• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5685
  • 581
  • 289
  • 275
  • 167
  • 157
  • 83
  • 66
  • 51
  • 43
  • 24
  • 21
  • 20
  • 19
  • 12
  • Tagged with
  • 9183
  • 9183
  • 3054
  • 1706
  • 1541
  • 1540
  • 1442
  • 1382
  • 1214
  • 1203
  • 1186
  • 1133
  • 1124
  • 1048
  • 1037
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
971

Integrating distance function learning and support vector machine for content-based image retrieval /

Tang, Siu-shing. January 2006 (has links)
Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2006. / Includes bibliographical references (leaves 59-66). Also available in electronic version.
972

End-to-End Single-rate Multicast Congestion Detection Using Support Vector Machines.

Liu, Xiaoming. January 2008 (has links)
<p> <p>&nbsp / </p> </p> <p align="left">IP multicast is an efficient mechanism for simultaneously transmitting bulk data to multiple receivers. Many applications can benefit from multicast, such as audio and videoconferencing, multi-player games, multimedia broadcasting, distance education, and data replication. For either technical or policy reasons, IP multicast still has not yet been deployed in today&rsquo / s Internet. Congestion is one of the most important issues impeding the development and deployment of IP multicast and multicast applications.</p>
973

Mining Speech Sounds : Machine Learning Methods for Automatic Speech Recognition and Analysis

Salvi, Giampiero January 2006 (has links)
This thesis collects studies on machine learning methods applied to speech technology and speech research problems. The six research papers included in this thesis are organised in three main areas. The first group of studies were carried out within the European project Synface. The aim was to develop a low latency phonetic recogniser to drive the articulatory movements of a computer generated virtual face from the acoustic speech signal. The visual information provided by the face is used as hearing aid for persons using the telephone. Paper A compares two solutions to the problem of mapping acoustic to visual information that are based on regression and classification techniques. Recurrent Neural Networks are used to perform regression while Hidden Markov Models are used for the classification task. In the second case the visual information needed to drive the synthetic face is obtained by interpolation between target values for each acoustic class. The evaluation is based on listening tests with hearing impaired subjects were the intelligibility of sentence material is compared in different conditions: audio alone, audio and natural face, audio and synthetic face driven by the different methods. Paper B analyses the behaviour, in low latency conditions, of a phonetic recogniser based on a hybrid of Recurrent Neural Networks (RNNs) and Hidden Markov Models (HMMs). The focus is on the interaction between the time evolution model learnt by the RNNs and the one imposed by the HMMs. Paper C investigates the possibility of using the entropy of the posterior probabilities estimated by a phoneme classification neural network, as a feature for phonetic boundary detection. The entropy and its time evolution are analysed with respect to the identity of the phonetic segment and the distance from a reference phonetic boundary. In the second group of studies, the aim was to provide tools for analysing large amount of speech data in order to study geographical variations in pronunciation (accent analysis). Paper D and Paper E use Hidden Markov Models and Agglomerative Hierarchical Clustering to analyse a data set of about 100 millions data points (5000 speakers, 270 hours of speech recordings). In Paper E, Linear Discriminant Analysis was used to determine the features that most concisely describe the groupings obtained with the clustering procedure. The third group belongs to studies carried out during the international project MILLE (Modelling Language Learning) that aims at investigating and modelling the language acquisition process in infants. Paper F proposes the use of an incremental form of Model Based Clustering to describe the unsupervised emergence of phonetic classes in the first stages of language acquisition. The experiments were carried out on child-directed speech expressly collected for the purposes of the project / QC 20100630
974

Context Dependent Thresholding and Filter Selection for Optical Character Recognition

Kieri, Andreas January 2012 (has links)
Thresholding algorithms and filters are of great importance when utilizing OCR to extract information from text documents such as invoices. Invoice documents vary greatly and since the performance of image processing methods when applied to those documents will vary accordingly, selecting appropriate methods is critical if a high recognition rate is to be obtained. This paper aims to determine if a document recognition system that automatically selects optimal processing methods, based on the characteristics of input images, will yield a higher recognition rate than what can be achieved by a manual choice. Such a recognition system, including a learning framework for selecting optimal thresholding algorithms and filters, was developed and evaluated. It was established that an automatic selection will ensure a high recognition rate when applied to a set of arbitrary invoice images by successfully adapting and avoiding the methods that yield poor recognition rates.
975

Personalized Medicine through Automatic Extraction of Information from Medical Texts

Frunza, Oana Magdalena 17 April 2012 (has links)
The wealth of medical-related information available today gives rise to a multidimensional source of knowledge. Research discoveries published in prestigious venues, electronic-health records data, discharge summaries, clinical notes, etc., all represent important medical information that can assist in the medical decision-making process. The challenge that comes with accessing and using such vast and diverse sources of data stands in the ability to distil and extract reliable and relevant information. Computer-based tools that use natural language processing and machine learning techniques have proven to help address such challenges. This current work proposes automatic reliable solutions for solving tasks that can help achieve a personalized-medicine, a medical practice that brings together general medical knowledge and case-specific medical information. Phenotypic medical observations, along with data coming from test results, are not enough when assessing and treating a medical case. Genetic, life-style, background and environmental data also need to be taken into account in the medical decision process. This thesis’s goal is to prove that natural language processing and machine learning techniques represent reliable solutions for solving important medical-related problems. From the numerous research problems that need to be answered when implementing personalized medicine, the scope of this thesis is restricted to four, as follows: 1. Automatic identification of obesity-related diseases by using only textual clinical data; 2. Automatic identification of relevant abstracts of published research to be used for building systematic reviews; 3. Automatic identification of gene functions based on textual data of published medical abstracts; 4. Automatic identification and classification of important medical relations between medical concepts in clinical and technical data. This thesis investigation on finding automatic solutions for achieving a personalized medicine through information identification and extraction focused on individual specific problems that can be later linked in a puzzle-building manner. A diverse representation technique that follows a divide-and-conquer methodological approach shows to be the most reliable solution for building automatic models that solve the above mentioned tasks. The methodologies that I propose are supported by in-depth research experiments and thorough discussions and conclusions.
976

On surrogate supervision multi-view learning

Jin, Gaole 03 December 2012 (has links)
Data can be represented in multiple views. Traditional multi-view learning methods (i.e., co-training, multi-task learning) focus on improving learning performance using information from the auxiliary view, although information from the target view is sufficient for learning task. However, this work addresses a semi-supervised case of multi-view learning, the surrogate supervision multi-view learning, where labels are available on limited views and a classifier is obtained on the target view where labels are missing. In surrogate multi-view learning, one cannot obtain a classifier without information from the auxiliary view. To solve this challenging problem, we propose discriminative and generative approaches. / Graduation date: 2013
977

One General Approach For Analysing Compositional Structure Of Terms In Biomedical Field

Chao, Yang, Zhang, Peng January 2013 (has links)
The root is the primary lexical unit of Ontological terms, which carries the most significant aspects of semantic content and cannot be reduced into small constituents. It is the key of ontological term structure. After the identification of root, we can easily get the meaning of terms. According to the meaning, it’s helpful to identify the other parts of terms, such as the relation, definition and so on. We have generated a general classification model to identify the roots of terms in this master thesis. There are four features defined in our classification model: the Token, the POS, the Length and the Position. Implementation is followed using Java and algorithm is followed using Naïve Bayes. We implemented and evaluated the classification model using Gene Ontology (GO). The evaluation results showed that our framework and model were effective.
978

Machine learning in engineering : techniques to speed up numerical optimization

Cerbone, G. (Giuseppe) 13 April 1992 (has links)
Many important application problems in engineering can be formalized as nonlinear optimization tasks. However, numerical methods for solving such problems are brittle and do not scale well. For example, these methods depend critically on choosing a good starting point from which to perform the optimization search. In high-dimensional spaces, numerical methods have difficulty finding solutions that are even locally optimal. The objective of this thesis is to demonstrate how machine learning techniques can improve the performance of numerical optimizers and facilitate optimization in engineering design. The machine learning methods have been tested in the domain of 2-dimensional structural design, where the goal is to find a truss of minimum weight that bears a set of fixed loads. Trusses are constructed from pure tension and pure compression members. The difference in the load-bearing properties of tension and compression members causes the gradient of the objective function to be discontinuous, and this prevents the application of powerful gradient-based optimization algorithms in this domain. In this thesis, the approach to numerical optimization is to find ways of transforming the initial problem into a selected set of subproblems where efficient, gradient-based algorithms can be applied. This is achieved by a three-step "compilation" process. The first step is to apply speedup learning techniques to partition the overall optimization task into sub-problems for which the gradient is continuous. Then, the second step is to further simplify each sub-problem by using inductive learning techniques to identify regularities and exploit them to reduce the number of independent variables. Unfortunately, these first two steps have the potential to produce an exponential number of sub-problems. Hence, in the third step, selection rules are derived to identify those sub-problems that are most likely to contain the global optimum. The numerical optimization procedures are only applied to these selected sub-problems. To identify good sub-problems, a novel ID3-like inductive learning algorithm called UTILITYID3 is applied to a collection of training examples to discover selection rules. These rules analyze the problem statement and identify a small number of sub-problems (typically 3) that are likely to contain the global optimum. In the domain of 2-dimensional structural design, the combination of these three steps yields a 6-fold speedup in the time required to find an optimal solution. Furthermore, it turns out that this method is less reliant on a good starting point for optimization. The methods developed in this problem show promise of being applied to a wide range of numerical optimization problems in engineering design. / Graduation date: 1992
979

Inferring the Binding Preferences of RNA-binding Proteins

Hilal, Kazan 17 December 2012 (has links)
Post-transcriptional regulation is carried out by RNA-binding proteins (RBPs) that bind to specific RNA molecules and control their processing, localization, stability and degradation. Experimental studies have successfully identified RNA targets associated with specific RBPs. However, because the locations of the binding sites within the targets are unknown and because RBPs recognize both sequence and structure elements in their binding sites, identification of RBP binding preferences from these data remains challenging. The unifying theme of this thesis is to identify RBP binding preferences from experimental data. First, we propose a protocol to design a complex RNA pool that represents diverse sets of sequence and structure elements to be used in an in vitro assay to efficiently measure RBP binding preferences. This design has been implemented in the RNAcompete method, and applied genome-wide to human and Drosophila RBPs. We show that RNAcompete-derived motifs are consistent with established binding preferences. We developed two computational models to learn binding preferences of RBPs from large-scale data. Our first model, RNAcontext uses a novel representation of secondary structure to infer both sequence and structure preferences of RBPs, and is optimized for use with in vitro binding data on short RNA sequences. We show that including structure information improves the prediction accuracy significantly. Our second model, MaLaRKey, extends RNAcontext to fit motif models to sequences of arbitrary length, and to incorporate a richer set of structure features to better model in vivo RNA secondary structure. We demonstrate that MaLaRKey infers detailed binding models that accurately predict binding of full-length transcripts.
980

Identifying Tissue Specific Distal Regulatory Sequences in the Mouse Genome

Chen, Chih-yu 06 December 2011 (has links)
Epigenetic modifications, transcription factor (TF) availability and chromatin conformation influence how a genome is interpreted by the transcriptional machinery responsible for gene expression. Enhancers buried in non-coding regions are associated with significant differences in histone marks between different cell types. In contrast, gene promoters show more uniform modifications across cell types. In this report, enhancer identification is first carried out using an enhancer associated feature in mouse erythroid cells. Taking advantage of public domain ChIP-Seq data sets in mouse embryonic stem cells, an integrative model is then used to assess features in enhancer prediction, and subsequently locate enhancers. Significant associations with multiple TF bound loci, higher expression in the closest genes, and active enhancer marks support functionality and tissue-specificity of these enhancers. Motif enrichment analysis further determines known and novel TFs regulating the target cell type. Furthermore, the features identified can facilitate more accurate enhancer prediction in other cell types.

Page generated in 0.0817 seconds