Spelling suggestions: "subject:"8upport vector machines."" "subject:"6upport vector machines.""
111 |
Evaluating the performance of machine-learning techniques for recognizing construction materials in digital imagesRashidi, Abbas 20 September 2013 (has links)
Digital images acquired at construction sites contain valuable information useful for various applications including As-built documentation of building elements, effective progress monitoring, structural damage assessment, and quality control of construction material. As a result there is an increasing need for effective methods to recognize different building materials in digital images and videos.
Pattern recognition is a mature field within the area of image processing; however, its application in the area of civil engineering and building construction is only recent.
In order to develop any robust image recognition method, it is necessary to choose the optimal machine learning algorithm.
To generate a robust color model for building material detection in an outdoor construction environment, a comparative analysis of three generative and discriminative machine learning algorithms, namely, multilayer perceptron (MLP), radial basis function (RBF), and support vector machines (SVMs), is conducted. The main focus of this study is on three classes of building materials: concrete, plywood, and brick.
For training purposes a large-size data set including hundreds of images is collected. The comparison study is conducted by implementing necessary algorithms in MATLAB and testing over hundreds of construction-site images. To evaluate the performance of each technique, the results are compared with a manual classification of building materials. In order to better assess the performance of each technique, experiments are conducted by taking pictures under various realistic jobsite conditions, e.g., different ranges of image resolutions, different distance of camera from object, and different types of cameras.
|
112 |
Machine Learning and Graph Theory Approaches for Classification and Prediction of Protein StructureAltun, Gulsah 22 April 2008 (has links)
Recently, many methods have been proposed for the classification and prediction problems in bioinformatics. One of these problems is the protein structure prediction. Machine learning approaches and new algorithms have been proposed to solve this problem. Among the machine learning approaches, Support Vector Machines (SVM) have attracted a lot of attention due to their high prediction accuracy. Since protein data consists of sequence and structural information, another most widely used approach for modeling this structured data is to use graphs. In computer science, graph theory has been widely studied; however it has only been recently applied to bioinformatics. In this work, we introduced new algorithms based on statistical methods, graph theory concepts and machine learning for the protein structure prediction problem. A new statistical method based on z-scores has been introduced for seed selection in proteins. A new method based on finding common cliques in protein data for feature selection is also introduced, which reduces noise in the data. We also introduced new binary classifiers for the prediction of structural transitions in proteins. These new binary classifiers achieve much higher accuracy results than the current traditional binary classifiers.
|
113 |
kernlab - An S4 package for kernel methods in RKaratzoglou, Alexandros, Smola, Alex, Hornik, Kurt, Zeileis, Achim January 2004 (has links) (PDF)
kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 object model and provides a framework for creating and using kernel-based algorithms. The package contains dot product primitives (kernels), implementations of support vector machines and the relevance vector machine, Gaussian processes, a ranking algorithm, kernel PCA, kernel CCA, and a spectral clustering algorithm. Moreover it provides a general purpose quadratic programming solver, and an incomplete Cholesky decomposition method. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
114 |
Automatic Driver Fatigue Monitoring Using Hidden Markov Models and Bayesian NetworksRashwan, Abdullah 11 December 2013 (has links)
The automotive industry is growing bigger each year. The central concern for any automotive company is driver and passenger safety. Many automotive companies have developed driver assistance systems, to help the driver and to ensure driver safety. These systems include adaptive cruise control, lane departure warning, lane change assistance, collision avoidance, night vision, automatic parking, traffic sign recognition, and driver fatigue detection.
In this thesis, we aim to build a driver fatigue detection system that advances the research in this area. Using vision in detecting driver fatigue is commonly the key part for driver fatigue detection systems. We have decided to investigate different direction. We examine the driver's voice, heart rate, and driving performance to assess fatigue level. The system consists of three main modules: the audio module, the heart rate and other signals module, and the Bayesian network module.
The audio module analyzes an audio recording of a driver and tries to estimate the level of fatigue for the driver. A Voice Activity Detection (VAD) module is used to extract driver speech from the audio recording. Mel-Frequency Cepstrum Coefficients, (MFCC) features are extracted from the speech signal, and then Support Vector Machines (SVM) and Hidden Markov Models (HMM) classifiers are used to detect driver fatigue. Both classifiers are tuned for best performance, and the performance of both classifiers is reported and compared.
The heart rate and other signals module uses heart rate, steering wheel position, and the positions of the accelerator, brake, and clutch pedals to detect the level of fatigue. These signals' sample rates are then adjusted to match, allowing simple features to be extracted from the signals, and SVM and HMM classifiers are used to detect fatigue level. The performance of both classifiers is reported and compared.
Bayesian networks' abilities to capture dependencies and uncertainty make them a sound choice to perform the data fusion. Prior information (Day/Night driving and previous decision) is also incorporated into the network to improve the final decision. The accuracies of the audio and heart rate and other signals modules are used to calculate certain CPTs for the Bayesian network, while the rest of the CPTs are calculated subjectively. The inference queries are calculated using the variable elimination algorithm. For those time steps where the audio module decision is absent, a window is defined and the last decision within this window is used as a current decision. The performance of the system is assessed based on the average accuracy per second.
A dataset was built to train and test the system. The experimental results show that the system is very promising. The performance of the system was assessed based on the average accuracy per second; the total accuracy of the system is 90.5%. The system design can be easily improved by easily integrating more modules into the Bayesian network.
|
115 |
Plant-wide Performance Monitoring and Controller PrioritizationPareek, Samidh 06 1900 (has links)
Plant-wide performance monitoring has generated a lot of interest in the control engineering community. The idea is to judge the performance of a plant as a whole rather than looking at performance of individual controllers. Data based methods are currently used to generate a variety of statistical performance indices to help us judge the performance of production units and control assets. However, so much information can often be overwhelming if it lacks precise information. Powerful computing and data storage capabilities have enabled industries to store huge amounts of data. Commercial performance monitoring softwares such as those available from many vendor companies such as Honeywell, Matrikon, ExperTune etc typically use this data to generate huge amounts of information. The problem of data overload has in this way turned into an information overload problem. This work focuses on developing methods that reconcile these various statistical measures of performance and generate useful diagnostic measures in order to optimize process performance of a unit/plant. These methods are also able to identify the relative importance of controllers in the way that they affect the performance of the unit/plant under consideration. / Process Control
|
116 |
Classification of HTML DocumentsXie, Wei January 2006 (has links)
Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement. / Master of Computing
|
117 |
Classification of HTML DocumentsXie, Wei . University of Ballarat. January 2006 (has links)
Text Classification is the task of mapping a document into one or more classes based on the presence or absence of words (or features) in the document. It is intensively being studied and different classification techniques and algorithms have been developed. This thesis focuses on classification of online documents that has become more critical with the development of World Wide Web. The WWW vastly increases the availability of on-line documents in digital format and has highlighted the need to classify them. From this background, we have noted the emergence of “automatic Web Classification”. These mainly concentrate on classifying HTML-like documents into classes or categories by not only using the methods that are inherited from the traditional Text Classification process, but also utilizing the extra information provided only by Web pages. Our work is based on the fact that, Web documents, contain not only ordinary features (words) but also extra information, such as meta-data and hyperlinks that can be used to advantage the classification process. The aim of this research is to study various ways of using the extra information, in particularly, hyperlink information provided by HTML-documents (Web pages). The merit of the approach, developed in this thesis, is its simplicity, compared with existing approaches. We present different approaches of using hyperlink information to improve the effectiveness of web classification. Unlike other work in this area, we will only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets, and then apply classifiers on them for classification. In the numerical experiments we adopted two wellknown Text Classification algorithms, Support Vector Machines and BoosTexter. The results obtained show that classification accuracy can be improved by using mixtures of ordinary and linked-class features. Moreover, out-links usually work better than in-links in classification. We also analyse and discuss the reasons behind this improvement. / Master of Computing
|
118 |
Some problems in high dimensional data analysisPham, Tung Huy January 2010 (has links)
The bloom of economics and technology has had an enormous impact on society. Along with these developments, human activities nowadays produce massive amounts of data that can be easily collected for relatively low cost with the aid of new technologies. Many examples can be mentioned here including data from web term-document data, sensor arrays, gene expression, finance data, imaging and hyperspectral analysis. Because of the enormous amount of data from various different and new sources, more and more challenging scientific problems appear. These problems have changed the types of problems which mathematical scientists work. / In traditional statistics, the dimension of the data, p say, is low, with many observations, n say. In this case, classical rules such as the Central Limit Theorem are often applied to obtain some understanding from data. A new challenge to statisticians today is dealing with a different setting, when the data dimension is very large and the number of observations is small. The mathematical assumption now could be p > n, or even p goes to infinity and n fixed in many cases, for example, there are few patients with many genes. In these cases, classical methods fail to produce a good understanding of the nature of the problem. Hence, new methods need to be found to solve these problems. Mathematical explanations are also needed to generalize these cases. / The research preferred in this thesis includes two problems: Variable selection and Classification, in the case where the dimension is very large. The work on variable selection problems, in particular the Adaptive Lasso was completed by June 2007 and the research on classification has been carried out through out 2008 and 2009. The research on the Dantzig selector and the Lasso were finished in July 2009. Therefore, this thesis is divided into two parts. In the first part of the thesis we study the Adaptive Lasso, the Lasso and the Dantzig selector. In particular, in Chapter 2 we present some results for the Adaptive Lasso. Chapter 3 will provides two examples that show that neither the Dantzig selector or the Lasso is definitely better than the other. The second part of the thesis is organized as follows. In Chapter 5, we shall construct the model setting. In Chapter 6, we summarize the results of the scaled centroid-based classifier. We also prove some results on the scaled centroid-based classifier. Because there are similarities between the Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD) classifiers, Chapter 8 introduces a class of distance-based classifiers that could be considered a generalization of the SVM and DWD classifiers. Chapters 9 and 10 are about the SVM and DWD classifiers. Chapter 11 demonstrates the performance of these classifiers on simulated data sets and some cancer data sets.
|
119 |
Some problems in high dimensional data analysisPham, Tung Huy January 2010 (has links)
The bloom of economics and technology has had an enormous impact on society. Along with these developments, human activities nowadays produce massive amounts of data that can be easily collected for relatively low cost with the aid of new technologies. Many examples can be mentioned here including data from web term-document data, sensor arrays, gene expression, finance data, imaging and hyperspectral analysis. Because of the enormous amount of data from various different and new sources, more and more challenging scientific problems appear. These problems have changed the types of problems which mathematical scientists work. / In traditional statistics, the dimension of the data, p say, is low, with many observations, n say. In this case, classical rules such as the Central Limit Theorem are often applied to obtain some understanding from data. A new challenge to statisticians today is dealing with a different setting, when the data dimension is very large and the number of observations is small. The mathematical assumption now could be p > n, or even p goes to infinity and n fixed in many cases, for example, there are few patients with many genes. In these cases, classical methods fail to produce a good understanding of the nature of the problem. Hence, new methods need to be found to solve these problems. Mathematical explanations are also needed to generalize these cases. / The research preferred in this thesis includes two problems: Variable selection and Classification, in the case where the dimension is very large. The work on variable selection problems, in particular the Adaptive Lasso was completed by June 2007 and the research on classification has been carried out through out 2008 and 2009. The research on the Dantzig selector and the Lasso were finished in July 2009. Therefore, this thesis is divided into two parts. In the first part of the thesis we study the Adaptive Lasso, the Lasso and the Dantzig selector. In particular, in Chapter 2 we present some results for the Adaptive Lasso. Chapter 3 will provides two examples that show that neither the Dantzig selector or the Lasso is definitely better than the other. The second part of the thesis is organized as follows. In Chapter 5, we shall construct the model setting. In Chapter 6, we summarize the results of the scaled centroid-based classifier. We also prove some results on the scaled centroid-based classifier. Because there are similarities between the Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD) classifiers, Chapter 8 introduces a class of distance-based classifiers that could be considered a generalization of the SVM and DWD classifiers. Chapters 9 and 10 are about the SVM and DWD classifiers. Chapter 11 demonstrates the performance of these classifiers on simulated data sets and some cancer data sets.
|
120 |
Reconhecimento facial com projeções ortogonais preservadoras de localidade customizadas para maximizar margens suaves / Face recognition using customized orthogonal locality preserving projections with soft margin maximizationSoldera, John January 2015 (has links)
Atualmente, o reconhecimento facial por técnicas automáticas é ainda uma tarefa desafiadora uma vez que as imagens faciais podem ser afetadas por mudanças na cena, tais como na iluminação, na pose da cabeça, ou na expressão facial. Além disso, a representação de faces por feições faciais geralmente requer diversas dimensões, o que impõe desafios adicionais ao reconhecimento facial. Nessa tese, é proposto um novo método de reconhecimento facial com o objetivo de ser robusto a muitos dos fatores que podem afetar as feições faciais na prática e se baseia em determinar transformações do espaço original de feições faciais de alta dimensionalidade para um espaço de baixa dimensionalidade que apresenta maior discriminação das classes de dados faciais (indivíduos). Isso é realizado através da aplicação de um método Projeções Ortogonais Preservadoras de Localidade (Orthogonal Locality Preserving Projections - OLPP) modificado, que usa esquemas de definição de localidade supervisionados que têm o objetivo de preservar a estrutura das classes de dados faciais no espaço resultante de baixa dimensionalidade, diferentemente do método OLPP típico que preserva a estrutura dos dados faciais. Dessa forma, as classes se tornam mais compactas, preservando a métrica de classificação. O método proposto pode trabalhar tanto com representações densas como esparsas de imagens faciais (ou seja, ele pode usar subconjuntos ou todos os pixels das imagens faciais), sendo proposto nessa tese um método de extração de feições faciais esparsas e um método de extração de feições faciais densas que preservam a informação de cor das imagens faciais apresentando melhora em relação ao método OLPP típico que usa imagens em escalas de cinza em baixa resolução. Novas imagens faciais de teste são classificadas no espaço de baixa dimensionalidade obtido usando Máquinas de Vetores de Suporte (Support Vector Machines - SVM) treinadas com margens suaves, apresentando maior eficiência do que a regra do vizinho mais próximo usada no método OLPP típico. Um conjunto de experimentos foi projetado para avaliar o método proposto sob várias condições encontradas na prática (como mudanças na pose, expressão facial, iluminação e a presença de artefatos que causam oclusão facial). Os resultados experimentais foram obtidos usando cinco bases de imagens faciais públicas (a PUT, a FEI, a FERET, a Yale e a ORL). Esses experimentos confirmam que os esquemas propostos de extração de feições faciais integrados à transformação proposta para um espaço discriminativo de baixa dimensionalidade empregando o esquema alternativo de classificação usando SVM com margens suaves obtêm maiores taxas de reconhecimento do que o próprio método OLPP e métodos representativos do estado da arte mesmo quando são usadas imagens coloridas em alta resolução (das bases de imagens faciais PUT, FEI e FERET) como imagens faciais em escalas de cinza em baixa resolução (das bases Yale e ORL). / Nowadays, face recognition by automatic techniques still is a challenging task since face images may be affected by changes in the scene, such as in the illumination, head pose or face expression. Also, face feature representation often requires several dimensions, which poses additional challenges for face recognition. In this thesis is proposed a novel face recognition method with the objective of to be robust to many issues which can affect the face features in practice and it is based on projections of high dimensional face image representations into lower dimensionality and highly discriminative spaces. This is achieved by a modified Orthogonal Locality Preserving Projections (OLPP) method that uses a supervised alternative locality definition scheme designed to preserve the face class (individuals) structure in the obtained lower dimensionality face feature space unlike the typical OLPP method which preserves the face data structure. Besides, a new kernel equation is proposed to calculate affinities among face samples, presenting better class structure preservation when compared to the heat kernel used by the typical OLPP method. The proposed method can work with sparse and dense face image representations (i.e. it can use sub-sets or all face image pixels), and a sparse and a dense feature extraction methods are proposed, which preserve the color information during the feature extraction process from the facial images improving on the typical OLPP method which uses grayscale low-resolution face images. New test face images are classified in the obtained lower dimensionality feature space using a trained soft margins Support Vector Machine (SVM), so it performs better than the nearest neighbor rule used in the typical OLPP method. A set of experiments was designed to evaluate the proposed method under various conditions found in practice (such as changes in head pose, face expression, illumination, and in the presence of occlusion artifacts). The experimental results were obtained using five challenging public face databases (namely, PUT, FEI, FERET, Yale and ORL). These experiments confirm that the proposed feature extraction method integrated to the proposed transformation to a discriminative lower dimensionality space using the alternative classification scheme with SVM and soft margins obtains higher recognition rates than the OLPP method itself and methods representative of the state-ofthe- art even when are used color (RGB) face images in high resolution (PUT, FEI and FERET face databases) as well as grayscale face images in low resolution (Yale and ORL face databases).
|
Page generated in 0.065 seconds