Spelling suggestions: "subject:"1inear classification"" "subject:"cinear classification""
1 |
Study and Design of Globally Optimal Distributed Scalar Quantizer for Binary Linear ClassificationZendehboodi, Sara 11 1900 (has links)
This thesis addresses the design of distributed scalar quantizers (DSQs) for two sensors,
tailored to maximize the classification accuracy for a pre-trained binary linear classifier
at the central node, diverging from traditional designs that prioritize data reconstruction
quality.
The first contribution of this thesis is the development of efficient globally optimal
DSQ design algorithms for two correlated discrete sources when the quantizer cells are
assumed to be convex. First, it is shown that the problem is equivalent to a minimum
weight path problem (with certain constraints) in a weighted directed acyclic graph.
The latter problem can be solved using dynamic programming with O(K_1K_2M^4) computational
complexity, where Ki, is the number of cells for the quantizer of source i,
i = 1, 2, and M is the size of the union of the sources’ alphabets. Additionally, it is
proved that the dynamic programming algorithm can be expedited by a factor of M by
exploiting the so called Monge property, for scenarios where the pre-trained classifier is
the optimal classifier for the unquantized sources.
Next, the design of so-called staggered DSQs (SDSQs) is addressed, i.e., DSQ’s with
K_1 = K_2 = K and with the thresholds of the two quantizers being interleaved. First, a
faster dynamic programming algorithm with only O(KM^2) time complexity is devised
for the design of the SDSQ that minimizes an upperbound on the classification error.
This sped up is obtained by simplifying the graph model for the problem. Moreover,
it is shown that this algorithm can also be further accelerated by a factor of M when
the pre-trained linear classifier is the optimal classifier. Furthermore, some theoretical
results are derived that provide support to imposing the above constraints to the DSQ
design problem in the case when the pre-trained classifier is optimal. First, it is shown
that when the sources (discrete or continuous) satisfy a certain symmetry property, the
SDSQ that minimizes the modified cost also minimizes the original cost within the class
of DSQs without the staggerness constraint. For continuous sources, it is also shown
that the SDSQ that minimizes the modified cost also minimizes the original cost and all
quantizer thresholds are distinct, even if the sources do not satisfy the aforementioned
symmetry condition. The latter result implies that DSQs with identical encoders are
not optimal even when the sources has the same marginal distribution, a fact which is
proved here for the first time, up to our knowledge.
The last (but not least) contribution of this thesis resides in leveraging the aforementioned
results to obtain efficient globally optimal solution algorithms for the problem
of decentralized detection under the probability of error criterion of two discrete vector
sources that are conditionally independent given any class label. The previously
known globally optimal solution has O(N^(K_1+K_2+1)) time complexity, where N is the
size of the union of the alphabets of the two sources. We show that by applying an
appropriate transformation to each vector source, the problem reduces to the problem
of designing the optimal DSQ with convex cells in the transformed scalar domain for
a scenario where the pre-trained linear classifier is the optimal classifier. We conclude
that the problem can be solved by a much faster algorithm with only O(K_1K_2N^3) time
complexity. Similarly, for the case of equal quantizer rates, the problem can be solved
in O(KN) operations if the sources satisfy an additional symmetry condition. Furthermore,
our results prove the conjecture that for continuous sources, imposing the
constraint that the encoders be identical precludes optimality, even when the marginal
distributions of the sources are the same. / Thesis / Doctor of Philosophy (PhD)
|
2 |
Learning via Query SynthesisAlabdulmohsin, Ibrahim Mansour 07 May 2017 (has links)
Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications.
In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning.
Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This
algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this thesis, I argue that learning via the empirical kernel maps, also commonly referred to as 1-norm Support Vector Machine (SVM) or Linear Programming (LP) SVM, is the best method for handling indefinite similarities. The advantages of this method are established both theoretically and empirically.
|
3 |
Accounting for Additional Heterogeneity: A Theoretic Extension of an Extant Economic ModelBarney, Bradley John 26 October 2007 (has links)
The assumption in economics of a representative agent is often made. However, it is a very rigid assumption. Hall and Jones (2004b) presented an economic model that essentially provided for a representative agent for each age group in determining the group's health level function. Our work seeks to extend their theoretical version of the model by allowing for two representative agents for each age—one for each of “Healthy” and “Sick” risk-factor groups—to allow for additional heterogeneity in the populace. The approach to include even more risk-factor groups is also briefly discussed. While our “extended” theoretical model is not applied directly to relevant data, several techniques that could be applicable were the relevant data to be obtained are demonstrated on other data sets. This includes examples of using linear classification, fitting baseline-category logit models, and running the genetic algorithm.
|
4 |
Interactive Classification Of Satellite Image Content Based On Query By ExampleDalay, Oral 01 January 2006 (has links) (PDF)
In our attempt to construct a semantic filter for satellite image content, we have built a software that allows user to indicate a few number of image regions that contains a specific geographical object, such as, a bridge, and to retrieve similar objects on the same satellite image. We are particularly interested in performing a data analysis approach based on user interaction. User can guide the classification procedure by interaction and visual observation of the results. We have applied a two step procedure for this and preliminary results show that we eliminate many true negatives while keeping most of the true positives.
|
5 |
Sparse Multinomial Logistic Regression via Approximate Message PassingByrne, Evan Michael 14 October 2015 (has links)
No description available.
|
6 |
Classificação linear de bovinos: criação de um modelo de decisão baseado na conformação de tipo “true type” como auxiliar a tomada de decisão na seleção de bovinos leiteirosSousa, Rogério Pereira de 29 August 2016 (has links)
Submitted by Silvana Teresinha Dornelles Studzinski (sstudzinski) on 2016-11-01T15:54:48Z
No. of bitstreams: 1
Rogério Pereira de Sousa_.pdf: 946780 bytes, checksum: ceb6c981273e15ecc58fe661bd02a34a (MD5) / Made available in DSpace on 2016-11-01T15:54:48Z (GMT). No. of bitstreams: 1
Rogério Pereira de Sousa_.pdf: 946780 bytes, checksum: ceb6c981273e15ecc58fe661bd02a34a (MD5)
Previous issue date: 2016-08-29 / IFTO - Instituto Federal de Educação, Ciência e Tecnologia do Tocantins / A seleção de bovinos leiteiros, através da utilização do sistema de classificação com características lineares de tipo, reflete no ganho de produção, na vida produtiva do animal, na padronização do rebanho, entre outros. Esta pesquisa operacional obteve suas informações através de pesquisas bibliográficas e análise de base de dados de classificações reais. O presente estudo, objetivou a geração de um modelo de classificação de bovinos leiteiros baseado em “true type”, para auxiliar os avaliadores no processamento e análise dos dados, ajudando na tomada de decisão quanto a seleção da vaca para aptidão leiteira, tornando os dados seguros para futuras consultas. Nesta pesquisa, aplica-se métodos computacionais à classificação de vacas leiteiras mediante a utilização mineração de dados e lógica fuzzy. Para tanto, realizou-se a análise em uma base de dado com 144 registros de animais classificados entre as categorias boa e excelente. A análise ocorreu com a utilização da ferramenta WEKA para extração de regras de associação com o algoritmo apriori, utilizando como métricas objetivas, suporte / confiança, e lift para determinar o grau de dependência da regra. Para criação do modelo de decisão com lógica fuzzy, fez-se uso da ferramenta R utilizando o pacote sets. Por meio dos resultados obtidos na mineração de regras, foi possível identificar regras relevantes ao modelo de classificação com confiança acima de 90%, indicando que as características avaliadas (antecedente) implicam em outras características (consequente), com uma confiança alta. Quanto aos resultados obtidos pelo modelo de decisão fuzzy, observa-se que, o modelo de classificação baseado em avaliações subjetivas fica suscetível a erros de classificação, sugerindo então o uso de resultados obtidos por regras de associação como forma de auxílio objetivo na classificação final da vaca para aptidão leiteira. / The selection of dairy cattle through the use of the rating system with linear type traits, reflected in increased production, the productive life of the animal, the standardization of the flock, among others. This operational research obtained their information through library research and basic analysis of actual ratings data. This study aimed to generate a dairy cattle classification model based on "true type" to assist the evaluators in the processing and analysis of data, helping in decision making and the selection of the cow to milk fitness, making the data safe for future reference. In this research, applies computational methods to the classification of dairy cows by using data mining and fuzzy logic. Therefore, we conducted the analysis on a data base with 144 animals records classified between good and excellent categories. Analysis is made with the use of WEKA tool for extraction of association rules with Apriori algorithm, using as objective metrics, support / confidence and lift to determine the degree of dependency rule. To create the decision model with fuzzy logic, it was made use of R using the tool sets package. Through the results obtained in the mining rules, it was possible to identify the relevant rules with confidence classification model above 90%, indicating that the characteristics assessed (antecedent) imply other characteristics (consequent), with a high confidence. As for the results obtained by the fuzzy decision model, it is observed that the classification model based on subjective assessments is susceptible to misclassification, suggesting then the use of results obtained by association rules as a way to aid goal in the final classification cow for dairy fitness
|
7 |
Využití metod zpracování signálů pro zvýšení bezpečnosti automobilové dopravy / Usage of advanced signal processing techniques for motor traffic safety enhancementBeneš, Radek January 2009 (has links)
This diploma thesis deals with the issue of the recognition of road signs in the video sequence. Such systems increase the traffic safety and are implemented by major car factories in the manufactured cars (Opel, BMW). First, the motivation for the utilisation of these systems is presented, followed by the survey of the current state of the art methods. Finally, a specific road-sign detection method is chosen and described in detail. The method uses advanced techniques of signal processing. Segmentation method in color space is used for sign detection and subsequent classification is accomplished by linear classification with optional use of PCA method. In addition, the method contains the prediction of road sign positions based on Kalman filtering. Implemented system yields relatively accurate results and overall analysis and discussion is enclosed.
|
8 |
Supervised Learning for White Matter Bundle SegmentationBertò, Giulia 03 June 2020 (has links)
Accurate delineation of anatomical structures in the white matter of the human brain is of paramount importance for multiple applications, such as neurosurgical planning, characterization of neurological disorders, and connectomic studies. Diffusion Magnetic Resonance Imaging (dMRI) techniques can provide, in-vivo, a mathematical representation of thousands of fibers composing such anatomical structures, in the form of 3D
polylines called streamlines. Given this representation, a task of invaluable interest is known as white matter bundle segmentation, whose aim is to virtually group together streamlines sharing a similar pathway into anatomically meaningful structures, called white matter bundles.
Obtaining a good and reliable bundle segmentation is however not trivial, mainly because of the intrinsic complexity of the data. Most of the current methods for bundle segmentation require extensive neuroanatomical knowledge, are time consuming, or are not able to adapt to different data settings. To overcome these limitations, the main goal of this thesis is to develop a new automatic method for accurate white matter bundle segmentation, by exploiting, combining and extending multiple up-to-date supervised learning techniques.
The main contribution of the project is the development of a novel streamline-based bundle segmentation method based on binary linear classification, which simultaneously combines information from atlases, bundle geometries, and connectivity patterns. We prove that the proposed method reaches unprecedented quality of segmentation, and that is robust to a multitude of diverse settings, such as when there are differences in bundle size, tracking algorithm, and/or quality of dMRI data. In addition, we show that some of the state-of-the-art bundle segmentation methods are deeply affected by a geometrical property of the shape of the bundles to be segmented, their fractal dimension.
Important factors involved in the task of streamline classification are: (i) the need for an effective streamline distance function and (ii) the definition of a proper feature space. To this end, we compare some of the most common streamline distance functions available in the literature and we provide some guidelines on their practical use for the task of supervised bundle segmentation. Moreover, we investigate the possibility to include, in a streamline-based segmentation method, additional information to the typically employed streamline distance measure. Specifically, we provide evidence that considering additional anatomical information regarding the cortical terminations of the streamlines and their proximity to specific Regions of Interest (ROIs) helps to improve the results of bundle segmentation.
Lastly, significant attention is paid to reproducibility in neuroscience. Following the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles, we have integrated our pipelines of analysis into an online open platform devoted to promoting reproducibility of scientific results and to facilitating knowledge discovery.
|
9 |
W-operator learning using linear models for both gray-level and binary inputs / Aprendizado de w-operadores usando modelos lineares para imagens binárias e em níveis de cinzaIgor dos Santos Montagner 12 June 2017 (has links)
Image Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains. / Processamento de imagens pode ser usado para resolver problemas em diversas áreas, como imagens médicas, processamento de documentos e segmentação de objetos. Operadores de imagens normalmente são construídos combinando diversos operadores elementares e ajustando seus parâmetros. Uma abordagem alternativa é a estimação de operadores de imagens a partir de pares de exemplos contendo uma imagem de entrada e o resultado esperado. Restringindo os operadores considerados para o que são invariantes à translação e localmente definidos ($W$-operadores), podemos aplicar técnicas de Aprendizagem de Máquina para estimá-los. O formato que define quais vizinhos são usadas é chamado de janela. $W$-operadores treinados com janelas grandes frequentemente tem problemas de generalização, pois necessitam de grandes conjuntos de treinamento. Este problema é ainda mais grave ao treinar operadores em níveis de cinza. Apesar de técnicas como o projeto dois níveis, que combina a saída de diversos operadores treinados com janelas menores, mitigar em parte estes problemas, uma determinação de parâmetros complexa é necessária. Neste trabalho apresentamos duas técnicas que permitem o treinamento de operadores usando janelas grandes. A primeira, KA, é baseada em Máquinas de Suporte Vetorial (SVM) e utiliza técnicas de aproximação de kernels para realizar o treinamento de $W$-operadores. Uma escolha adequada de kernels permite o treinamento de operadores níveis de cinza e binários. A segunda técnica, NILC, permite a criação automática de combinações de operadores de imagens. Este método utiliza uma técnica de otimização específica para casos em que o número de características é muito grande. Ambos métodos obtiveram resultados competitivos com algoritmos da literatura em dois domínio de aplicação diferentes. O primeiro, Staff Removal, é um processamento de documentos binários frequente em sistemas de reconhecimento ótico de partituras. O segundo é um problema de segmentação de vasos sanguíneos em imagens em níveis de cinza.
|
10 |
W-operator learning using linear models for both gray-level and binary inputs / Aprendizado de w-operadores usando modelos lineares para imagens binárias e em níveis de cinzaMontagner, Igor dos Santos 12 June 2017 (has links)
Image Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains. / Processamento de imagens pode ser usado para resolver problemas em diversas áreas, como imagens médicas, processamento de documentos e segmentação de objetos. Operadores de imagens normalmente são construídos combinando diversos operadores elementares e ajustando seus parâmetros. Uma abordagem alternativa é a estimação de operadores de imagens a partir de pares de exemplos contendo uma imagem de entrada e o resultado esperado. Restringindo os operadores considerados para o que são invariantes à translação e localmente definidos ($W$-operadores), podemos aplicar técnicas de Aprendizagem de Máquina para estimá-los. O formato que define quais vizinhos são usadas é chamado de janela. $W$-operadores treinados com janelas grandes frequentemente tem problemas de generalização, pois necessitam de grandes conjuntos de treinamento. Este problema é ainda mais grave ao treinar operadores em níveis de cinza. Apesar de técnicas como o projeto dois níveis, que combina a saída de diversos operadores treinados com janelas menores, mitigar em parte estes problemas, uma determinação de parâmetros complexa é necessária. Neste trabalho apresentamos duas técnicas que permitem o treinamento de operadores usando janelas grandes. A primeira, KA, é baseada em Máquinas de Suporte Vetorial (SVM) e utiliza técnicas de aproximação de kernels para realizar o treinamento de $W$-operadores. Uma escolha adequada de kernels permite o treinamento de operadores níveis de cinza e binários. A segunda técnica, NILC, permite a criação automática de combinações de operadores de imagens. Este método utiliza uma técnica de otimização específica para casos em que o número de características é muito grande. Ambos métodos obtiveram resultados competitivos com algoritmos da literatura em dois domínio de aplicação diferentes. O primeiro, Staff Removal, é um processamento de documentos binários frequente em sistemas de reconhecimento ótico de partituras. O segundo é um problema de segmentação de vasos sanguíneos em imagens em níveis de cinza.
|
Page generated in 0.1266 seconds