• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 93
  • 80
  • 11
  • 11
  • 10
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 253
  • 90
  • 78
  • 69
  • 60
  • 57
  • 53
  • 52
  • 47
  • 46
  • 44
  • 41
  • 38
  • 37
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Species Discrimination and Monitoring of Abiotic Stress Tolerance by Chlorophyll Fluorescence Transients

MISHRA, Anamika January 2012 (has links)
Chlorophyll fluorescence imaging has now become a versatile and standard tool in fundamental and applied plant research. This method captures time series images of the chlorophyll fluorescence emission of whole leaves or plants upon various illuminations, typically combination of actinic light and saturating flashes. Several conventional chlorophyll fluorescence parameters have been recognized that have physiological interpretation and are useful for, e.g., assessment of plant health status and early detection of biotic and abiotic stresses. Chlorophyll florescence imaging enabled us to probe the performance of plants by visualizing physiologically relevant fluorescence parameters reporting physiology and biochemistry of the plant leaves. Sometimes there is a need to find the most contrasting fluorescence features/parameters in order to quantify stress response at very early stage of the stress treatment. The conventional fluorescence utilizes well defined single image such as F0, Fp, Fm, Fs or arithmetic combinations of basic images such as Fv/Fm, PSII, NPQ, qP. Therefore, although conventional fluorescence parameters have physiological interpretation, they may not be representing highly contrasting image sets. In order to find the effect of stress treatments at very early stage, advanced statistical techniques, based on classifiers and feature selection methods, have been developed to select highly contrasting chlorophyll fluorescence images out of hundreds of captured images. We combined sets of highly performing images resulting in images with very high contrast, the so called combinatorial imaging. The application of advanced statistical methods on chlorophyll fluorescence imaging data allows us to succeed in tasks, where conventional approaches do not work. This thesis aims to explore the application of conventional chlorophyll fluorescence parameters as well as advanced statistical techniques of classifiers and feature selection methods for high-throughput screening. We demonstrate the applicability of the technique in discriminating three species of the same family Lamiaceae at a very early stage of their growth. Further, we show that chlorophyll fluorescence imaging can be used for measuring cold and drought tolerance of Arabidopsis thaliana and tomato plants, respectively, in a simulated high ? throughput screening.
32

A Grammar of Northern and Southern Gumuz

Ahland, Colleen, Ahland, Colleen January 2012 (has links)
Gumuz is a Nilo-Saharan dialect cluster spoken in the river valleys of northwestern Ethiopia and the southeastern part of the Republic of the Sudan. There are approximately 200,000 speakers, the majority of which reside in Ethiopia. This study is a phonological and grammatical analysis of two main dialects/languages: Northern Gumuz and Southern Gumuz. The study provides an overview of the Gumuz people and culture, including historical accounts of the language(s) and migration patterns. Most major aspects of the language are described and analyzed in detail: phonology, nouns, pronouns, demonstratives and other noun phrase constituents, verbs and verbal morphology, noun incorporation, verbal classifiers, noun categorization, basic clauses, and subordinate clauses. Northern and Southern Gumuz varieties are contrasted throughout. Gumuz tone has two levels, High and Low, with tonal downstep of High. The tonal melody on bound pronominals on verbs indicates transitivity. Nouns are divided into two basic types: relational and absolute. Relational nouns have an inherent relationship with another nominal element, either within a noun-noun compound or with a (historical) possessive affix. Two sets of relational nouns --attributive and relator nouns-- obligatorily take an inherent possession suffix if not in a compound. Gumuz has two noun-noun constructions: the Associative Construction and the Attributive Construction. The first is left-headed with `noun of noun' semantics. The second is right-headed with the initial noun expressing an inherent quality of the second. Certain body part terms have grammaticalized as a variety of other morphosyntactic categories, in particular as relator nouns, verbal classifiers, and class morphemes, the final two of which are noun categorization devices. Many of these same body part terms can be incorporated into the verb or form part of lexicalized verb-noun compounds. Deverbal nominalizations with /ma-/ are found throughout the language structures. These /ma-/ nominalizations serve as both subject and object complements. They are also commonly found in other subordinate clauses such as relative and adverbial clauses. Purpose clauses are formed with the dative preposition plus a /ma-/ nominalization. Finite purpose clauses take pronominal inflection and have further grammaticalized as future tense main clause verbs in Southern Gumuz.
33

A credit scoring model based on classifiers consensus system approach

Ala'raj, Maher A. January 2016 (has links)
Managing customer credit is an important issue for each commercial bank; therefore, banks take great care when dealing with customer loans to avoid any improper decisions that can lead to loss of opportunity or financial losses. The manual estimation of customer creditworthiness has become both time- and resource-consuming. Moreover, a manual approach is subjective (dependable on the bank employee who gives this estimation), which is why devising and implementing programming models that provide loan estimations is the only way of eradicating the ‘human factor’ in this problem. This model should give recommendations to the bank in terms of whether or not a loan should be given, or otherwise can give a probability in relation to whether the loan will be returned. Nowadays, a number of models have been designed, but there is no ideal classifier amongst these models since each gives some percentage of incorrect outputs; this is a critical consideration when each percent of incorrect answer can mean millions of dollars of losses for large banks. However, the LR remains the industry standard tool for credit-scoring models development. For this purpose, an investigation is carried out on the combination of the most efficient classifiers in credit-scoring scope in an attempt to produce a classifier that exceeds each of its classifiers or components. In this work, a fusion model referred to as ‘the Classifiers Consensus Approach’ is developed, which gives a lot better performance than each of single classifiers that constitute it. The difference of the consensus approach and the majority of other combiners lie in the fact that the consensus approach adopts the model of real expert group behaviour during the process of finding the consensus (aggregate) answer. The consensus model is compared not only with single classifiers, but also with traditional combiners and a quite complex combiner model known as the ‘Dynamic Ensemble Selection’ approach. As a pre-processing technique, step data-filtering (select training entries which fits input data well and remove outliers and noisy data) and feature selection (remove useless and statistically insignificant features which values are low correlated with real quality of loan) are used. These techniques are valuable in significantly improving the consensus approach results. Results clearly show that the consensus approach is statistically better (with 95% confidence value, according to Friedman test) than any other single classifier or combiner analysed; this means that for similar datasets, there is a 95% guarantee that the consensus approach will outperform all other classifiers. The consensus approach gives not only the best accuracy, but also better AUC value, Brier score and H-measure for almost all datasets investigated in this thesis. Moreover, it outperformed Logistic Regression. Thus, it has been proven that the use of the consensus approach for credit-scoring is justified and recommended in commercial banks. Along with the consensus approach, the dynamic ensemble selection approach is analysed, the results of which show that, under some conditions, the dynamic ensemble selection approach can rival the consensus approach. The good sides of dynamic ensemble selection approach include its stability and high accuracy on various datasets. The consensus approach, which is improved in this work, may be considered in banks that hold the same characteristics of the datasets used in this work, where utilisation could decrease the level of mistakenly rejected loans of solvent customers, and the level of mistakenly accepted loans that are never to be returned. Furthermore, the consensus approach is a notable step in the direction of building a universal classifier that can fit data with any structure. Another advantage of the consensus approach is its flexibility; therefore, even if the input data is changed due to various reasons, the consensus approach can be easily re-trained and used with the same performance.
34

Comparação de métodos de classificação de imagens, visando o gerenciamento de áreas citrícolas

Barbosa, Ana Paula [UNESP] 14 August 2009 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:24:39Z (GMT). No. of bitstreams: 0 Previous issue date: 2009-08-14Bitstream added on 2014-06-13T20:12:45Z : No. of bitstreams: 1 barbosa_ap_me_botfca.pdf: 1272681 bytes, checksum: 2f6e9e57461cf2252f20193a29c01150 (MD5) / Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) / Dada a significativa importância da atividade citrícola no Estado de São Paulo, verifica-se a necessidade de um constante monitoramento destas áreas, contribuindo para a tomada de decisões rápidas e abrangentes, visando à manutenção desta exploração. Mesmo com sua extraordinária capacidade de analisar e interpretar dados de sensoriamento remoto, o ser humano tende à subjetividade ao registrar as informações observadas nas imagens. Muitas vezes, o conhecimento do analista sobre a área de estudo é limitado, o que faz do processo de classificação uma tarefa que demanda maior esforço e tempo para identificação dos objetos representados na superfície. A classificação de imagens é o processo de extração de informação para reconhecer padrões e objetos homogêneos e são utilizados em sensoriamento remoto para mapear áreas da superfície terrestre que correspondem aos temas de interesse. O trabalho teve por objetivo a comparação da eficiência de métodos de classificação de imagens orbitais em áreas cultivadas com citros, utilizando técnicas de geoprocessamento visando o planejamento e o gerenciamento localizado das áreas de produção de citros. A área de estudo utilizada corresponde a Fazenda Água Branca, município de Bariri/ SP. O SIG - Idrisi 15.0, foi utilizado no processamento das imagens do satélite LANDSAT-5 TM, órbita/ponto: 221/75, passagens de 16/06/2003 e 26/05/2007. Do trabalho realizado constatou-se que os resultados na avaliação de acurácia da classificação foram satisfatórios, sendo que a classificação do algoritmo CLUSTER apresentou qualidade excelente (0,9276) e muito boa (0,6485); o algoritmo MAXVER apresentou classificações excelentes, com Kappa de 0,8338 e 0,8818; e o índice gerado pelo método de classificação relativa... / The significative activity of the citrus crop in São Paulo State requires a constant monitoring of these areas, contributing to make the quick and comprehensive decisions, aiming to maintain this operation. Even with the extraordinary ability to analyze and interpret data from remote sensing, human beings tend to subjectivity to register the information in the observed images. Often, knowledge of the analyst about the study area is limited, which makes the process of classifying a task that demands more time and effort to identify the objects represented on the surface. The images classification is the process of extraction of information to recognize patterns and homogeneous objects and it is used in remote sensing to map areas of the surface that correspond to themes of interest. This study aimed to compare the efficiency of methods of classification of orbital images in areas cultivated with citrus, using GIS techniques to the planning and management located in the areas of citrus production. The study area is located in Água Branca Farm, city of Bariri/SP. The GIS - Idrisi 15.0 was used in the image processing. It noted that the evaluation results of the classification accuracy were satisfactory, which the classifications of the CLUSTER algorithm had excellent quality (0.9276) and very good (0.6485), the MAXVER algorithm had excellent ratings, with kappa of 0.8338 and 0.8818, and the index obtained by the method of Fuzzy classification submitted a rating between very good (0.7260) and good (0.5235) for 2003 and 2007, respectively. The methods used for the discrimination of areas cultivated with citrus showed different efficiencies in the classification of images. In general, the classifications for the 2003 images showed the best performance when compared... (Complete abstract click electronic access below)
35

"Combinação de classificadores simbólicos para melhorar o poder preditivo e descritivo de Ensembles" / Combination of symbolic classifiers to improve predictive and descriptive power of ensembles

Flávia Cristina Bernardini 17 May 2002 (has links)
A qualidade das hipóteses induzidas pelos atuais sistemas de Aprendizado de Máquina depende principalmente da quantidade e da qualidade dos atributos e exemplos utilizados no treinamento. Freqüentemente, resultados experimentais obtidos sobre grandes bases de dados, que possuem muitos atributos irrelevantes, resultam em hipóteses de baixa precisão. Por outro lado, muitos dos sistemas de aprendizado de máquina conhecidos não estão preparados para trabalhar com uma quantidade muito grande de exemplos. Assim, uma das áreas de pesquisa mais ativas em aprendizado de máquina tem girado em torno de técnicas que sejam capazes de ampliar a capacidade dos algoritmos de aprendizado para processar muitos exemplos de treinamento, atributos e classes. Para que conceitos sejam aprendidos a partir de grandes bases de dados utilizando Aprendizado de Máquina, pode-se utilizar duas abordagens. A primeira realiza uma seleção de exemplos e atributos mais relevantes, e a segunda ´e a abordagem de ensembles. Um ensemble ´e um conjunto de classificadores cujas decisões individuais são combinadas de alguma forma para classificar um novo caso. Ainda que ensembles classifiquem novos exemplos melhor que cada classificador individual, eles se comportam como caixas pretas, no sentido de n˜ao oferecer ao usuário alguma explicação relacionada à classificação por eles fornecida. O objetivo deste trabalho é propor uma forma de combinação de classificadores simbólicos, ou seja, classificadores induzidos por algoritmos de AM simbólicos, nos quais o conhecimento é descrito na forma de regras if-then ou equivalentes, para se trabalhar com grandes bases de dados. A nossa proposta é a seguinte: dada uma grande base de dados, divide-se esta base aleatoriamente em pequenas bases de tal forma que é viável fornecer essas bases de tamanho menor a um ou vários algoritmos de AM simbólicos. Logo após, as regras que constituem os classificadores induzidos por esses algoritmos são combinadas em um único classificador. Para analisar a viabilidade do objetivo proposto, foi implementado um sistema na linguagem de programação lógica Prolog, com a finalidade de (a) avaliar regras de conhecimento induzidas por algoritmos de Aprendizado de Máquina simbólico e (b) avaliar diversas formas de combinar classificadores simbólicos bem como explicar a classificação de novos exemplos realizada por um ensemble de classificares simbólicos. A finalidade (a) é implementada pelo Módulo de Análise de Regras e a finalidade (b) pelo Módulo de Combinação e Explicação. Esses módulos constituem os módulos principais do RuleSystem. Neste trabalho, são descritos os métodos de construção de ensembles e de combinação de classificadores encontrados na literatura, o projeto e a documentação do RuleSystem, a metodologia desenvolvida para documentar o sistema RuleSystem, a implementação do Módulo de Combinação e Explicação, objeto de estudo deste trabalho, e duas aplicações do Módulo de Combinação e Explicação. A primeira aplicação utilizou uma base de dados artificiais, a qual nos permitiu observar necessidades de modificações no Módulo de Combinação e Explicação. A segunda aplicação utilizou uma base de dados reais. / The hypothesis quality induced by current machine learning algorithms depends mainly on the quantity and quality of features and examples used in the training phase. Frequently, hypothesis with low precision are obtained in experiments using large databases with a large number of irrelevant features. Thus, one active research area in machine learning is to investigate techniques able to extend the capacity of machine learning algorithms to process a large number of examples, features and classes. To learn concepts from large databases using machine learning algorithms, two approaches can be used. The first approach is based on a selection of relevant features and examples, and the second one is the ensemble approach. An ensemble is a set of classifiers whose individual decisions are combined in some way to classify a new case. Although ensembles classify new examples better than each individual classifier, they behave like black-boxes, since they do not offer any explanation to the user about their classification. The purpose of this work is to consider a form of symbolic classifiers combination to work with large databases. Given a large database, it is equally divided randomly in small databases. These small databases are supplied to one or more symbolic machine learning algorithms. After that, the rules from the resulting classifiers are combined into one classifier. To analise the viability of this proposal, was implemented a system in logic programming language Prolog, called RuleSystem. This system has two purposes; the first one, implemented by the Rule Analises Module, is to evaluate rules induced by symbolic machine learning algorithms; the second one, implemented by the Combination and Explanation Module, is to evaluate several forms of combining symbolic classifiers as well as to explain ensembled classification of new examples. Both principal modules constitute the Rule System. This work describes ensemble construction methods and combination of classifiers methods found in the literature; the project and documentation of RuleSystem; the methodology developed to document the RuleSystem; and the implementation of the Combination and Explanation Module. Two different case studies using the Combination and Explanation Module are described. The first case study uses an artificial database. Through the use of this artificial database, it was possible to improve several of the heuristics used by the the Combination and Explanation Module. A real database was used in the second case study.
36

Performance Envelopes of Adaptive Ensemble Data Stream Classifiers

Joe-Yen, Stefan 01 January 2017 (has links)
This dissertation documents a study of the performance characteristics of algorithms designed to mitigate the effects of concept drift on online machine learning. Several supervised binary classifiers were evaluated on their performance when applied to an input data stream with a non-stationary class distribution. The selected classifiers included ensembles that combine the contributions of their member algorithms to improve overall performance. These ensembles adapt to changing class definitions, known as “concept drift,” often present in real-world situations, by adjusting the relative contributions of their members. Three stream classification algorithms and three adaptive ensemble algorithms were compared to determine the capabilities of each in terms of accuracy and throughput. For each< run of the experiment, the percentage of correct classifications was measured using prequential analysis, a well-established methodology in the evaluation of streaming classifiers. Throughput was measured in classifications performed per second as timed by the CPU clock. Two main experimental variables were manipulated to investigate and compare the range of accuracy and throughput exhibited by each algorithm under various conditions. The number of attributes in the instances to be classified and the speed at which the definitions of labeled data drifted were varied across six total combinations of drift-speed and dimensionality. The implications of results are used to recommend improved methods for working with stream-based data sources. The typical approach to counteract concept drift is to update the classification models with new data. In the stream paradigm, classifiers are continuously exposed to new data that may serve as representative examples of the current situation. However, updating the ensemble classifier in order to maintain or improve accuracy can be computationally costly and will negatively impact throughput. In a real-time system, this could lead to an unacceptable slow-down. The results of this research showed that,among several algorithms for reducing the effect of concept drift, adaptive decision trees maintained the highest accuracy without slowing down with respect to the no-drift condition. Adaptive ensemble techniques were also able to maintain reasonable accuracy in the presence of drift without much change in the throughput. However, the overall throughput of the adaptive methods is low and may be unacceptable for extremely time-sensitive applications. The performance visualization methodology utilized in this study gives a clear and intuitive visual summary that allows system designers to evaluate candidate algorithms with respect to their performance needs.
37

Interpretability for Deep Learning Text Classifiers

Lucaci, Diana 14 December 2020 (has links)
The ubiquitous presence of automated decision-making systems that have a performance comparable to humans brought attention towards the necessity of interpretability for the generated predictions. Whether the goal is predicting the system’s behavior when the input changes, building user trust, or expert assistance in improving the machine learning methods, interpretability is paramount when the problem is not sufficiently validated in real applications, and when unacceptable results lead to significant consequences. While for humans, there are no standard interpretations for the decisions they make, the complexity of the systems with advanced information-processing capacities conceals the detailed explanations for individual predictions, encapsulating them under layers of abstractions and complex mathematical operations. Interpretability for deep learning classifiers becomes, thus, a challenging research topic where the ambiguity of the problem statement allows for multiple exploratory paths. Our work focuses on generating natural language interpretations for individual predictions of deep learning text classifiers. We propose a framework for extracting and identifying the phrases of the training corpus that influence the prediction confidence the most through unsupervised key phrase extraction and neural predictions. We assess the contribution margin that the added justification has when the deep learning model predicts the class probability of a text instance, by introducing and defining a contribution metric that allows one to quantify the fidelity of the explanation to the model. We assess both the performance impact of the proposed approach on the classification task as quantitative analysis and the quality of the generated justifications through extensive qualitative and error analysis. This methodology manages to capture the most influencing phrases of the training corpus as explanations that reveal the linguistic features used for individual test predictions, allowing humans to predict the behavior of the deep learning classifier.
38

Deaf children's understanding of the language of motion and location in ASL

Conlin-Luippold, Frances 09 November 2015 (has links)
Understanding how a language expresses the existence and action of an entity represents a critical juncture in the development of cognition and the development of language. For deaf children learning a sign language, verbs of motion and location exemplify this critical juncture: these are complex structures that convey substantial morphological, syntactic and semantic information. This dissertation investigated deaf children’s understanding of linguistic representations of motion events as presented in a variety of verbs of motion and location in American Sign Language. The sample for this investigation consisted of 350 deaf children (of Deaf and hearing parents) enrolled in schools for the deaf in the United States. The subjects, who ranged in age from 4-18, were administered the Real Objects and Plurals Arrangement Task (ROPL) of the American Sign Language Assessment Instrument (ASLAI). The following research questions were addressed: (1) To what extent do deaf children understand each of the features of motion events (figure, ground, motion, path, manner, cause) expressed in verbs of motion and location? (2) What (if any) is the implicational structure of these features in the course of acquisition? (3) What role does exposure (i.e. early vs. late input) play on the acquisition of these features? Do age and parental hearing status influence the acquisition of these features? (4) Is there any difference in how deaf children learn to understand events in verbs of motion compared to verbs of location? Results revealed that deaf children's understanding of motion event features follows a sequential process, with features such as motion and figure being acquired in the earliest stages and path and ground being acquired later. Moreover, both age and length of exposure (to a signed language) influenced this acquisition process. These findings suggest that for deaf children, the acquisition of motion event structure in verbs of motion and location is a multifaceted process that is dependent on several factors.
39

FUZZY CLASSIFIERS FOR IMBALANCED DATA SETS

VISA, SOFIA 08 October 2007 (has links)
No description available.
40

Contributions to Ensembles of Models for Predictive Toxicology Applications. On the Representation, Comparison and Combination of Models in Ensembles.

Makhtar, Mokhairi January 2012 (has links)
The increasing variety of data mining tools offers a large palette of types and representation formats for predictive models. Managing the models then becomes a big challenge, as well as reusing the models and keeping the consistency of model and data repositories. Sustainable access and quality assessment of these models become limited to researchers. The approach for the Data and Model Governance (DMG) makes easier to process and support complex solutions. In this thesis, contributions are proposed towards ensembles of models with a focus on model representation, comparison and usage. Predictive Toxicology was chosen as an application field to demonstrate the proposed approach to represent predictive models linked to data for DMG. Further analysing methods such as predictive models comparison and predictive models combination for reusing the models from a collection of models were studied. Thus in this thesis, an original structure of the pool of models was proposed to represent predictive toxicology models called Predictive Toxicology Markup Language (PTML). PTML offers a representation scheme for predictive toxicology data and models generated by data mining tools. In this research, the proposed representation offers possibilities to compare models and select the relevant models based on different performance measures using proposed similarity measuring techniques. The relevant models were selected using a proposed cost function which is a composite of performance measures such as Accuracy (Acc), False Negative Rate (FNR) and False Positive Rate (FPR). The cost function will ensure that only quality models be selected as the candidate models for an ensemble. The proposed algorithm for optimisation and combination of Acc, FNR and FPR of ensemble models using double fault measure as the diversity measure improves Acc between 0.01 to 0.30 for all toxicology data sets compared to other ensemble methods such as Bagging, Stacking, Bayes and Boosting. The highest improvements for Acc were for data sets Bee (0.30), Oral Quail (0.13) and Daphnia (0.10). A small improvement (of about 0.01) in Acc was achieved for Dietary Quail and Trout. Important results by combining all the three performance measures are also related to reducing the distance between FNR and FPR for Bee, Daphnia, Oral Quail and Trout data sets for about 0.17 to 0.28. For Dietary Quail data set the improvement was about 0.01 though, but this data set is well known as a difficult learning exercise. For five UCI data sets tested, similar results were achieved with Acc improvement between 0.10 to 0.11, closing more the gaps between FNR and FPR. As a conclusion, the results show that by combining performance measures (Acc, FNR and FPR), as proposed within this thesis, the Acc increased and the distance between FNR and FPR decreased.

Page generated in 0.0619 seconds