Global ETD Search

141	Geographic information science: contribution to understanding salt and sodium affected soils in the Senegal River valley Ndiaye, Ramatoulaye January 1900 (has links) Doctor of Philosophy / Department of Geography / John A. Harrington Jr / The Senegal River valley and delta (SRVD) are affected by long term climate variability. Indicators of these climatic shifts include a rainfall deficit, warmer temperatures, sea level rise, floods, and drought. These shifts have led to environmental degradation, water deficits, and profound effects on human life and activities in the area. Geographic Information Science (GIScience), including satellite-based remote sensing methods offer several advantages over conventional ground-based methods used to map and monitor salt-affected soil (SAS) features. This study was designed to assess the accuracy of information on soil salinization extracted from Landsat satellite imagery. Would available imagery and GIScience data analysis enable an ability to discriminate natural soil salinization from soil sodication and provide an ability to characterize the SAS trend and pattern over 30 years? A set of Landsat MSS (June 1973 and September 1979), Landsat TM (November 1987, April 1994 and November 1999) and ETM+ (May 2001 and March 2003) images have been used to map and monitor salt impacted soil distribution. Supervised classification, unsupervised classification and post-classification change detection methods were used. Supervised classifications of May 2001 and March 2003 images were made in conjunction field data characterizing soil surface chemical characteristics that included exchange sodium percentage (ESP), cation exchange capacity (CEC) and the electrical conductivity (EC). With this supervised information extraction method, the distribution of three different types of SAS (saline, saline-sodic, and sodic) was mapped with an accuracy of 91.07% for 2001 image and 73.21% for 2003 image. Change detection results confirmed a decreasing trend in non-saline and saline soil and an increase in saline-sodic and sodic soil. All seven Landsat images were subjected to the unsupervised classification method which resulted in maps that separate SAS according to their degree of salinity. The spatial distribution of sodic and saline-sodic soils has a strong relationship with the area of irrigated rice crop management. This study documented that human-induced salinization is progressively replacing natural salinization in the SRVD. These pedologic parameters obtained using GIScience remote sensing techniques can be used as a scientific tool for sustainable management and to assist with the implementation of environmental policy. Salt affected soil GIScience Supervised classification Change detection Senegal River valley Geography (0366)
142	A Semi-Supervised Predictive Model to Link Regulatory Regions to Their Target Genes Hafez, Dina Mohamed January 2015 (has links) <p>Next generation sequencing technologies have provided us with a wealth of data profiling a diverse range of biological processes. In an effort to better understand the process of gene regulation, two predictive machine learning models specifically tailored for analyzing gene transcription and polyadenylation are presented.</p><p>Transcriptional enhancers are specific DNA sequences that act as ``information integration hubs" to confer regulatory requirements on a given cell. These non-coding DNA sequences can regulate genes from long distances, or across chromosomes, and their relationships with their target genes are not limited to one-to-one. With thousands of putative enhancers and less than 14,000 protein-coding genes, detecting enhancer-gene pairs becomes a very complex machine learning and data analysis challenge. </p><p>In order to predict these specific-sequences and link them to genes they regulate, we developed McEnhancer. Using DNAseI sensitivity data and annotated in-situ hybridization gene expression clusters, McEnhancer builds interpolated Markov models to learn enriched sequence content of known enhancer-gene pairs and predicts unknown interactions in a semi-supervised learning algorithm. Classification of predicted relationships were 73-98% accurate for gene sets with varying levels of initial known examples. Predicted interactions showed a great overlap when compared to Hi-C identified interactions. Enrichment of known functionally related TF binding motifs, enhancer-associated histone modification marks, along with corresponding developmental time point was highly evident.</p><p>On the other hand, pre-mRNA cleavage and polyadenylation is an essential step for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage site (polyA site), which are frequently constrained by sequence content and position. More than 50\% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3'-UTRs, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered by the lack of appropriate tests for determining APAs with significant differences across multiple libraries. </p><p>We specified a linear effects regression model to identify tissue-specific biases indicating regulated APA; the significance of differences between tissue types was assessed by an appropriately designed permutation test. This combination allowed us to identify highly specific subsets of APA events in the individual tissue types. Predictive kernel-based SVM models successfully classified constitutive polyA sites from a biologically relevant background (auROC = 99.6%), as well as tissue-specific regulated sets from each other. The main cis-regulatory elements described for polyadenylation were found to be a strong, and highly informative, hallmark for constitutive sites only. Tissue-specific regulated sites were found to contain other regulatory motifs, with the canonical PAS signal being nearly absent at brain-specific sites. We applied this model on SRp20 data, an RNA binding protein that might be involved in oncogene activation and obtained interesting insights. </p><p>Together, these two models contribute to the understanding of enhancers and the key role they play in regulating tissue-specific expression patterns during development, as well as provide a better understanding of the diversity of post-transcriptional gene regulation in multiple tissue types.</p> / Dissertation Computer science Bioinformatics Gene regulation Interpolated Markov model Machine learning Semi-supervised learning SVM Transcriptional enhancers
143	Graph-based approaches for semi-supervised and cross-domain sentiment analysis Ponomareva, Natalia January 2014 (has links) The rapid development of Internet technologies has resulted in a sharp increase in the number of Internet users who create content online. User-generated content often represents people's opinions, thoughts, speculations and sentiments and is a valuable source of information for companies, organisations and individual users. This has led to the emergence of the field of sentiment analysis, which deals with the automatic extraction and classification of sentiments expressed in texts. Sentiment analysis has been intensively researched over the last ten years, but there are still many issues to be addressed. One of the main problems is the lack of labelled data necessary to carry out precise supervised sentiment classification. In response, research has moved towards developing semi-supervised and cross-domain techniques. Semi-supervised approaches still need some labelled data and their effectiveness is largely determined by the amount of these data, whereas cross-domain approaches usually perform poorly if training data are very different from test data. The majority of research on sentiment classification deals with the binary classification problem, although for many practical applications this rather coarse sentiment scale is not sufficient. Therefore, it is crucial to design methods which are able to perform accurate multiclass sentiment classification. The aims of this thesis are to address the problem of limited availability of data in sentiment analysis and to advance research in semi-supervised and cross-domain approaches for sentiment classification, considering both binary and multiclass sentiment scales. We adopt graph-based learning as our main method and explore the most popular and widely used graph-based algorithm, label propagation. We investigate various ways of designing sentiment graphs and propose a new similarity measure which is unsupervised, easy to compute, does not require deep linguistic analysis and, most importantly, provides a good estimate for sentiment similarity as proved by intrinsic and extrinsic evaluations. The main contribution of this thesis is the development and evaluation of a graph-based sentiment analysis system that a) can cope with the challenges of limited data availability by using semi-supervised and cross-domain approaches b) is able to perform multiclass classification and c) achieves highly accurate results which are superior to those of most state-of-the-art semi-supervised and cross-domain systems. We systematically analyse and compare semi-supervised and cross-domain approaches in the graph-based framework and propose recommendations for selecting the most pertinent learning approach given the data available. Our recommendations are based on two domain characteristics, domain similarity and domain complexity, which were shown to have a significant impact on semi-supervised and cross-domain performance. 004.678
144	Active learning in cost-sensitive environments Liu, Alexander Yun-chung 21 June 2010 (has links) Active learning techniques aim to reduce the amount of labeled data required for a supervised learner to achieve a certain level of performance. This can be very useful in domains where unlabeled data is easy to obtain but labelling data is costly. In this dissertation, I introduce methods of creating computationally efficient active learning techniques that handle different misclassification costs, different evaluation metrics, and different label acquisition costs. This is accomplished in part by developing techniques from utility-based data mining typically not studied in conjunction with active learning. I first address supervised learning problems where labeled data may be scarce, especially for one particular class. I revisit claims about resampling, a particularly popular approach to handling imbalanced data, and cost-sensitive learning. The presented research shows that while resampling and cost-sensitive learning can be equivalent in some cases, the two approaches are not identical. This work on resampling and cost-sensitive learning motivates a need for active learners that can handle different misclassification costs. After presenting a cost-sensitive active learning algorithm, I show that this algorithm can be combined with a proposed framework for analyzing evaluation metrics in order to create an active learning approach that can optimize any evaluation metric that can be expressed as a function of terms in a confusion matrix. Finally, I address methods for active learning in terms of different utility costs incurred when labeling different types of points, particularly when label acquisition costs are spatially driven. / text Active learning Labeled data Supervised learners Utility-based data mining Resampling Cost-sensitive learning Label acquisition
145	Clustering Via Supervised Support Vector Machines Merat, Sepehr 07 August 2008 (has links) An SVM-based clustering algorithm is introduced that clusters data with no a priori knowledge of input classes. The algorithm initializes by first running a binary SVM classifier against a data set with each vector in the set randomly labeled. Once this initialization step is complete, the SVM confidence parameters for classification on each of the training instances can be accessed. The lowest confidence data (e.g., the worst of the mislabeled data) then has its labels switched to the other class label. The SVM is then re-run on the data set (with partly re-labeled data). The repetition of the above process improves the separability until there is no misclassification. Variations on this type of clustering approach are shown. clustering machine learning pattern recognition support vector machines supervised learning unsupervised learning
146	Individual Differences in Adolescents’ Driving Practice during the Learner Stage Zhao, Yinan 13 May 2016 (has links) The implementation of Graduated Driver Licensing (GDL) policies has reduced the rate of car crashes among adolescents. However, limited research has focused on adolescents’ supervised driving during the learner permit stage of GDL. The study aimed to describe supervised driving practice during the learner permit stage and to test predictors of individual differences in the amount and the quality of supervised driving. 183 adolescents (M age = 16.4 years, 54.1% female) and their parents (84.1% mothers) participated. Adolescents reported driving an average of 25 minutes per day. Adolescents living in single-parent households, with less family income, and with a stronger motivation to drive reported more daily driving. Adolescents with a stronger motivation to drive reported driving in more settings. Discussion focuses on implications for developing effective driving-specific parenting strategies and helping to enrich adolescents’ supervised driving experiences. Adolescent drivers learner permit stage supervised driving Applied Behavior Analysis Personality and Social Contexts Psychology
147	Data Driven Visual Recognition Aghazadeh, Omid January 2014 (has links) This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems. In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them. In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model. We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate. We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven. / <p>QC 20140604</p> Visual Recognition Data Driven Supervised Learning Mixture Models Non-Parametric Models Category Recognition Novelty Detection
148	\"Tem alguém vendo\": Visitas monitoradas em varas de família sob a perspectiva de operadores do direito, psicólogas judiciárias e familiares / \"Someone is watching\": supervised visitation in family courts through the perspective of legal operators, judicial psychologists and families Zugman, Maiana Jugend 24 June 2019 (has links) A manutenção da convivência dos filhos com ambos os pais após a ruptura conjugal é uma temática discutida em vários textos, documentos e leis nacionais e internacionais. Todavia, a preservação destas relações se mostra uma tarefa complexa, quando nos referimos às Varas de Família e às separações e divórcios litigiosos que a elas se apresentam. Além dos conflitos parentais, diferentes razões podem causar o distanciamento entre os filhos e o genitor com quem não residem, como alegações de violência sobretudo sexual contra a criança, recusa desta em ver o genitor descontínuo e dificuldades impostas pelo genitor contínuo com relação à convivência. Nos casos mais graves, o convívio pode ser regulamentado judicialmente sob a modalidade de visitas monitoradas, isto é, na presença de um terceiro, visando preservar o vínculo entre pais ou outros familiares e crianças e/ou adolescentes e, ao mesmo tempo, garantir a proteção destes. Em nosso contexto de pesquisa, os encontros acontecem dentro dos fóruns e são monitorados por psicólogos judiciários. Estes, porém, atuam sem embasamento técnico ou teórico, dada a escassez de cursos de Psicologia que oferecem a disciplina de Psicologia Jurídica no Brasil, a carência de literatura nacional específica sobre as visitas monitoradas e a falta de uma estrutura adequada, incluindo supervisão e discussão de casos, que permita aos profissionais uma sistematização da prática. O presente trabalho teve por objetivo compreender o significado e a função das visitas monitoradas determinadas judicialmente em Varas de Família para operadores do Direito (juízes, promotoras e advogados), psicólogas judiciárias e familiares. Realizamos a coleta de dados a partir de entrevistas psicológicas semi-estruturadas, por meio do método hermenêutico (Mandelbaum, 2012), com 18 participantes, sendo quatro magistrados, duas promotoras, um advogado, oito psicólogas judiciárias, dois pais e uma mãe. As entrevistas foram transcritas e, no decorrer das transcrições, identificamos temáticas comuns nos discursos dos entrevistados, o que levou à criação de 18 categorias de análise. A partir destas, realizamos uma extensa pesquisa bibliográfica internacional, a fim de conhecermos a prática de visitas monitoradas pelo mundo. Deparamo-nos com ampla quantidade de materiais publicados na Europa, Oceania, América do Norte e Israel, cujas experiências apresentam algumas diferenças daquelas realizadas no Brasil, entretanto, muitas semelhanças, tais como: a variedade de terminologias utilizadas para designar a visitação monitorada; uma diversidade de práticas e formatos de trabalho; problemas de comunicação entre os tribunais e aqueles que monitoram os encontros e dificuldade quanto a uma definição clara da técnica e do lugar do profissional nas visitas monitoradas. A análise das entrevistas foi condizente com os temas encontrados no exterior, o que permitiu uma articulação teórico-prática e a percepção da limitação do procedimento de visitas monitoradas para atender às demandas dos complexos conflitos familiares que rotineiramente chegam às Varas de Família. Concluímos pela importância da viabilização de uma integralização de serviços, que ofereça uma rede de cuidado e suporte às famílias em litígio, cujas necessidades extrapolam a capacidade e mesmo o objetivo das visitas monitoradas / The maintenance of the children\'s contact with both parents after the marital breakup is a theme discussed in many texts, documents, and national and international laws. However, the preservation of these relationships is a complex task, when we refer to the Family Courts and the litigious separations and divorces that present themselves to them. In addition to parental conflicts, different reasons may cause the distance between the offspring and the parent with whom they do not reside, such as allegations of violence - especially sexual - against the child, refusal to see the noncustodial parent and difficulties imposed by the custodial parent related to access. In more severe cases, contact can be legally regulated in the form of supervised visitation, i.e., in the presence of a third party, to preserve the bond between parents or other family members and children and/or adolescents and, at the same time, ensuring protection to this children and adolescents. In our research context, the meetings take place within the forums and are supervised by judicial psychologists. These professionals, however, act without technical or theoretical basis, given the shortage of Psychology courses that offer the discipline of Legal Psychology in Brazil, the deficiency of specific national literature about the supervised visitation and the lack of adequate infrastructure, including supervision and case discussion, to allow these professionals a systematization of the practice. The aim of the present study was to comprehend the meaning and function of supervised visitation judicially ordered in the Family Courts for legal operators, (judges, prosecutors and lawyers), judiciary psychologists and family members. We performed the data collection by semi-structured psychological interviews, through the hermeneutic method (Mandelbaum, 2012), with 18 participants, being four magistrates, two prosecutors, one lawyer, eight judicial psychologists, two fathers and one mother. The interviews were transcribed and, during the transcripts, we identified common themes in the respondents speeches, which led to the creation of 18 categories of analysis. From these, we carried out an extensive international bibliographical research, in order to get to know the practice of supervised visitation around the world. We encountered a broad amount of materials published in Europe, Oceania, North America and Israel, whose experiences show some differences from those performed in Brazil, however, many similarities, such as: the variety of terminology used to designate supervised visitation; a diversity of practices and formats of work; problems of communication between the courts and those who supervise the meetings, and difficulty related to a clear definition of the technique and the role of the professional in the supervised visits. The analysis of the interviews was consistent with the themes found abroad, allowing a theoretical and practical articulation and also the perception of the limits of the supervised visitation procedure to assist the demands of the complex family conflicts that routinely reach the Family Courts. We conclude by the importance of the feasibility of integrated services that could offer a network of care and support to the families in litigation, whose needs exceed the capacity and even the objective of the supervised visitation Family courts Legal psychology Psicologia jurídica Supervised visitation Varas de família Visitas monitoradas
149	Apprentissage à partir du mouvement / Learning from motion Tokmakov, Pavel 04 June 2018 (has links) L’apprentissage faiblement supervisé cherche à réduire au minimum l’effort humain requis pour entrainer les modèles de l’état de l’art. Cette technique permet de tirer parti d’une énorme quantité de données. Toutefois, dans la pratique, les méthodes faiblement supervisées sont nettement moins efficaces que celles qui sont totalement supervisées. Plus particulièrement, dans l’apprentissage profond, où les approches de vision par ordinateur sont les plus performantes, elles restent entièrement supervisées, ce qui limite leurs utilisations dans les applications du monde réel. Cette thèse tente tout d’abord de combler le fossé entre les méthodes faiblement supervisées et entièrement supervisées en utilisant l’information de mouvement. Puis étudie le problème de la segmentation des objets en mouvement, en proposant l’une des premières méthodes basées sur l’apprentissage pour cette tâche.Dans une première partie de la thèse, nous nous concentrons sur le problème de la segmentation sémantique faiblement supervisée. Le défi est de capturer de manières précises les bordures des objets et d’éviter les optimums locaux (ex : segmenter les parties les plus discriminantes). Contrairement à la plupart des approches de l’état de l’art, qui reposent sur des images statiques, nous utilisons les données vidéo avec le mouvement de l’objet comme informations importantes. Notre méthode utilise une approche de segmentation vidéo de l’état de l’art pour segmenter les objets en mouvement dans les vidéos. Les masques d’objets approximatifs produits par cette méthode sont ensuite fusionnés avec le modèle de segmentation sémantique appris dans un EM-like framework, afin d’inférer pour les trames vidéo, des labels sémantiques au niveau des pixels. Ainsi, au fur et à mesure que l’apprentissage progresse, la qualité des labels s’améliore automatiquement. Nous intégrons ensuite cette architecture à notre approche basée sur l’apprentissage pour la segmentation de la vidéo afin d’obtenir un framework d’apprentissage complet pour l’apprentissage faiblement supervisé à partir de vidéos.Dans la deuxième partie de la thèse, nous étudions la segmentation vidéo non supervisée, plus précisément comment segmenter tous les objets dans une vidéo qui se déplace indépendamment de la caméra. De nombreux défis tels qu’un grand mouvement de la caméra, des inexactitudes dans l’estimation du flux optique et la discontinuité du mouvement, complexifient la tâche de segmentation. Nous abordons le problème du mouvement de caméra en proposant une méthode basée sur l’apprentissage pour la segmentation du mouvement : un réseau de neurones convolutif qui prend le flux optique comme entrée et qui est entraîné pour segmenter les objets qui se déplacent indépendamment de la caméra. Il est ensuite étendu avec un flux d’apparence et un module de mémoire visuelle pour améliorer la continuité temporelle. Le flux d’apparence tire profit de l’information sémantique qui est complémentaire de l’information de mouvement. Le module de mémoire visuelle est un paramètre clé de notre approche : il combine les sorties des flux de mouvement et d’apparence et agréger une représentation spatio-temporelle des objets en mouvement. La segmentation finale est ensuite produite à partir de cette représentation agrégée. L’approche résultante obtient des performances de l’état de l’art sur plusieurs jeux de données de référence, surpassant la méthode d’apprentissage en profondeur et heuristique simultanée. / Weakly-supervised learning studies the problem of minimizing the amount of human effort required for training state-of-the-art models. This allows to leverage a large amount of data. However, in practice weakly-supervised methods perform significantly worse than their fully-supervised counterparts. This is also the case in deep learning, where the top-performing computer vision approaches remain fully-supervised, which limits their usage in real world applications. This thesis attempts to bridge the gap between weakly-supervised and fully-supervised methods by utilizing motion information. It also studies the problem of moving object segmentation itself, proposing one of the first learning-based methods for this task.We focus on the problem of weakly-supervised semantic segmentation. This is especially challenging due to the need to precisely capture object boundaries and avoid local optima, as for example segmenting the most discriminative parts. In contrast to most of the state-of-the-art approaches, which rely on static images, we leverage video data with object motion as a strong cue. In particular, our method uses a state-of-the-art video segmentation approach to segment moving objects in videos. The approximate object masks produced by this method are then fused with the semantic segmentation model learned in an EM-like framework to infer pixel-level semantic labels for video frames. Thus, as learning progresses, the quality of the labels improves automatically. We then integrate this architecture with our learning-based approach for video segmentation to obtain a fully trainable framework for weakly-supervised learning from videos.In the second part of the thesis we study unsupervised video segmentation, the task of segmenting all the objects in a video that move independently from the camera. This task presents challenges such as strong camera motion, inaccuracies in optical flow estimation and motion discontinuity. We address the camera motion problem by proposing a learning-based method for motion segmentation: a convolutional neural network that takes optical flow as input and is trained to segment objects that move independently from the camera. It is then extended with an appearance stream and a visual memory module to improve temporal continuity. The appearance stream capitalizes on the semantic information which is complementary to the motion information. The visual memory module is the key component of our approach: it combines the outputs of the motion and appearance streams and aggregates a spatio-temporal representation of the moving objects. The final segmentation is then produced based on this aggregated representation. The resulting approach obtains state-of-the-art performance on several benchmark datasets, outperforming the concurrent deep learning and heuristic-based methods. Apprentissage Semi-Supervisé Reconnaissance des objets Semi-Supervised Learning Recognizing Objects 004
150	Classifying natural forests using LiDAR data / Klassificering av nyckelbiotoper med hjälp av LiDAR-data Arvidsson, Simon, Gullstrand, Marcus January 2019 (has links) In forestry, natural forests are forest areas with high biodiversity, in need of preservation. The current mapping of natural forests is a tedious task that requires manual labor that could possibly be automated. In this paper we explore the main features used by a random forest algorithm to classify natural forest and managed forest in northern Sweden. The goal was to create a model with a substantial strength of agreement, meaning a Kappa value of 0.61 or higher, placing the model in the same range as models produced in previous research. We used raster data gathered from airborne LiDAR, combined with labeled sample areas, both supplied by the Swedish Forest Agency. Two experiments were performed with different features. Experiment 1 used features extracted using methods inspired from previous research while Experiment 2 further added upon those features. From the total number of used sample areas (n=2882), 70% was used to train the models and 30% was used for evaluation. The result was a Kappa value of 0.26 for Experiment 1 and 0.32 for Experiment 2. Features shown to be prominent are features derived from canopy height, where the supplied data also had the highest resolution. Percentiles, kurtosis and canopy crown areas derived from the canopy height were shown to be the most important for classification. The results fell short of our goal, possibly indicating a range of flaws in the data used. The size of the sample areas and resolution of raster data are likely important factors when extracting features, playing a large role in the produced model’s performance. Geographic information systems Classification and regression trees Supervised learning by classification Computer Systems Datorsystem

Search results