Global ETD Search

211	Tracking non-rigid objects in video Buchanan, Aeron Morgan January 2008 (has links) Video is a sequence of 2D images of the 3D world generated by a camera. As the camera moves relative to the real scene and elements of that scene themselves move, correlated frame-to-frame changes in the video images are induced. Humans easily identify such changes as scene motion and can readily assess attempts to quantify it. For a machine, the identification of the 2D frame-to-frame motion is difficult. This problem is addressed by the computer vision process of tracking. Tracking underpins the solution to the problem of augmenting general video sequences with artificial imagery, a staple task in the visual effects industry. The problem is difficult because tracking in general video sequences is complicated by the presence of non-rigid motion, repeated texture and arbitrary occlusions. Existing methods provide solutions that rely on imposing limitations on the scenes that can be processed or that rely on human artistry and hard work. I introduce new paradigms, frameworks and algorithms for overcoming the challenges of processing general video and thus provide solutions that fill the gap between the `automated' and `manual' approaches. The work is easily sectioned into three parts, which can be considered separately or taken together for dealing with video without limitations. The initial focus is on directly addressing practical issues of human interaction in the tracking process: a new solution is developed by explicitly incorporating the user into an interactive algorithm. It is a novel tracking system based on fast full-frame patch searching and high-speed optimal track determination. This approach makes only minimal assumptions about motion and appearance, making it suitable for the widest variety of input video. I detail an implementation of the new system using k-d trees and dynamic programming. The second distinct contribution is an important extension to tracking algorithms in general. It can be noted that existing tracking algorithms occupy a spectrum in their use of global motion information. Local methods are easily confused by occlusions, repeated texture and image noise. Global motion models offer strong predictions to see through these difficulties and have been used in restricted circumstances, but are defeated by scenes containing independently moving objects or modest levels of non-rigid motion. I present a well principled way of combining local and global models to improve tracking, especially in these highly problematic cases. By viewing rank-constrained tracking as a probabilistic model of 2D tracks instead of 3D motion, I show how one can obtain a robust motion prior that can be easily incorporated in any existing tracking algorithm. The development of the global motion prior is based on rank-constrained factorization of measurement matrices. A common difficulty comes from the frequent occurrence of occlusions in video, which means that the relevant matrices are often not complete due to missing data. This defeats standard factorization algorithms. To fully explain and understand the algorithmic complexities of factorization in this practical context, I present a common notation for the direct comparison of existing algorithms and propose a new family of hybrid approaches that combine the superb initial performance of alternation methods with the convergence power of the Newton algorithm. Together, these investigations provide a wide-ranging, yet coherent exploration of tracking non-rigid objects in video. 621.382
212	Amélioration de l'exactitude de l'inférence phylogénomique Roure, Béatrice 04 1900 (has links) L’explosion du nombre de séquences permet à la phylogénomique, c’est-à-dire l’étude des liens de parenté entre espèces à partir de grands alignements multi-gènes, de prendre son essor. C’est incontestablement un moyen de pallier aux erreurs stochastiques des phylogénies simple gène, mais de nombreux problèmes demeurent malgré les progrès réalisés dans la modélisation du processus évolutif. Dans cette thèse, nous nous attachons à caractériser certains aspects du mauvais ajustement du modèle aux données, et à étudier leur impact sur l’exactitude de l’inférence. Contrairement à l’hétérotachie, la variation au cours du temps du processus de substitution en acides aminés a reçu peu d’attention jusqu’alors. Non seulement nous montrons que cette hétérogénéité est largement répandue chez les animaux, mais aussi que son existence peut nuire à la qualité de l’inférence phylogénomique. Ainsi en l’absence d’un modèle adéquat, la suppression des colonnes hétérogènes, mal gérées par le modèle, peut faire disparaître un artéfact de reconstruction. Dans un cadre phylogénomique, les techniques de séquençage utilisées impliquent souvent que tous les gènes ne sont pas présents pour toutes les espèces. La controverse sur l’impact de la quantité de cellules vides a récemment été réactualisée, mais la majorité des études sur les données manquantes sont faites sur de petits jeux de séquences simulées. Nous nous sommes donc intéressés à quantifier cet impact dans le cas d’un large alignement de données réelles. Pour un taux raisonnable de données manquantes, il appert que l’incomplétude de l’alignement affecte moins l’exactitude de l’inférence que le choix du modèle. Au contraire, l’ajout d’une séquence incomplète mais qui casse une longue branche peut restaurer, au moins partiellement, une phylogénie erronée. Comme les violations de modèle constituent toujours la limitation majeure dans l’exactitude de l’inférence phylogénétique, l’amélioration de l’échantillonnage des espèces et des gènes reste une alternative utile en l’absence d’un modèle adéquat. Nous avons donc développé un logiciel de sélection de séquences qui construit des jeux de données reproductibles, en se basant sur la quantité de données présentes, la vitesse d’évolution et les biais de composition. Lors de cette étude nous avons montré que l’expertise humaine apporte pour l’instant encore un savoir incontournable. Les différentes analyses réalisées pour cette thèse concluent à l’importance primordiale du modèle évolutif. / The explosion of sequence number allows for phylogenomics, the study of species relationships based on large multi-gene alignments, to flourish. Without any doubt, phylogenomics is essentially an efficient way to eliminate the problems of single gene phylogenies due to stochastic errors, but numerous problems remain despite obvious progress realized in modeling evolutionary process. In this PhD-thesis, we are trying to characterize some consequences of a poor model fit and to study their impact on the accuracy of the phylogenetic inference. In contrast to heterotachy, the variation in the amino acid substitution process over time did not attract so far a lot of attention. We demonstrate that this heterogeneity is frequently observed within animals, but also that its existence can interfere with the quality of phylogenomic inference. In absence of an adequate model, the elimination of heterogeneous columns, which are poorly handled by the model, can eliminate an artefactual reconstruction. In a phylogenomic framework, the sequencing strategies often result in a situation where some genes are absent for some species. The issue about the impact of the quantity of empty cells was recently relaunched, but the majority of studies on missing data is performed on small datasets of simulated sequences. Therefore, we were interested on measuring the impact in the case of a large alignment of real data. With a reasonable amount of missing data, it seems that the accuracy of the inference is influenced rather by the choice of the model than the incompleteness of the alignment. For example, the addition of an incomplete sequence that breaks a long branch can at least partially re-establish an artefactual phylogeny. Because, model violations are always representing the major limitation of the accuracy of the phylogenetic inference, the improvement of species and gene sampling remains a useful alternative in the absence of an adequate model. Therefore, we developed a sequence-selection software, which allows the reproducible construction of datasets, based on the quantity of data, their evolutionary speed and their compositional bias. During this study, we did realize that the human expertise still furnishes an indispensable knowledge. The various analyses performed in the course of this PhD thesis agree on the primordial importance of the model of sequence evolution. Phylogénomique Exactitude de l’inférence Hétéropécilie Échantillonnage des espèces Sélection des séquences Données manquantes Violation de modèle Phylogenomics Accuracy of the inference Heteropecilly Species sampling Sequence sorting Missing data Model violation
213	Comparaison de quatre méthodes pour le traitement des données manquantes au sein d’un modèle multiniveau paramétrique visant l’estimation de l’effet d’une intervention Paquin, Stéphane 03 1900 (has links) Les données manquantes sont fréquentes dans les enquêtes et peuvent entraîner d’importantes erreurs d’estimation de paramètres. Ce mémoire méthodologique en sociologie porte sur l’influence des données manquantes sur l’estimation de l’effet d’un programme de prévention. Les deux premières sections exposent les possibilités de biais engendrées par les données manquantes et présentent les approches théoriques permettant de les décrire. La troisième section porte sur les méthodes de traitement des données manquantes. Les méthodes classiques sont décrites ainsi que trois méthodes récentes. La quatrième section contient une présentation de l’Enquête longitudinale et expérimentale de Montréal (ELEM) et une description des données utilisées. La cinquième expose les analyses effectuées, elle contient : la méthode d’analyse de l’effet d’une intervention à partir de données longitudinales, une description approfondie des données manquantes de l’ELEM ainsi qu’un diagnostic des schémas et du mécanisme. La sixième section contient les résultats de l’estimation de l’effet du programme selon différents postulats concernant le mécanisme des données manquantes et selon quatre méthodes : l’analyse des cas complets, le maximum de vraisemblance, la pondération et l’imputation multiple. Ils indiquent (I) que le postulat sur le type de mécanisme MAR des données manquantes semble influencer l’estimation de l’effet du programme et que (II) les estimations obtenues par différentes méthodes d’estimation mènent à des conclusions similaires sur l’effet de l’intervention. / Missing data are common in empirical research and can lead to significant errors in parameters’ estimation. This dissertation in the field of methodological sociology addresses the influence of missing data on the estimation of the impact of a prevention program. The first two sections outline the potential bias caused by missing data and present the theoretical background to describe them. The third section focuses on methods for handling missing data, conventional methods are exposed as well as three recent ones. The fourth section contains a description of the Montreal Longitudinal Experimental Study (MLES) and of the data used. The fifth section presents the analysis performed, it contains: the method for analysing the effect of an intervention from longitudinal data, a detailed description of the missing data of MLES and a diagnosis of patterns and mechanisms. The sixth section contains the results of estimating the effect of the program under different assumptions about the mechanism of missing data and by four methods: complete case analysis, maximum likelihood, weighting and multiple imputation. They indicate (I) that the assumption on the type of MAR mechanism seems to affect the estimate of the program’s impact and, (II) that the estimates obtained using different estimation methods leads to similar conclusions about the intervention’s effect. Données manquantes Imputation multiple Maximum de vraisemblance Pondération Mécanisme de données manquantes Multiniveau Intervention Analyse longitudinale Analyse de sensibilité Sensitivity analysis Longitudinal Multilevel Experimental Mecanism Missing data Maximum likelihood Weighting Multiple imputation
214	Identification aveugle de mélanges et décomposition canonique de tenseurs : application à l'analyse de l'eau / Blind identification of mixtures and canonical tensor decomposition : application to wateranalysis Royer, Jean-Philip 04 October 2013 (has links) Dans cette thèse, nous nous focalisons sur le problème de la décomposition polyadique minimale de tenseurs de dimension trois, problème auquel on se réfère généralement sous différentes terminologies : « Polyadique Canonique » (CP en anglais), « CanDecomp », ou encore « Parafac ». Cette décomposition s'avère très utile dans un très large panel d'applications. Cependant, nous nous concentrons ici sur la spectroscopie de fluorescence appliquée à des données environnementales particulières de type échantillons d'eau qui pourront avoir été collectés en divers endroits ou différents moments. Ils contiennent un mélange de plusieurs molécules organiques et l'objectif des traitements numériques mis en œuvre est de parvenir à séparer et à ré-estimer ces composés présents dans les échantillons étudiés. Par ailleurs, dans plusieurs applications comme l'imagerie hyperspectrale ou justement, la chimiométrie, il est intéressant de contraindre les matrices de facteurs recherchées à être réelles et non négatives car elles sont représentatives de quantités physiques réelles non négatives (spectres, fractions d'abondance, concentrations, ...etc.). C'est pourquoi tous les algorithmes développés durant cette thèse l'ont été dans ce cadre (l'avantage majeur de cette contrainte étant de rendre le problème d'approximation considéré bien posé). Certains de ces algorithmes reposent sur l'utilisation de méthodes proches des fonctions barrières, d'autres approches consistent à paramétrer directement les matrices de facteurs recherchées par des carrés. / In this manuscript, we focus on the minimal polyadic decomposition of third order tensors, which is often referred to: “Canonical Polyadic” (CP), “CanDecomp”, or “Parafac”. This decomposition is useful in a very wide panel of applications. However, here, we only address the problem of fluorescence spectroscopy applied to environment data collected in different locations or times. They contain a mixing of several organic components and the goal of the used processing is to separate and estimate these components present in the considered samples. Moreover, in some applications like hyperspectral unmixing or chemometrics, it is useful to constrain the wanted loading matrices to be real and nonnegative, because they represent nonnegative physical data (spectra, abundance fractions, concentrations, etc...). That is the reason why all the algorithms developed here take into account this constraint (the main advantage is to turn the approximation problem into a well-posed one). Some of them rely on methods close to barrier functions, others consist in a parameterization of the loading matrices with the help of squares. Many optimization algorithms were considered: gradient approaches, nonlinear conjugate gradient, that fits well with big dimension problems, Quasi-Newton (BGFS and DFP) and finally Levenberg-Marquardt. Two versions of these algorithms have been considered: “Enhanced Line Search” version (ELS, enabling to escape from local minima) and the “backtracking” version (alternating with ELS). Non négativité Tenseurs d'ordre trois Décomposition canonique polyadique Données manquantes Spectroscopie de fluorescence Nonnegativity Third order tensors Polyadic canonical decomposition Missing data Fluorescence spectroscopy
215	Modèles conjoints pour données longitudinales et données de survie incomplètes appliqués à l'étude du vieillissement cognitif Dantan, Etienne 08 December 2009 (has links) Dans l'étude du vieillissement cérébral, le suivi des personnes âgées est soumis à une forte sélection avec un risque de décès associé à de faibles performances cognitives. La modélisation de l'histoire naturelle du vieillissement cognitif est complexe du fait de données longitudinales et données de survie incomplètes. Par ailleurs, un déclin accru des performances cognitives est souvent observé avant le diagnostic de démence sénile, mais le début de cette accélération n'est pas facile à identifier. Les profils d'évolution peuvent être variés et associés à des risques différents de survenue d'un événement; cette hétérogénéité des déclins cognitifs de la population des personnes âgées doit être prise en compte. Ce travail a pour objectif d'étudier des modèles conjoints pour données longitudinales et données de survie incomplètes afin de décrire l'évolution cognitive chez les personnes âgées. L'utilisation d'approches à variables latentes a permis de tenir compte de ces phénomènes sous-jacents au vieillissement cognitif que sont l'hétérogénéité et l'accélération du déclin. Au cours d'un premier travail, nous comparons deux approches pour tenir compte des données manquantes dans l'étude d'un processus longitudinal. Dans un second travail, nous proposons un modèle conjoint à état latent pour modéliser simultanément l'évolution cognitive et son accélération pré-démentielle, le risque de démence et le risque de décès. / In cognitive ageing study, older people are highly selected by a risk of death associated with poor cognitive performances. Modeling the natural history of cognitive decline is difficult in presence of incomplete longitudinal and survival data. Moreover, the non observed cognitive decline acceleration beginning before the dementia diagnosis is difficult to evaluate. Cognitive decline is highly heterogeneous, e.g. there are various patterns associated with different risks of survival event. The objective is to study joint models for incomplete longitudinal and survival data to describe the cognitive evolution in older people. Latent variable approaches were used to take into account the non-observed mechanisms, e.g. heterogeneity and decline acceleration. First, we compared two approaches to consider missing data in longitudinal data analysis. Second, we propose a joint model with a latent state to model cognitive evolution and its pre-dementia acceleration, dementia risk and death risk. Modèles mixtes Données manquantes Modèles conjoints Modèle multi-états État latent Vieillissement cognitif Démence Décès Mixed model Missing data Joint model Multi-state model Latent state Cognitive ageing Dementia Death
216	Analyse longitudinale de la qualité de vie relative à la santé en cancérologie / Longitudinal analysis of the health-related quality of life in oncology Anota, Amelie 22 October 2014 (has links) La qualité de vie relative à la santé (QdV) est désormais un des objectifs majeurs des essais cliniques en cancérologie pour pouvoir s’assurer du bénéfice clinique de nouvelles stratégies thérapeutiques pour le patient. Cependant, les résultats des données de QdV restent encore peu pris en compte en pratique clinique en raison de la nature subjective et dynamique de la QdV. De plus, les méthodes statistiques pour son analyse longitudinale doivent être capables de tenir compte de l’occurrence des données manquantes et d’un potentiel effet Response Shift reflétant l’adaptation du patient vis-à-vis de la maladie et de la toxicité du traitement. Ces méthodes doivent enfin proposer des résultats facilement compréhensibles par les cliniciens.Dans cette optique, les objectifs de ce travail ont été de faire le point sur ces facteurs limitants et de proposer des méthodes adéquates pour une interprétation robuste des données de QdV longitudinales. Ces travaux sont centrés sur la méthode du temps jusqu’à détérioration d’un score de QdV (TJD), en tant que modalité d’analyse longitudinale, ainsi que sur la caractérisation de l’occurrence de l’effet Response Shift.Les travaux menés ont donné lieu à la création d’un package R pour l’analyse longitudinale de la QdV selon la méthode du TJD avec une interface facile d’utilisation. Certaines recommandations ont été proposées sur les définitions de TJD à appliquer selon les situations thérapeutiques et l’occurrence ou non d’un effet Response Shift. Cette méthode attractive pour les cliniciens a été appliquée dans le cadre de deux essais de phase précoces I et IL La méthode de pondération par probabilité inversée du score de propension a été investiguée conjointement avec la méthode du TJD afin de tenir compte de l’occurrence de données manquantes dépendant des caractéristiques des patients. Une comparaison de trois approches statistiques pour l’analyse longitudinale a montré la performance du modèle linéaire mixte et permet de donner quelques recommandations pour l’analyse longitudinale selon le design de l’étude. Cette étude a également montré l’impact de l’occurrence de données manquantes informatives sur les méthodes d’analyse longitudinale. Des analyses factorielles et modèles issus de la théorie de réponse à l’item ont montré leur capacité à caractériser la Response Shift conjointement avec la méthode Then-test. Enfin, bien que les modèles à équation structurelles soient régulièrement appliqués pour caractériser cet effet sur le questionnaire de QdV générique SF-36, ils semblent peu adaptés à la structure des questionnaires spécifiques du cancer du groupe « European Organization of Research and Treatment of Cancer » (EORTC / Health-related quality of life (HRQoL) has become one of the major objectives of oncology clinical trials to ensure the clinical benefit of new treatment strategies for the patient. However, the results of HRQoL data remain poorly used in clinical practice due to the subjective and dynamic nature of HRQoL. Moreover, statistical methods for its longitudinal analysis hâve to take into account the occurrence of missing data and the potential Response Shift effect reflecting patient’s adaptation of the disease and treatment toxicities. Finally, these methods should also propose some results easy understandable for clinicians.In this context, this work aimed to review these limiting factors and to propose some suitable methods for a robust interprétation of longitudinal HRQoL data. This work is focused on both the Time to HRQoL score détérioration (TTD) as a modality of longitudinal analysis and the characterization of the occurrence of the Response Shift effect.This work has resulted in the création of an R package for the longitudinal HRQoL analysis according to the TTD with an easy to use interface. Some recommendations were proposed on the définitions of the TTD to apply according to the therapeutic settings and the potential occurrence of the Response Shift effect. This attractive method was applied in two early stage I and II trials. The inverse probability weighting method of the propensity score was investigated in conjunction with the TTD method to take into account the occurrence of missing data depending on patients’ characteristics. A comparison between three statistical approaches for the longitudinal analysis showed the performance of the linear mixed model and allows to give some recommendations for the longitudinal analysis according to the study design. This study also highlighted the impact of the occurrence of informative missing data on the longitudinal statistical methods. Factor analyses and Item Response Theory models showed their ability to characterize the occurrence of the Response Shift in conjunction with the Then- test method. Finally, although the structural équations modeling are often used to characterize this effect on the SF-36 generic questionnaire, they seem not appropriated to the particular structure of the HRQoL cancer spécifie questionnaires of the European Organization of Research and Treatment of Cancer (EORTC) HRQoL group Cancérologie Qualité de vie relative à la santé Analyse longitudinale, Response Shift Essais cliniques Données manquantes Temps jusqu'à déterioration Théorie de réponse à l'item Oncology Health-related quality of lie Longitudinal analysis Response Shift Clinical trials Missing data Time to deterioration Response theory item 616.9
217	Técnicas de diagnóstico para modelos lineares generalizados com medidas repetidas / Diagnostics for generalized linear models for repeated measures data with missing values Damiani, Lucas Petri 10 May 2012 (has links) A literatura dispõe de métodos de diagnóstico para avaliar o ajuste de modelos lineares generalizados (MLGs) para medidas repetidas baseado em equações de estimação generalizada (EEG). No entanto, tais métodos não contemplam a distribuição binomial nem bancos de dados com observações faltantes. O presente trabalho generalizou os métodos já desenvolvidos para essas duas situações. Na construção de gráficos de probabilidade meio-normal com envelope simulado para a distribuição binomial, foi proposto um método para geração de variáveis aleatórias com distribuição marginal binomial correlacionadas, baseado na convolução de variáveis com distribuição de Poisson independentes. Os métodos de diagnóstico desenvolvidos foram aplicados em dados reais e simulados. / Literature provides diagnostic methods to assess the fit of generalized linear models (GLM) for repeated measures based on generalized estimating equations (GEE). Still, such methods do not include the binomial distribution or databases with missing observations. This work generalizes the methods already developed for these two situations. A method for generating random variables with correlated marginal binomial distributions based on convolution of independent Poisson random variables has been proposed for the construction of half-normal probability plots. The diagnostic methods developed were applied to real and simulated data. Correlation structure Dados faltantes Diagnostic techniques Equações de estimação generalizadas Generalized estimating equation Medidas repetidas Missing data Repeated measures Simulação de variáveis aleatórias Simulation of random variables. Técnicas de diagnóstico.
218	Análise de dados categorizados com omissão / Analysis of categorical data with missingness Poleto, Frederico Zanqueta 30 August 2006 (has links) Neste trabalho aborda-se aspectos teóricos, computacionais e aplicados de análises clássicas de dados categorizados com omissão. Uma revisão da literatura é apresentada enquanto se introduz os mecanismos de omissão, mostrando suas características e implicações nas inferências de interesse por meio de um exemplo considerando duas variáveis respostas dicotômicas e estudos de simulação. Amplia-se a modelagem descrita em Paulino (1991, Brazilian Journal of Probability and Statistics 5, 1-42) da distribuição multinomial para a produto de multinomiais para possibilitar a inclusão de variáveis explicativas na análise. Os resultados são desenvolvidos em formulação matricial adequada para a implementação computacional, que é realizada com a construção de uma biblioteca para o ambiente estatístico R, a qual é disponibilizada para facilitar o traçado das inferências descritas nesta dissertação. A aplicação da teoria é ilustrada por meio de cinco exemplos de características diversas, uma vez que se ajusta modelos estruturais lineares (homogeneidade marginal), log-lineares (independência, razão de chances adjacentes comum) e funcionais lineares (kappa, kappa ponderado, sensibilidade/especificidade, valor preditivo positivo/negativo) para as probabilidades de categorização. Os padrões de omissão também são variados, com omissões em uma ou duas variáveis, confundimento de células vizinhas, sem ou com subpopulações. / We consider theoretical, computational and applied aspects of classical categorical data analyses with missingness. We present a literature review while introducing the missingness mechanisms, highlighting their characteristics and implications in the inferences of interest by means of an example involving two binary responses and simulation studies. We extend the multinomial modeling scenario described in Paulino (1991, Brazilian Journal of Probability and Statistics 5, 1-42) to the product-multinomial setup to allow for the inclusion of explanatory variables. We develop the results in matrix formulation and implement the computational procedures via subroutines written under R statistical environment. We illustrate the application of the theory by means of five examples with different characteristics, fitting structural linear (marginal homogeneity), log-linear (independence, constant adjacent odds ratio) and functional linear models (kappa, weighted kappa, sensitivity/specificity, positive/negative predictive value) for the marginal probabilities. The missingness patterns includes missingness in one or two variables, neighbor cells confounded, with or without explanatory variables. categorical data dados categorizados dados faltantes dados incompletos dados omissos ignorable mechanism incomplete data MAR MAR MCAR MCAR mecanismo ignorável mecanismo não-ignorável missing data MNAR MNAR modelos de seleção non-ignorable mechanism selection models
219	Elastic matching for classification and modelisation of incomplete time series / Appariement élastique pour la classification et la modélisation de séries temporelles incomplètes Phan, Thi-Thu-Hong 12 October 2018 (has links) Les données manquantes constituent un challenge commun en reconnaissance de forme et traitement de signal. Une grande partie des techniques actuelles de ces domaines ne gère pas l'absence de données et devient inutilisable face à des jeux incomplets. L'absence de données conduit aussi à une perte d'information, des difficultés à interpréter correctement le reste des données présentes et des résultats biaisés notamment avec de larges sous-séquences absentes. Ainsi, ce travail de thèse se focalise sur la complétion de larges séquences manquantes dans les séries monovariées puis multivariées peu ou faiblement corrélées. Un premier axe de travail a été une recherche d'une requête similaire à la fenêtre englobant (avant/après) le trou. Cette approche est basée sur une comparaison de signaux à partir d'un algorithme d'extraction de caractéristiques géométriques (formes) et d'une mesure d'appariement élastique (DTW - Dynamic Time Warping). Un package R CRAN a été développé, DTWBI pour la complétion de série monovariée et DTWUMI pour des séries multidimensionnelles dont les signaux sont non ou faiblement corrélés. Ces deux approches ont été comparées aux approches classiques et récentes de la littérature et ont montré leur faculté de respecter la forme et la dynamique du signal. Concernant les signaux peu ou pas corrélés, un package DTWUMI a aussi été développé. Le second axe a été de construire une similarité floue capable de prender en compte les incertitudes de formes et d'amplitude du signal. Le système FSMUMI proposé est basé sur une combinaison floue de similarités classiques et un ensemble de règles floues. Ces approches ont été appliquées à des données marines et météorologiques dans plusieurs contextes : classification supervisée de cytogrammes phytoplanctoniques, segmentation non supervisée en états environnementaux d'un jeu de 19 capteurs issus d'une station marine MAREL CARNOT en France et la prédiction météorologique de données collectées au Vietnam. / Missing data are a prevalent problem in many domains of pattern recognition and signal processing. Most of the existing techniques in the literature suffer from one major drawback, which is their inability to process incomplete datasets. Missing data produce a loss of information and thus yield inaccurate data interpretation, biased results or unreliable analysis, especially for large missing sub-sequence(s). So, this thesis focuses on dealing with large consecutive missing values in univariate and low/un-correlated multivariate time series. We begin by investigating an imputation method to overcome these issues in univariate time series. This approach is based on the combination of shape-feature extraction algorithm and Dynamic Time Warping method. A new R-package, namely DTWBI, is then developed. In the following work, the DTWBI approach is extended to complete large successive missing data in low/un-correlated multivariate time series (called DTWUMI) and a DTWUMI R-package is also established. The key of these two proposed methods is that using the elastic matching to retrieving similar values in the series before and/or after the missing values. This optimizes as much as possible the dynamics and shape of knowledge data, and while applying the shape-feature extraction algorithm allows to reduce the computing time. Successively, we introduce a new method for filling large successive missing values in low/un-correlated multivariate time series, namely FSMUMI, which enables to manage a high level of uncertainty. In this way, we propose to use a novel fuzzy grades of basic similarity measures and fuzzy logic rules. Finally, we employ the DTWBI to (i) complete the MAREL Carnot dataset and then we perform a detection of rare/extreme events in this database (ii) forecast various meteorological univariate time series collected in Vietnam Imputation Données manquantes Séries temporelles univariées Dynamic Time Warping Mesure de similarité Système d'inférence floue Imputation Missing data Univariate time series Uncorrelated multivariate time series Dynamic Time Warping Similarity measure Fuzzy inference system
220	Decision Making System Algorithm On Menopause Data Set Bacak, Hikmet Ozge 01 September 2007 (has links) (PDF) Multiple-centered clustering method and decision making system algorithm on menopause data set depending on multiple-centered clustering are described in this study. This method consists of two stages. At the first stage, fuzzy C-means (FCM) clustering algorithm is applied on the data set under consideration with a high number of cluster centers. As the output of FCM, cluster centers and membership function values for each data member is calculated. At the second stage, original cluster centers obtained in the first stage are merged till the new numbers of clusters are reached. Merging process relies upon a &ldquo / similarity measure&rdquo / between clusters defined in the thesis. During the merging process, the cluster center coordinates do not change but the data members in these clusters are merged in a new cluster. As the output of this method, therefore, one obtains clusters which include many cluster centers. In the final part of this study, an application of the clustering algorithms &ndash / including the multiple centered clustering method &ndash / a decision making system is constructed using a special data on menopause treatment. The decisions are based on the clusterings created by the algorithms already discussed in the previous chapters of the thesis. A verification of the decision making system / v decision aid system is done by a team of experts from the Department of Department of Obstetrics and Gynecology of Hacettepe University under the guidance of Prof. Sinan Beksa&ccedil / .

Search results