Global ETD Search

221	Decision Making System Algorithm On Menopause Data Set Bacak, Hikmet Ozge 01 September 2007 (has links) (PDF) Multiple-centered clustering method and decision making system algorithm on menopause data set depending on multiple-centered clustering are described in this study. This method consists of two stages. At the first stage, fuzzy C-means (FCM) clustering algorithm is applied on the data set under consideration with a high number of cluster centers. As the output of FCM, cluster centers and membership function values for each data member is calculated. At the second stage, original cluster centers obtained in the first stage are merged till the new numbers of clusters are reached. Merging process relies upon a &ldquo / similarity measure&rdquo / between clusters defined in the thesis. During the merging process, the cluster center coordinates do not change but the data members in these clusters are merged in a new cluster. As the output of this method, therefore, one obtains clusters which include many cluster centers. In the final part of this study, an application of the clustering algorithms &ndash / including the multiple centered clustering method &ndash / a decision making system is constructed using a special data on menopause treatment. The decisions are based on the clusterings created by the algorithms already discussed in the previous chapters of the thesis. A verification of the decision making system / v decision aid system is done by a team of experts from the Department of Department of Obstetrics and Gynecology of Hacettepe University under the guidance of Prof. Sinan Beksa&ccedil / .
222	Signal Processing for Spectroscopic Applications Gudmundson, Erik January 2010 (has links) Spectroscopic techniques allow for studies of materials and organisms on the atomic and molecular level. Examples of such techniques are nuclear magnetic resonance (NMR) spectroscopy—one of the principal techniques to obtain physical, chemical, electronic and structural information about molecules—and magnetic resonance imaging (MRI)—an important medical imaging technique for, e.g., visualization of the internal structure of the human body. The less well-known spectroscopic technique of nuclear quadrupole resonance (NQR) is related to NMR and MRI but with the difference that no external magnetic field is needed. NQR has found applications in, e.g., detection of explosives and narcotics. The first part of this thesis is focused on detection and identification of solid and liquid explosives using both NQR and NMR data. Methods allowing for uncertainties in the assumed signal amplitudes are proposed, as well as methods for estimation of model parameters that allow for non-uniform sampling of the data. The second part treats two medical applications. Firstly, new, fast methods for parameter estimation in MRI data are presented. MRI can be used for, e.g., the diagnosis of anomalies in the skin or in the brain. The presented methods allow for a significant decrease in computational complexity without loss in performance. Secondly, the estimation of blood flow velo-city using medical ultrasound scanners is addressed. Information about anomalies in the blood flow dynamics is an important tool for the diagnosis of, for example, stenosis and atherosclerosis. The presented methods make no assumption on the sampling schemes, allowing for duplex mode transmissions where B-mode images are interleaved with the Doppler emissions. spectral estimation spectroscopic techniques MRI NMR NQR signal detection parameter estimation missing data non-uniform sampling damped sinusoids detection of explosives T1-mapping relaxometry brain imaging skin imaging blood velocity estimation medical ultrasound Signal processing Signalbehandling
223	Comparaison de quatre méthodes pour le traitement des données manquantes au sein d’un modèle multiniveau paramétrique visant l’estimation de l’effet d’une intervention Paquin, Stéphane 03 1900 (has links) Les données manquantes sont fréquentes dans les enquêtes et peuvent entraîner d’importantes erreurs d’estimation de paramètres. Ce mémoire méthodologique en sociologie porte sur l’influence des données manquantes sur l’estimation de l’effet d’un programme de prévention. Les deux premières sections exposent les possibilités de biais engendrées par les données manquantes et présentent les approches théoriques permettant de les décrire. La troisième section porte sur les méthodes de traitement des données manquantes. Les méthodes classiques sont décrites ainsi que trois méthodes récentes. La quatrième section contient une présentation de l’Enquête longitudinale et expérimentale de Montréal (ELEM) et une description des données utilisées. La cinquième expose les analyses effectuées, elle contient : la méthode d’analyse de l’effet d’une intervention à partir de données longitudinales, une description approfondie des données manquantes de l’ELEM ainsi qu’un diagnostic des schémas et du mécanisme. La sixième section contient les résultats de l’estimation de l’effet du programme selon différents postulats concernant le mécanisme des données manquantes et selon quatre méthodes : l’analyse des cas complets, le maximum de vraisemblance, la pondération et l’imputation multiple. Ils indiquent (I) que le postulat sur le type de mécanisme MAR des données manquantes semble influencer l’estimation de l’effet du programme et que (II) les estimations obtenues par différentes méthodes d’estimation mènent à des conclusions similaires sur l’effet de l’intervention. / Missing data are common in empirical research and can lead to significant errors in parameters’ estimation. This dissertation in the field of methodological sociology addresses the influence of missing data on the estimation of the impact of a prevention program. The first two sections outline the potential bias caused by missing data and present the theoretical background to describe them. The third section focuses on methods for handling missing data, conventional methods are exposed as well as three recent ones. The fourth section contains a description of the Montreal Longitudinal Experimental Study (MLES) and of the data used. The fifth section presents the analysis performed, it contains: the method for analysing the effect of an intervention from longitudinal data, a detailed description of the missing data of MLES and a diagnosis of patterns and mechanisms. The sixth section contains the results of estimating the effect of the program under different assumptions about the mechanism of missing data and by four methods: complete case analysis, maximum likelihood, weighting and multiple imputation. They indicate (I) that the assumption on the type of MAR mechanism seems to affect the estimate of the program’s impact and, (II) that the estimates obtained using different estimation methods leads to similar conclusions about the intervention’s effect. Données manquantes Imputation multiple Maximum de vraisemblance Pondération Mécanisme de données manquantes Multiniveau Intervention Analyse longitudinale Analyse de sensibilité Sensitivity analysis Longitudinal Multilevel Experimental Mecanism Missing data Maximum likelihood Weighting Multiple imputation
224	Amélioration de l'exactitude de l'inférence phylogénomique Roure, Béatrice 04 1900 (has links) L’explosion du nombre de séquences permet à la phylogénomique, c’est-à-dire l’étude des liens de parenté entre espèces à partir de grands alignements multi-gènes, de prendre son essor. C’est incontestablement un moyen de pallier aux erreurs stochastiques des phylogénies simple gène, mais de nombreux problèmes demeurent malgré les progrès réalisés dans la modélisation du processus évolutif. Dans cette thèse, nous nous attachons à caractériser certains aspects du mauvais ajustement du modèle aux données, et à étudier leur impact sur l’exactitude de l’inférence. Contrairement à l’hétérotachie, la variation au cours du temps du processus de substitution en acides aminés a reçu peu d’attention jusqu’alors. Non seulement nous montrons que cette hétérogénéité est largement répandue chez les animaux, mais aussi que son existence peut nuire à la qualité de l’inférence phylogénomique. Ainsi en l’absence d’un modèle adéquat, la suppression des colonnes hétérogènes, mal gérées par le modèle, peut faire disparaître un artéfact de reconstruction. Dans un cadre phylogénomique, les techniques de séquençage utilisées impliquent souvent que tous les gènes ne sont pas présents pour toutes les espèces. La controverse sur l’impact de la quantité de cellules vides a récemment été réactualisée, mais la majorité des études sur les données manquantes sont faites sur de petits jeux de séquences simulées. Nous nous sommes donc intéressés à quantifier cet impact dans le cas d’un large alignement de données réelles. Pour un taux raisonnable de données manquantes, il appert que l’incomplétude de l’alignement affecte moins l’exactitude de l’inférence que le choix du modèle. Au contraire, l’ajout d’une séquence incomplète mais qui casse une longue branche peut restaurer, au moins partiellement, une phylogénie erronée. Comme les violations de modèle constituent toujours la limitation majeure dans l’exactitude de l’inférence phylogénétique, l’amélioration de l’échantillonnage des espèces et des gènes reste une alternative utile en l’absence d’un modèle adéquat. Nous avons donc développé un logiciel de sélection de séquences qui construit des jeux de données reproductibles, en se basant sur la quantité de données présentes, la vitesse d’évolution et les biais de composition. Lors de cette étude nous avons montré que l’expertise humaine apporte pour l’instant encore un savoir incontournable. Les différentes analyses réalisées pour cette thèse concluent à l’importance primordiale du modèle évolutif. / The explosion of sequence number allows for phylogenomics, the study of species relationships based on large multi-gene alignments, to flourish. Without any doubt, phylogenomics is essentially an efficient way to eliminate the problems of single gene phylogenies due to stochastic errors, but numerous problems remain despite obvious progress realized in modeling evolutionary process. In this PhD-thesis, we are trying to characterize some consequences of a poor model fit and to study their impact on the accuracy of the phylogenetic inference. In contrast to heterotachy, the variation in the amino acid substitution process over time did not attract so far a lot of attention. We demonstrate that this heterogeneity is frequently observed within animals, but also that its existence can interfere with the quality of phylogenomic inference. In absence of an adequate model, the elimination of heterogeneous columns, which are poorly handled by the model, can eliminate an artefactual reconstruction. In a phylogenomic framework, the sequencing strategies often result in a situation where some genes are absent for some species. The issue about the impact of the quantity of empty cells was recently relaunched, but the majority of studies on missing data is performed on small datasets of simulated sequences. Therefore, we were interested on measuring the impact in the case of a large alignment of real data. With a reasonable amount of missing data, it seems that the accuracy of the inference is influenced rather by the choice of the model than the incompleteness of the alignment. For example, the addition of an incomplete sequence that breaks a long branch can at least partially re-establish an artefactual phylogeny. Because, model violations are always representing the major limitation of the accuracy of the phylogenetic inference, the improvement of species and gene sampling remains a useful alternative in the absence of an adequate model. Therefore, we developed a sequence-selection software, which allows the reproducible construction of datasets, based on the quantity of data, their evolutionary speed and their compositional bias. During this study, we did realize that the human expertise still furnishes an indispensable knowledge. The various analyses performed in the course of this PhD thesis agree on the primordial importance of the model of sequence evolution. Phylogénomique Exactitude de l’inférence Hétéropécilie Échantillonnage des espèces Sélection des séquences Données manquantes Violation de modèle Phylogenomics Accuracy of the inference Heteropecilly Species sampling Sequence sorting Missing data Model violation
225	Probabilistic Estimation of Unobserved Process Events Rogge-Solti, Andreas January 2014 (has links) Organizations try to gain competitive advantages, and to increase customer satisfaction. To ensure the quality and efficiency of their business processes, they perform business process management. An important part of process management that happens on the daily operational level is process controlling. A prerequisite of controlling is process monitoring, i.e., keeping track of the performed activities in running process instances. Only by process monitoring can business analysts detect delays and react to deviations from the expected or guaranteed performance of a process instance. To enable monitoring, process events need to be collected from the process environment. When a business process is orchestrated by a process execution engine, monitoring is available for all orchestrated process activities. Many business processes, however, do not lend themselves to automatic orchestration, e.g., because of required freedom of action. This situation is often encountered in hospitals, where most business processes are manually enacted. Hence, in practice it is often inefficient or infeasible to document and monitor every process activity. Additionally, manual process execution and documentation is prone to errors, e.g., documentation of activities can be forgotten. Thus, organizations face the challenge of process events that occur, but are not observed by the monitoring environment. These unobserved process events can serve as basis for operational process decisions, even without exact knowledge of when they happened or when they will happen. An exemplary decision is whether to invest more resources to manage timely completion of a case, anticipating that the process end event will occur too late. This thesis offers means to reason about unobserved process events in a probabilistic way. We address decisive questions of process managers (e.g., "when will the case be finished?", or "when did we perform the activity that we forgot to document?") in this thesis. As main contribution, we introduce an advanced probabilistic model to business process management that is based on a stochastic variant of Petri nets. We present a holistic approach to use the model effectively along the business process lifecycle. Therefore, we provide techniques to discover such models from historical observations, to predict the termination time of processes, and to ensure quality by missing data management. We propose mechanisms to optimize configuration for monitoring and prediction, i.e., to offer guidance in selecting important activities to monitor. An implementation is provided as a proof of concept. For evaluation, we compare the accuracy of the approach with that of state-of-the-art approaches using real process data of a hospital. Additionally, we show its more general applicability in other domains by applying the approach on process data from logistics and finance. / Unternehmen versuchen Wettbewerbsvorteile zu gewinnen und die Kundenzufriedenheit zu erhöhen. Um die Qualität und die Effizienz ihrer Prozesse zu gewährleisten, wenden Unternehmen Geschäftsprozessmanagement an. Hierbei spielt die Prozesskontrolle im täglichen Betrieb eine wichtige Rolle. Prozesskontrolle wird durch Prozessmonitoring ermöglicht, d.h. durch die Überwachung des Prozessfortschritts laufender Prozessinstanzen. So können Verzögerungen entdeckt und es kann entsprechend reagiert werden, um Prozesse wie erwartet und termingerecht beenden zu können. Um Prozessmonitoring zu ermöglichen, müssen prozessrelevante Ereignisse aus der Prozessumgebung gesammelt und ausgewertet werden. Sofern eine Prozessausführungsengine die Orchestrierung von Geschäftsprozessen übernimmt, kann jede Prozessaktivität überwacht werden. Aber viele Geschäftsprozesse eignen sich nicht für automatisierte Orchestrierung, da sie z.B. besonders viel Handlungsfreiheit erfordern. Dies ist in Krankenhäusern der Fall, in denen Geschäftsprozesse oft manuell durchgeführt werden. Daher ist es meist umständlich oder unmöglich, jeden Prozessfortschritt zu erfassen. Zudem ist händische Prozessausführung und -dokumentation fehleranfällig, so wird z.B. manchmal vergessen zu dokumentieren. Eine Herausforderung für Unternehmen ist, dass manche Prozessereignisse nicht im Prozessmonitoring erfasst werden. Solch unbeobachtete Prozessereignisse können jedoch als Entscheidungsgrundlage dienen, selbst wenn kein exaktes Wissen über den Zeitpunkt ihres Auftretens vorliegt. Zum Beispiel ist bei der Prozesskontrolle zu entscheiden, ob zusätzliche Ressourcen eingesetzt werden sollen, wenn eine Verspätung angenommen wird. Diese Arbeit stellt einen probabilistischen Ansatz für den Umgang mit unbeobachteten Prozessereignissen vor. Dabei werden entscheidende Fragen von Prozessmanagern beantwortet (z.B. "Wann werden wir den Fall beenden?", oder "Wann wurde die Aktivität ausgeführt, die nicht dokumentiert wurde?"). Der Hauptbeitrag der Arbeit ist die Einführung eines erweiterten probabilistischen Modells ins Geschäftsprozessmanagement, das auf stochastischen Petri Netzen basiert. Dabei wird ein ganzheitlicher Ansatz zur Unterstützung der einzelnen Phasen des Geschäftsprozesslebenszyklus verfolgt. Es werden Techniken zum Lernen des probabilistischen Modells, zum Vorhersagen des Zeitpunkts des Prozessendes, zum Qualitätsmanagement von Dokumentationen durch Erkennung fehlender Einträge, und zur Optimierung von Monitoringkonfigurationen bereitgestellt. Letztere dient zur Auswahl von relevanten Stellen im Prozess, die beobachtet werden sollten. Diese Techniken wurden in einer quelloffenen prototypischen Anwendung implementiert. Zur Evaluierung wird der Ansatz mit existierenden Alternativen an echten Prozessdaten eines Krankenhauses gemessen. Die generelle Anwendbarkeit in weiteren Domänen wird examplarisch an Prozessdaten aus der Logistik und dem Finanzwesen gezeigt. Geschäftsprozessmanagement stochastische Petri Netze Bayessche Netze Probabilistische Modelle Vorhersage Fehlende Daten Process Mining business process management stochastic Petri nets Bayesian networks probabilistic models prediction missing data process mining Data processing Computer science
226	Methodology for Handling Missing Data in Nonlinear Mixed Effects Modelling Johansson, Åsa M. January 2014 (has links) To obtain a better understanding of the pharmacokinetic and/or pharmacodynamic characteristics of an investigated treatment, clinical data is often analysed with nonlinear mixed effects modelling. The developed models can be used to design future clinical trials or to guide individualised drug treatment. Missing data is a frequently encountered problem in analyses of clinical data, and to not venture the predictability of the developed model, it is of great importance that the method chosen to handle the missing data is adequate for its purpose. The overall aim of this thesis was to develop methods for handling missing data in the context of nonlinear mixed effects models and to compare strategies for handling missing data in order to provide guidance for efficient handling and consequences of inappropriate handling of missing data. In accordance with missing data theory, all missing data can be divided into three categories; missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). When data are MCAR, the underlying missing data mechanism does not depend on any observed or unobserved data; when data are MAR, the underlying missing data mechanism depends on observed data but not on unobserved data; when data are MNAR, the underlying missing data mechanism depends on the unobserved data itself. Strategies and methods for handling missing observation data and missing covariate data were evaluated. These evaluations showed that the most frequently used estimation algorithm in nonlinear mixed effects modelling (first-order conditional estimation), resulted in biased parameter estimates independent on missing data mechanism. However, expectation maximization (EM) algorithms (e.g. importance sampling) resulted in unbiased and precise parameter estimates as long as data were MCAR or MAR. When the observation data are MNAR, a proper method for handling the missing data has to be applied to obtain unbiased and precise parameter estimates, independent on estimation algorithm. The evaluation of different methods for handling missing covariate data showed that a correctly implemented multiple imputations method and full maximum likelihood modelling methods resulted in unbiased and precise parameter estimates when covariate data were MCAR or MAR. When the covariate data were MNAR, the only method resulting in unbiased and precise parameter estimates was a full maximum likelihood modelling method where an extra parameter was estimated, correcting for the unknown missing data mechanism's dependence on the missing data. This thesis presents new insight to the dynamics of missing data in nonlinear mixed effects modelling. Strategies for handling different types of missing data have been developed and compared in order to provide guidance for efficient handling and consequences of inappropriate handling of missing data. Pharmacometrics population models censored observations missing covariates missing dependent variable missing data mechanism missing completely at random (MCAR) missing at random (MAR) missing not at random (MNAR) estimation algorithms
227	資料採礦中的資料純化過程之效果評估楊惠如 Unknown Date (has links) 數年來台灣金控公司已如雨後春筍般冒出來，在金控公司底下含有產險公司、銀行、證券以及人壽公司等許多金融相關公司，因此，原本各自擺放於各子公司的資料庫可以通通整合在一起，當高階主管想提出決策時可利用資料庫進行資料採礦，以獲取有用的資訊。然而資料採礦的效果再怎麼神奇，也必須先有一個好的、完整的資料庫供使用，如果資料品質太差或者資料內容與研究目標無關，這是無法達成完美的資料採礦工作。透過抽樣調查與函數映射的方法使得資料庫得以加值，因此當有目標資料庫與輔助資料庫時，可以利用函數映射方法使資料庫整合為一個大資料庫，再將資料庫中遺失值或稀少值作插補得到增值後的資料庫。在此給予這個整個流程一個名詞 ”Data SPA(Data Systematic Purifying Analysis)”，即「資料純化」。在本研究中，主要就是針對純化完成的資料進行結構的確認，確認經過這些過程之後的資料是效用且正確的。在本研究採用了橫向評估、縱向評估與全面性評估三種方法來檢驗資料。資料純化後的資料經過三項評估後，可以發現資料以每個變數或者每筆觀察樣本的角度去查驗資料時，資料的表現並不理想，但是，資料的整體性卻是相當不錯。雖然以橫向評估和縱向評估來看，資料純化後的資料無法與原本完整的資料完全一致，但是透過資料純化的過程，資料得以插補且欄位得以擴增，這樣使得資料的資訊量增加，所以，資料純化確實有其效果，因為資訊量的增加對於要進行資料採礦的資料庫是一大助益。 / For the past few years, Taiwan has experienced a tremendous growth in its financial industry namely in banks, life and property insurances, brokerages and security firms. Needless to say the need to store the data produced in this industry has become an important and a primary task to accomplish. Originally, firms store the data in their own database. With the progressive development of data management, the data now can be combined and stored into one large database that allows the users an easy access for data retrieval. However, if the quality of the data is questionable, then the existence of database would not provide much insightful information to the users. To tackle the fore mentioned problem, this research uses functional mapping combining the goal and auxiliary database and then imputes the missing data or the rare data from the combined database. This whole process is called Data Systematic Purifying Analysis (Data SPA). The purpose of this research is to evaluate whether there is any improvement of the structure of the data when the data has gone through the process of systematic purifying analysis. Generally the resulting data should be within good quality and useful. After the assessments of the data structure, the behavior of the data with respect to their added variables and observations is unsatisfactory. However the manifestation of the data as a whole has seen an improvement. The modified database through Data SPA has augmented the database making it more efficient to the usage of data mining techniques. 資料純化資料採礦遺漏值插補函數映射資料庫加值 Data Systematic Purifying Analysis Data Mining Missing Data Rare Data Imputation Functional Mapping Database Value-Added
228	Imputação filogenética: uma perspectiva macroecológica / Phylogenetic imputation: a macroecological perspective Jardim, Lucas Lacerda Caldas Zanini 27 April 2018 (has links) Submitted by Onia Arantes Albuquerque (onia.ufg@gmail.com) on 2018-10-15T15:02:15Z No. of bitstreams: 2 Tese - Lucas Lacerda Caldas Zanini Jardim - 2018.pdf: 5066072 bytes, checksum: 4280b5b19a9111a59fea8065049fd5b3 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-10-15T15:25:17Z (GMT) No. of bitstreams: 2 Tese - Lucas Lacerda Caldas Zanini Jardim - 2018.pdf: 5066072 bytes, checksum: 4280b5b19a9111a59fea8065049fd5b3 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-10-15T15:25:17Z (GMT). No. of bitstreams: 2 Tese - Lucas Lacerda Caldas Zanini Jardim - 2018.pdf: 5066072 bytes, checksum: 4280b5b19a9111a59fea8065049fd5b3 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-04-27 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Macroecology studies ecological pattern at large geographical and temporal scales. At these scales, information about hundreds or even thousands of studied species. This lack of information may potentially bias studies’ conclusions related with macroecological processes and patterns. In this thesis, we evaluated phylogenetic imputation methods, their uses and effects in macroecological studies. The first chapter evaluated different methods used to deal with missing data, taking into account different scenarios of species trait evolution, as well as percentage and pattern of missing data. We found that dealing with missing data relies on the specific goals and data of the study. Therefore, we suggested caution while using imputed database. In the second chapter, we tested the island rule effect in body mass and brain volume of primates. To do so, we fitted evolutionary models to those traits and then imputed the body mass and brain volume for Homo floresiensis. We concluded that primates do not follow the island rule and even though our models overestimated, on average, brain and body size of Homo floresiensis, its evolution did not deviate from primates’ evolutionary expectation. Lastly, in the third chapter, we tested existence of Bergmann’s rule in mammals using multiple imputation methods, in addition to considering the consequences of ignoring missing data while testing the rule. We found that ignoring missing data can invert (eg. changing from positive to negative effect) the effect of temperature on body mass, but this bias did not turn the effect statistically significant. Therefore, we concluded that mammals do not follow Bergmann’s rule, when evaluated at the class taxonomic level. Finally, this thesis discussed pros, cons and future research avenues in order to make phylogenetic imputation a more robust tool to deal with missing data in macroecology. / A macroecologia estuda padrões ecológicos em grandes escalas geográficas e temporais, em busca de quais processos moldam esses padrões. Nessas escalas de estudo, há raramente informações completas sobre as centenas ou até milhares de espécies estudadas. Essa ausência de informações tem o potencial de enviesar as conclusões dos estudos sobre padrões e processos macroecológicos. Nessa tese, nós avaliamos métodos de imputação filogenética, a sua aplicação e consequências em estudos macroecológicos. Para avaliar potenciais vieses do uso de banco de dados imputados, no primeiro capítulo, nós aplicamos diferentes métodos utilizados para tratar dados faltantes, sob diferentes cenários de evolução dos atributos das espécies, porcentagem e padrão dos dados faltantes. Nós encontramos que a forma de tratar o dado faltante pode ser dependente dos objetivos e dos dados de cada estudo e, portanto, nós sugerimos cautela ao utilizarmos bancos de dados imputados. No segundo capítulo, nós testamos o efeito da regra de ilha na evolução da massa corpórea e do volume cerebral de primatas. A partir dos melhores modelos evolutivos ajustados a esses atributos, nós imputamos a massa corpórea e volume cerebral de Homo floresiensis. Nós concluímos que primatas não seguem regra de ilha e que apesar de nossos modelos superestimarem, em média, o tamanho do corpo e cérebro de Homo floresiensis, a sua evolução não se desvia do esperado pela evolução de primatas. Por fim, no terceiro capítulo testamos a regra de Bergmann em mamíferos, utilizando métodos de imputação múltipla e avaliamos as consequências de desconsiderar os dados faltantes na detecção da regra. Nós encontramos que testar a regra sem considerar os dados faltantes pode inverter o efeito da temperatura na massa do corpo, mas esse viés não tornou o efeito estatisticamente significante. Portanto, concluímos que mamíferos não seguem a regra de Bergmann, quando toda a classe é avaliada. Por fim, essa tese discutiu vantagens, desvantagens e futuras linhas de pesquisa para tornar a imputação filogenética uma ferramenta mais robusta para tratarmos dados faltantes em macroecologia. Imputação múltipla Imputação filogenética Macroecologia Dados faltantes Lacuna de conhecimento Regra de Bergmann Homo floresiensis Regra de ilha Multiple imputation Phylogenetic imputation Macroecology Missing data Biodiversity knowledge shortfall Bergmann’s rule Island rule Homo floresiensis CIENCIAS BIOLOGICAS::ECOLOGIA
229	Modelagem de mudanças climáticas: do nicho fundamental à conservação da biodiversidade / Climate change modeling: from the fundamental niche to biodiversity conservation Faleiro, Frederico Augusto Martins Valtuille 07 March 2016 (has links) Submitted by Cássia Santos (cassia.bcufg@gmail.com) on 2016-05-31T09:35:51Z No. of bitstreams: 2 Tese - Frederico Augusto Martins Valtuille Faleiro - 2016.pdf: 7096330 bytes, checksum: 04cfce04ef128c5bd6e99ce18bb7f650 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2016-05-31T10:52:51Z (GMT) No. of bitstreams: 2 Tese - Frederico Augusto Martins Valtuille Faleiro - 2016.pdf: 7096330 bytes, checksum: 04cfce04ef128c5bd6e99ce18bb7f650 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2016-05-31T10:52:51Z (GMT). No. of bitstreams: 2 Tese - Frederico Augusto Martins Valtuille Faleiro - 2016.pdf: 7096330 bytes, checksum: 04cfce04ef128c5bd6e99ce18bb7f650 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Previous issue date: 2016-03-07 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / The climate changes are one of the major threats to the biodiversity and it is expected to increase its impact along the 21st century. The climate change affect all levels of the biodiversity from individuals to biomes, reducing the ecosystem services. Despite of this, the prediction of climate change impacts on biodiversity is still a challenge. Overcoming these issues depends on improvements in different aspects of science that support predictions of climate change impact on biodiversity. The common practice to predict the climate change impact consists in formulate ecological niche models based in the current climate and project the changes based in the future climate predicted by the climate models. However, there are some recognized limitations both in the formulation of the ecological niche model and in the use of predictions from the climate models that need to be analyzed. Here, in the first chapter we review the science behind the climate models in order to reduce the knowledge gap between the scientific community that formulate the climate models and the community that use the predictions of these models. We showed that there is not consensus about evaluate the climate models, obtain regional models with higher spatial resolution and define consensual models. However, we gave some guidelines for use the predictions of the climate models. In the second chapter, we tested if the predictions of correlative ecological niche models fitted with presence-absence match the predictions of models fitted with abundance data on the metrics of climate change impact on orchid bees in the Atlantic Forest. We found that the presence-absence models were a partial proxy of change in abundance when the output of the models was continuous, but the same was not true when the predictions were converted to binary. The orchid bees in general will decrease the abundance in the future, but will retain a good amount of suitable sites in the future and the distance to gained climatic suitable areas can be very close, despite of great variation. The change in the species richness and turnover will be mainly in the western and some regions of southern of the Atlantic Forest. In the third chapter, we discussed the drawbacks in using the estimations of realized niche instead the fundamental niche, such as overpredicting the effect of climate change on species’ extinction risk. We proposed a framework based on phylogenetic comparative and missing data methods to predict the dimensions of the fundamental niche of species with missing data. Moreover, we explore sources of uncertainty in predictions of fundamental niche and highlight future directions to overcome current limitations of phylogenetic comparative and missing data methods to improve predictions. We conclude that it is possible to make better use of the current knowledge about species’ fundamental niche with phylogenetic information and auxiliary traits to predict the fundamental niche of poorly-studied species. In the fourth chapter, we used the framework of the chapter three to test the performance of two recent phylogenetic modeling methods to predict the thermal niche of mammals. We showed that PhyloPars had better performance than Phylogenetic Eigenvector Maps in predict the thermal niche. Moreover, the error and bias had similar phylogenetic pattern for both margins of the thermal niche while they had differences in the geographic pattern. The variance in the performance was explained by taxonomic differences and not by methodological aspects. Finally, our models better predicted the upper margin than the lower margin of the thermal niche. This is a good news for predicting the effect of climate change on species without physiological data. We hope our finds can be used to improve the predictions of climate change effect on the biodiversity in future studies and support the political decisions on minimizing the effects of climate change on biodiversity. / As mudanças climáticas são uma das principais ameaças à biodiversidade e é esperado que aumente seu impacto ao longo do século XXI. As mudanças climáticas afetam todos os níveis de biodiversidade, de indivíduos à biomas, reduzindo os serviços ecossistêmicos. Apesar disso, as predições dos impactos das mudanças climáticas na biodiversidade é ainda um desafio. A superação dessas questões depende de melhorias em diferentes aspectos da ciência que dá suporte para predizer o impacto das mudanças climáticas na biodiversidade. A prática comum para predizer o impacto das mudanças climáticas consiste em formular modelos de nicho ecológico baseado no clima atual e projetar as mudanças baseadas no clima futuro predito pelos modelos climáticos. No entanto, existem algumas limitações reconhecidas na formulação do modelo de nicho ecológico e no uso das predições dos modelos climáticos que precisam ser analisadas. Aqui, no primeiro capítulo nós revisamos a ciência por detrás dos modelos climáticos com o intuito de reduzir a lacuna de conhecimentos entre a comunidade científica que formula os modelos climáticos e a comunidade que usa as predições dos modelos. Nós mostramos que não existe consenso sobre avaliar os modelos climáticos, obter modelos regionais com maior resolução espacial e definir modelos consensuais. No entanto, nós damos algumas orientações para usar as predições dos modelos climáticos. No segundo capítulo, nós testamos se as predições dos modelos correlativos de nicho ecológicos ajustados com presença-ausência são congruentes com aqueles ajustados com dados de abundância nas medidas de impacto das mudanças climáticas em abelhas de orquídeas da Mata Atlântica. Nós encontramos que os modelos com presença-ausência foram substitutos parciais das mudanças na abundância quando o resultado dos modelos foi contínuo (adequabilidade), mas o mesmo não ocorreu quando as predições foram convertidas para binárias. As espécies de abelhas, de modo geral, irão diminuir em abundância no futuro, mas reterão uma boa quantidade de locais adequados no futuro e a distância para áreas climáticas adequadas ganhadas podem estar bem próximo, apesar da grande variação. A mudança na riqueza e na substituição de espécies ocorrerá principalmente no Oeste e algumas regiões no sul da Mata Atlântica. No terceiro capítulo, nós discutimos as desvantagens no uso de estimativas do nicho realizado ao invés do nicho fundamental, como superestimar o efeito das mudanças climáticas no risco de extinção das espécies. Nós propomos um esquema geral baseado em métodos filogenéticos comparativos e métodos de dados faltantes para predizer as dimensões do nicho fundamental das espécies com dados faltantes. Além disso, nós exploramos as fontes de incerteza nas predições do nicho fundamental e destacamos direções futuras para superar as limitações atuais dos métodos comparativos filogenéticas e métodos de dados faltantes para melhorar as predições. Nós concluímos que é possível fazer melhor uso do conhecimento atual sobre o nicho fundamental das espécies com informação filogenética e caracteres auxiliares para predizer o nicho fundamental de espécies pouco estudadas. No quarto capítulo, nós usamos o esquema geral do capítulo três para testar a performance de dois novos métodos de modelagem filogenética para predizer o nicho térmico dos mamíferos. Nós mostramos que o “PhyloPars” teve uma melhor performance que o “Phylogenetic Eigenvector Maps” em predizer o nicho térmico. Além disso, o erro e o viés tiveram um padrão filogenético similar para ambas as margens do nicho térmico, enquanto eles apresentaram diferentes padrões espaciais. A variância na performance foi explicada pelas diferenças taxonômicas e não pelas diferenças em aspectos metodológicos. Finalmente, nossos modelos melhor predizem a margem superior do que a margem inferior do nicho térmico. Essa é uma boa notícia para predizer o efeito das mudanças climáticas em espécies sem dados fisiológicos. Nós esperamos que nossos resultados possam ser usados para melhorar as predições do efeito das mudanças climáticas na biodiversidade em estudos futuros e dar suporte para decisões políticas para minimização dos efeitos das mudanças climáticas na biodiversidade. Impacto das mudanças climáticas Modelagem de distribuição de espécies Modelos do sistema terrestre Imputação filogenética Dados faltantes Climate change impact Species distribution modeling Earth system models Phylogenetic imputation Missing data CIENCIAS BIOLOGICAS::ECOLOGIA
230	Análise de dados categorizados com omissão / Analysis of categorical data with missingness Frederico Zanqueta Poleto 30 August 2006 (has links) Neste trabalho aborda-se aspectos teóricos, computacionais e aplicados de análises clássicas de dados categorizados com omissão. Uma revisão da literatura é apresentada enquanto se introduz os mecanismos de omissão, mostrando suas características e implicações nas inferências de interesse por meio de um exemplo considerando duas variáveis respostas dicotômicas e estudos de simulação. Amplia-se a modelagem descrita em Paulino (1991, Brazilian Journal of Probability and Statistics 5, 1-42) da distribuição multinomial para a produto de multinomiais para possibilitar a inclusão de variáveis explicativas na análise. Os resultados são desenvolvidos em formulação matricial adequada para a implementação computacional, que é realizada com a construção de uma biblioteca para o ambiente estatístico R, a qual é disponibilizada para facilitar o traçado das inferências descritas nesta dissertação. A aplicação da teoria é ilustrada por meio de cinco exemplos de características diversas, uma vez que se ajusta modelos estruturais lineares (homogeneidade marginal), log-lineares (independência, razão de chances adjacentes comum) e funcionais lineares (kappa, kappa ponderado, sensibilidade/especificidade, valor preditivo positivo/negativo) para as probabilidades de categorização. Os padrões de omissão também são variados, com omissões em uma ou duas variáveis, confundimento de células vizinhas, sem ou com subpopulações. / We consider theoretical, computational and applied aspects of classical categorical data analyses with missingness. We present a literature review while introducing the missingness mechanisms, highlighting their characteristics and implications in the inferences of interest by means of an example involving two binary responses and simulation studies. We extend the multinomial modeling scenario described in Paulino (1991, Brazilian Journal of Probability and Statistics 5, 1-42) to the product-multinomial setup to allow for the inclusion of explanatory variables. We develop the results in matrix formulation and implement the computational procedures via subroutines written under R statistical environment. We illustrate the application of the theory by means of five examples with different characteristics, fitting structural linear (marginal homogeneity), log-linear (independence, constant adjacent odds ratio) and functional linear models (kappa, weighted kappa, sensitivity/specificity, positive/negative predictive value) for the marginal probabilities. The missingness patterns includes missingness in one or two variables, neighbor cells confounded, with or without explanatory variables. dados categorizados dados faltantes dados incompletos dados omissos MAR MCAR mecanismo ignorável mecanismo não-ignorável MNAR modelos de seleção categorical data ignorable mechanism incomplete data MAR MCAR missing data MNAR non-ignorable mechanism selection models

Search results