Spelling suggestions: "subject:"aximum entropy 3dmodeling"" "subject:"aximum entropy bymodeling""
1 |
Distributions of Large Mammal Assemblages in Thailand with a Focus on Dhole (Cuon alpinus) ConservationJenks, Kate Elizabeth 01 May 2012 (has links)
Biodiversity monitoring and predictions of species occurrence are essential to develop outcome-oriented conservation management plans for endangered species and assess their success over time. To assess distribution and patterns of habitat use of large mammal assemblages in Thailand, with a focus on the endangered dhole (Cuon alpinus), I first implemented a long-term camera-trapping project carried out with park rangers from October 2003 through October 2007 in Khao Yai National Park. This project was extremely successful and may serve as a regional model for wildlife conservation. I found significantly lower relative abundance indices for carnivore species, and collectively for all mammals compared to data obtained in 1999-2000, suggesting population declines resulting from increased human activity. I integrated this data into maximum entropy modeling (Maxent) to further evaluate whether ranger stations reduced poaching activity and increased wildlife diversity and abundances. I then conducted a focused camera trap survey from January 2008 through February 2010 in Khao Ang Rue Nai Wildlife Sanctuary to gather critical baseline information on dholes, one of the predator species that seemed to have declined over time and that is exposed to continued pressure from humans. Additionally, I led a collaborative effort with other colleagues in the field to collate and integrate camera trap data from 15 protected areas to build a country-wide habitat suitability map for dholes, other predators, and their major prey species. The predicted presence probability for sambar (Rusa unicolor) and leopards (Panthera pardus) were the most important variables in predicting dhole presence countrywide. Based on my experience from these different field ecological surveys and endeavors, it became clear that local people's beliefs may have a strong influence on dhole management and conservation. Thus, I conducted villager interview surveys to identify local attitudes towards dholes, document the status of dholes in wildlife sanctuaries adjacent to Cambodia, and determine the best approach to improve local support for dhole conservation before proceeding with further field studies of the species in Thailand. A photograph of a dhole was correctly identified by only 20% of the respondents. My studies provide evidence that some protected areas in Thailand continue to support a diversity of carnivore speices of conservation concern, including clouded leopards (Neofelis nebulosa), dholes, and small felids. However, dholes' impact on prey populations may be increasing as tiger (Panthera tigris) and leopards are extripated from protected areas. The next step in dhole conservation is to estimate the size and stability of their fragmented populations and also focus on maintaining adequate prey bases that would support both large felids and dholes
|
2 |
Probabilistic Modeling of Multi-relational and Multivariate Discrete DataWu, Hao 07 February 2017 (has links)
Modeling and discovering knowledge from multi-relational and multivariate discrete data is a crucial task that arises in many research and application domains, e.g. text mining, intelligence analysis, epidemiology, social science, etc. In this dissertation, we study and address three problems involving the modeling of multi-relational discrete data and multivariate multi-response count data, viz. (1) discovering surprising patterns from multi-relational data, (2) constructing a generative model for multivariate categorical data, and (3) simultaneously modeling multivariate multi-response count data and estimating covariance structures between multiple responses.
To discover surprising multi-relational patterns, we first study the ``where do I start?'' problem originating from intelligence analysis. By studying nine methods with origins in association analysis, graph metrics, and probabilistic modeling, we identify several classes of algorithmic strategies that can supply starting points to analysts, and thus help to discover interesting multi-relational patterns from datasets. To actually mine for interesting multi-relational patterns, we represent the multi-relational patterns as dense and well-connected chains of biclusters over multiple relations, and model the discrete data by the maximum entropy principle, such that in a statistically well-founded way we can gauge the surprisingness of a discovered bicluster chain with respect to what we already know. We design an algorithm for approximating the most informative multi-relational patterns, and provide strategies to incrementally organize discovered patterns into the background model. We illustrate how our method is adept at discovering the hidden plot in multiple synthetic and real-world intelligence analysis datasets. Our approach naturally generalizes traditional attribute-based maximum entropy models for single relations, and further supports iterative, human-in-the-loop, knowledge discovery.
To build a generative model for multivariate categorical data, we apply the maximum entropy principle to propose a categorical maximum entropy model such that in a statistically well-founded way we can optimally use given prior information about the data, and are unbiased otherwise. Generally, inferring the maximum entropy model could be infeasible in practice. Here, we leverage the structure of the categorical data space to design an efficient model inference algorithm to estimate the categorical maximum entropy model, and we demonstrate how the proposed model is adept at estimating underlying data distributions. We evaluate this approach against both simulated data and US census datasets, and demonstrate its feasibility using an epidemic simulation application.
Modeling data with multivariate count responses is a challenging problem due to the discrete nature of the responses. Existing methods for univariate count responses cannot be easily extended to the multivariate case since the dependency among multiple responses needs to be properly accounted for. To model multivariate data with multiple count responses, we propose a novel multivariate Poisson log-normal model (MVPLN). By simultaneously estimating the regression coefficients and inverse covariance matrix over the latent variables with an efficient Monte Carlo EM algorithm, the proposed model takes advantages of association among multiple count responses to improve the model prediction accuracy. Simulation studies and applications to real world data are conducted to systematically evaluate the performance of the proposed method in comparison with conventional methods. / Ph. D. / In this decade of big data, massive data of various types are generated every day from different research areas and industry sectors. Among all these types of data, text data, i.e. text documents, are important to many research and real world applications. One challenge faced when analyzing massive text data is which documents we should investigate first to initialize the analysis and how to identify stories and plots, if any, that hide inside the massive text documents. For example, in intelligence analysis, when analyzing intelligence documents, some common questions that analysts ask are ‘How is a suspect connected to the passenger manifest on this flight?’ and ‘How do distributed terrorist cells interface with each other?’. This is a crucial task so called storytelling. In the first half of this dissertation, we will study this problem and design mathematical models and computer algorithms to automatically identify useful information from text data to help analysts to discover hidden stories and plots from massive text documents. We also incorporate visual analytics techniques and design a visualization system to support human-in-the-loop exploratory data analysis so that analysts could interact with the algorithms and models iteratively to investigate given datasets.
In the second half of this dissertation, we study two problems that arise from the domain of public health. When epidemic of certain disease happens, e.g. flu seasons, public health officials need to make certain policies in advance to prevent or alleviate the epidemic. A data-driven approach would be to make such public health policies using simulation results and predictions based on historical data. One problem usually faced in epidemic simulation is that researchers would like to run simulations with real-world data so that the simulation results can be close to real-world scenarios but at the same time protect the private information of individuals. To solve this problem, we design and implement a mathematical model that could generate realistic sythetic population using U.S. Census Survey to help conduct the epidemic simulation. Using flus as an example, we also propose a mathematical model to study associations between different types of flus with the information collected from social media, like Twitter. We believe that identifying such associations between different types of flus will help officials to make appropriate public health policies.
|
Page generated in 0.0984 seconds