101 |
Estudo sobre a aplicação de estatística bayesiana e método de máxima entropia em análise de dados / Study on application of bayesian statistics and method of maximun entropy in data analysisPerassa, Eder Arnedo, 1982- 19 April 2007 (has links)
Orientador: Jose Augusto Chinellato / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Fisica Gleb Wataghin / Made available in DSpace on 2018-08-09T22:35:29Z (GMT). No. of bitstreams: 1
Perassa_EderArnedo_M.pdf: 7742499 bytes, checksum: 5f8e2630e2b11b5f5965e6b95c19be9b (MD5)
Previous issue date: 2007 / Resumo: Neste trabalho são estudados os métodos de estatística bayesiana e máxima entropia na análise de dados. É feita uma revisão dos conceitos básicos e procedimentos que podem ser usados para in-ferência de distribuições de probabilidade. Os métodos são aplicados em algumas áreas de interesse, com especial atenção para os casos em que há pouca informação sobre o conjunto de dados. São apresentados algoritmos para a aplicação de tais métodos, bem como alguns exemplos detalhados em que espera-se servirem de auxílio aos interessados em aplicações em casos mais comuns de análise de dados / Abstract: In this work, we study the methods of Bayesian Statistics and Maximum Entropy in data analysis. We present a review of basic concepts and procedures that can be used for inference of probability distributions. The methods are applied in some interesting fields, with special attention to the cases where there¿s few information on set of data, which can be found in physics experiments such as high energies physics, astrophysics, among others. Algorithms are presented for the implementation of such methods, as well as some detailed examples where it is expected to help interested in applications in most common cases of data analysis / Mestrado / Física das Particulas Elementares e Campos / Mestre em Física
|
102 |
Modelagem de distribuição geográfica para Hydromedusa maximiliani (Mikan, 1820) (Testudines, Chelidae), BrasilLima, Lúcio Moreira Campos 27 February 2014 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-05-17T13:48:31Z
No. of bitstreams: 1
luciomoreiracamposlima.pdf: 5522684 bytes, checksum: e616be5eb2a37187aa4afef3c218cc35 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-06-28T14:19:25Z (GMT) No. of bitstreams: 1
luciomoreiracamposlima.pdf: 5522684 bytes, checksum: e616be5eb2a37187aa4afef3c218cc35 (MD5) / Made available in DSpace on 2016-06-28T14:19:25Z (GMT). No. of bitstreams: 1
luciomoreiracamposlima.pdf: 5522684 bytes, checksum: e616be5eb2a37187aa4afef3c218cc35 (MD5)
Previous issue date: 2014-02-27 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O cágado-pescoço-de-cobra, Hydromedusa maximiliani, é uma espécie endêmica da Mata Atlântica e ameaçada de extinção na categoria Vulnerável pela IUCN, cujas populações estão associadas principalmente com riachos de interior de mata, mas a distribuição geográfica relacionada com esses ambientes hidrologicamente dinâmicos ainda é pouco entendida. Modelagem de Distribuição de Espécies tem sido uma ferramenta amplamente usada nos últimos anos. O algoritmo da Máxima Entropia, Maxent, permite prever a distribuição geográfica potencial de espécies a partir de dados de presença. Este estudo teve por objetivos construir modelos ecológicos para prever a distribuição potencial de H. maximiliani que poderão fornecer subsídios para elaboração de novas estratégias de conservação e, dessa forma contribuir com o avanço no conhecimento sobre o padrão de sua distribuição em regiões com domínio da Mata Atlântica. Os Dados de ocorrência foram obtidos, entre setembro de 2012 e setembro de 2013, através de visitas às coleções zoológicas, levantamentos bibliográficos e coleta de coordenadas geográficas no campo. Para a construção do modelo foi usado o algoritmo Maxent, auxiliado pelo ArcGis versão 10 e pelo modelo digital de elevação do “Shuttle Radar Topographic Mission”. As variáveis ambientais foram obtidas pelo Worldclim version 1.1 Global Cimate Surface 10. O modelo foi avalizado pelo valor de AUC (Area Under the ROC Curve) e pelo teste estatístico Jackknife. Foram compilados 42 pontos para a distribuição da H. maximiliani. A distribuição potencial se estendeu desde o sul da Bahia até o estado de São Paulo. O modelo gerado mostrou uma alta capacidade preditiva, com valor AUC superior a 0,97, e apresentou uma transferabilidade satisfatória (i.e. capacidade para prever distribuições em regiões não amostradas). O alto valor AUC evidencia um bom modelo de distribuição geográfica potencial de espécies. No entanto, em modelos de larga escala, esse valor pode-se apresentar proporcional ao tamanho da escala, o que levaria a uma interpretação equivocada do modelo. Contudo, as áreas previstas para a distribuição da H. maximiliani no presente estudo mostraram-se realistas e condizentes com a distribuição real da espécie. / The Maximilian’s snake-necked-turtle, Hydromedusa maximiliani, is specie endemic to the Atlantic Forest and endangered in category Vulnerable by IUCN, whose populations are mainly associated with streams inside the forest, but the geographical distribution related to these environments hydrologically dynamic is poorly understood. Species Distribution Modeling has been a tool widely used in recent years. The Maximum Entropy algorithm, Maxent predicts the potential geographic distribution of species from presence data. This study aimed to build ecological models to predict the potential distribution of H. maximiliani that may provide support for development of new strategies for the conservation and thus contribute to the advancement in knowledge about the pattern of their distribution in regions with the Atlantic Forest domain. The occurrence data were obtained between September 2012 and September 2013, through visits to the zoological collections, bibliographic and collection of geographic coordinates in the field. To construct the model we used the Maxent algorithm, aided by ArcGIS version 10 and the digital elevation model of the "Shuttle Radar Topographic Mission". The environmental variables were obtained by Worldclim version 1.1 Cimate Global Surface 10. The model was endorsed by the AUC (Area Under the ROC Curve) and the statistical test Jackknife. 42 points were compiled for the distribution of H. maximiliani. The potential distribution extended from southern Bahia to São Paulo. The generated model showed high predictive ability, with higher AUC value to 0.97, and showed a satisfactory transferability (i.e. ability to predict distributions in regions not sampled ). The high AUC value shows a good distribution model geographic potential of species. However, in models of large scale, this value can be presented proportional to the size of the scale, which would lead to a misinterpretation of the model. However, the areas provided for the distribution of H. maximiliani in this study are realistic and consistent with the distribution realityof the species.
|
103 |
Reconhecimento de entidades mencionadas em português utilizando aprendizado de máquina / Portuguese named entity recognition using machine learningWesley Seidel Carvalho 24 February 2012 (has links)
O Reconhecimento de Entidades Mencionadas (REM) é uma subtarefa da extração de informações e tem como objetivo localizar e classificar elementos do texto em categorias pré-definidas tais como nome de pessoas, organizações, lugares, datas e outras classes de interesse. Esse conhecimento obtido possibilita a execução de outras tarefas mais avançadas. O REM pode ser considerado um dos primeiros passos para a análise semântica de textos, além de ser uma subtarefa crucial para sistemas de gerenciamento de documentos, mineração de textos, extração da informação, entre outros. Neste trabalho, estudamos alguns métodos de Aprendizado de Máquina aplicados na tarefa de REM que estão relacionados ao atual estado da arte, dentre eles, dois métodos aplicados na tarefa de REM para a língua portuguesa. Apresentamos três diferentes formas de avaliação destes tipos de sistemas presentes na literatura da área. Além disso, desenvolvemos um sistema de REM para língua portuguesa utilizando Aprendizado de Máquina, mais especificamente, o arcabouço de máxima entropia. Os resultados obtidos com o nosso sistema alcançaram resultados equiparáveis aos melhores sistemas de REM para a língua portuguesa desenvolvidos utilizando outras abordagens de aprendizado de máquina. / Named Entity Recognition (NER), a task related to information extraction, aims to classify textual elements according to predefined categories such as names, places, dates etc. This enables the execution of more advanced tasks. NER is a first step towards semantic textual analysis and is also a crucial task for systems of information extraction and other types of systems. In this thesis, I analyze some Machine Learning methods applied to NER tasks, including two methods applied to Portuguese language. I present three ways of evaluating these types of systems found in the literature. I also develop an NER system for the Portuguese language utilizing Machine Learning that entails working with a maximum entropy framework. The results are comparable to the best NER systems for the Portuguese language developed with other Machine Learning alternatives.
|
104 |
Statistické jazykové modely založené na neuronových sítích / STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKSMikolov, Tomáš January 2012 (has links)
Statistické jazykové modely jsou důležitou součástí mnoha úspěšných aplikací, mezi něž patří například automatické rozpoznávání řeči a strojový překlad (příkladem je známá aplikace Google Translate). Tradiční techniky pro odhad těchto modelů jsou založeny na tzv. N-gramech. Navzdory známým nedostatkům těchto technik a obrovskému úsilí výzkumných skupin napříč mnoha oblastmi (rozpoznávání řeči, automatický překlad, neuroscience, umělá inteligence, zpracování přirozeného jazyka, komprese dat, psychologie atd.), N-gramy v podstatě zůstaly nejúspěšnější technikou. Cílem této práce je prezentace několika architektur jazykových modelůzaložených na neuronových sítích. Ačkoliv jsou tyto modely výpočetně náročnější než N-gramové modely, s technikami vyvinutými v této práci je možné jejich efektivní použití v reálných aplikacích. Dosažené snížení počtu chyb při rozpoznávání řeči oproti nejlepším N-gramovým modelům dosahuje 20%. Model založený na rekurentní neurovové síti dosahuje nejlepších publikovaných výsledků na velmi známé datové sadě (Penn Treebank).
|
105 |
Performance Evaluation and Prediction of 2-D Markovian and Bursty Multi-Traffic Queues. Analytical Solution for 2-D Markovian and Bursty Multi-Traffic Non Priority, Priority and Hand Off Calling Schemes.Karamat, Taimur January 2010 (has links)
Queueing theory is the mathematical study of queues or waiting lines, which are formed whenever demand for service exceeds the capacity to provide service. A queueing system is composed of customers, packets or calls that need some kind of service. These entities arrive at queueing system, join a queue if service is not immediately available and leave system after receiving service. There are also cases when customers, packets or calls leave system without joining queue or drop out without receiving service even after waiting for some time. Queueing network models with finite capacity have facilitated the analysis of discrete flow systems, such as computer systems, transportation networks, manufacturing systems and telecommunication networks, by providing powerful and realistic tools for performance evaluation and prediction.
In wireless cellular systems mobility is the most important feature and continuous service is achieved by supporting handoff from one cell to another. Hand off is the process of changing channel associated with the current connection while a call is in progress. A handoff is required when a mobile terminal moves from one cell to another or the signal quality deteriorates in current cell. Since neighbouring cells use disjoint subset of frequency bands therefore negotiation must take place between mobile terminal, the current base station and next
potential base station. A poorly designed handoff scheme significantly decreases quality of service (QOS). Different schemes have been devised and in these schemes handoff calls are prioritize.
Also most of the performance evaluation techniques consider the case where the arrival process is poisson and service is exponential i.e. there is single arrival and single departure. Whereas in practice there is burstiness in cellular traffic i.e. there can be bulk arrivals and bulk departures. Other issue is that, assumptions made by stochastic process models are not satisfied. Most of the effort is concentrated on providing different interpretations of M/M queues rather than attempting to provide a new methodology.
In this thesis performance evaluation of multi traffic cellular models i.e. non priority, priority and hand off calling scheme for bursty traffic are devised. Moreover extensions are carried out towards the analysis of a multi-traffic M/M queueing system and state probabilities are calculated analytically.
|
106 |
Digital Signal Processing of SARSAT Signals Using the MEM and FFTChung, Kwai-Sum Thomas 07 1900 (has links)
<p> This thesis investigates the processing of emergency locator transmitter (ELT) signals which are used in search and rescue satellite-aided tracking (SARSAT) systems. Essentially, the system relies on the transmission of ELT signals from a distressed platform being relayed through an orbiting satellite to an earth station where signal processing can be performed. </p> <p> The methods of signal processing investigated here include both linear and nonlinear. The linear methods include the window function, the autocorrelation function, the digital filtering and the Fast Fourier Transform (FFT). The nonlinear processing is based on the Maximum Entropy Method (MEM) . In addition, additive white Gaussian noise has been added to simulate the performance under different carrier-to-noise density ratio conditions. </p> <p> For a single ELT signal, it is shown in the thesis that the MEM processor gives good spectral performance as compared to the FFT when applied to all types of modulation. When multiple ELT signals are present, the MEM also provides certain benefits in improving the spectral performance as compared to the FFT. </p> / Thesis / Master of Engineering (ME)
|
107 |
Using Geospatial Techniques to Assess Responses of Black Bear Populations to Anthropogenically Modified Landscapes: Conflict & RecolonizationMcFadden, Jamie Elizabeth 14 December 2018 (has links)
The convergence of three young scientific disciplines (ecology, geospatial sciences, and remote sensing) has generated unique advancements in wildlife research by connecting ecological data with remote sensing data through the application of geospatial techniques. Ecological datasets may contain spatial and sampling biases. By using geospatial techniques, datasets may be useful in revealing landscape scale (e.g., statewide) trends for wildlife populations, such as population recovery and human-wildlife interactions. Specifically, black bear populations across North America vary greatly in their degree of distribution stability. The black bear population in Michigan may be considered stable or secure, whereas the population in Missouri is currently recolonizing. The focus of the research in this dissertation is to examine the ecological and anthropogenic impacts 1) on human-black bear interactions in Michigan (see Chapter 2) and 2) on black bear presence in Missouri (see Chapter 3), through the use of black bear reports provided by the public to the state wildlife agencies. By using generalized linear modeling (GLM) and maximum entropy (MaxEnt), I developed spatial distribution models of probability of occurrence/presence for the 2 study areas (Michigan and Missouri). For the Missouri study, I quantified the spatiotemporal shifts in the probability of bear presence statewide. The results from my statewide studies corroborate previous local-scale research based on rigorous data collection. Overall, human-black bear interactions (e.g., wildlife sightings, conflicts), while very dynamic, appear greatest in forested and rural areas where the preferred habitat for black bears (i.e., forest) intersects with low density anthropogenic activities. As both human and black bear populations continue to expand, it is reasonable to expect human-black bear interactions to spatiotemporally increase across both study areas. The results from my studies provide wildlife managers with information critical to management decisions such as harvest regulations and habitat conservation actions across the landscape and through time. The ability to detect and monitor ecological changes through the use of geospatial techniques can lead to insights about the stressors and drivers of population-level change, further facilitating the development of proactive causeocused management strategies.
|
108 |
Lyme Disease and Forest Fragmentation in the Peridomestic EnvironmentTelionis, Pyrros A. 14 May 2020 (has links)
Over the last 20 years, Lyme disease has grown to become the most common vector-borne disease affecting Americans. Spread in the eastern U.S. primarily by the bite of Ixodes scapularis, the black-legged tick, the disease affects an estimated 329,000 Americans per year. Originally confined to New England, it has since spread across much of the east coast and has become endemic in Virginia. Since 2010 the state has averaged 1200 cases per year, with 200 annually in the New River Health District (NRHD), the location of our study.
Efforts to geographically model Lyme disease primarily focus on landscape and climatic variables. The disease depends highly on the survival of the tick vector, and white-footed mouse, the primary reservoir. Both depend on the existence of forest-herbaceous edge-habitats, as well as warm summer temperatures, mild winter lows, and summer wetness. While many studies have investigated the effect of forest fragmentation on Lyme, none have made use of high-resolution land cover data to do so at the peridomestic level.
To fill this knowledge gap, we made use of the Virginia Geographic Information Network’s 1-meter land cover dataset and identified forest-herbaceous edge-habitats for the NRHD. We then calculated the density of these edge-habitats at 100, 200 and 300-meter radii, representing the peridomestic environment. We also calculated the density of <2-hectare forest patches at the same distance thresholds. To avoid confounding from climatic variation, we also calculated mean summer temperatures, total summer rainfall, and number of consecutive days below freezing of the prior winters. Adding to these data, elevation, terrain shape index, slope, and aspect, and including lags on each of our climatic variables, we created environmental niche models of Lyme in the NRHD. We did so using both Boosted Regression Trees (BRT) and Maximum Entropy (MaxEnt) modeling, the two most common niche modeling algorithms in the field today.
We found that Lyme is strongly associated with higher density of developed-herbaceous edges within 100-meters from the home. Forest patch density was also significant at both 100-meter and 300-meter levels. This supports the notion that the fine scale peridomestic environment is significant to Lyme outcomes, and must be considered even if one were to account for fragmentation at a wider scale, as well as variations in climate and terrain. / M.S. / Lyme disease is the most common vector-borne disease in the United States today. Infecting about 330,000 Americans per year, the disease continues to spread geographically. Originally found only in New England, the disease is now common in Virginia. The New River Health District, where we did our study, sees over 200 cases per year.
Lyme disease is mostly spread by the bite of the black-legged tick. As such we can predict where Lyme cases might be found if we understand the environmental needs of these ticks. The ticks themselves depend on warm summer temperatures, mild winter lows, and summer wetness. But they are also affected by forest fragmentation which drives up the population of white-footed mice, the tick’s primary host. The mice are particularly fond of the interface between forests and open fields. These edge habitats provide food and cover for the mice, and in turn support a large population of ticks.
Many existing studies have demonstrated this link, but all have done so across broad scales such as counties or census tracts. To our knowledge, no such studies have investigated forest fragmentation near the home of known Lyme cases. To fill this gap in our knowledge, we made use of high-resolution forest cover data to identify forest-field edge habitats and small isolated forest patches. We then calculated the total density of both within 100, 200 and 300 meters of the homes of known Lyme cases, and compared these to values from non-cases using statistical modeling. We also included winter and summer temperatures, rainfall, elevation, slope, aspect, and terrain shape.
We found that a large amount of forest-field edges within 100 meters of a home increases the risk of Lyme disease to residents of that home. The same can be said for isolated forest patches. Even after accounting for all other variables, this effect was still significant. This information can be used by health departments to predict which neighborhoods may be most at risk for Lyme. They can then increase surveillance in those areas, warn local doctors, or send out educational materials.
|
109 |
Maintaining QoS through preferential treatment to UMTS servicesAwan, Irfan U., Al-Begain, Khalid January 2003 (has links)
Yes / One of the main features of the third generation (3G) mobile networks is their capability to provide different classes of services; especially multimedia and real-time services in addition to the traditional telephony and data services. These new services, however, will require higher Quality of Service (QoS) constraints on the network mainly regarding delay, delay variation and packet loss. Additionally, the overall traffic profile in both the air interface and inside the network will be rather different than used to be in today's mobile networks. Therefore, providing QoS for the new services will require more than what a call admission control algorithm can achieve at the border of the network, but also continuous buffer control in both the wireless and the fixed part of the network to ensure that higher priority traffic is treated in the proper way. This paper proposes and analytically evaluates a buffer management scheme that is based on multi-level priority and Complete Buffer Sharing (CBS) policy for all buffers at the border and inside the wireless network. The analytical model is based on the G/G/1/N censored queue with single server and R (R¿2) priority classes under the Head of Line (HoL) service rule for the CBS scheme. The traffic is modelled using the Generalised Exponential distribution. The paper presents an analytical solution based on the approximation using the Maximum Entropy (ME) principle. The numerical results show the capability of the buffer management scheme to provide higher QoS for the higher priority service classes.
|
110 |
Performance Modelling of GPRS with Bursty Multi-class Traffic.Kouvatsos, Demetres D., Awan, Irfan U., Al-Begain, Khalid January 2003 (has links)
No / An analytic framework is devised, based on the principle of maximum entropy (ME), for the performance modelling and evaluation of a wireless GSM/GPRS cell supporting bursty multiple class traffic of voice calls and data packets under complete partitioning (CPS), partial sharing (PSS) and aggregate sharing (ASS) traffic handling schemes. Three distinct open queueing network models (QNMS) under CPS, PSS and ASS, respectively, are described, subject to external compound Poisson traffic processes and generalised exponential (GE) transmission times under a repetitive service blocking mechanism and a complete buffer sharing management rule. Each QNM generally consists of three building block stations, namely a loss system with GSM/GPRS traffic and a system of access and transfer finite capacity queues in tandem dealing with GPRS traffic under head-of-line and discriminatory processor sharing scheduling disciplines, respectively. The analytic methodology is illustrated by focusing on the performance study of the GE-type tandem queueing system for GPRS under a CPS. An ME product-form approximation is characterised leading into a decomposition of the tandem system into individual queues and closed-form ME expressions for state and blocking probabilities are presented. Typical numerical examples are included to validate the ME solutions against simulation and study the effect of external GPRS bursty traffic upon the performance of the cell. Moreover, an overview of recent extensions of the work towards the analysis of a GE-type multiple server finite capacity queue with preemptive resume priorities and its implications towards the performance modelling and evaluation of GSM/GPRS cells with PSS and ASS are included. / ,
|
Page generated in 0.0828 seconds