31 |
Measuring Linguistic and Cultural Evolution Using Books and TweetsGray, Tyler 01 January 2019 (has links)
Written language provides a snapshot of linguistic, cultural, and current events information for a given time period. Aggregating these snapshots by studying many texts over time reveals trends in the evolution of language, culture, and society. The ever-increasing amount of electronic text, both from the digitization of books and other paper documents to the increasing frequency with which electronic text is used as a means of communication, has given us an unprecedented opportunity to study these trends. In this dissertation, we use hundreds of thousands of books spanning two centuries scanned by Google, and over 100 billion messages, or ‘tweets’, posted to the social media platform, Twitter, over the course of a decade to study the English language, as well as study the evolution of culture and society as inferred from the changes in language.
We begin by studying the current state of verb regularization and how this compares between the more formal writing of books and the more colloquial writing of tweets on Twitter. We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books, and also for tweets geotagged in the United States relative to American English books, but the opposite is true for tweets geotagged in the United Kingdom relative to British English books. We also find interesting regional variations in regularization across counties in the United States. However, once differences in population are accounted for, we do not identify strong correlations with socio-demographic variables.
Next, we study stretchable words, a fundamental aspect of spoken language that, until the advent of social media, was rarely observed within written language. We examine the frequency distributions of stretchable words and introduce two central parameters that capture their main characteristics of balance and stretch. We explore their dynamics by creating visual tools we call ‘balance plots’ and ‘spelling trees’. We also discuss how the tools and methods we develop could be used to study mistypings and misspellings, and may have further applications both within and beyond language.
Finally, we take a closer look at the English Fiction n-gram dataset created by Google. We begin by explaining why using token counts as a proxy of word, or more generally, ‘n-gram’, importance is fundamentally flawed. We then devise a method to rebuild the Google Books corpus so that meaningful linguistic and cultural trends may be reliably discerned. We use book counts as the primary ranking for an n-gram and use subsampling to normalize across time to mitigate the extraneous results created by the underlying exponential increase in data volume over time. We also combine the subsampled data over a number of years as a method of smoothing. We then use these improved methods to study linguistic and cultural evolution across the last two centuries. We examine the dynamics of Zipf distributions for n-grams by measuring the churn of language reflected in the flux of n-grams across rank boundaries. Finally, we examine linguistic change using wordshift plots and a rank divergence measure with a tunable parameter to compare the language of two different time periods. Our results address several methodological shortcomings associated with the raw Google Books data, strengthening the potential for cultural inference by word changes.
|
32 |
Conducting water and sanitation survey using Personal Digital Assistants and Geographic Information System technologies in rural ZimbabweNtozini, Robert 06 1900 (has links)
Access to clean water and improved sanitation are basic human right. This quantitative, descriptive study sought to establish current water and sanitation coverage in Chirumanzu and Shurugwi districts in Zimbabwe and develop methods of assessing coverage using Geographic Information Systems. Google Earth was used to identify homesteads. Personal digital assistant-based forms were used to collect geo-referenced data on all water points and selected households. Geospatial analysis methods were used to calculate borehole water coverage.
Using Google Earth, 29375 homesteads were identified. The water survey mapped 4134 water points; 821 were boreholes; and only 548 were functional. Functional borehole water coverage was: 57.3%, 46.2%, and 33.5% for distance from household to water point of within 1500 m, 1000 m, and 500 m respectively. Sanitation coverage was 44.3%, but 96% of the latrines did not meet Blair Ventilated Pit latrine standards. / Health Studies / M.A. (Public Health) (Medical Informatics)
|
33 |
Análise geoespacial dos casos de dengue e sua relação com fatores socioambientais nos municípios de João pessoa, Cabedelo e BayeuxAlmeida , Caio Américo Pereira de 19 December 2016 (has links)
Submitted by Leonardo Cavalcante (leo.ocavalcante@gmail.com) on 2018-04-18T16:37:44Z
No. of bitstreams: 1
Arquivototal.pdf: 6827785 bytes, checksum: 9b850656b962e6bec15506496e35a7a9 (MD5) / Made available in DSpace on 2018-04-18T16:37:48Z (GMT). No. of bitstreams: 1
Arquivototal.pdf: 6827785 bytes, checksum: 9b850656b962e6bec15506496e35a7a9 (MD5)
Previous issue date: 2016-12-19 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work has had as objective to analyze the spatial behavior of dengue cases occurred in João Pessoa, Cabedelo and Bayeux, between 2011 and 2014, taking into consideration the influence of socio-environmental factors. As a methodological basis, was used statistical techniques to geospatial analysis as a tool of public health management, using the geographic space as category of analysis. This review also used geoprocessing techniques to support the epidemiological mapping and characterization of dengue. Aiming the completion of this study were obtained data from atmospheric variables by the National Institute of Meteorology (INMET), socio-economic data by IBGE and epidemiological data through Office Secretaries from João Pessoa, Cabedelo and Bayeux, respectively, registered in the Sinan. The statistical parameters generated were: determination coefficient (R²), Pearson coefficient, average distance from the nearest neighbor, geographic weighted center and standard distance. Among the socio-environmental factors, the variables that presented more meaningful values of correlation with the dengue cases were air moisture, rainfall, permanent private dwelling, responsible people with monthly nominal income up to ¹/² minimum wage and responsible people with monthly nominal income from ¹/² to 3 minimum wages. The months from March to August were the ones with the greater number of dengue cases (83%) and those that showed the highest values of rainfall and air moisture. The neighborhoods which indicated the highest amount of dengue cases were Mangabeira, Castelo Branco, Cruz das Armas, Oitizeiro and Cristo Redentor, all of them situated in João Pessoa city. However, the neighborhoods with the highest prevalence were São José, in João Pessoa and Morada Nova and Camboinha in Cabedelo town. The socio-environmental factors like improper waste disposal, settlements with no garbage collection and unsupported by public agencies, added to climatic conditions, were evidenced as the great responsible for the dengue occurrence. / Este trabalho teve como objetivo analisar o comportamento espacial dos casos de dengue ocorridos em João Pessoa, Cabedelo e Bayeux, entre 2011 e 2014, levando em consideração a influência de fatores socioambientais. Como base metodológica, utilizou-se técnicas estatísticas para análise geoespacial como instrumento de gestão em saúde pública, utilizando o espaço geográfico como categoria de análise. Este estudo também utilizou técnicas de geoprocessamento para auxiliar no mapeamento e caracterização epidemiológica da dengue. Para a realização do trabalho foram obtidos dados de variáveis atmosféricas no Instituto Nacional de Meteorologia (INMET), socioeconômicas no Instituto Brasileiro de Geografia e Estatística (IBGE) e epidemiológico nas Secretarias Municipais de Saúde dos municípios de João Pessoa, Cabedelo e Bayeux, registrados no Sinan. Os parâmetros estatísticos gerados foram: coeficiente de determinação R², coeficiente de Pearson, distância média ao vizinho mais próximo, centro geográfico ponderado e distância padrão. Entre os fatores socioambientais, as variáveis que apresentaram os valores mais significativos de correlação com os casos de dengue foram umidade do ar, precipitação, domicílio particular permanente, moradores em domicílio particular permanente, pessoas responsáveis com rendimento nominal mensal de até ½ salário mínimo e pessoas responsáveis com rendimento nominal mensal de ½ a 3 salários mínimos. Os meses de março a agosto foram os de maior quantidade de casos de dengue (83%) e os que apresentaram os maiores valores de precipitação e umidade do ar. Os bairros que indicaram a maior quantidade de casos foram Mangabeira, Castelo Branco, Cruz das Armas, Oitizeiro e Cristo Redentor, todos em João Pessoa. Porém, os bairros de maior prevalência foram São José em João Pessoa e Morada Nova e Camboinha em Cabedelo. Os fatores socioambientais como descarte inadequado do lixo, residências sem coleta de lixo e desassistidas pelos órgãos públicos, aliados às circunstâncias climáticas, evidenciaram-se como os grandes responsáveis para a ocorrência da dengue.
|
34 |
Modeling Occurrence and Assessing Public Perceptions of De Facto Wastewater Reuse across the USAJanuary 2014 (has links)
abstract: The National Research Council 2011 report lists quantifying the extent of de facto (or unplanned) potable reuse in the U.S. as the top research need associated with assessing the potential for expanding the nations water supply through reuse of municipal wastewater. Efforts to identify the significance and potential health impacts of de facto water reuse are impeded by out dated information regarding the contribution of municipal wastewater effluent to potable water supplies. This project aims to answer this research need. The overall goal of the this project is to quantify the extent of de facto reuse by developing a model that estimates the amount of wastewater effluent that is present within drinking water treatment plants; and to use the model in conjunction with a survey to help assess public perceptions. The four-step approach to accomplish this goal includes: (1) creating a GIS-based model coupled with Python programming; (2) validating the model with field studies by analyzing sucralose as a wastewater tracer; (3) estimating the percentage of wastewater in raw drinking water sources under varying streamflow conditions; (4) and assessing through a social survey the perceptions of the general public relating to acceptance and occurrence of de facto reuse. The resulting De Facto Reuse in our Nations Consumable Supply (DRINCS) Model, estimates that treated municipal wastewater is present at nearly 50% of drinking water treatment plant intake sites serving greater than 10,000 people (N=2,056). Contrary to the high frequency of occurrence, the magnitude of occurrence is relatively low with 50% of impacted intakes yielding less than 1% de facto reuse under average streamflow conditions. Model estimates increase under low flow conditions (modeled by Q95), in several cases treated wastewater makes up 100% of the water supply. De facto reuse occurs at levels that surpass what is publically perceived in the three cities of Atlanta, GA, Philadelphia, PA, and Phoenix, AZ. Respondents with knowledge of de facto reuse occurrence are 10 times more likely to have a high acceptance (greater than 75%) of treated wastewater at their home tap. / Dissertation/Thesis / Ph.D. Civil and Environmental Engineering 2014
|
35 |
Modeling Flood Potential Based on Land Use in the Greenbrier River Watershed in West Virginia, USALopez Sanchez, Manuel Eduardo 10 September 2021 (has links)
No description available.
|
36 |
Geospatial Analysis of Spatial Patterns of U.S. Hospital Readmission RatesWang, Yamei 01 January 2017 (has links)
Unplanned hospital readmission after a recent hospitalization is an indication of poor healthcare quality and a waste of healthcare resources. The Centers for Medicare and Medicaid Services (CMS) initiated the Hospital Readmission Reduction Program (HRRP) to improve healthcare quality and reduce costs; however, studies found the risk adjustment method used in calculating the standardized readmission rate was less accurate without hospital region or community factors. Accordingly, this cross-sectional quantitative study was designed to examine spatial patterns in hospital readmission rates following Andersen's behavioral model of health service utilization. This study was the first geospatial analysis on risk standardized hospital readmissions (RSRR) based on hospital geographic locations. Secondary data from the CMS was used in assessing the global and local geospatial cluster patterns using Global Moran's Index, Anselin local Moran's Index, and graphical analysis tool to identify cluster groups. The study found hospital-wide RSRR was significantly clustered across the country or at the local level. A total of 15 optimal cluster groups were identified with wide variability in cluster size. The hospital-wide and other seven CMS published RSRRs were significantly different among all clusters. The geographically bounded hospital RSRRs provided evidence in support of adding community or regional layer to risk adjustment of RSRR. The specific cluster groups with extremely high or low readmission rates can assist national and local policymakers and hospital administrators to identify specific targets to take actions. This research has social change implications for reducing hospital readmission rates and saving healthcare costs.
|
37 |
Impacts of Sea-Level Rise on Urban Properties in Tampa Due to Climate ChangeXie, Weiwei 10 December 2021 (has links) (PDF)
The fast urbanization produces a large and growing population in coastal areas. However, the rise of sea level, one of the most significant impacts of global warming, makes coastal communities much more vulnerable to flooding than before. This Master’s thesis study investigates sea-level rise impacts on parcel-level property in the specific coastal city of Tampa, Florida, USA. An improved sea-level rise model based on satellite altimeter data is first used to predict future sea levels. Based on high-resolution LiDAR digital elevation data and property map, flooded properties are identified to evaluate property damage cost. This empirical analysis provides an in-depth understanding of potential flooding risks for individual properties with detailed spatial information at a fine spatial scale. The spatial and temporal analyses can be potentially used by researchers or governments to mitigate the impact of sea-level rise and make better urban management plans to adapt to climate change.
|
38 |
Looking Outward from the Village: The Contingencies of Soil Moisture on the Prehistoric Farmed Landscape near Goodman Point PuebloBrown, Andrew D 08 1900 (has links)
Ancestral Pueblo communities of the central Mesa Verde region (CMVR) became increasingly reliant on agriculture for their subsistence needs during Basketmaker III (BMIII) through Terminal Pueblo III (TPIII) (AD 600–1300) periods. Researchers have been studying the Ancestral Pueblo people for over a century using a variety of methods to understand the relationships between climate, agriculture, population, and settlement patterns. While these methods and research have produced a well-developed cultural history of the region, studies at a smaller scale are still needed to understand the changes in farming behavior and the distribution of individual sites across the CMVR. Soil moisture is the limiting factor for crop growth in the semi-arid region of the Goodman Watershed in the CMVR. Thus, I constructed the soil moisture proxy model (SMPM) that is on a local scale and focuses on variables relevant to soil moisture – soil particle-size, soil depth, slope, and aspect. From the SMPM output, the areas of very high soil moisture are assumed to represent desirable farmland locations. I describe the relationship between very high soil moisture and site locations, then I infer the relevance of that relationship to settlement patterns and how those patterns changed over time (BMIII – TPIII). The results of the model and its application help to clarify how Ancestral Pueblo people changed as local farming communities. The results of this study indicates that farmers shifted away from use of preferred farmland during Terminal Pueblo III, which may have been caused by other cultural factors. The general outcome of this thesis is an improved understanding of human-environmental relationships on the local landscape in the CMVR.
|
39 |
Áreas vulneráveis e fatores de risco a ocorrência da esquistossomose em SergipeSilva, Marília Matos Bezerra Lemos 23 August 2018 (has links)
Nowadays, among the parasitic diseases that affect humans, schistosomiasis is one of the most widespread in the world. In Sergipe, the magnitude of its prevalence associated with its socioeconomic importance confers the disease great relevance as a public health problem (SVS, 2017). The ability to point out the spatial distribution of this parasite, as well as to determine potential areas for the occurrence of the disease, will have important implications in the planning actions of the control programs. In this perspective, from a systemic approach, the present study aims to evaluate the vulnerability and risk factors of the occurrence of schistosomiasis in endemic areas of the State of Sergipe. The study suggested in the thesis is based on the conception of epidemiological structure defined by Loureiro et al. (1979); in Cutter's perception of vulnerability (1996); in the methodological contributions of Batelle (DEE, 1973); and the use of geoprocessing techniques (spatial distribution and analysis - IDW and Kriging). The present research is divided into three stages. Initially it proposes a model of evaluation of the vulnerability to the occurrence of schistosomiasis in endemic areas of the state. It then carries out a mixed ecological descriptive study of temporal and analytical series to evaluate the evolution and the geographical distribution of the disease, while checking the association (Mann-Whitney test) between areas of high prevalence and socio-demographic and environmental variables of the state. Finally, an observational, cross-sectional observational epidemiological study was carried out in two hyperendemic areas of the state, aiming to present the different epidemiological patterns and the risk factors (logistic regression) associated with the occurrence of the disease. In the temporal analysis performed, there was no evidence of a change in the state epidemiological profile, indicating a decreasing positive trend of schistosomiasis in the coming years. Despite the fact that the hypothesis of a reduction in prevalence was confirmed, human infection rates were found to be very high in almost all of Sergipe, indicating the endemism of the disease in the state. The developed analyzes revealed indicators involved in the conformation and maintenance of the endemic structure, as well as identified areas of priority attention. It was also observed that indicators of quality of life and environmental variables are associated with the transmission of the disease in the state. Density estimators showed high vulnerability in the territories of Grande Aracaju, Baixo São Francisco e Sul Sergipano, historically endemic areas that present population groups with high risk of infection, validating the proposed model. In addition, they pointed out vulnerable areas in regions where there is a lack of concrete data on the endemic disease, the territory of the Sergipe Sertão Sertão is cited as an example. The logistic regression analysis showed a significant association between Schistosoma mansoni human infection and the variables: gender, occupation, schooling, income, habitability conditions and time of contact with water sources in both areas investigated, indicating diversity in risk and transmission of the disease. The products generated in the scope of this doctoral thesis will assist in planning and management of integrative actions aimed at health promotion, disease prevention and control programs. / Atualmente, entre as doenças parasitárias que afetam o homem a esquistossomose é uma das mais difundidas no mundo. Em Sergipe, a magnitude de sua prevalência associada à sua importância socioeconômica confere a endemia grande relevância enquanto problema de saúde pública (SVS, 2017). A capacidade de apontar a distribuição espacial desta parasitose, bem como, determinar áreas potenciais a sua ocorrência terá implicações importantes nas ações de planejamento dos programas de controle. Nesta perspectiva, a partir de uma abordagem sistêmica, o presente estudo objetiva avaliar a vulnerabilidade e os fatores de risco a ocorrência da esquistossomose em áreas endêmicas do estado de Sergipe. O estudo apresentado nesta tese fundamenta-se na concepção de estrutura epidemiológica definida por Loureiro et al., (1979); na percepção de vulnerabilidade de Cutter (1996); nas contribuições metodológicas de Batelle (DEE, 1973); e no uso de técnicas de geoprocessamento (distribuição e análise espacial - IDW e Krigagem). A pesquisa apresenta-se dividida em três etapas. Inicialmente, realiza um estudo ecológico misto descritivo de séries temporais e analítico para avaliar a evolução e a distribuição geográfica da doença ao tempo que verifica a associação entre as áreas de alta prevalência e variáveis sociodemográficas e ambientais do estado (teste de Mann-Whitney). Ato contínuo, propõe um modelo de avaliação da vulnerabilidade a ocorrência da esquistossomose em áreas endêmicas do estado. Por fim, é realizado um estudo epidemiológico analítico observacional de corte transversal em duas áreas hiperendêmicas, visando apresentar os distintos padrões epidemiológicos e os fatores de preditivos (regressão logística) associados a transmissão da doença nestas localidades. Na análise temporal realizada não foi evidenciada mudança no perfil epidemiológico estadual, apontando tendência positiva decrescente da esquistossomose para os próximos anos. A despeito de ter se confirmado a hipótese da redução da prevalência constataram-se taxas de infecção humana com situações de positividade muito alta em quase todo o território sergipano. As análises desenvolvidas revelaram indicadores envolvidos na conformação e manutenção da estrutura endêmica, bem como, identificou áreas de atenção prioritária. Foi observado também que indicadores de qualidade de vida e variáveis ambientais estão associados a disseminação da doença no estado. Os estimadores de densidade apontaram vulnerabilidade alta nos territórios da Grande Aracaju, Baixo São Francisco e Sul Sergipano, áreas historicamente endêmicas que apresentam grupos populacionais com alto risco de infecção, validando o modelo proposto. Ademais, apontaram áreas vulneráveis em regiões onde faltam dados concretos de endemização da doença, o território do Médio Sertão Sergipano é citado como exemplo. As análises de regressão logística apontaram associação significativa entre a infecção humana por Schistosoma mansoni e as variáveis: gênero, ocupação, escolaridade, renda, condições de habitabilidade e tempo de contato com as fontes hídricas, em ambas as áreas investigadas, apontando diversidade no risco e transmissão da doença. Os produtos gerados no âmbito desta tese doutoral auxiliarão propostas de planejamento e gestão de ações integradoras que visem à promoção da saúde, prevenção da doença e direcionamentos dos programas de controle. / São Cristóvão, SE
|
40 |
Learning to Map the Visual and Auditory WorldSalem, Tawfiq 01 January 2019 (has links)
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training strategy to provide an estimate of the expected visual and auditory ground-level attributes consisting of the type of scenes, objects, and sounds a person can experience at a location. Through a large-scale evaluation on real data, we show that our learned models can be used for applications including mapping, image localization, image retrieval, and metadata verification.
|
Page generated in 0.081 seconds