• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 76
  • 61
  • 13
  • 6
  • 6
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 197
  • 197
  • 197
  • 47
  • 44
  • 39
  • 37
  • 36
  • 35
  • 35
  • 33
  • 32
  • 30
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Using data analytics and laboratory experiments to advance the understanding of reservoir rock properties

Li, Zihao 01 February 2019 (has links)
Conventional and unconventional reservoirs are both critical in oilfield developments. After waterflooding treatments over decades, the petrophysical properties of a conventional reservoir may change in many aspects. It is crucial to identify the variations of these petrophysical properties after the long-term waterflooding treatments, both at the pore and core scales. For unconventional reservoirs, the productivity and performance of hydraulic fracturing in shales are challenging because of the complicated petrophysical properties. The confining pressure imposed on a shale formation has a tremendous impact on the permeability of the rock. The correlation between confining pressure and rock permeability is complicated and might be nonlinear. In this thesis, a series of laboratory tests was conducted on core samples extracted from four U.S. shale formations to measure their petrophysical properties. In addition, a special 2D microfluidic equipment that simulates the pore structure of a sandstone formation was developed to investigate the influence of injection flow rate on the development of high-permeability flow channels. Moreover, the multiple linear regression (MLR) model was applied with the predictors based on the development stages to quantify the variations of reservoir petrophysical properties. The MLR model outcome indicated that certain variables were effectively correlated to the permeability. The 2D microfluidic model demonstrated the development of viscous fingering when the injection water flow rate was higher than a certain level, which resulted in reduced overall sweep efficiency. These comprehensive laboratory experiments demonstrate the role of confining pressure, Klinkenberg effect, and bedding plane direction on the gas flow in the nanoscale pore space in shales. / Master of Science / Conventional and unconventional hydrocarbon reservoirs are both important in oil-gas development. The waterflooding treatment is the injection of water into a petroleum reservoir to increase reservoir pressure and to displace residual oil, which is a widely used enhanced oil recovery method. However, after waterflooding treatments for several decades, it may bring many changes in the properties of a conventional reservoir. To optimize subsequent oilfield development plans, it is our duty to identify the variations of these properties after the long-term waterflooding treatments, both at the pore and core scales. In unconventional reservoirs, hydraulic fracturing has been widely used to produce hydrocarbon resources from shale or other tight rocks at an economically viable production rate. The operation of hydraulic fracturing in shales is challenging because of the complicated reservoir pressure. The external pressure imposed on a shale formation has a tremendous impact on the permeability of the rock. The correlation between pressure and rock permeability is intricate. In this thesis, a series of laboratory tests was conducted on core samples to measure their properties and the pressure. Moreover, a statistical model was applied to quantify the variations of reservoir properties. The results indicated that certain reservoir properties were effectively correlated to the permeability. These comprehensive investigations demonstrate the role of pressure, special gas flow effect, and rock bedding direction on the gas flow in the extremely small pore in shales.
12

Improving Turbidity-Based Estimates of Suspended Sediment Concentrations and Loads

Jastram, John Dietrich 12 June 2007 (has links)
As the impacts of human activities increase sediment transport by aquatic systems the need to accurately quantify this transport becomes paramount. Turbidity is recognized as an effective tool for monitoring suspended sediments in aquatic systems, and with recent technological advances turbidity can be measured in-situ remotely, continuously, and at much finer temporal scales than was previously possible. Although turbidity provides an improved method for estimation of suspended-sediment concentration (SSC), compared to traditional discharge-based methods, there is still significant variability in turbidity-based SSC estimates and in sediment loadings calculated from those estimates. The purpose of this study was to improve the turbidity-based estimation of SSC. Working at two monitoring sites on the Roanoke River in southwestern Virginia, stage, turbidity, and other water-quality parameters and were monitored with in-situ instrumentation, suspended sediments were sampled manually during elevated turbidity events; those samples were analyzed for SSC and for physical properties; rainfall was quantified by geologic source area. The study identified physical properties of the suspended-sediment samples that contribute to SSC-estimation variance and hydrologic variables that contribute to variance in those physical properties. Results indicated that the inclusion of any of the measured physical properties, which included grain-size distributions, specific surface-area, and organic carbon, in turbidity-based SSC estimation models reduces unexplained variance. Further, the use of hydrologic variables, which were measured remotely and on the same temporal scale as turbidity, to represent these physical properties, resulted in a model which was equally as capable of predicting SSC. A square-root transformed turbidity-based SSC estimation model developed for the Roanoke River at Route 117 monitoring station, which included a water level variable, provided 63% less unexplained variance in SSC estimations and 50% narrower 95% prediction intervals for an annual loading estimate, when compared to a simple linear regression using a logarithmic transformation of the response and regressor (turbidity). Unexplained variance and prediction interval width were also reduced using this approach at a second monitoring site, Roanoke River at Thirteenth Street Bridge; the log-based transformation of SSC and regressors was found to be most appropriate at this monitoring station. Furthermore, this study demonstrated the potential for a single model, generated from a pooled set of data from the two monitoring sites, to estimate SSC with less variance than a model generated only from data collected at this single site. When applied at suitable locations, the use of this pooled model approach could provide many benefits to monitoring programs, such as developing SSC-estimation models for multiple sites which individually do not have enough data to generate a robust model or extending the model to monitoring sites between those for which the model was developed and significantly reducing sampling costs for intensive monitoring programs. / Master of Science
13

Studies on bikeability in a metropolitan area using the active commuting route environment scale (ACRES)

Wahlgren, Lina January 2011 (has links)
Background: The Active Commuting Route Environment Scale (ACRES) was developed to study active commuters’ perceptions of their route environments. The overall aims were to assess the measuring properties of the ACRES and study active bicycle commuters’ perceptions of their commuting route environments. Methods: Advertisement- and street-recruited bicycle commuters from Greater Stockholm, Sweden, responded to the ACRES. Expected differences between inner urban and suburban route environments were used to assess criterion-related validity, together with ratings from an assembled expert panel as well as existing objective measures. Reliability was assessed as test-retest reproducibility. Comparisons of ratings between advertisement- and street-recruited participants were used for assessments of representativity. Ratings of inner urban and suburban route environments were used to evaluate commuting route environment profiles. Simultaneous multiple linear regression analyses were used to assess the relation between the outcome variable: whether the route environment hinders or stimulates bicycle-commuting and environmental predictors, such as levels of exhaust fumes, speeds of traffic and greenery, in inner urban areas. Results: The ACRES was characterized by considerable criterion-related validity and reasonable test-retest reproducibility. There was a good correspondence between the advertisement- and street-recruited participants’ ratings. Distinct differences in commuting route environment profiles between the inner urban and suburban areas were noted. Suburban route environments were rated as safer and more stimulating for bicycle-commuting. Beautiful, green and safe route environments seem to be, independently of each other, stimulating factors for bicycle-commuting in inner urban areas. On the other hand, high levels of exhaust fumes and traffic congestion, as well as low ‘directness’ of the route, seem to be hindering factors. Conclusions: The ACRES is useful for assessing bicyclists’ perceptions of their route environments. A number of environmental factors related to the route appear to be stimulating or hindering for bicycle commuting. The overall results demonstrate a complex research area at the beginning of exploration. / BAKGRUND: Färdvägsmiljöer kan tänkas påverka människors fysiskt aktiva arbetspendling och därmed bidra till bättre folkhälsa. Studier av färdvägsmiljöer är därför önskvärda för att öka förståelsen kring möjliga samband mellan fysiskt aktiv arbetspendling och färdvägsmiljöer. En enkät, ”The Active Commuting Route Environment Scale” (ACRES), har därför skapats i syfte att studera fysiskt aktiva arbetspendlares upplevelser av sina färdvägsmiljöer. Huvudsyftet med denna avhandling var dels att studera enkätens psykometriska egenskaper i form av validitet och reliabilitet, dels att studera arbetspendlande cyklisters upplevelser av sina färdvägsmiljöer. METODER: Arbetspendlande cyklister från Stor-Stockholm rekryterades via tidningsannonsering och via direkt kontakt i anslutning till färdvägen. Deltagarna besvarade enkäten ACRES. Tillsammans med skattningar från en grupp av experter och redan existerande objektiva mått användes förväntade skillnader mellan färdvägsmiljöer i inner- och ytterstaden för att studera kriterierelaterad validitet. Reliabiliteten studerades som reproducerbarhet via upprepade mätningar (test-retest). Jämförelser mellan skattningar av deltagare rekryterade via annonsering och via direkt kontakt i färdvägsmiljöer användes för att studera representativitet. Skattningar av färdvägsmiljöer i inner- och ytterstaden användes vidare för att studera färdvägsmiljöprofiler. Multipel linjär regressionsanalys användes även för att studera sambandet mellan utfallsvariabeln huruvida färdvägsmiljön motverkar eller stimulerar arbetspendling med cykel och miljöprediktorer, såsom avgasnivåer, trafikens hastighet och grönska, i innerstadsmiljöer. RESULTAT: Enkäten ACRES visade god kriterierelaterad validitet och rimlig reproducerbarhet. Det var en god överrensstämmelse mellan skattningar av deltagare rekryterade via annonsering och via direkt kontakt. Färdvägsmiljöprofilerna visade tydliga skillnader mellan inner- och ytterstadsmiljöer. Ytterstadens färdvägsmiljöer skattades som tryggare och mer stimulerande för arbetspendling med cykel än innerstadens färdvägsmiljöer. Vidare verkar vackra, gröna och trygga färdvägsmiljöer, oberoende av varandra, vara stimulerade faktorer för arbetspendling med cykel i innerstadsmiljöer. Däremot verkar höga avgasnivåer, höga trängselnivåer och färdvägar som kräver många riktningsändringar vara motverkande faktorer. SLUTSATSER: Enkäten ACRES är ett användbart instrument vid mätningar av cyklisters upplevelser av sina färdvägsmiljöer. Ett antal faktorer relaterade till färdvägsmiljön verkar vara stimulerande respektive motverkande för arbetspendling med cykel. Generellt sett på visar resultaten ett relativt outforskat och komplext forskningsområde. / <p>Örebro universitet, Hälsoakademin</p> / FAAP
14

Privacy Preserving Distributed Data Mining

Lin, Zhenmin 01 January 2012 (has links)
Privacy preserving distributed data mining aims to design secure protocols which allow multiple parties to conduct collaborative data mining while protecting the data privacy. My research focuses on the design and implementation of privacy preserving two-party protocols based on homomorphic encryption. I present new results in this area, including new secure protocols for basic operations and two fundamental privacy preserving data mining protocols. I propose a number of secure protocols for basic operations in the additive secret-sharing scheme based on homomorphic encryption. I derive a basic relationship between a secret number and its shares, with which we develop efficient secure comparison and secure division with public divisor protocols. I also design a secure inverse square root protocol based on Newton's iterative method and hence propose a solution for the secure square root problem. In addition, we propose a secure exponential protocol based on Taylor series expansions. All these protocols are implemented using secure multiplication and can be used to develop privacy preserving distributed data mining protocols. In particular, I develop efficient privacy preserving protocols for two fundamental data mining tasks: multiple linear regression and EM clustering. Both protocols work for arbitrarily partitioned datasets. The two-party privacy preserving linear regression protocol is provably secure in the semi-honest model, and the EM clustering protocol discloses only the number of iterations. I provide a proof-of-concept implementation of these protocols in C++, based on the Paillier cryptosystem.
15

Risco climático de ocorrência da requeima da batata na região dos Andes, Venezuela / Climatic risk for potato late blight occurrence in the Andes region, Venezuela

Lozada Garcia, Beatriz Ibet 07 October 2005 (has links)
A batata é uma das culturas de maior importância na produção agrícola da Venezuela. As condições climáticas de baixa temperatura e alta umidade, existentes na região onde normalmente se cultiva a batata, são favoráveis para a ocorrência da Requeima, sendo esta a doença que mais limita a produção de batata em quase todas as regiões produtoras do país. A batata é semeada na Venezuela em variadas condições de altitudes, desde os 400 até 3000 manm, sendo a região Andina (Táchira, Mérida e Trujillo) uma das mais produtoras, portanto, a existência de bases de dados meteorológicas de qualidade é de extrema importância. Nessa região, a densidade de estações meteorológicas que registram a temperatura do ar é muito irregular e pequena, sendo que outro problema é a ocorrência de falhas nas séries históricas diárias de precipitação, o que dificulta os estudos agrometeorológicos associados à determinação do risco climático devido à ocorrência de doenças nas culturas agrícolas. Em decorrência disso e da importância da cultura para o país, propôs-se à realização do presente trabalho, cujos objetivos foram: gerar modelos diários de estimativa da temperatura mínima, máxima e média do ar, empregando-se a regressão linear múltipla, considerando como variáveis independentes às coordenadas geográficas (longitude e latitude) e a altitude, preencher os dados faltantes em séries diárias de precipitação, mediante uma proposta do método do vizinho mais próximo, sendo este aquele determinado a partir da análise de agrupamento (método de Ward, com distância euclidiana); e caracterizar e espacializar o risco de ocorrência da Requeima da batata na região andina da Venezuela, mediante o modelo de previsão proposto por Hyre (1954), cujos dados de entrada são precipitação e temperatura média e mínima diárias. Estes dados foram obtidos de 106 estações meteorológicas pertencentes ao Ministério do Ambiente e Recursos Naturais da Venezuela, para um período de 31 anos (1967-1997). Os modelos obtidos para a estimativa das temperaturas mínima, máxima e média do ar apresentaram em média coeficientes de determinação superiores a 0,90, quando testados com dados independentes, com estimativas livres de erro significativos: índice de concordância d variando de 0,98 a 1,0 e RMSE médio menor que 2 °C. Para a precipitação, empregou-se uma base de dados que inicialmente apresentava um 17% de dados faltantes, o que foi reduzido para 2,5% com o uso do método proposto. Os erros (ME, MAE e RMSE) obtidos foram moderados (MAE 1,7 a 4,0 mm/dia) e o índice de concordância variou de 0,57, para dados diários, a 0,83, quando os dados foram agrupados em valores mensais. A interpolação do risco potencial máximo de ocorrência da Requeima se realizou mediante técnicas de Geoestatística, (Krigagem ordinária) gerando mapas de risco de cada data de semeadura (20), de 1° de janeiro a 15 de outubro. Foram definidos os índices de risco máximo e provável. Esses índices mostraram que a época de maior risco acontece durante o período chuvoso, principalmente nos estados de Táchira e Trujillo. / Potato is one of the most importance crops for Venezuela´s agriculture. However, low temperature and high humidity in the region where potatoes are growed are favorable to Late Blight (Phytophtora infestans) occurrence. Such disease limits the potato production in almost all regions of the country. Potato crop is growed in Venezuela at different altitudes, between 400 and 3000 msnm, being the Andes Region, which include the states of Táchira, Mérida, and Trujillo, the biggest producer. In this region, the existence of a weather database is of extreme importance, but the density of weather stations which measure air temperature is very low and the stations are not very well distributed. Another problem related to weather data is that daily rainfall measurements is not very reliable due to failures and missing data, which causes difficulties for agrometeorological studies associated to climatic risks of diseases occurrence. In function of what was mentioned above and taking into account the importance of potato crop to Venezuela, the objectives of the present study are: to generate models to estimate daily air temperature (average, maximum, and minimum) based on the multiple linear regression and geographical coordinates (latitude, longitude, and longitude); to fill in missing daily rainfall data through a proposed method of the nearest neighboor, which is determined by the cluster analysis (Ward Method, with Euclidean distance); and to characterize and spatialize the climatic risk of potato Late Blight occurrence in the Andes region of Venezuela, based on the forecast model proposed by Hyre (1954), which uses as input daily rainfall and daily average and minimum air temperature. Data used in this study were obtained from 106 weather stations, from the Ministry of Environment and Natural Resources of Venezuela, for a period of 31 years (1967-1997). The obtained models for estimating average, maximum, and minimum air temperatures showed on average determination coefficients (R2) higher than 0.90 when tested with independent data and estimated values free of significant errors: d (agreement index) from 0.98 to 1.00 and average RMSE smaller than 2oC. After to be organized, rainfall data presented a database with 17% of the missing data. Using the proposed method (the nearest neighboor) this percentage fell down to 2.5%. The statistical test of this method showed moderate errors, with MAE between 1.7 and 4.0 mmday-1 and d from 0.57 (daily basis) to 0.83 (monthly basis). The interpolation of the potential maximum climatic risk for Late Blight occurrence was done by Geostatistics, (ordinary Kriging method), generating the map of risks for each sowing date.(20) of January 1st to October 15th. The maximum and most likely risks were then defined. These indexes showed that the period with the highest risk for potato Late Blight occurrence is during the rainy season, mainly in the states of Táchira and Trujillo.
16

Modelagem digital de atributos de solo da Fazenda Edgárdia - Botucatu-SP / Digital soil attributes modeling of Fazenda Edgárdia - Botucatu-SP

Carvalho, Tânia Maria de [UNESP] 19 December 2016 (has links)
Submitted by TÂNIA MARIA DE CARVALHO null (taniacarvalho2010@gmail.com) on 2017-02-02T19:26:12Z No. of bitstreams: 1 TESE_arquiv.pdf: 4743361 bytes, checksum: 0c094f892ee8b02e1690df7e4438651f (MD5) / Approved for entry into archive by LUIZA DE MENEZES ROMANETTO (luizamenezes@reitoria.unesp.br) on 2017-02-06T16:42:11Z (GMT) No. of bitstreams: 1 carvalho_tm_dr_bot.pdf: 4743361 bytes, checksum: 0c094f892ee8b02e1690df7e4438651f (MD5) / Made available in DSpace on 2017-02-06T16:42:11Z (GMT). No. of bitstreams: 1 carvalho_tm_dr_bot.pdf: 4743361 bytes, checksum: 0c094f892ee8b02e1690df7e4438651f (MD5) Previous issue date: 2016-12-19 / O mapa de solos é uma ferramenta essencial para o planejamento de uso da terra e estudos que envolvem aspectos ambientais relativos a esse importante recurso natural. Técnicas quantitativas e ferramentas de geoprocessamento têm sido aliadas à interpretação dos processos pedogenéticos para possibilitar a elaboração de mapas mais precisos, obtidos por processo mais rápido e menos oneroso. Dentre os modelos aplicados, os denominados modelos híbridos empregam variáveis auxiliares preditoras e autocorrelação espacial, para viabilizar a predição de atributos de solo em locais não amostrados. A iniciativa para mapeamento digital do solo em escala mundial – GlobalSoilMap.net atua no sentido de disponibilizar representações globais de atributos de solo, elaboradas por meio da aplicação de modelo híbrido em dados legados de solos, realizando a prática do Mapeamento Digital de Solos (MDS). Com base nesse princípio, esse trabalho baseou-se na hipótese de que a aplicação da técnica híbrida regressão-krigagem, utilizando dados legados de levantamento de solo e covariáveis de relevo e sensoriamento remoto proveem mapa de atributos de solo representativos de uma área da Cuesta de Botucatu. O modelo foi aplicado localmente, a duas profundidades, para representação contínua do Índice de Avermelhamento (IAV), saturação de bases (V%), teor de areia, teor de argila, CTC e pH dos solos da Fazenda Experimental Edgárdia, para a qual são disponíveis dados de levantamento de solo. As covariáveis preditoras derivadas de um MDE e de imagem orbital foram uniformizadas a uma resolução espacial de 10 m, e os métodos foram selecionados de acordo com a verificação de correlação linear significativa entre atributos e covariáveis e autocorrelação espacial dos atributos ou dos resíduos de regressões lineares múltiplas (RLM). Os dados foram separados em subconjuntos de treinamento e validação. Os coeficientes de correlação entre atributos de solo e covariáveis foram significativos e variaram de -0,40 a 0,51. Os preditores mais correlacionados aos atributos foram Índice Topográfico de Umidade (ITU), Declividade (Decl), Aspecto (Aspc), Elevação (Elev) e índice de vegetação NDVI, sendo os quatro últimos os principais na estimação das frações texturais. Os valores de R² ajustado das RLM, entre 0,10 e 0,36, foram considerados baixos. De modo geral, os mapas de predição expuseram padrões característicos da variação espacial observada nos mapas das covariáveis preditoras, usadas na calibração dos modelos. Foi observado um incremento na acurácia entre as duas etapas do processo de RK, indicando que o mapa final é superior em relação à RLM. No entanto, os modelos apresentaram, de modo geral, um baixo desempenho quando avaliados por meio de validação externa, mesmo com a estratificação em duas áreas mais uniformes em termos de relevo. Os resultados indicaram a limitação do uso de amostragem para fins de levantamento em modelos de predição. Houve ainda dificuldade de aplicação dos modelos em função do contexto litológico complexo e da dinâmica local de formação de solos, que não puderam ser detectadas pelas covariáveis selecionadas. Apesar das limitações, os mapas de predição apresentaram coerência com o conhecimento relativo aos atributos, nas condições locais. / The soil map is an essential tool for land use planning and studies related to environmental aspects of this important natural resource. Quantitative techniques and geoprocessing tools are currently combined with the interpretation of pedogenic processes to enable the development of more accurate maps obtained by faster and less costly process. Among the models applied to it, the hybrid models employ predictive auxiliary variables and spatial autocorrelation, to enable the prediction of soil attributes in unsampled locations. The digital soil mapping worldwide project – GlobalSoilMap.net acts in order to provide global representations of soil attributes developed through the application of hybrid model in legacy soil data, performing the practice of Digital Soil Mapping (MDS). This work was based on the assumption that the application of the hybrid technique of regression-kriging (RK), using legacy data of soil survey and covariates of relief and remote sensing provide representative map of soil attributes of an area in Cuesta of Botucatu. The goal was to apply locally, in two depths, prediction models and continuous representation of Soil Redness Index (IAV), base saturation index (V%), sand content and clay content, cation-exchange capacity (CTC) and pH of the soils in Edgardia Experimental Farm, for which are available soil survey data. The predictor covariates were derived from an Digital Elevation Model (MDE) and an orbital image. They were all standardized at spatial resolution of 10 m, the methods were selected by checking significant linear correlation between attributes and covariates and spatial autocorrelation of attributes or residues of multiple linear regressions (RLM). The data were separated into training and validation subsets. The correlation coefficients (r) between soil attributes and covariates were significant and ranged from -0.40 to 0.51. The predictors more correlated to attributes were topographic wetness index (ITU), slope (Decl), aspect (Aspc), elevation (Elev) and vegetation index (NDVI), and the last four are key definers of granulometric fractions. The values of adjusted R² of RLM were between 0.10 and 0.36, which is considered low. In general, the prediction maps exhibited characteristic patterns of spatial variation observed in the covariates maps, used in the calibration of the models. An increase in accuracy was observed between the two steps of the modeling process by RK, indicating that the final map is better than the RLM. However, the models showed generally low performance, and did not provide good results when evaluated by external validation and even if the area was stratified in two smaller plots, with more homogeneous relief. The results indicated the restricted use of soil survey sampling in prediction models, and the difficulty of applying MDS in areas with complex lithology, especially where the correlation between local dynamics of soil genesis and selected covariates are not strong. Despite the limitations, the prediction maps were consistent with knowledge about soil properties in local conditions.
17

Risco climático de ocorrência da requeima da batata na região dos Andes, Venezuela / Climatic risk for potato late blight occurrence in the Andes region, Venezuela

Beatriz Ibet Lozada Garcia 07 October 2005 (has links)
A batata é uma das culturas de maior importância na produção agrícola da Venezuela. As condições climáticas de baixa temperatura e alta umidade, existentes na região onde normalmente se cultiva a batata, são favoráveis para a ocorrência da Requeima, sendo esta a doença que mais limita a produção de batata em quase todas as regiões produtoras do país. A batata é semeada na Venezuela em variadas condições de altitudes, desde os 400 até 3000 manm, sendo a região Andina (Táchira, Mérida e Trujillo) uma das mais produtoras, portanto, a existência de bases de dados meteorológicas de qualidade é de extrema importância. Nessa região, a densidade de estações meteorológicas que registram a temperatura do ar é muito irregular e pequena, sendo que outro problema é a ocorrência de falhas nas séries históricas diárias de precipitação, o que dificulta os estudos agrometeorológicos associados à determinação do risco climático devido à ocorrência de doenças nas culturas agrícolas. Em decorrência disso e da importância da cultura para o país, propôs-se à realização do presente trabalho, cujos objetivos foram: gerar modelos diários de estimativa da temperatura mínima, máxima e média do ar, empregando-se a regressão linear múltipla, considerando como variáveis independentes às coordenadas geográficas (longitude e latitude) e a altitude, preencher os dados faltantes em séries diárias de precipitação, mediante uma proposta do método do vizinho mais próximo, sendo este aquele determinado a partir da análise de agrupamento (método de Ward, com distância euclidiana); e caracterizar e espacializar o risco de ocorrência da Requeima da batata na região andina da Venezuela, mediante o modelo de previsão proposto por Hyre (1954), cujos dados de entrada são precipitação e temperatura média e mínima diárias. Estes dados foram obtidos de 106 estações meteorológicas pertencentes ao Ministério do Ambiente e Recursos Naturais da Venezuela, para um período de 31 anos (1967-1997). Os modelos obtidos para a estimativa das temperaturas mínima, máxima e média do ar apresentaram em média coeficientes de determinação superiores a 0,90, quando testados com dados independentes, com estimativas livres de erro significativos: índice de concordância d variando de 0,98 a 1,0 e RMSE médio menor que 2 °C. Para a precipitação, empregou-se uma base de dados que inicialmente apresentava um 17% de dados faltantes, o que foi reduzido para 2,5% com o uso do método proposto. Os erros (ME, MAE e RMSE) obtidos foram moderados (MAE 1,7 a 4,0 mm/dia) e o índice de concordância variou de 0,57, para dados diários, a 0,83, quando os dados foram agrupados em valores mensais. A interpolação do risco potencial máximo de ocorrência da Requeima se realizou mediante técnicas de Geoestatística, (Krigagem ordinária) gerando mapas de risco de cada data de semeadura (20), de 1° de janeiro a 15 de outubro. Foram definidos os índices de risco máximo e provável. Esses índices mostraram que a época de maior risco acontece durante o período chuvoso, principalmente nos estados de Táchira e Trujillo. / Potato is one of the most importance crops for Venezuela´s agriculture. However, low temperature and high humidity in the region where potatoes are growed are favorable to Late Blight (Phytophtora infestans) occurrence. Such disease limits the potato production in almost all regions of the country. Potato crop is growed in Venezuela at different altitudes, between 400 and 3000 msnm, being the Andes Region, which include the states of Táchira, Mérida, and Trujillo, the biggest producer. In this region, the existence of a weather database is of extreme importance, but the density of weather stations which measure air temperature is very low and the stations are not very well distributed. Another problem related to weather data is that daily rainfall measurements is not very reliable due to failures and missing data, which causes difficulties for agrometeorological studies associated to climatic risks of diseases occurrence. In function of what was mentioned above and taking into account the importance of potato crop to Venezuela, the objectives of the present study are: to generate models to estimate daily air temperature (average, maximum, and minimum) based on the multiple linear regression and geographical coordinates (latitude, longitude, and longitude); to fill in missing daily rainfall data through a proposed method of the nearest neighboor, which is determined by the cluster analysis (Ward Method, with Euclidean distance); and to characterize and spatialize the climatic risk of potato Late Blight occurrence in the Andes region of Venezuela, based on the forecast model proposed by Hyre (1954), which uses as input daily rainfall and daily average and minimum air temperature. Data used in this study were obtained from 106 weather stations, from the Ministry of Environment and Natural Resources of Venezuela, for a period of 31 years (1967-1997). The obtained models for estimating average, maximum, and minimum air temperatures showed on average determination coefficients (R2) higher than 0.90 when tested with independent data and estimated values free of significant errors: d (agreement index) from 0.98 to 1.00 and average RMSE smaller than 2oC. After to be organized, rainfall data presented a database with 17% of the missing data. Using the proposed method (the nearest neighboor) this percentage fell down to 2.5%. The statistical test of this method showed moderate errors, with MAE between 1.7 and 4.0 mmday-1 and d from 0.57 (daily basis) to 0.83 (monthly basis). The interpolation of the potential maximum climatic risk for Late Blight occurrence was done by Geostatistics, (ordinary Kriging method), generating the map of risks for each sowing date.(20) of January 1st to October 15th. The maximum and most likely risks were then defined. These indexes showed that the period with the highest risk for potato Late Blight occurrence is during the rainy season, mainly in the states of Táchira and Trujillo.
18

Big Data : le nouvel enjeu de l'apprentissage à partir des données massives / Big Data : the new challenge Learning from data Massive

Adjout Rehab, Moufida 01 April 2016 (has links)
Le croisement du phénomène de mondialisation et du développement continu des technologies de l’information a débouché sur une explosion des volumes de données disponibles. Ainsi, les capacités de production, de stockage et de traitement des donnée sont franchi un tel seuil qu’un nouveau terme a été mis en avant : Big Data.L’augmentation des quantités de données à considérer, nécessite la mise en oeuvre de nouveaux outils de traitement. En effet, les outils classiques d’apprentissage sont peu adaptés à ce changement de volumétrie tant au niveau de la complexité de calcul qu’à la durée nécessaire au traitement. Ce dernier, étant le plus souvent centralisé et séquentiel,ce qui rend les méthodes d’apprentissage dépendantes de la capacité de la machine utilisée. Par conséquent, les difficultés pour analyser un grand jeu de données sont multiples.Dans le cadre de cette thèse, nous nous sommes intéressés aux problèmes rencontrés par l’apprentissage supervisé sur de grands volumes de données. Pour faire face à ces nouveaux enjeux, de nouveaux processus et méthodes doivent être développés afin d’exploiter au mieux l’ensemble des données disponibles. L’objectif de cette thèse est d’explorer la piste qui consiste à concevoir une version scalable de ces méthodes classiques. Cette piste s’appuie sur la distribution des traitements et des données pou raugmenter la capacité des approches sans nuire à leurs précisions.Notre contribution se compose de deux parties proposant chacune une nouvelle approche d’apprentissage pour le traitement massif de données. Ces deux contributions s’inscrivent dans le domaine de l’apprentissage prédictif supervisé à partir des données volumineuses telles que la Régression Linéaire Multiple et les méthodes d’ensemble comme le Bagging.La première contribution nommée MLR-MR, concerne le passage à l’échelle de la Régression Linéaire Multiple à travers une distribution du traitement sur un cluster de machines. Le but est d’optimiser le processus du traitement ainsi que la charge du calcul induite, sans changer évidement le principe de calcul (factorisation QR) qui permet d’obtenir les mêmes coefficients issus de la méthode classique.La deuxième contribution proposée est appelée "Bagging MR_PR_D" (Bagging based Map Reduce with Distributed PRuning), elle implémente une approche scalable du Bagging,permettant un traitement distribué sur deux niveaux : l’apprentissage et l’élagage des modèles. Le but de cette dernière est de concevoir un algorithme performant et scalable sur toutes les phases de traitement (apprentissage et élagage) et garantir ainsi un large spectre d’applications.Ces deux approches ont été testées sur une variété de jeux de données associées àdes problèmes de régression. Le nombre d’observations est de plusieurs millions. Nos résultats expérimentaux démontrent l’efficacité et la rapidité de nos approches basées sur la distribution de traitement dans le Cloud Computing. / In recent years we have witnessed a tremendous growth in the volume of data generatedpartly due to the continuous development of information technologies. Managing theseamounts of data requires fundamental changes in the architecture of data managementsystems in order to adapt to large and complex data. Single-based machines have notthe required capacity to process such massive data which motivates the need for scalablesolutions.This thesis focuses on building scalable data management systems for treating largeamounts of data. Our objective is to study the scalability of supervised machine learningmethods in large-scale scenarios. In fact, in most of existing algorithms and datastructures,there is a trade-off between efficiency, complexity, scalability. To addressthese issues, we explore recent techniques for distributed learning in order to overcomethe limitations of current learning algorithms.Our contribution consists of two new machine learning approaches for large scale data.The first contribution tackles the problem of scalability of Multiple Linear Regressionin distributed environments, which permits to learn quickly from massive volumes ofexisting data using parallel computing and a divide and-conquer approach to providethe same coefficients like the classic approach.The second contribution introduces a new scalable approach for ensembles of modelswhich allows both learning and pruning be deployed in a distributed environment.Both approaches have been evaluated on a variety of datasets for regression rangingfrom some thousands to several millions of examples. The experimental results showthat the proposed approaches are competitive in terms of predictive performance while reducing significantly the time of training and prediction.
19

Contact Center Employee Characteristics Associated with Customer Satisfaction

Pow, Lara 01 January 2017 (has links)
The management of operations for a customer contact center (CCC) presents significant challenges. Management's direction is to reduce costs through operational efficiency metrics while providing maximum customer satisfaction levels to retain customers and increase profit margins. The purpose of this correlational study was to quantify the significance of various customer service representative (CSR) characteristics including internal service quality, employee satisfaction, and employee productivity, and then to determine their predictive ability on customer satisfaction, as outlined in the service-profit chain model. The research question addressed whether a linear relationship existed between CSR characteristics and the customers' satisfaction with the CSR by applying ordinary least squares regression using archival dyadic data. The data consisted of a random sample of 269 CSRs serving a large Canadian bank. Various subsets of data were analyzed via regression to help generate actionable insights. One particular model involving poor performing CSRs whose customer satisfaction was less than 75% top box proved to be statistically significant (p = .036, R-squared = .321) suggesting that poor performing CSRs contribute to a significant portion of poor customer service while high performing CSRs do not necessarily guarantee good customer service. A key variable used in this research was a CSR's level of education, which was not significant. Such a finding implies that for CCC support, a less-educated labor pool may be maintained, balancing societal benefits of employment for less-educated people at a reasonable service cost to a company. These findings relate to positive social change as hiring less-educated applicants could increase their social and economic status.
20

QUANTIFYING NON-RECURRENT DELAY USING PROBE-VEHICLE DATA

Brashear, Jacob Douglas Keaton 01 January 2018 (has links)
Current practices based on estimated volume and basic queuing theory to calculate delay resulting from non-recurrent congestion do not account for the day-to-day fluctuations in traffic. In an attempt to address this issue, probe GPS data are used to develop impact zone boundaries and calculate Vehicle Hours of Delay (VHD) for incidents stored in the Traffic Response and Incident Management Assisting the River City (TRIMARC) incident log in Louisville, KY. Multiple linear regression along with stepwise selection is used to generate models for the maximum queue length, the average queue length, and VHD to explore the factors that explain the impact boundary and VHD. Models predicting queue length do not explain significant amounts of variance but can be useful in queue spillback studies. Models predicting VHD are as effective as the data collected; models using cheaper-to-collect data sources explain less variance; models collecting more detailed data explained more variance. Models for VHD can be useful in incident management after action reviews and predicting road user costs.

Page generated in 0.1386 seconds