Global ETD Search

1	Characterization and interwell connectivity evaluation of Green Rver reservoirs, Wells Draw study area, Uinta Basin, Utah Abiazie, Joseph Uchechukwu 15 May 2009 (has links) Recent efforts to optimize oil recovery from Green River reservoirs, Uinta Basin, have stimulated the need for better understanding of the reservoir connectivity at the scale of the operational unit. This study focuses on Green River reservoirs in the Wells Draw study area where oil production response to implemented waterflood is poor and a better understanding of the reservoir connectivity is required to enhance future secondary oil recovery. Correlating the sand bodies between well locations in the area remains difficult at 40-acre well spacing. Thus, interwell connectivity of the reservoirs is uncertain. Understanding the reservoir connectivity in the Wells Draw study area requires integration of all static and dynamic data for generation of probabilistic models of the reservoir at the interwell locations. The objective of this study is two-fold. The first objective was to determine reservoir connectivity at the interwell scale in the Wells Draw study area. To achieve this goal, I used well log and perforation data in the Wells Draw study area to produce probabilistic models of net-porosity for four producing intervals: (1) Castle Peak, (2) Lower Douglas Creek, (3) Upper Douglas Creek, and (4) Garden Gulch. The second objective was to find readily applicable methods for determining interwell connectivity. To achieve this goal, I used sandstone net thickness and perforation data to evaluate interwell connectivity in the Wells Draw study area. This evaluation was done to: (1) assess and visualize connectivity, (2) provide an assessment of connectivity for validating / calibrating percolation and capacitance based methods, and (3) determine flow barriers for simulation. The probabilistic models encompass the four producing intervals with a gross thickness of 1,900 ft and enable simulation assessments of different development strategies for optimization of oil recovery in the Wells Draw study area. The method developed for determining interwell connectivity in Wells Draw study area is reliable and suited to the four producing intervals. Also, this study shows that the percolation based method is reliable for determining interwell connectivity in the four producing intervals. Reservoir Connectivity Green River Characterization Geostatistical modeling ranking
2	Characterization and interwell connectivity evaluation of Green Rver reservoirs, Wells Draw study area, Uinta Basin, Utah Abiazie, Joseph Uchechukwu 15 May 2009 (has links) Recent efforts to optimize oil recovery from Green River reservoirs, Uinta Basin, have stimulated the need for better understanding of the reservoir connectivity at the scale of the operational unit. This study focuses on Green River reservoirs in the Wells Draw study area where oil production response to implemented waterflood is poor and a better understanding of the reservoir connectivity is required to enhance future secondary oil recovery. Correlating the sand bodies between well locations in the area remains difficult at 40-acre well spacing. Thus, interwell connectivity of the reservoirs is uncertain. Understanding the reservoir connectivity in the Wells Draw study area requires integration of all static and dynamic data for generation of probabilistic models of the reservoir at the interwell locations. The objective of this study is two-fold. The first objective was to determine reservoir connectivity at the interwell scale in the Wells Draw study area. To achieve this goal, I used well log and perforation data in the Wells Draw study area to produce probabilistic models of net-porosity for four producing intervals: (1) Castle Peak, (2) Lower Douglas Creek, (3) Upper Douglas Creek, and (4) Garden Gulch. The second objective was to find readily applicable methods for determining interwell connectivity. To achieve this goal, I used sandstone net thickness and perforation data to evaluate interwell connectivity in the Wells Draw study area. This evaluation was done to: (1) assess and visualize connectivity, (2) provide an assessment of connectivity for validating / calibrating percolation and capacitance based methods, and (3) determine flow barriers for simulation. The probabilistic models encompass the four producing intervals with a gross thickness of 1,900 ft and enable simulation assessments of different development strategies for optimization of oil recovery in the Wells Draw study area. The method developed for determining interwell connectivity in Wells Draw study area is reliable and suited to the four producing intervals. Also, this study shows that the percolation based method is reliable for determining interwell connectivity in the four producing intervals. Reservoir Connectivity Green River Characterization Geostatistical modeling ranking
3	Amostras virtuais no monitoramento da produção florestal / Virtual samples in the monitoring of forest production Lima, Natália da Silva 23 February 2018 (has links) Submitted by Natália da Silva Lima null (nataliadslima@gmail.com) on 2018-03-27T16:35:58Z No. of bitstreams: 1 Natalia Lima_versao_completa.pdf: 1987635 bytes, checksum: 362b53fdf9f1432de196e26249ec4371 (MD5) / Approved for entry into archive by Maria Lucia Martins Frederico null (mlucia@fca.unesp.br) on 2018-03-27T17:22:09Z (GMT) No. of bitstreams: 1 lima_ns__me_botfca.pdf: 1987635 bytes, checksum: 362b53fdf9f1432de196e26249ec4371 (MD5) / Made available in DSpace on 2018-03-27T17:22:09Z (GMT). No. of bitstreams: 1 lima_ns__me_botfca.pdf: 1987635 bytes, checksum: 362b53fdf9f1432de196e26249ec4371 (MD5) Previous issue date: 2018-02-23 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / A elevada produtividade do eucalipto ocasiona a necessidade do domínio de técnicas e ferramentas precisas acerca dos povoamentos florestais, como o inventário florestal, procedimento fundamental para o monitoramento da produção que permite o conhecimento do máximo potencial das florestas, além disso, é essencial para o planejamento de atividades de corte, colheita e suprimento de madeira. As técnicas convencionais para o levantamento da produção florestal utilizam-se de métodos tradicionais de estatística (estatística clássica), considerando apenas a adoção de valores médios para a tomada de decisões, não explorando as correlações espaciais que possam existir entre as parcelas amostrais. Então, um tipo de interpolador geoestatístico que permite definir a estrutura de dependência espacial dos dados é a krigagem, que juntamente com a metodologia de amostras virtuais, pode-se tornar uma alternativa viável para obtenção de um variograma com modelagem aceitável. Desta forma, o objetivo geral deste estudo foi empregar a metodologia de amostras virtuais no planejamento da malha amostral do inventário florestal, a fim de possibilitar a determinação de um número de parcelas que possam ser estabelecidas na área para obtenção de produtividade, sem que haja perda de precisão ou aumento de custos. A área de estudo abrange uma área plantada de 287,66 ha, pertencente a empresa Eucatex S/A, localizada em Itatinga, no estado de São Paulo. Os dados dendrométricos de volume (m³.ha-1) foram obtidos por meio de inventários realizados pela própria empresa, utilizando-se 32 parcelas retangulares permanentes, para os inventários florestais contínuos (IFC) e 32 parcelas retangulares permanentes mais 66 temporárias, para o inventário pré-corte (IPC). Os seguintes passos foram realizados com os dados de IFC para espacialização dos dados originais, no programa ArcGIS: aplicação de uma estatística univariada sobre a variável volume/ha; geração do variograma experimental; ajuste do variograma; validação do modelo teórico; aplicação da krigagem; geração de mapas de produtividade ano a ano. Posteriormente, 47 amostras virtuais foram inseridas na malha amostral dos inventários contínuos, considerando apenas as parcelas permanentes (32 parcelas), sendo que 10 parcelas foram selecionadas previamente para a validação externa; a análise geoestatistica se deu com as amostras virtuais mais as amostras permanentes restantes, totalizando 69 parcelas. Comparando-se os métodos, observou-se pela validação cruzada que o erro médio e a correlação entre valores observados e estimados melhoraram adicionando-se amostras virtuais, possibilitando um melhor ajuste do variograma e obtenção de estimativas favoráveis. A comprovação da eficácia da metodologia de inserção das amostras, chamada de validação externa, deu-se pelo cálculo da média do erro médio, que foi igual a 6,8% em volume, em relação ao inventário real. Desta forma, a técnica de inserção de amostras virtuais pode ser utilizada, mostrando-se satisfatória para o planejamento da próxima malha para posteriores análises geoestatísticas, em plantios equiâneos. / The high productivity of eucalyptus trees makes it necessary to master accurate techniques and tools for forest, such as the forest inventory, a fundamental procedure for monitoring production that allows the knowledge of the maximum potential of forests, and is essential for planning of harvesting and supply of timber. Conventional techniques for surveying forest production use traditional statistical methods (classical statistics), considering only the adoption of average values for decision making, not exploring the spatial correlations that may exist between the sample plots. So, a kind of geostatistical interpolator that allows defining the structure of spatial dependence of data is kriging, which together with the methodology of virtual samples, can become a viable alternative to obtain a variogram with acceptable modeling. In this way, the aim of this study was to use the methodology of virtual samples in the planning of the sampling of the forest inventory, in order to allow the determination of a number of parcels that can be established in the area to obtain productivity, without losing accuracy or increasing costs. The study area covers a planted area of 287.66 ha, belonging to Eucatex S / A, located in Itatinga, in the state of São Paulo. The volume dendrometric data (m³.ha-1) were obtained through inventories carried out by the company and were estimated in 32 permanent rectangular plots for continuous forest inventories (CFI) and 32 permanent rectangular plots plus 66 temporary plots for the pre-cut inventory (PCI). The following steps were performed with the CFI data for spatialization of the original data in the ArcGIS program: application of an univariate statistic on the variable volume. ha-1; generation of the experimental variogram; variogram adjustment; validation of the theoretical model; application of kriging; generation of productivity maps year by year. Subsequently, 47 virtual samples were inserted into the sampling network of continuous inventories, considering only the permanent plots (32 plots), 10 plots were previously selected for external validation; the geostatistical analysis occurred with the virtual samples plus the remaining permanent samples, totaling 69 plots. Comparing the methods, it was observed through cross-validation that the mean error and the correlation between observed and estimated values were improved by adding the virtual samples, allowing better adjustment of the variogram and obtaining favorable estimates. The validity of the methodology of insertion of the samples, called external validation, was verified by means of the average error, which was 6.8% in volume, in relation to the original forest inventory. In this way, the technique of insertion of virtual samples can be used, proving to be satisfactory for the planning of the next mesh for later geostatistical analysis, in plantations that have the same age. / CAPES: 1582107 Inventário florestal Geoestatística Produtividade florestal Forest inventory Geostatistical Forest productivity
4	Conceptual Cost Estimation of Highway Bid Items Using Geostatistical Interpolation Awuku, Bright January 2021 (has links) Challenges associated with ensuring the accuracy and reliability of cost estimation of highway bid items, especially during the conceptual phase of a project, are of significant interest to state highway agencies. Even with the existing research undertaken on the subject, the problem of inaccurate estimation of highway bid items still exists. A systematic literature review was performed to determine research trends, identify, categorize the factors influencing highway unit prices, and assess the performance of conceptual cost prediction models. This research proposes a geographic information system (GIS)?based methodology that leverages vast historical bid data for unit-price estimation and the robust GIS capabilities with consideration of the effects of project-specific location and variations due to cost escalation on different bid items. A comparison of the three spatial interpolation techniques operationalized in this research revealed that disjunctive and empirical Bayesian kriging models led to more accurate cost prediction than ordinary kriging algorithms. conceptual cost estimation geostatistical modeling highway unit prices
5	Geostatistical Inspired Metamodeling and Optimization of Nanoscale Analog Circuits Okobiah, Oghenekarho 05 1900 (has links) The current trend towards miniaturization of modern consumer electronic devices significantly affects their design. The demand for efficient all-in-one appliances leads to smaller, yet more complex and powerful nanoelectronic devices. The increasing complexity in the design of such nanoscale Analog/Mixed-Signal Systems-on-Chip (AMS-SoCs) presents difficult challenges to designers. One promising design method used to mitigate the burden of this design effort is the use of metamodeling (surrogate) modeling techniques. Their use significantly reduces the time for computer simulation and design space exploration and optimization. This dissertation addresses several issues of metamodeling based nanoelectronic based AMS design exploration. A surrogate modeling technique which uses geostatistical based Kriging prediction methods in creating metamodels is proposed. Kriging prediction techniques take into account the correlation effects between input parameters for performance point prediction. We propose the use of Kriging to utilize this property for the accurate modeling of process variation effects of designs in the deep nanometer region. Different Kriging methods have been explored for this work such as simple and ordinary Kriging. We also propose another metamodeling technique Kriging-Bootstrapped Neural Network that combines the accuracy and process variation awareness of Kriging with artificial neural network models for ultra-fast and accurate process aware metamodeling design. The proposed methodologies combine Kriging metamodels with selected algorithms for ultra-fast layout optimization. The selected algorithms explored are: Gravitational Search Algorithm (GSA), Simulated Annealing Optimization (SAO), and Ant Colony Optimization (ACO). Experimental results demonstrate that the proposed Kriging metamodel based methodologies can perform the optimizations with minimal computational burden compared to traditional (SPICE-based) design flows. Kriging geostatistical nanoscale analog Nanoelectronics. Computer simulation.
6	Otimização de amostragem espacial / Optimization of sampling space Guedes, Luciana Pagliosa Carvalho 10 April 2008 (has links) O objetivo desse trabalho foi estabelecer planos de amostragem com redução no tamanho amostral, a partir de conjuntos de dados com dependência espacial, que fossem eficientes na predição de localizações não amostradas e que gerassem estimativas eficientes de características relacionadas na predição espacial. Esses planos amostrais reduzidos foram obtidos por processos de otimização denominados recozimento simulado e algoritmo genético híbrido, considerando a média da variância da predição espacial, obtida pelo método de interpolação chamado krigagem, como função objetivo minimizada. Para isso, utilizaram-se conjuntos de dados simulados, com diferentes valores de alcance e efeito pepita, cujo intuito foi identificar a influência que esses parâmetros exercem na escolha da configuração amostral otimizada. Para cada conjunto de dados simulados, foram obtidas amostras pelos processos de otimização e seus resultados foram comparados aos esquemas de amostragem: aleatório, sistemático, sistemático centrado adicionado de delineamentos menores e sistemático centrado adicionado de pontos próximos. Os resultados mostraram que os planos de amostragem otimizados, principalmente os planos obtidos pelo algoritmo genético híbrido, produziram menores estimativas para a média da variância da krigagem e melhores estimativas para a porcentagem e soma de valores preditos acima do terceiro quartil e do percentil 90, que s~ao características relacionadas na predição espacial. Observou-se, também, que o aumento do tamanho amostral produziu melhores estimativas para todos os resultados analisados e, independente do valor de alcance e efeito pepita, a amostragem otimizada pelo algoritmo genético híbrido produziu melhores resultados. Além disso, obtiveram-se conjuntos amostrais reduzidos de 128 parcelas pelo algoritmo genético híbrido, pelo processo de recozimento simulado e pelas amostragens aleatória e sistemática, para a propriedade química teor de potássio pertencente ao conjunto de dados, com 256 parcelas, de um experimento de agricultura de precisão em uma área experimental. Por intermédio dos dados resultantes dessas amostragens, realizou-se uma analise geoestatística para identificar o comportamento de dependência espacial da variável potássio na área e foram feitas predições espaciais do potássio em localizações n~ao amostradas nessa mesma área. Em todos os esquemas de amostragem utilizados, os valores preditos foram classificados segundo o critério de adubação do potássio no Paraná em culturas de soja (EMATER - Empresa Paranaense de Assistência Técnica e Extensão Rural, 1998). Esses resultados foram comparados com a analise realizada no conjunto de dados inicial e observou-se uma maior similaridade desses resultados com os obtidos pela analise realizada através dos dados da amostragem obtida pelo algoritmo genético híbrido. Assim, tiveram-se evidências de que a redução em 50% do tamanho amostral do conjunto de dados da variável potássio, utilizando nessa redução uma amostragem obtida pelo algoritmo genético híbrido, produziu resultados eficientes para a classificação de adubação de potássio na área em estudo, reduzindo em 50% os custos 8 com analise química do solo, sem grande perda de eficiência nas conclusões obtidas pela predição espacial. / The aim of this work was to establish plans for sampling with reduced in the sample size, from sets of dependent spatial data, and they are eficients in terms of prediction of the nonsampled observations and prediction of linear targets. These plans were obtained by sampling reduced processes optimization of the algorithm called simulated annealing and hybrid genetic algorithm, considering the average kriging variance as objective function to be minimised. Therefore, it was used simulated data sets, With diferent values of range and nugget efect, whose aim was to identify the in uence that these exercise parameters in choosing the sample configuration foptimized. For each set of data simulated samples were obtained through the optimization process and its results were compared to sampling schemes: random, systematic, lattice plus in ll and lattice plus close pairs. The results show that the sampling plans optimized, especially the plans obtained by hybrid genetic algorithm, produced lower estimates for the average kriging variance and best estimates for the percentage and amount of predicted values above the third quartile and the 90 percentile, which are characteristics related to spatial prediction. It was also observed that the increase of sample size produces best estimate for all results analyzed and independent of the value range and nugget efect, sampling optimized by the hybrid genetic algorithm produced better results. In addition, sets up sampling reduced of 128 samples by sampling schemes: hybrid genetic algorithm, simulated annealing, random and systematic, for the property belonging to the chemical potassium set data, with 256 samples, an experiment of precision agriculture, in an experimental area. Through these sampling data, an analysis was carried out geostatistics to identify the behavior of spatial dependence of the variable potassium in the area under study and predictions were made of potassium space in locations not sampled that same area. In all sampling schemes used, the predicted values were classified at the discretion of the potassium fertilization in the cultivation of soybeans in Parana (EMATER-PARANA, 1998). These results were compared with the analysis in the initial set of data and there was a greater similarity of these results with those obtained by the analysis performed by data obtained by the sampling hybrid genetic algorithm. So there has been evidence that the reduction by 50% of the sample size of the data set of variable potassium, using this reduction a sample obtained by the hybrid genetic algorithm, produced efective results for classification of potassium fertilizer in the area under study, reducing by 50% the costs with chemical analysis of soil, without much loss of eficiency in the conclusions obtained by predicting space. Algoritmos genéticos Amostragem Design geostatistical Estatística de processos estocásticos. Genetic algorithm. Geostatistics
7	Simulação geoestatística aplicada ao planejamento de pilhas de homogeneização : um estudo de caso de reconciliação Abichequer, Luciana Arnt January 2010 (has links) A lucratividade da indústria mineira é diretamente dependente do planejamento adequado de todas as fases de extração e beneficiamento de minério. Cada passo deste processo por sua vez, também é dependente dos resultados da fase anterior. As usinas de beneficiamento, por exemplo, devem ser alimentadas por um material o mais homogêneo possível, o que é garantido por um planejamento adequado de lavra e de forma muito eficaz por pilhas de homogeneização. No entanto, os métodos de estimativa comumente utilizados para prever os teores dos blocos que formam esses sistemas, não medem adequadamente a incerteza associada a este processo de estimativa. Este estudo avalia a eficácia da simulação geoestatística na previsão da variabilidade in situ dos teores e planejamento de pilhas de homogeneização. O método gera múltiplos cenários equiprováveis para o depósito que devem reproduzir os parâmetros estatísticos das amostras, o histograma e o variograma do fenômeno. Para um dado plano de lavra, o conjunto de cenários simulados irá gerar um conjunto de valores equiprováveis para cada pilha de homogeneização. A metodologia proposta foi aplicada a um depósito de fosfato na região central do Brasil. Neste caso de estudo, os teores de P2O5 previstos pelo plano de lavra de curto prazo e amostrados na área industrial foram comparados com o grupo de valores gerados para as pilhas por meio da simulação. A reprodução dos resultados demonstrou a aplicabilidade da metodologia para o depósito em questão. Além disto, a simulação geoestatística demonstrou ser uma ferramenta muito eficiente na previsão da variabilidade in situ dos teores e planejamento de pilhas de homogeneização. / Profits in mineral industry strongly depend on proper planning at all stages of mining and ore processing, and every step of these processes depends on the results from the previous stage. These chains of events can be illustrated, for instance, by the processing plant. To maximize ore recovery, among various factors, the processing plant must be fed by a material as homogeneous as possible minimizing the variance of the head grades that can be achieved initially by a proper mining plan and optimal schedule and more effectively by homogenization piles. The key factor is to know the input grade variance to design a blending system capable of attenuating it. Estimation methods commonly used to predict block grades which form the geological model used for mine planning do not adequately measure the variance associated with these estimates. This study evaluates geostatistical simulation in predicting in situ grades variability and planning blending piles. The method generates multiple, equally probable grade scenarios for the deposit, which reproduce the values of the samples, the histogram, and the variograma of the attribute being simulated. For a given mining plan, the set of simulated scenarios will generate a group of equiprobable values for each homogenization pile. These results provide the means to assess a range of possible values each pile can have. To validate the suggested methodology, the contents of P2O5 obtained by the short-term mining plan and sampled at the processing plant from a large phosphate mine in central Brazil were compared against the simulated values. The results matched adequately demonstrating that geostatistical simulation and pile emulation methodology are efficient tools that could be applied in predicting grades in situ variability and in planning blending piles. Minérios Homogeneização Geoestatística Simulação estocástica In situ variability Homogenization pile Geostatistical simulation
8	Statistical Geocomputing: Spatial Outlier Detection in Precision Agriculture Chu Su, Peter 29 September 2011 (has links) The collection of crop yield data has become much easier with the introduction of technologies such as the Global Positioning System (GPS), ground-based yield sensors, and Geographic Information Systems (GIS). This explosive growth and widespread use of spatial data has challenged the ability to derive useful spatial knowledge. In addition, outlier detection as one important pre-processing step remains a challenge because the technique and the definition of spatial neighbourhood remain non-trivial, and the quantitative assessments of false positives, false negatives, and the concept of region outlier remain unexplored. The overall aim of this study is to evaluate different spatial outlier detection techniques in terms of their accuracy and computational efficiency, and examine the performance of these outlier removal techniques in a site-specific management context. In a simulation study, unconditional sequential Gaussian simulation is performed to generate crop yield as the response variable along with two explanatory variables. Point and region spatial outliers are added to the simulated datasets by randomly selecting observations and adding or subtracting a Gaussian error term. With simulated data which contains known spatial outliers in advance, the assessment of spatial outlier techniques can be conducted as a binary classification exercise, treating each spatial outlier detection technique as a classifier. Algorithm performance is evaluated with the area and partial area under the ROC curve up to different true positive and false positive rates. Outlier effects in on-farm research are assessed in terms of the influence of each spatial outlier technique on coefficient estimates from a spatial regression model that accounts for autocorrelation. Results indicate that for point outliers, spatial outlier techniques that account for spatial autocorrelation tend to be better than standard spatial outlier techniques in terms of higher sensitivity, lower false positive detection rate, and consistency in performance. They are also more resistant to changes in the neighbourhood definition. In terms of region outliers, standard techniques tend to be better than spatial autocorrelation techniques in all performance aspects because they are less affected by masking and swamping effects. In particular, one spatial autocorrelation technique, Averaged Difference, is superior to all other techniques in terms of both point and region outlier scenario because of its ability to incorporate spatial autocorrelation while at the same time, revealing the variation between nearest neighbours. In terms of decision-making, all algorithms led to slightly different coefficient estimates, and therefore, may result in distinct decisions for site-specific management. The results outlined here will allow an improved removal of crop yield data points that are potentially problematic. What has been determined here is the recommendation of using Averaged Difference algorithm for cleaning spatial outliers in yield dataset. Identifying the optimal nearest neighbour parameter for the neighbourhood aggregation function is still non-trivial. The recommendation is to specify a large number of nearest neighbours, large enough to capture the region size. Lastly, the unbiased coefficient estimates obtained with Average Difference suggest it is the better method for pre-processing spatial outliers in crop yield data, which underlines its suitability for detecting spatial outlier in the context of on-farm research. spatial outlier geostatistical simulation spatial neighbourhood area under ROC curve sensitivity-specificity precision agriculture Geography
9	Applications of Ensemble Kalman Filter for characterization and history matching of SAGD reservoirs Gul, Ali Unknown Date No description available. Ensemble Kalman Filter Steam Assisted Gravity Drainage Computer Assisted History Matching Reservoir Characterization Geostatistical Reservoir Modeling
10	Statistical Geocomputing: Spatial Outlier Detection in Precision Agriculture Chu Su, Peter 29 September 2011 (has links) The collection of crop yield data has become much easier with the introduction of technologies such as the Global Positioning System (GPS), ground-based yield sensors, and Geographic Information Systems (GIS). This explosive growth and widespread use of spatial data has challenged the ability to derive useful spatial knowledge. In addition, outlier detection as one important pre-processing step remains a challenge because the technique and the definition of spatial neighbourhood remain non-trivial, and the quantitative assessments of false positives, false negatives, and the concept of region outlier remain unexplored. The overall aim of this study is to evaluate different spatial outlier detection techniques in terms of their accuracy and computational efficiency, and examine the performance of these outlier removal techniques in a site-specific management context. In a simulation study, unconditional sequential Gaussian simulation is performed to generate crop yield as the response variable along with two explanatory variables. Point and region spatial outliers are added to the simulated datasets by randomly selecting observations and adding or subtracting a Gaussian error term. With simulated data which contains known spatial outliers in advance, the assessment of spatial outlier techniques can be conducted as a binary classification exercise, treating each spatial outlier detection technique as a classifier. Algorithm performance is evaluated with the area and partial area under the ROC curve up to different true positive and false positive rates. Outlier effects in on-farm research are assessed in terms of the influence of each spatial outlier technique on coefficient estimates from a spatial regression model that accounts for autocorrelation. Results indicate that for point outliers, spatial outlier techniques that account for spatial autocorrelation tend to be better than standard spatial outlier techniques in terms of higher sensitivity, lower false positive detection rate, and consistency in performance. They are also more resistant to changes in the neighbourhood definition. In terms of region outliers, standard techniques tend to be better than spatial autocorrelation techniques in all performance aspects because they are less affected by masking and swamping effects. In particular, one spatial autocorrelation technique, Averaged Difference, is superior to all other techniques in terms of both point and region outlier scenario because of its ability to incorporate spatial autocorrelation while at the same time, revealing the variation between nearest neighbours. In terms of decision-making, all algorithms led to slightly different coefficient estimates, and therefore, may result in distinct decisions for site-specific management. The results outlined here will allow an improved removal of crop yield data points that are potentially problematic. What has been determined here is the recommendation of using Averaged Difference algorithm for cleaning spatial outliers in yield dataset. Identifying the optimal nearest neighbour parameter for the neighbourhood aggregation function is still non-trivial. The recommendation is to specify a large number of nearest neighbours, large enough to capture the region size. Lastly, the unbiased coefficient estimates obtained with Average Difference suggest it is the better method for pre-processing spatial outliers in crop yield data, which underlines its suitability for detecting spatial outlier in the context of on-farm research. spatial outlier geostatistical simulation spatial neighbourhood area under ROC curve sensitivity-specificity precision agriculture Geography

Search results