Global ETD Search

141	Automatic Classification of Fish in Underwater Video; Pattern Matching - Affine Invariance and Beyond gundam, madhuri, Gundam, Madhuri 15 May 2015 (has links) Underwater video is used by marine biologists to observe, identify, and quantify living marine resources. Video sequences are typically analyzed manually, which is a time consuming and laborious process. Automating this process will significantly save time and cost. This work proposes a technique for automatic fish classification in underwater video. The steps involved are background subtracting, fish region tracking and classification using features. The background processing is used to separate moving objects from their surrounding environment. Tracking associates multiple views of the same fish in consecutive frames. This step is especially important since recognizing and classifying one or a few of the views as a species of interest may allow labeling the sequence as that particular species. Shape features are extracted using Fourier descriptors from each object and are presented to nearest neighbor classifier for classification. Finally, the nearest neighbor classifier results are combined using a probabilistic-like framework to classify an entire sequence. The majority of the existing pattern matching techniques focus on affine invariance, mainly because rotation, scale, translation and shear are common image transformations. However, in some situations, other transformations may be modeled as a small deformation on top of an affine transformation. The proposed algorithm complements the existing Fourier transform-based pattern matching methods in such a situation. First, the spatial domain pattern is decomposed into non-overlapping concentric circular rings with centers at the middle of the pattern. The Fourier transforms of the rings are computed, and are then mapped to polar domain. The algorithm assumes that the individual rings are rotated with respect to each other. The variable angles of rotation provide information about the directional features of the pattern. This angle of rotation is determined starting from the Fourier transform of the outermost ring and moving inwards to the innermost ring. Two different approaches, one using dynamic programming algorithm and second using a greedy algorithm, are used to determine the directional features of the pattern. Electrical and Electronics
142	Optimalizace rozvozu piva společnosti Heineken / Heineken Beer Distribution Optimalisation Vršecká, Renáta January 2009 (has links) This thesis deals with real logistic problem of the Heineken CZ Company. The company sets down an itinerary for each vehicle to distribute its goods to particular customers on daily basis. These itineraries are created manually, only with the skill of experienced driver. The goal of this thesis is to find a solution with an algorithm, which will be able to set optimal itineraries of all vehicles, so the total distance and therefore operating costs are minimized, with only the knowledge of distances between each two nodes.
143	EVALUATING SPATIAL QUERIES OVER DECLUSTERED SPATIAL DATA Eslam A Almorshdy (6832553) 02 August 2019 (has links) <div> <div> <p>Due to the large volumes of spatial data, data is stored on clusters of machines that inter-communicate to achieve a task. In such distributed environment; communicating intermediate results among computing nodes dominates execution time. Communication overhead is even more dominant if processing is in memory. Moreover, the way spatial data is partitioned affects overall processing cost. Various partitioning strategies influence the size of the intermediate results. Spatial data poses the following additional challenges: 1)Storage load balancing because of the skewed distribution of spatial data over the underlying space, 2)Query load imbalance due to skewed query workload and query hotspots over both time and space, and 3)Lack of effective utilization of the computing resources. We introduce a new kNN query evaluation technique, termed BCDB, for evaluating nearest-neighbor queries (NN-queries, for short). In contrast to clustered partitioning of spatial data, BCDB explores the use of declustered partitioning of data to address data and query skew. BCDB uses summaries of the underling data and a coarse-grained index to localize processing of the NN-query on each local node as much as possible. The coarse-grained index is locally traversed using a new uncertain version of classical distance browsing resulting in minimal O( √k) elements to be communicated across all processing nodes.</p> </div> </div> Theoretical Computer Science Geospatial Information Systems Data Structures Database Management Access methods Spatial Database Nearest Neighbor Queries
144	Weighing Machine Learning Algorithms for Accounting RWISs Characteristics in METRo : A comparison of Random Forest, Deep Learning & kNN Landmér Pedersen, Jesper January 2019 (has links) The numerical model to forecast road conditions, Model of the Environment and Temperature of Roads (METRo), laid the foundation of solving the energy balance and calculating the temperature evolution of roads. METRo does this by providing a numerical modelling system making use of Road Weather Information Stations (RWIS) and meteorological projections. While METRo accommodates tools for correcting errors at each station, such as regional differences or microclimates, this thesis proposes machine learning as a supplement to the METRo prognostications for accounting station characteristics. Controlled experiments were conducted by comparing four regression algorithms, that is, recurrent and dense neural network, random forest and k-nearest neighbour, to predict the squared deviation of METRo forecasted road surface temperatures. The results presented reveal that the models utilising the random forest algorithm yielded the most reliable predictions of METRo deviations. However, the study also presents the promise of neural networks and the ability and possible advantage of seasonal adjustments that the networks could offer. machine learning neural network random forest k-nearest neighbour Computer Sciences Datavetenskap (datalogi)
145	PCA-tree: uma proposta para indexação multidimensional / PCA-Tree: a multidimensional access method proposal Bernardina, Philipe Dalla 15 June 2007 (has links) Com o vislumbramento de aplicações que exigiam representações em espaços multidimensionais, surgiu a necessidade de desenvolvimento de métodos de acessos eficientes a estes dados representados em R^d. Dentre as aplicações precursoras dos métodos de acessos multidimensionais, podemos citar os sistemas de geoprocessamento, aplicativos 3D e simuladores. Posteriormente, os métodos de acessos multidimensionais também apresentaram-se como uma importante ferramenta no projeto de classificadores, principalmente classificadores pelos vizinhos mais próximos. Com isso, expandiu-se o espaço de representação, que antes se limitava no máximo a quatro dimensões, para dimensionalidades superiores a mil. Dentre os vários métodos de acesso multidimensional existentes, destaca-se uma classe de métodos baseados em árvores balanceadas com representação em R^d. Estes métodos constituem evoluções da árvore de acesso unidimenisonal B-tree e herdam várias características deste último. Neste trabalho, apresentamos alguns métodos de acessos dessa classe de forma a ilustrar a idéia central destes algoritmos e propomos e implementamos um novo método de acesso, a PCA-tree. A PCA-tree utiliza uma heurística de quebra de nós baseada na extração da componente principal das amostras a serem divididas. Um hiperplano que possui essa componente principal como seu vetor normal é definido como o elemento que divide o espaço associado ao nó. A partir dessa idéia básica geramos uma estrutura de dados e algoritmos que utilizam gerenciamento de memória secundária como a B-tree. Finalmente, comparamos o desempenho da PCA-tree com o desempenho de alguns outros métodos de acesso da classe citada, e apresentamos os prós e contras deste novo método de acesso através de análise de resultados práticos. / The advent of applications demanding the representation of objects in multi-dimensional spaces fostered the development of efficient multi-dimensional access methods. Among some early applications that required multi-dimensional access methods, we can cite geo-processing systems, 3D applications and simulators. Later on, multi-dimensional access methods also became important tools in the design of classifiers, mainly of those based on nearest neighbors technique. Consequently, the dimensionality of the spaces has increased, from earlier at most four to dimensionality larger than a thousand. Among several multi-dimensional access methods, the class of approaches based on balanced tree structures with data represented in Rd has received a lot of attention. These methods constitute evolues from the B-tree for unidimensional accesses, and inherit several of its characteristics. In this work, we present some of the access methods based on balanced trees in order to illustrate the central idea of these algorithms, and we propose and implement a new multi-dimensional access method, which we call PCA-tree. It uses an heuristic to break nodes based on the principal component of the sample to be divided. A hyperplane, whose normal is the principal component, is defined as the one that will split the space represented by the node. From this basic idea we define the data structure and the algorithms for the PCA-tree employing secondary memory management, as in B-trees. Finally, we compare the performance of the PCA-tree with the performance of other methods in the cited class, and present advantages and disadvantages of the proposed access method through analysis of experimental results. indexação indexing métodos de acessos espaciais métodos de acessos multidimensionais mutidimensional access methods nearest neighbors classifier spatial access methods
146	Artificial intelligence and Machine learning : a diabetic readmission study Forsman, Robin, Jönsson, Jimmy January 2019 (has links) The maturing of Artificial intelligence provides great opportunities for healthcare, but also comes with new challenges. For Artificial intelligence to be adequate a comprehensive analysis of the data is necessary along with testing the data in multiple algorithms to determine which algorithm is appropriate to use. In this study collection of data has been gathered that consists of patients who have either been readmitted or not readmitted to hospital within 30-days after being admitted. The data has then been analyzed and compared in different algorithms to determine the most appropriate algorithm to use. Artificial intelligence Machine learning Logistic regression K-nearest neighbor Boosted decision tree Artificial neural network Computer Sciences Datavetenskap (datalogi)
147	Floodplain Mapping in Data-Scarce Environments Using Regionalization Techniques Keighobad Jafarzadegan (5929811) 10 June 2019 (has links) <p>Flooding is one of the most devastating and frequently occurring natural phenomena in the world. Due to the adverse impacts of floods on the life and property of humans, it is crucial to investigate the best flood modeling approaches for delineation of floodplain areas. Conventionally, different hydrodynamic models are used to identify the floodplain areas. However, the high computational cost, and the dependency of these models on detailed input datasets limit their application for large scale floodplain mapping in data-scarce regions. Recently, a new floodplain mapping method based on a hydrogeomorphic feature, named Height Above Nearest Drainage (<i>HAND</i>), has been proposed as a successful alternative for fast and efficient floodplain mapping at the large scale. The overall goal of this study is to improve the performance of <i>HAND</i>-based method by overcoming its current limitations. The main focus will be on extending the application of the <i>HAND</i>-based method to data-scarce environments. To achieve this goal, regionalization techniques are integrated with the floodplain models at the regional and continental scales. Considering these facts, four research objective are established to (1) Develop a regression model to create 100-year floodplain maps at a regional scale (2) Develop a classification framework for creating 100-year floodplain maps for the Contiguous United States (3) Develop a new version of the <i>HAND</i>-based method for creating probabilistic 100-year floodplain maps, and (4) Propose a general regionalization framework for transferring information from data-rich basins to data-scarce environments. </p> <p> </p> <p>In the first objective, the state of North Carolina is selected as the study area, and a regression model is developed to regionalize the available 100-year Flood Insurance Rate Maps (FIRMs) to the data-scarce regions. The regression model is an exponential equation with three independent variables including the average slope, the average elevation, and the main stream slope of the watershed. The results show that the estimated floodplains are within the expected range of accuracy of C>0.6 and F>0.9 for majority of watersheds located in the mid-altitude regions, but it overpredicts and underpredicts in the flat and mountainous regions respectively. </p> <p> </p> <p>The second objective of this research extends the spatial application of the <i>HAND</i>-based method to the entire United States by proposing a new classification framework. The proposed framework classifies the watersheds into three groups by using seven watershed characteristics related to the topography, climate and land use. The validation results show that the average error of floodplain maps is around 14% which demonstrate the reliability and robustness of the proposed framework for continental floodplain mapping. In addition to the acceptable accuracy, the proposed framework creates the floodplain maps for any watershed within the United States. </p> <p> </p> <p>The <i>HAND</i>-based method is a deterministic modeling approach to floodplain mapping. In the third objective, the probabilistic version of this method is proposed. Using a probabilistic approach to floodplain mapping provides more informative maps. In this study, a flat watershed in the state of Kansas is selected as the case study, and the performance of four probabilistic functions for floodplain mapping is compared. The results show that a linear function with one parameter and a gamma function with two parameters are the best options for this study area. It is also shown that the proposed probabilistic approach can reduce the overpredictions and underpredictions made by the deterministic <i>HAND</i>-based approach. </p> <p> </p> <p>In the fourth objective, a new regionalization framework for transferring the calibrated environmental models to data-scarce regions is proposed. This framework aims to improve the current similarity-based regionalization methods by reducing the subjectivity that exists in the selection of basin descriptors. Using this framework for the probabilistic <i>HAND</i>-based method in the third objective, the floodplains are regionalized for a large set of watersheds in the Central United States. The results show that “vertical component of centroid (or latitude)” is the dominant descriptor of spatial variabilities in the probabilistic floodplain maps. This is an interesting finding which shows how a systematic approach can help to explore the hidden descriptors for regionalization. It is demonstrated that using common methods, such as correlation coefficient calculation, or stepwise regression analysis, will not reveal the critical role of latitude on the spatial variability of floodplains.</p> Reginalization Floodplain Mapping Data-Scarce Regions Classification Digital Elevation Model Height Above Nearest Drainage
148	Fraud or Not? Åkerblom, Thea, Thor, Tobias January 2019 (has links) This paper uses statistical learning to examine and compare three different statistical methods with the aim to predict credit card fraud. The methods compared are Logistic Regression, K-Nearest Neighbour and Random Forest. They are applied and estimated on a data set consisting of nearly 300,000 credit card transactions to determine their performance using classification of fraud as the outcome variable. The three models all have different properties and advantages. The K-NN model preformed the best in this paper but has some disadvantages, since it does not explain the data but rather predict the outcome accurately. Random Forest explains the variables but performs less precise. The Logistic Regression model seems to be unfit for this specific data set. Logistic Regression K-Nearest Neighbour classification random forest fraud transactions and statistical learning Probability Theory and Statistics Sannolikhetsteori och statistik
149	PCA-tree: uma proposta para indexação multidimensional / PCA-Tree: a multidimensional access method proposal Philipe Dalla Bernardina 15 June 2007 (has links) Com o vislumbramento de aplicações que exigiam representações em espaços multidimensionais, surgiu a necessidade de desenvolvimento de métodos de acessos eficientes a estes dados representados em R^d. Dentre as aplicações precursoras dos métodos de acessos multidimensionais, podemos citar os sistemas de geoprocessamento, aplicativos 3D e simuladores. Posteriormente, os métodos de acessos multidimensionais também apresentaram-se como uma importante ferramenta no projeto de classificadores, principalmente classificadores pelos vizinhos mais próximos. Com isso, expandiu-se o espaço de representação, que antes se limitava no máximo a quatro dimensões, para dimensionalidades superiores a mil. Dentre os vários métodos de acesso multidimensional existentes, destaca-se uma classe de métodos baseados em árvores balanceadas com representação em R^d. Estes métodos constituem evoluções da árvore de acesso unidimenisonal B-tree e herdam várias características deste último. Neste trabalho, apresentamos alguns métodos de acessos dessa classe de forma a ilustrar a idéia central destes algoritmos e propomos e implementamos um novo método de acesso, a PCA-tree. A PCA-tree utiliza uma heurística de quebra de nós baseada na extração da componente principal das amostras a serem divididas. Um hiperplano que possui essa componente principal como seu vetor normal é definido como o elemento que divide o espaço associado ao nó. A partir dessa idéia básica geramos uma estrutura de dados e algoritmos que utilizam gerenciamento de memória secundária como a B-tree. Finalmente, comparamos o desempenho da PCA-tree com o desempenho de alguns outros métodos de acesso da classe citada, e apresentamos os prós e contras deste novo método de acesso através de análise de resultados práticos. / The advent of applications demanding the representation of objects in multi-dimensional spaces fostered the development of efficient multi-dimensional access methods. Among some early applications that required multi-dimensional access methods, we can cite geo-processing systems, 3D applications and simulators. Later on, multi-dimensional access methods also became important tools in the design of classifiers, mainly of those based on nearest neighbors technique. Consequently, the dimensionality of the spaces has increased, from earlier at most four to dimensionality larger than a thousand. Among several multi-dimensional access methods, the class of approaches based on balanced tree structures with data represented in Rd has received a lot of attention. These methods constitute evolues from the B-tree for unidimensional accesses, and inherit several of its characteristics. In this work, we present some of the access methods based on balanced trees in order to illustrate the central idea of these algorithms, and we propose and implement a new multi-dimensional access method, which we call PCA-tree. It uses an heuristic to break nodes based on the principal component of the sample to be divided. A hyperplane, whose normal is the principal component, is defined as the one that will split the space represented by the node. From this basic idea we define the data structure and the algorithms for the PCA-tree employing secondary memory management, as in B-trees. Finally, we compare the performance of the PCA-tree with the performance of other methods in the cited class, and present advantages and disadvantages of the proposed access method through analysis of experimental results. indexação métodos de acessos espaciais métodos de acessos multidimensionais indexing mutidimensional access methods nearest neighbors classifier spatial access methods
150	Extensão do Método de Predição do Vizinho mais Próximo para o modelo Poisson misto / An Extension of Nearest Neighbors Prediction Method for mixed Poisson model Arruda, Helder Alves 28 March 2017 (has links) Várias propostas têm surgido nos últimos anos para problemas que envolvem a predição de observações futuras em modelos mistos, contudo, para os casos em que o problema trata-se em atribuir valores para os efeitos aleatórios de novos grupos existem poucos trabalhos. Tamura, Giampaoli e Noma (2013) propuseram um método que consiste na computação das distâncias entre o novo grupo e os grupos com efeitos aleatórios conhecidos, baseadas nos valores das covariáveis, denominado Método de Predição do Vizinho Mais Próximo ou NNPM (Nearest Neighbors Prediction Method), na sigla em inglês, considerando o modelo logístico misto. O objetivo deste presente trabalho foi o de estender o método NNPM para o modelo Poisson misto, além da obtenção de intervalos de confiança para as predições, para tais fins, foram propostas novas medidas de desempenho da predição e o uso da metodologia Bootstrap para a criação dos intervalos. O método de predição foi aplicado em dois conjuntos de dados reais e também no âmbito de estudos de simulação, em ambos os casos, obtiveram-se bons desempenhos. Dessa forma, a metodologia NNPM apresentou-se como um método de predição muito satisfatório também no caso Poisson misto. / Many proposals have been created in the last years for problems in the prediction of future observations in mixed models, however, there are few studies for cases that is necessary to assign random effects values for new groups. Tamura, Giampaoli and Noma (2013) proposed a method that computes the distances between a new group and groups with known random effects based on the values of the covariates, named as Nearest Neighbors Prediction Method (NNPM), considering the mixed logistic model. The goal of this dissertation was to extend the NNPM for the mixed Poisson model, in addition to obtaining confidence intervals for predictions. To attain such purposes new prediction performance measures were proposed as well as the use of Bootstrap methodology for the creation of intervals. The prediction method was applied in two sets of real data and in the simulation studies framework. In both cases good performances were obtained. Thus, the NNPM proved to be a viable prediction method also in the mixed Poisson case. Efeitos aleatórios Mixed Poisson model Modelo Poisson misto Nearest neighbors Predição Prediction Random effects Vizinho mais próximo

Search results