11 |
Automatické rozpoznávání zpěvu ptákůBřenek, Roman January 2014 (has links)
This master thesis deals with methods of automatic recognition of bird species by their voices. In first, I defined the database of records and created a reference data by handmade evaluation. The next step is to find the optimal features for describing a bird singing. I use a Human Frequency cepstral Coefficients (HFCC). For the best accuracy of recognition is necessary to correctly classify a bird's vocalization from a non-vocalization segments. The VAD system is based on an algorithm k-Nearest Neighbours. The last step describes the system based on Hidden Markov Models which allows to recognize the concrete bird species from the parts of bird's singing.
|
12 |
Graphical Methods for Image Compositing and CompletionAl-Kabbany, Ahmed January 2016 (has links)
This thesis is concerned with problems encountered in image-based rendering (IBR) systems. The significance of such systems is increasing as virtual reality as well as augmented reality are finding their way into many applications, from entertainment to military. Particularly, I propose methods that are based on graph theory to address the open problems in the literature of image and video compositing, and scene completion.
For a visually plausible compositing, it is first required to separate the object to be composed from the background it was initially captured against, a problem that is known as natural image matting. It aims, using some user interactions, to calculate a map that depicts how much a background color(s) contributes to the color of every other pixel in an image. My contributions to matting increase the accuracy of the map calculation as well as automate the whole process, by eliminating the need for user interactions. I propose several techniques for sampling user interactions which enhance the quality of the calculated maps. They rely on statistics of non-parametric color models as well as graph transduction and iterative graph cut techniques. The presented sampling strategies lead to state-of-the-art separation, and their efficiency was acknowledged by the standard benchmark in the literature. I have adopted the Gestalt laws of visual grouping to formulate a novel cost function to automate the generation of interactions that otherwise have to be provided manually. This frees the matting process from a critical limitation when used in rendering contexts. Scene completion is another task that is often required in IBR systems. This document presents a novel image completion method that overcomes a few drawbacks in the literature. It adopts a binary optimization technique to construct an image summary, which is then shifted according to a map, calculated with combinatorial optimization, to complete the image. I also present the formulation with which the proposed method can be extended to complete scenes, rather than images, in a stereoscopically and temporally-consistent manner.
|
13 |
Estimation of the Impacts of Climate Change on the Design, Risk and Performance of Urban Water InfrastructureAlzahrani, Fahad 30 March 2023 (has links)
Changes in the temporal variability of precipitation at all timescales are expected due to global warming. Such changes affect urban water infrastructure by potentially influencing their performance and risk of failure. Unfortunately, there is considerable uncertainty about how hydrological variables will change in the future. While uncertainty is present at all timescales, the climate signal in the daily time series simulated by climate models, for instance, can be estimated with much greater certainty than in the simulated hourly time series. That is problematic as sub-daily precipitation time series are essential to solving specific water resource engineering problems, especially in urban hydrology, where times of concentrations are typically less than a day. For instance, hourly or sub-hourly precipitation time series are routinely used to design stormwater and road drainage systems. Rainfall variability at sub-daily time steps is often represented as Intensity-Duration-Frequency (IDF) curves, relating precipitation duration (of basin time of concentration) to return period and average precipitation intensity. Naturally, several researchers investigated the integration of climate change in IDF curves, leading to methods of variable complexity and variable performance.
This thesis aims to a) make a critical analysis of the most commonly used methods for IDF curves under climate change in Canada and b) identify the methods with optimal performance for a set of stations located in the South Nation watershed in Ottawa, Ontario, and c) perform a case study highlighting the effect of the choice of the temporal disaggregation method on the estimated risk of failure/performance of an urban water system.
The first part of the thesis examines Equidistant Quantile Mapping (EQM) used in the IDF_CC tool developed for the Canadian Water Network project. Two conceptual flaws in the method that led to a systematic underestimation of extreme events were discovered. Two corrections are proposed to the EQM, leading to the development of two new methods for IDF generation. The output of EQM and its improved version is a time series of annual maximum precipitation intensity for different durations that can be used to derive IDF curves.
These time series generated using the above approach are not appropriate for rainfall-runoff models for which continuous time series of precipitation (not only maximums) are required. The second part of the thesis tackles the issue, which examines a different approach to evaluating the risk of failure/performance of urban water systems under a changing climate. This second approach yields continuous time series of precipitation that can be fed in rainfall-runoff models used for IDF curve generation. The proposed method is applied in three steps: i) projections of future daily precipitation are generated by downscaling the output of climate models; ii) the downscaled daily precipitation time series are temporally disaggregated to an hourly time step using various techniques; iii) finally, the disaggregated future precipitation time series are used as inputs to rainfall-runoff models or used to generate IDF curves. This approach relaxes several strong assumptions made to develop the EQM approach, such as the implicit (and strong) assumption that the annual maximum precipitation at two different time steps occurs during the same event. That assumption is not necessarily valid and can affect the realism of the generated IDF curves. The method's performance is obviously dependent on the temporal disaggregation technique used in step 3. In this thesis, a simple steady-state stochastic disaggregation model that generates wet/dry day occurrence using a binomial distribution and precipitation intensity using an exponential distribution is proposed and compared to widely used temporal disaggregation methods: the multiplicative random cascade model (MRC), the Hurst-Kolmogorov process (HKP), and three versions of the K-nearest neighbor model (KNN) using the nonparametric Kolmogorov-
Smirnov (KS) test. The six disaggregation techniques were assessed at four stations located in South Nation River Watershed located in Eastern Ontario, Canada.
The third part of the thesis is a case study of the impact of climate change on stormwater management. First, a stormwater management model (SWMM) of St. Catharines, Ontario, developed in a previous study, was selected to simulate its stormwater and sanitary system. The model was forced with downscaled and temporally disaggregated precipitation outputs of the Canadian Regional Climate Model at the Port Dalhousie station, simulated under emission scenario RCP8.5. The temporal disaggregation was done using the Fahad-Ousmane and the KNN (30) methods developed in the previous chapter. The impact of climate change on the frequency, volume, and quality of combined sewer overflows and other hydraulic parameters is examined. Results suggest an increase in the total volume, flow frequency percentage, maximum flow, and average flow in the stormwater system due to climate change. Therefore, adaptation measures should be implemented for the distribution network and wastewater treatment plant to convey and treat the wastewater resulting from wet and dry events.
|
14 |
The many faces of approximation in KNN graph computation / Les multiples facettes des approximations dans la construction de graphes KNRuas, Olivier 17 December 2018 (has links)
La quantité incroyable de contenu disponible dans les services en ligne rend le contenu intéressant incroyablement difficile à trouver. La manière la plus emblématique d’aider les utilisateurs consiste à faire des recommandations. Le graphe des K-plus-proches-voisins (K-Nearest-Neighbours (KNN)) connecte chaque utilisateur aux k autres utilisateurs qui lui sont les plus similaires, étant donnée une fonction de similarité. Le temps de calcul d’un graphe KNN exact est prohibitif dans les services en ligne. Les approches existantes approximent l’ensemble de candidats pour chaque voisinage pour diminuer le temps de calcul. Dans cette thèse, nous poussons plus loin la notion d’approximation : nous approximons les données de chaque utilisateur, la similarité et la localité de données. L’approche obtenue est nettement plus rapide que toutes les autres. / The incredible quantity of available content in online services makes content of interest incredibly difficult to find. The most emblematic way to help the users is to do item recommendation. The K-Nearest-Neighbors (KNN) graph connects each user to its k most similar other users, according to a given similarity metric. The computation time of an exact KNN graph is prohibitive in online services. Existing approaches approximate the set of candidates for each user’s neighborhood to decrease the computation time. In this thesis we push farther the notion of approximation : we approximate the data of each user, the similarity and the data locality. The resulting approach clearly outperforms all the other ones.
|
15 |
Forest inventory improvement based on satellite images / Miškų inventorizacijos tobulinimas kosminių vaizdų pagrinduJonikavičius, Donatas 12 October 2012 (has links)
The aim of the study – improvement of on-going in Lithuania forest inventories based on satellite images and GIS databases.
Specific objective of the study – to explore the possibilities of methods applied for the collection of information from satellite images and GIS databases and its processing in order to determine various Lithuanian forest characteristics, focusing on a variety of forest inventory schemes.
4
The goals of the study:
1. To discuss methodological assumptions for the use of satellite images and GIS database information to estimate various characteristics of the Lithuanian forests.
2. To investigate methodological assumptions for the application of two-phase sampling scheme based on medium-resolution satellite images for the estimation of Lithuanian forest characteristics.
3. To investigate the possibilities of application of medium-resolution satellite images on the basis of two-phase sampling scheme in stand-wise, mature stands and pre-harvesting forest inventories.
4. To investigate methodological decisions and application peculiarities of fast detection of changes in the forest using medium-resolution satellite images under Lithuanian conditions.
Scientific novelty
The development of methodological background for the use of medium-resolution satellite images and two-phase sampling-based schemes in Lithuanian forest inventory.
The use of stand-wise forest inventory data as an auxiliary information together with medium-resolution satellite images in... [to full text] / Bendrasis darbo tikslas – Lietuvoje vykdomų miškų inventorizacijų tobulinimas kosminių nuotolinių tyrimų vaizdų bei GIS duomenų bazių pagrindu.
Darbo konkretusis tikslas – ištirti kosminių nuotolinių tyrimų vaizdų ir GIS duomenų bazių informacijos, jos apdorojimo metodų galimybes nustatant įvairias Lietuvos miškų charakteristikas, orientuojantis į įvairias miško inventorizacijos schemas.
Darbo uždaviniai:
1. Aptarti kosminių nuotolinių tyrimų vaizdų ir GIS duomenų bazių informacijos naudojimo vertinant įvairias Lietuvos miškų charakteristikas metodines prielaidas.
2. Ištirti dviejų fazių atrankos schemos taikymo vidutinės skiriamosios gebos kosminių vaizdų pagrindu vertinant įvairias Lietuvos miškų charakteristikas metodines prielaidas.
3. Ištirti dviejų fazių atrankos schema grindžiamo vidutinės skiriamosios gebos kosminių vaizdų taikymo sklypinėje, brandžių medynų bei prieškirtiminėje miškų inventorizacijose galimybes.
4. Ištirti operatyvaus pakitimų aptikimo miške, naudojant vidutinės skiriamosios gebos kosminius vaizdus, metodinius sprendimus bei jų taikymo Lietuvos sąlygomis ypatumus.
Mokslinis naujumas
Išvystyti vidutines skiriamosios gebos kosminių vaizdų ir dviejų fazių atranka grindžiamų vertinimo schemų naudojimo Lietuvos miškų inventorizacijoje metodiniai pagrindai.
Sklypinės miškų inventorizacijos duomenų naudojimas kaip pagalbinė informacija kartu su vidutinės skiriamosios gebos kosminiais vaizdais dviejų fazių atrankos schemose vertinant miško charakteristikas... [toliau žr. visą tekstą]
|
16 |
運用kNN文字探勘分析智慧型終端App群集之研究 / The study of analyzing smart handheld device App's clusters by using kNN text mining曾國傑, Tseng, Kuo Chieh Unknown Date (has links)
隨著智慧型終端設備日益普及,使用者對App需求逐漸增加,各大企業也因此開創了一種新的互動性行銷方式。同時,App下載所帶來的龐大商機也促使許多開發人員紛紛加入App的開發行列,造成App的數量呈現爆炸性成長,而讓使用者在面對種類繁多的App時,無法做出有效率的選擇。故本研究將透過文字探勘與kNN集群分析技術,分析網友發表的App推薦文並將App進行分群;再藉由參數的調整,期望能透過衡量指標的評估來獲得最佳品質之分群,以便作為使用者選擇App之參考依據。
為了使大量App進行分群以解決使用者「資訊超載」的問題,本研究以App Store之遊戲類App為分析對象,蒐集了439篇App推薦文章,並依App推薦對象之異同,將其合併成357篇App推薦文章;接著,透過文字探勘技術將文章轉換成可相互比較的向量空間模型,再利用kNN群集分析對其進行分群。同時,藉由參數組合中k值與文件相似度門檻值的調整來獲得最佳品質之分群;其分群品質的評估則透過平均群內相似度等指標來進行衡量;而為了提升分群品質,本研究採用「多階段分群」,以分群後各群集內的文章數量來判斷是否進行再分群或群集合併。
本研究結果顯示第一階段分群在k值為10、文件相似度門檻值為0.025時,能獲得最佳之分群品質。而在後續階段的分群過程中,因群集內文章數減少,故將k值降低並逐漸提高文件相似度門檻值以獲得分群效果。第二階段結束後,可針對已達到分群停止條件之群集進行關鍵詞彙萃取,並可歸類出「棒球/射擊」與「投擲飛行」等6種App類型;其後階段依循相同分群規則可獲得「守城塔防」等14種App類型。分群結束後,共可分出36個群集並獲得20種App類型。分群過程中,平均群內相似度逐漸增加;平均群間相似度則逐漸下降;分群品質衡量指標由第一階段分群後的12.65%提升到第五階段結束時的75.81%。
由本研究可知分群之後相似度高的App會逐漸聚集成群,所獲得之各群集命名結果將能作為使用者選擇App之參考依據;App軟體開發人員也能從各群集之關鍵詞彙中了解使用者所注重的遊戲元素,改善App內容以更符合使用者之需求。而以本研究結果為基礎,透過建立專業詞庫改善分群品質、利用文件摘要技術加強使用者對各群集之了解,或建立App推薦系統等皆可做為未來研究之方向。 / With the popularity of Smart Handheld Devices are increasing, the needs of “App” are spreading. Developers whom devote themselves to this opportunity are also rising, making the total number of Apps growing rapidly. Facing these kind of situation, users couldn’t choose the App they need efficiently. This research uses text mining and kNN Clustering technique analyzing the recommendation reviews of App by netizen then clustering the App recommendation articles; Through the adjustments of parameters, we expect to evaluate the measurement indicators to obtain the best quality cluster to use as a basis for users to select Apps.
In order to solve the information overload for the user, we analyzed apps of the “Games” category form App store and sorted out to 357 App recommendation articles to use as our analysis target. Then we used text mining technique to process the articles and uses kNN clustering analysis to sort out the articles. Simultaneously, we fine tuning the measurement indicators to find the optimal cluster. This research uses multi-phase clustering technique to assure the quality of each cluster.
We discriminate 36 clusters and 20 categories from the clustering results. During the clustering process, the Mean of Intra-cluster Similarity increases gradually; in the contrary, the Mean of Inter-cluster Similarity reduces. The “Cluster Quality” increases from 12.65% significantly to 75.81%. In conclusion, similar Apps will gradually been clustered by its similarities, and can be used to be a reference by its cluster’s name. The App developers can also understands the game elements which the users pay greater attentions and tailored their contents to match the needs of the users according to the key phrases from each cluster. In further discussion, building specialized terms database of App to improve the quality of the clustering, using summarization technique to robust user understanding of each cluster, or to build up App recommendation system is liking to be further studied via using the results by this research.
|
17 |
Consultas kNN em redes dependentes do tempo / KNN queries in time-dependent networksCruz, Lívia Almada January 2013 (has links)
CRUZ, Lívia Almada. Consultas kNN em redes dependentes do tempo. 2013. 75 f. Dissertação (Mestrado em ciência da computação)- Universidade Federal do Ceará, Fortaleza-CE, 2013. / Submitted by Elineudson Ribeiro (elineudsonr@gmail.com) on 2016-07-11T18:24:05Z
No. of bitstreams: 1
2013_dis_lacruz.pdf: 6954650 bytes, checksum: fbf7280f2f781976bae6e4474c2c16c6 (MD5) / Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2016-07-20T11:52:58Z (GMT) No. of bitstreams: 1
2013_dis_lacruz.pdf: 6954650 bytes, checksum: fbf7280f2f781976bae6e4474c2c16c6 (MD5) / Made available in DSpace on 2016-07-20T11:52:58Z (GMT). No. of bitstreams: 1
2013_dis_lacruz.pdf: 6954650 bytes, checksum: fbf7280f2f781976bae6e4474c2c16c6 (MD5)
Previous issue date: 2013 / In this dissertation we study the problem of processing k-nearest neighbours (kNN)queries in road networks considering the history of traffic conditions, in particular the case where the speed of moving objects is time-dependent. For instance, given that the user is at a given location at a certain time, the query returns the k points of interest (e.g., gas stations) that can be reached in the minimum amount of time. Previous solutions to answer kNN queries and others common queries in road networks do not work when the moving speed in each road is not constant. Building efficient and correct approaches and algorithms and storage and access schemes for processing these queries is a challenge because graph properties considered in static networks do not hold in the time dependent case. Our approach uses the well-known A∗ search algorithm by applying incremental network expansion and pruning unpromising vertices. The goal is reduce the percentage of network assessed in the search. To support the algorithm execution, we propose a storage and access method for time-dependent networks. We discuss the design and correctness of our algorithm and present experimental results that show the efficiency and effectiveness of our solution. / Nesta dissertação foi estudado o problema de processar consultas kNN em redes de rodovias considerando o histórico das condições de tráfego, em particular o caso onde a velocidade dos objetos móveis depende do tempo. Dado que um usuário está em uma dada localização e em um determinado instante de tempo, a consulta retorna os k pontos de interesse (por exemplo, postos de gasolina) que podem ser alcançados em uma quantidade de tempo mínima considerando condições históricas de tráfego. Soluções anteriores para consultas kNN e outras consultas comuns em redes de rodovia estáticas não funcionam quando o custo das arestas (tempo de viagem) é dependente do tempo. A construção de estratégias e algoritmos eficientes e corretos, e métodos de armazenamento e acesso para o processamento destas consultas é um desafio desde que algumas das propriedades de grafos comumente supostas em estratégias para redes estáticas não se mantêm para redes dependentes do tempo. O método proposto aplica uma busca A∗ à medida que vai, de maneira incremental, explorando a rede. O objetivo do método é reduzir o percentual da rede avaliado na busca. Para dar suporte à execução do algoritmo, foi também proposto um método para armazenamento e acesso para redes dependentes do tempo. A construção e a corretude do algoritmo são discutidas e são apresentados resultados experimentais com dados reais e sintéticos que mostram a eficiência da solução.
|
18 |
Identifiera känslig data inom ramen för GDPR : Med K-Nearest NeighborsDarborg, Alex January 2018 (has links)
General Data Protection Regulation, GDPR, is a regulation coming into effect on May 25th 2018. Due to this, organizations face large decisions concerning how sensitive data, stored in databases, are to be identified. Meanwhile, there is an expansion of machine learning on the software market. The goal of this project has been to develop a tool which, through machine learning, can identify sensitive data. The development of this tool has been accomplished through the use of agile methods and has included comparisions of various algorithms and the development of a prototype. This by using tools such as Spyder and XAMPP. The results show that different types of sensitive data give variating results in the developed software solution. The kNN algorithm showed strong results in such cases when the sensitive data concerned Swedish Social Security numbers of 10 digits, and phone numbers in the length of ten or eleven digits, either starting with 46-, 070, 072 or 076 and also addresses. Regular expression showed strong results concerning e-mails and IP-addresses. / General Data Protection Regulation, GDPR, är en reglering som träder i kraft 25 maj 2018. I och med detta ställs organisationer inför stora beslut kring hur de ska finna känsliga data som är lagrad i databaser. Samtidigt expanderar maskininlärning på mjukvarumarknaden. Målet för detta projekt har varit att ta fram ett verktyg som med hjälp av maskininlärning kan identifiera känsliga data. Utvecklingen av detta verktyg har skett med hjälp av agila metoder och har innefattat jämförelser av olika algoritmer och en framtagning av en prototyp. Detta med hjälp av verktyg såsom Spyder och XAMPP. Resultatet visar på att olika typer av känsliga data ger olika starka resultat i den utvecklade programvaran. kNN-algoritmen visade starka resultat i de fall då den känsliga datan rörde svenska, tiosiffriga personnummer samt telefonnummer i tio- eller elva-siffrigt format, och antingen inleds med 46, 070, 072 eller 076 samt då den rörde adresser. Regular expression visade på starka resultat när det gällde e- mails och IP-adresser.
|
19 |
Microarray Data Analysis Tool (MAT)Selvaraja, Sudarshan January 2008 (has links)
No description available.
|
20 |
A Hierarchical Multi-Output Nearest Neighbor Model for Multi-Output Dependence LearningMorris, Richard Glenn 08 March 2013 (has links)
Multi-Output Dependence (MOD) learning is a generalization of standard classification problems that allows for multiple outputs that are dependent on each other. A primary issue that arises in the context of MOD learning is that for any given input pattern there can be multiple correct output patterns. This changes the learning task from function approximation to relation approximation. Previous algorithms do not consider this problem, and thus cannot be readily applied to MOD problems. To perform MOD learning, we introduce the Hierarchical Multi-Output Nearest Neighbor model (HMONN) that employs a basic learning model for each output and a modified nearest neighbor approach to refine the initial results. This paper focuses on tasks with nominal features, although HMONN has the initial capacity for solving MOD problems with real-valued features. Results obtained using UCI repository, synthetic, and business application data sets show improved accuracy over a baseline that treats each output as independent of all the others, with HMONN showing improvement that is statistically significant in the majority of cases.
|
Page generated in 0.0324 seconds