• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 200
  • 70
  • 23
  • 22
  • 21
  • 8
  • 5
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 443
  • 443
  • 443
  • 178
  • 146
  • 99
  • 86
  • 73
  • 72
  • 58
  • 56
  • 55
  • 54
  • 50
  • 48
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
341

DETECTION OF CYBER ATTACKS ON POWER DISTRIBUTION SYSTEM USING QSVM

Urmisha Reddy Janak (20391372) 05 December 2024 (has links)
<p dir="ltr">As cyber threats evolve, Power Distribution Systems (PDS) face growing risks from sophisticated attacks like False Data Injection Attacks (FDIAs), which can disrupt system stability and reliability. This thesis presents a quantum-based approach using Quantum Support Vector Machines (QSVM) to detect and mitigate FDIAs in PDS. By leveraging quantum feature mapping, the QSVM model efficiently identifies subtle anomalies within high-dimensional data, enhancing the accuracy and speed of FDIA detection. The methodology includes the integration of an augmented Lagrangian function to further optimize detection performance. Validated using the IEEE-13 bus system, this QSVM framework showcases its potential as a robust, real-time detection tool for cybersecurity in smart grid infrastructures. The results underscore the promise of quantum computing in strengthening the resilience of critical energy systems.</p>
342

Ensemble baseado em métodos de Kernel para reconhecimento biométrico multimodal / Ensemble Based on Kernel Methods for Multimodal Biometric Recognition

Costa, Daniel Moura Martins da 31 March 2016 (has links)
Com o avanço da tecnologia, as estratégias tradicionais para identificação de pessoas se tornaram mais suscetíveis a falhas, de forma a superar essas dificuldades algumas abordagens vêm sendo propostas na literatura. Dentre estas abordagens destaca-se a Biometria. O campo da Biometria abarca uma grande variedade de tecnologias usadas para identificar e verificar a identidade de uma pessoa por meio da mensuração e análise de aspectos físicos e/ou comportamentais do ser humano. Em função disso, a biometria tem um amplo campo de aplicações em sistemas que exigem uma identificação segura de seus usuários. Os sistemas biométricos mais populares são baseados em reconhecimento facial ou de impressões digitais. Entretanto, existem outros sistemas biométricos que utilizam a íris, varredura de retina, voz, geometria da mão e termogramas faciais. Nos últimos anos, o reconhecimento biométrico obteve avanços na sua confiabilidade e precisão, com algumas modalidades biométricas oferecendo bom desempenho global. No entanto, mesmo os sistemas biométricos mais avançados ainda enfrentam problemas. Recentemente, esforços têm sido realizados visando empregar diversas modalidades biométricas de forma a tornar o processo de identificação menos vulnerável a ataques. Biometria multimodal é uma abordagem relativamente nova para representação de conhecimento biométrico que visa consolidar múltiplas modalidades biométricas. A multimodalidade é baseada no conceito de que informações obtidas a partir de diferentes modalidades se complementam. Consequentemente, uma combinação adequada dessas informações pode ser mais útil que o uso de informações obtidas a partir de qualquer uma das modalidades individualmente. As principais questões envolvidas na construção de um sistema biométrico unimodal dizem respeito à definição das técnicas de extração de característica e do classificador. Já no caso de um sistema biométrico multimodal, além destas questões, é necessário definir o nível de fusão e a estratégia de fusão a ser adotada. O objetivo desta dissertação é investigar o emprego de ensemble para fusão das modalidades biométricas, considerando diferentes estratégias de fusão, lançando-se mão de técnicas avançadas de processamento de imagens (tais como transformada Wavelet, Contourlet e Curvelet) e Aprendizado de Máquina. Em especial, dar-se-á ênfase ao estudo de diferentes tipos de máquinas de aprendizado baseadas em métodos de Kernel e sua organização em arranjos de ensemble, tendo em vista a identificação biométrica baseada em face e íris. Os resultados obtidos mostraram que a abordagem proposta é capaz de projetar um sistema biométrico multimodal com taxa de reconhecimento superior as obtidas pelo sistema biométrico unimodal. / With the advancement of technology, traditional strategies for identifying people become more susceptible to failure, in order to overcome these difficulties some approaches have been proposed in the literature. Among these approaches highlights the Biometrics. The field of Biometrics encompasses a wide variety of technologies used to identify and verify the person\'s identity through the measurement and analysis of physiological and behavioural aspects of the human body. As a result, biometrics has a wide field of applications in systems that require precise identification of their users. The most popular biometric systems are based on face recognition and fingerprint matching. Furthermore, there are other biometric systems that utilize iris and retinal scan, speech, face, and hand geometry. In recent years, biometrics authentication has seen improvements in reliability and accuracy, with some of the modalities offering good performance. However, even the best biometric modality is facing problems. Recently, big efforts have been undertaken aiming to employ multiple biometric modalities in order to make the authentication process less vulnerable to attacks. Multimodal biometrics is a relatively new approach to biometrics representation that consolidate multiple biometric modalities. Multimodality is based on the concept that the information obtained from different modalities complement each other. Consequently, an appropriate combination of such information can be more useful than using information from single modalities alone. The main issues involved in building a unimodal biometric System concern the definition of the feature extraction technique and type of classifier. In the case of a multimodal biometric System, in addition to these issues, it is necessary to define the level of fusion and fusion strategy to be adopted. The aim of this dissertation is to investigate the use of committee machines to fuse multiple biometric modalities, considering different fusion strategies, taking into account advanced methods in machine learning. In particular, it will give emphasis to the analyses of different types of machine learning methods based on Kernel and its organization into arrangements committee machines, aiming biometric authentication based on face, fingerprint and iris. The results showed that the proposed approach is capable of designing a multimodal biometric System with recognition rate than those obtained by the unimodal biometrics Systems.
343

Ensemble baseado em métodos de Kernel para reconhecimento biométrico multimodal / Ensemble Based on Kernel Methods for Multimodal Biometric Recognition

Daniel Moura Martins da Costa 31 March 2016 (has links)
Com o avanço da tecnologia, as estratégias tradicionais para identificação de pessoas se tornaram mais suscetíveis a falhas, de forma a superar essas dificuldades algumas abordagens vêm sendo propostas na literatura. Dentre estas abordagens destaca-se a Biometria. O campo da Biometria abarca uma grande variedade de tecnologias usadas para identificar e verificar a identidade de uma pessoa por meio da mensuração e análise de aspectos físicos e/ou comportamentais do ser humano. Em função disso, a biometria tem um amplo campo de aplicações em sistemas que exigem uma identificação segura de seus usuários. Os sistemas biométricos mais populares são baseados em reconhecimento facial ou de impressões digitais. Entretanto, existem outros sistemas biométricos que utilizam a íris, varredura de retina, voz, geometria da mão e termogramas faciais. Nos últimos anos, o reconhecimento biométrico obteve avanços na sua confiabilidade e precisão, com algumas modalidades biométricas oferecendo bom desempenho global. No entanto, mesmo os sistemas biométricos mais avançados ainda enfrentam problemas. Recentemente, esforços têm sido realizados visando empregar diversas modalidades biométricas de forma a tornar o processo de identificação menos vulnerável a ataques. Biometria multimodal é uma abordagem relativamente nova para representação de conhecimento biométrico que visa consolidar múltiplas modalidades biométricas. A multimodalidade é baseada no conceito de que informações obtidas a partir de diferentes modalidades se complementam. Consequentemente, uma combinação adequada dessas informações pode ser mais útil que o uso de informações obtidas a partir de qualquer uma das modalidades individualmente. As principais questões envolvidas na construção de um sistema biométrico unimodal dizem respeito à definição das técnicas de extração de característica e do classificador. Já no caso de um sistema biométrico multimodal, além destas questões, é necessário definir o nível de fusão e a estratégia de fusão a ser adotada. O objetivo desta dissertação é investigar o emprego de ensemble para fusão das modalidades biométricas, considerando diferentes estratégias de fusão, lançando-se mão de técnicas avançadas de processamento de imagens (tais como transformada Wavelet, Contourlet e Curvelet) e Aprendizado de Máquina. Em especial, dar-se-á ênfase ao estudo de diferentes tipos de máquinas de aprendizado baseadas em métodos de Kernel e sua organização em arranjos de ensemble, tendo em vista a identificação biométrica baseada em face e íris. Os resultados obtidos mostraram que a abordagem proposta é capaz de projetar um sistema biométrico multimodal com taxa de reconhecimento superior as obtidas pelo sistema biométrico unimodal. / With the advancement of technology, traditional strategies for identifying people become more susceptible to failure, in order to overcome these difficulties some approaches have been proposed in the literature. Among these approaches highlights the Biometrics. The field of Biometrics encompasses a wide variety of technologies used to identify and verify the person\'s identity through the measurement and analysis of physiological and behavioural aspects of the human body. As a result, biometrics has a wide field of applications in systems that require precise identification of their users. The most popular biometric systems are based on face recognition and fingerprint matching. Furthermore, there are other biometric systems that utilize iris and retinal scan, speech, face, and hand geometry. In recent years, biometrics authentication has seen improvements in reliability and accuracy, with some of the modalities offering good performance. However, even the best biometric modality is facing problems. Recently, big efforts have been undertaken aiming to employ multiple biometric modalities in order to make the authentication process less vulnerable to attacks. Multimodal biometrics is a relatively new approach to biometrics representation that consolidate multiple biometric modalities. Multimodality is based on the concept that the information obtained from different modalities complement each other. Consequently, an appropriate combination of such information can be more useful than using information from single modalities alone. The main issues involved in building a unimodal biometric System concern the definition of the feature extraction technique and type of classifier. In the case of a multimodal biometric System, in addition to these issues, it is necessary to define the level of fusion and fusion strategy to be adopted. The aim of this dissertation is to investigate the use of committee machines to fuse multiple biometric modalities, considering different fusion strategies, taking into account advanced methods in machine learning. In particular, it will give emphasis to the analyses of different types of machine learning methods based on Kernel and its organization into arrangements committee machines, aiming biometric authentication based on face, fingerprint and iris. The results showed that the proposed approach is capable of designing a multimodal biometric System with recognition rate than those obtained by the unimodal biometrics Systems.
344

It’s a Match: Predicting Potential Buyers of Commercial Real Estate Using Machine Learning

Hellsing, Edvin, Klingberg, Joel January 2021 (has links)
This thesis has explored the development and potential effects of an intelligent decision support system (IDSS) to predict potential buyers for commercial real estate property. The overarching need for an IDSS of this type has been identified exists due to information overload, which the IDSS aims to reduce. By shortening the time needed to process data, time can be allocated to make sense of the environment with colleagues. The system architecture explored consisted of clustering commercial real estate buyers into groups based on their characteristics, and training a prediction model on historical transaction data from the Swedish market from the cadastral and land registration authority. The prediction model was trained to predict which out of the cluster groups most likely will buy a given property. For the clustering, three different clustering algorithms were used and evaluated, one density based, one centroid based and one hierarchical based. The best performing clustering model was the centroid based (K-means). For the predictions, three supervised Machine learning algorithms were used and evaluated. The different algorithms used were Naive Bayes, Random Forests and Support Vector Machines. The model based on Random Forests performed the best, with an accuracy of 99.9%. / Denna uppsats har undersökt utvecklingen av och potentiella effekter med ett intelligent beslutsstödssystem (IDSS) för att prediktera potentiella köpare av kommersiella fastigheter. Det övergripande behovet av ett sådant system har identifierats existerar på grund av informtaionsöverflöd, vilket systemet avser att reducera. Genom att förkorta bearbetningstiden av data kan tid allokeras till att skapa förståelse av omvärlden med kollegor. Systemarkitekturen som undersöktes bestod av att gruppera köpare av kommersiella fastigheter i kluster baserat på deras köparegenskaper, och sedan träna en prediktionsmodell på historiska transkationsdata från den svenska fastighetsmarknaden från Lantmäteriet. Prediktionsmodellen tränades på att prediktera vilken av grupperna som mest sannolikt kommer köpa en given fastighet. Tre olika klusteralgoritmer användes och utvärderades för grupperingen, en densitetsbaserad, en centroidbaserad och en hierarkiskt baserad. Den som presterade bäst var var den centroidbaserade (K-means). Tre övervakade maskininlärningsalgoritmer användes och utvärderades för prediktionerna. Dessa var Naive Bayes, Random Forests och Support Vector Machines. Modellen baserad p ̊a Random Forests presterade bäst, med en noggrannhet om 99,9%.
345

Analysis of machine learning for human motion pattern  recognition on embedded devices / Analys av maskininlärning för igenkänning av mänskliga rörelser på inbyggda system

Fredriksson, Tomas, Svensson, Rickard January 2018 (has links)
With an increased amount of connected devices and the recent surge of artificial intelligence, the two technologies need more attention to fully bloom as a useful tool for creating new and exciting products. As machine learning traditionally is implemented on computers and online servers this thesis explores the possibility to extend machine learning to an embedded environment. This evaluation of existing machine learning in embedded systems with limited processing capa-bilities has been carried out in the specific context of an application involving classification of basic human movements. Previous research and implementations indicate that it is possible with some limitations, this thesis aims to answer which hardware limitation is affecting clas-sification and what classification accuracy the system can reach on an embedded device. The tests included human motion data from an existing dataset and included four different machine learning algorithms on three devices. Support Vector Machine (SVM) are found to be performing best com-pared to CART, Random Forest and AdaBoost. It reached a classification accuracy of 84,69% between six different included motions with a clas-sification time of 16,88 ms per classification on a Cortex M4 processor. This is the same classification accuracy as the one obtained on the host computer with more computational capabilities. Other hardware and machine learning algorithm combinations had a slight decrease in clas-sification accuracy and an increase in classification time. Conclusions could be drawn that memory on the embedded device affect which al-gorithms could be run and the complexity of data that can be extracted in form of features. Processing speed is mostly affecting classification time. Additionally the performance of the machine learning system is connected to the type of data that is to be observed, which means that the performance of different setups differ depending on the use case. / Antalet uppkopplade enheter ökar och det senaste uppsvinget av ar-tificiell intelligens driver forskningen framåt till att kombinera de två teknologierna för att både förbättra existerande produkter och utveckla nya. Maskininlärning är traditionellt sett implementerat på kraftfulla system så därför undersöker den här masteruppsatsen potentialen i att utvidga maskininlärning till att köras på inbyggda system. Den här undersökningen av existerande maskinlärningsalgoritmer, implemen-terade på begränsad hårdvara, har utförts med fokus på att klassificera grundläggande mänskliga rörelser. Tidigare forskning och implemen-tation visar på att det ska vara möjligt med vissa begränsningar. Den här uppsatsen vill svara på vilken hårvarubegränsning som påverkar klassificering mest samt vilken klassificeringsgrad systemet kan nå på den begränsande hårdvaran. Testerna inkluderade mänsklig rörelsedata från ett existerande dataset och inkluderade fyra olika maskininlärningsalgoritmer på tre olika system. SVM presterade bäst i jämförelse med CART, Random Forest och AdaBoost. Den nådde en klassifikationsgrad på 84,69% på de sex inkluderade rörelsetyperna med en klassifikationstid på 16,88 ms per klassificering på en Cortex M processor. Detta är samma klassifikations-grad som en vanlig persondator når med betydligt mer beräknings-resurserresurser. Andra hårdvaru- och algoritm-kombinationer visar en liten minskning i klassificeringsgrad och ökning i klassificeringstid. Slutsatser kan dras att minnet på det inbyggda systemet påverkar vilka algoritmer som kunde köras samt komplexiteten i datan som kunde extraheras i form av attribut (features). Processeringshastighet påverkar mest klassificeringstid. Slutligen är prestandan för maskininlärningsy-stemet bunden till typen av data som ska klassificeras, vilket betyder att olika uppsättningar av algoritmer och hårdvara påverkar prestandan olika beroende på användningsområde.
346

Using Data-Driven Feasible Region Approximations to Handle Nonlinear Constraints When Applying CMA-ES to the Initial Margin Optimization Problem / Datadriven approximation av tillåtet område för att hantera icke-linjära bivillkor när CMA-ES används för att optimera initial margin

Wallström, Karl January 2021 (has links)
The introduction of initial margin requirements for non-cleared OTC derivatives has made it possible to optimize initial margin when considering a network of trading participants. Applying CMA-ES, this thesis has explored a new method to handle the nonlinear constraints present in the initial margin optimization problem. The idea behind the method and the research question in this thesis are centered around leveraging data created during optimization. Specifically, by creating a linear approximation of the feasible region using support vector machines and in turn applying a repair strategy based on projection. The hypothesis was that by repairing solutions an increase in convergence speed should follow. In order to answer the research question, a reference method was at first created. Here CMA-ES along with feasibility rules was used, referred to as CMA-FS. The proposed method of optimization data leveraging (ODL) was then appended to CMA-FS, referred to as CMA-ODL. Both algorithms were then applied to a single initial margin optimization problem 100 times each with different random seeds used for sampling in the optimization algorithms. The results showed that CMA-ODL converged significantly faster than CMA-FS, without affecting final objective values significantly negatively. Convergence was measured in terms of iterations and not computational time. On average a 5% increase in convergence speed was achieved with CMA-ODL. No significant difference was found between CMA-FS and CMA-ODL in terms of the percentage of infeasible solutions generated. A reason behind the lack of a reduction in violations can be due to how ODL is implemented with the CMA-ES algorithm. Specifically, ODL will lead to a greater number of feasible solutions being available during recombination in CMA-ES. Although, due to the projection, the solutions after projection are not completely reflective of the actual parameters used for that generation. The projection should also bias the algorithm towards the boundary of the feasible region. Still, the performative difference in terms of convergence speed was significant. In conclusion, the proposed boundary constraint handling method increased performance, but it is not known whether the method has any major practical applicability, due to the restriction to only considering the number of iterations and not the computational time. / Införandet av initial margin för non-cleared OTC derivatives har gjort det möjligt att optimera initial margin när ett flertal marknadsdeltagare tas till hänsyn. Denna uppsats har applicerat CMA-ES och specifikt undersökt en ny metod för hantering av de icke-linjära bivillkoren som uppstår när initial margin optimeras. Idén bakom metoden och forskningsfrågan i rapporten bygger på att utnyttja data som generas vid optimering. Detta görs specifikt genom att den icke-linjära tillåtna regionen approximeras linjärt med support vector machines. Därefter används en reparationsstrategi bestående av projicering för att reparera otillåtna lösningar. Hypotesen i uppsatsen var att genom att reparera lösningar så skulle konvergenshastigheten öka. För att svara på forskningsfrågan så togs en referensmetod fram, där CMA-ES och feasibility rules användes för att hantera icke-linjära bivillkor. Denna version av CMA-ES kallades CMA-FS. Sedan integrerades den nya metoden med CMA-FS, denna version kallades för CMA-ODL. Därefter så applicerades båda algoritmer 100 gånger på ett initial margin optimeringsproblem, där olika seeds användes för generering av lösningar i algoritmerna. Resultaten visade att CMA-ODL konvergerade signifikant snabbare än CMA-FS utan att påverka optimeringsresultatet negativt. Med CMA-ODL så ökade konvergenshastigheten med ungefär 5%. Konvergens mättes genom antal iterationer och inte beräkningstid. Ingen signifikant skillnad mellan CMA-ODL och CMA-FS observerades när de jämfördes med avseende på mängden icke-tillåtna lösningar genererade. En anledning varför ingen skillnad observerades är hur den nya metoden var integrerad med CMA-ES algoritmen. Den tilltänkta metoden leder till att fler tillåtna lösningar är tillgängliga när CMA-ES ska bilda nästa generation men eftersom lösningar projiceras så kommer dom inte att reflektera dom parametrar som användes för att faktiskt generera dom. Projiceringen leder också till att fler lösningar på randen av det tillåtna området kommer att genereras. Sammanfattningsvis så observerades fortfarande en signifikant ökning i konvergenshastighet för CMA-ODL men det är oklart om algoritmen är praktiskt användbar p.g.a. restriktionen att enbart betrakta antalet iterationer och inte total beräkningstid.
347

Characterisation and classification of protein sequences by using enhanced amino acid indices and signal processing-based methods

Chrysostomou, Charalambos January 2013 (has links)
Protein sequencing has produced overwhelming amount of protein sequences, especially in the last decade. Nevertheless, the majority of the proteins' functional and structural classes are still unknown, and experimental methods currently used to determine these properties are very expensive, laborious and time consuming. Therefore, automated computational methods are urgently required to accurately and reliably predict functional and structural classes of the proteins. Several bioinformatics methods have been developed to determine such properties of the proteins directly from their sequence information. Such methods that involve signal processing methods have recently become popular in the bioinformatics area and been investigated for the analysis of DNA and protein sequences and shown to be useful and generally help better characterise the sequences. However, there are various technical issues that need to be addressed in order to overcome problems associated with the signal processing methods for the analysis of the proteins sequences. Amino acid indices that are used to transform the protein sequences into signals have various applications and can represent diverse features of the protein sequences and amino acids. As the majority of indices have similar features, this project proposes a new set of computationally derived indices that better represent the original group of indices. A study is also carried out that resulted in finding a unique and universal set of best discriminating amino acid indices for the characterisation of allergenic proteins. This analysis extracts features directly from the protein sequences by using Discrete Fourier Transform (DFT) to build a classification model based on Support Vector Machines (SVM) for the allergenic proteins. The proposed predictive model yields a higher and more reliable accuracy than those of the existing methods. A new method is proposed for performing a multiple sequence alignment. For this method, DFT-based method is used to construct a new distance matrix in combination with multiple amino acid indices that were used to encode protein sequences into numerical sequences. Additionally, a new type of substitution matrix is proposed where the physicochemical similarities between any given amino acids is calculated. These similarities were calculated based on the 25 amino acids indices selected, where each one represents a unique biological protein feature. The proposed multiple sequence alignment method yields a better and more reliable alignment than the existing methods. In order to evaluate complex information that is generated as a result of DFT, Complex Informational Spectrum Analysis (CISA) is developed and presented. As the results show, when protein classes present similarities or differences according to the Common Frequency Peak (CFP) in specific amino acid indices, then it is probable that these classes are related to the protein feature that the specific amino acid represents. By using only the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient, as biologically related features can appear individually either in the real or the imaginary spectrum. This is successfully demonstrated over the analysis of influenza neuraminidase protein sequences. Upon identification of a new protein, it is important to single out amino acid responsible for the structural and functional classification of the protein, as well as the amino acids contributing to the protein's specific biological characterisation. In this work, a novel approach is presented to identify and quantify the relationship between individual amino acids and the protein. This is successfully demonstrated over the analysis of influenza neuraminidase protein sequences. Characterisation and identification problem of the Influenza A virus protein sequences is tackled through a Subgroup Discovery (SD) algorithm, which can provide ancillary knowledge to the experts. The main objective of the case study was to derive interpretable knowledge for the influenza A virus problem and to consequently better describe the relationships between subtypes of this virus. Finally, by using DFT-based sequence-driven features a Support Vector Machine (SVM)-based classification model was built and tested, that yields higher predictive accuracy than that of SD. The methods developed and presented in this study yield promising results and can be easily applied to proteomic fields.
348

Textual data mining applications for industrial knowledge management solutions

Ur-Rahman, Nadeem January 2010 (has links)
In recent years knowledge has become an important resource to enhance the business and many activities are required to manage these knowledge resources well and help companies to remain competitive within industrial environments. The data available in most industrial setups is complex in nature and multiple different data formats may be generated to track the progress of different projects either related to developing new products or providing better services to the customers. Knowledge Discovery from different databases requires considerable efforts and energies and data mining techniques serve the purpose through handling structured data formats. If however the data is semi-structured or unstructured the combined efforts of data and text mining technologies may be needed to bring fruitful results. This thesis focuses on issues related to discovery of knowledge from semi-structured or unstructured data formats through the applications of textual data mining techniques to automate the classification of textual information into two different categories or classes which can then be used to help manage the knowledge available in multiple data formats. Applications of different data mining techniques to discover valuable information and knowledge from manufacturing or construction industries have been explored as part of a literature review. The application of text mining techniques to handle semi-structured or unstructured data has been discussed in detail. A novel integration of different data and text mining tools has been proposed in the form of a framework in which knowledge discovery and its refinement processes are performed through the application of Clustering and Apriori Association Rule of Mining algorithms. Finally the hypothesis of acquiring better classification accuracies has been detailed through the application of the methodology on case study data available in the form of Post Project Reviews (PPRs) reports. The process of discovering useful knowledge, its interpretation and utilisation has been automated to classify the textual data into two classes.
349

Реконфигурабилне архитектуре за хардверску акцелерацију предиктивних модела машинског учења / Rekonfigurabilne arhitekture za hardversku akceleraciju prediktivnih modela mašinskog učenja / Reconfigurable Architectures for Hardware Acceleration of Machine Learning Classifiers

Vranjković Vuk 02 July 2015 (has links)
<p>У овој дисертацији представљене су универзалне реконфигурабилне<br />архитектуре грубог степена гранулације за хардверску имплементацију<br />DT (decision trees), ANN (artificial neural networks) и SVM (support vector<br />machines) предиктивних модела као и хомогених и хетерогених<br />ансамбала. Коришћењем ових архитектура реализоване су две врсте<br />DT модела, две врсте ANN модела, две врсте SVM модела и седам<br />врста ансамбала на FPGA (field programmable gate arrays) чипу.<br />Експерименти, засновани на скуповима из стандардне UCI базе скупова<br />за машинско учење, показују да FPGA имплементација омогућава<br />значајно убрзање (од 1 до 6 редова величине) просечног времена<br />потребног за предикцију, у поређењу са софтверским решењима.</p> / <p>U ovoj disertaciji predstavljene su univerzalne rekonfigurabilne<br />arhitekture grubog stepena granulacije za hardversku implementaciju<br />DT (decision trees), ANN (artificial neural networks) i SVM (support vector<br />machines) prediktivnih modela kao i homogenih i heterogenih<br />ansambala. Korišćenjem ovih arhitektura realizovane su dve vrste<br />DT modela, dve vrste ANN modela, dve vrste SVM modela i sedam<br />vrsta ansambala na FPGA (field programmable gate arrays) čipu.<br />Eksperimenti, zasnovani na skupovima iz standardne UCI baze skupova<br />za mašinsko učenje, pokazuju da FPGA implementacija omogućava<br />značajno ubrzanje (od 1 do 6 redova veličine) prosečnog vremena<br />potrebnog za predikciju, u poređenju sa softverskim rešenjima.</p> / <p>This thesis proposes universal coarse-grained reconfigurable computing<br />architectures for hardware implementation of decision trees (DTs), artificial<br />neural networks (ANNs), support vector machines (SVMs), and<br />homogeneous and heterogeneous ensemble classifiers (HHESs). Using<br />these universal architectures, two versions of DTs, two versions of SVMs,<br />two versions of ANNs, and seven versions of HHESs machine learning<br />classifiers, have been implemented in field programmable gate arrays<br />(FPGA). Experimental results, based on datasets of standard UCI machine<br />learning repository database, show that FPGA implementation provides<br />significant improvement (1&ndash;6 orders of magnitude) in the average instance<br />classification time, in comparison with software implementations.</p>
350

A Study of Several Statistical Methods for Classification with Application to Microbial Source Tracking

Zhong, Xiao 30 April 2004 (has links)
With the advent of computers and the information age, vast amounts of data generated in a great deal of science and industry fields require the statisticians to explore further. In particular, statistical and computational problems in biology and medicine have created a new field of bioinformatics, which is attracting more and more statisticians, computer scientists, and biologists. Several procedures have been developed for tracing the source of fecal pollution in water resources based on certain characteristics of certain microorganisms. Use of this collection of techniques has been termed microbial source tracking (MST). Most of the current methods for MST are based on patterns of either phenotypic or genotypic variation in indicator organisms. Studies also suggested that patterns of genotypic variation might be more reliable due to their less association with environmental factors than those of phenotypic variation. Among the genotypic methods for source tracking, fingerprinting via rep-PCR is most common. Thus, identifying the specific pollution sources in contaminated waters based on rep-PCR fingerprinting techniques, viewed as a classification problem, has become an increasingly popular research topic in bioinformatics. In the project, several statistical methods for classification were studied, including linear discriminant analysis, quadratic discriminant analysis, logistic regression, and $k$-nearest-neighbor rules, neural networks and support vector machine. This project report summaries each of these methods and relevant statistical theory. In addition, an application of these methods to a particular set of MST data is presented and comparisons are made.

Page generated in 0.1717 seconds