Spelling suggestions: "subject:"8upport vector machine SVM"" "subject:"6upport vector machine SVM""
41 |
Automatic Patent ClassificationYehe, Nala January 2020 (has links)
Patents have a great research value and it is also beneficial to the community of industrial, commercial, legal and policymaking. Effective analysis of patent literature can reveal important technical details and relationships, and it can also explain business trends, propose novel industrial solutions, and make crucial investment decisions. Therefore, we should carefully analyze patent documents and use the value of patents. Generally, patent analysts need to have a certain degree of expertise in various research fields, including information retrieval, data processing, text mining, field-specific technology, and business intelligence. In real life, it is difficult to find and nurture such an analyst in a relatively short period of time, enabling him or her to meet the requirement of multiple disciplines. Patent classification is also crucial in processing patent applications because it will empower people with the ability to manage and maintain patent texts better and more flexible. In recent years, the number of patents worldwide has increased dramatically, which makes it very important to design an automatic patent classification system. This system can replace the time-consuming manual classification, thus providing patent analysis managers with an effective method of managing patent texts. This paper designs a patent classification system based on data mining methods and machine learning techniques and use KNIME software to conduct a comparative analysis. This paper will research by using different machine learning methods and different parts of a patent. The purpose of this thesis is to use text data processing methods and machine learning techniques to classify patents automatically. It mainly includes two parts, the first is data preprocessing and the second is the application of machine learning techniques. The research questions include: Which part of a patent as input data performs best in relation to automatic classification? And which of the implemented machine learning algorithms performs best regarding the classification of IPC keywords? This thesis will use design science research as a method to research and analyze this topic. It will use the KNIME platform to apply the machine learning techniques, which include decision tree, XGBoost linear, XGBoost tree, SVM, and random forest. The implementation part includes collection data, preprocessing data, feature word extraction, and applying classification techniques. The patent document consists of many parts such as description, abstract, and claims. In this thesis, we will feed separately these three group input data to our models. Then, we will compare the performance of those three different parts. Based on the results obtained from these three experiments and making the comparison, we suggest using the description part data in the classification system because it shows the best performance in English patent text classification. The abstract can be as the auxiliary standard for classification. However, the classification based on the claims part proposed by some scholars has not achieved good performance in our research. Besides, the BoW and TFIDF methods can be used together to extract efficiently the features words in our research. In addition, we found that the SVM and XGBoost techniques have better performance in the automatic patent classification system in our research.
|
42 |
Performance Benchmarking and Cost Analysis of Machine Learning Techniques : An Investigation into Traditional and State-Of-The-Art Models in Business Operations / Prestandajämförelse och kostnadsanalys av maskininlärningstekniker : en undersökning av traditionella och toppmoderna modeller inom affärsverksamhetLundgren, Jacob, Taheri, Sam January 2023 (has links)
Eftersom samhället blir allt mer datadrivet revolutionerar användningen av AI och maskininlärning sättet företag fungerar och utvecklas på. Denna studie utforskar användningen av AI, Big Data och Natural Language Processing (NLP) för att förbättra affärsverksamhet och intelligens i företag. Huvudsyftet med denna avhandling är att undersöka om den nuvarande klassificeringsprocessen hos värdorganisationen kan upprätthållas med minskade driftskostnader, särskilt lägre moln-GPU-kostnader. Detta har potential att förbättra klassificeringsmetoden, förbättra produkten som företaget erbjuder sina kunder på grund av ökad klassificeringsnoggrannhet och stärka deras värdeerbjudande. Vidare utvärderas tre tillvägagångssätt mot varandra och implementationerna visar utvecklingen inom området. Modellerna som jämförs i denna studie inkluderar traditionella maskininlärningsmetoder som Support Vector Machine (SVM) och Logistisk Regression, tillsammans med state-of-the-art transformermodeller som BERT, både Pre-Trained och Fine-Tuned. Artikeln visar att det finns en avvägning mellan prestanda och kostnad vilket illustrerar problemet som många företag, som Valu8, står inför när de utvärderar vilket tillvägagångssätt de ska implementera. Denna avvägning diskuteras och analyseras sedan mer detaljerat för att utforska möjliga kompromisser från varje perspektiv i ett försök att hitta en balanserad lösning som kombinerar prestandaeffektivitet och kostnadseffektivitet. / As society is becoming more data-driven, Artificial Intelligence (AI) and Machine Learning are revolutionizing how companies operate and evolve. This study explores the use of AI, Big Data, and Natural Language Processing (NLP) in improving business operations and intelligence in enterprises. The primary objective of this thesis is to examine if the current classification process at the host company can be maintained with reduced operating costs, specifically lower cloud GPU costs. This can improve the classification method, enhance the product the company offers its customers due to increased classification accuracy, and strengthen its value proposition. Furthermore, three approaches are evaluated against each other, and the implementations showcase the evolution within the field. The models compared in this study include traditional machine learning methods such as Support Vector Machine (SVM) and Logistic Regression, alongside state-of-the-art transformer models like BERT, both Pre-Trained and Fine-Tuned. The paper shows a trade-off between performance and cost, showcasing the problem many companies like Valu8 stand before when evaluating which approach to implement. This trade-off is discussed and analyzed in further detail to explore possible compromises from each perspective to strike a balanced solution that combines performance efficiency and cost-effectiveness.
|
43 |
Grön AI : En analys av maskininlärningsalgoritmers prestanda och energiförbrukningBerglin, Caroline, Ellström, Julia January 2024 (has links)
Trots de framsteg som gjorts inom artificiell intelligens (AI) och maskininlärning (ML), uppkommer utmaningar gällande deras miljöpåverkan. Fokuset på att skapa avancerade och träffsäkra modeller innebär ofta att omfattande beräkningsresurser krävs, vilket leder till en hög energiförbrukning. Syftet med detta arbete är att undersöka ämnet grön AI och sambandet mellan prestanda och energiförbrukning hos två ML-algoritmer. De algoritmer som undersöks är beslutsträd och stödvektormaskin (SVM), med hjälp av två dataset: Bank Marketing och MNIST. Prestandan mäts med utvärderingsmåtten noggrannhet, precision, recall och F1-poäng, medan energiförbrukningen mäts med verktyget Intel VTune Profiler. Arbetets resultat visar att en högre prestanda resulterade i en högre energiförbrukning, där SVM presterade bäst men också förbrukade mest energi i samtliga tester. Vidare visar resultatet att optimering av modellerna resulterade både i en förbättrad prestanda men också i en ökad energiförbrukning. Samma resultat kunde ses när ett större dataset användes. Arbetet anses inte bidra med resultat eller riktlinjer som går att generalisera till andra arbeten. Däremot bidrar arbetet med en förståelse och medvetenhet kring miljöaspekterna gällande AI, vilket kan användas som en grund för att undersöka ämnet vidare. Genom en ökad medvetenhet kan ett gemensamt ansvar tas för att utveckla AI-lösningar som inte bara är kraftfulla och effektiva, utan också hållbara. / Despite the advancements made in artificial intelligence (AI) and machine learning (ML), challenges regarding their environmental impact arise. The focus on creating advanced and accurate models often requires extensive computational resources, leading to a high energy consumption. The purpose of this work is to explore the topic of green AI and the relationship between performance and energy consumption of two ML algorithms. The algorithms being evaluated are decision trees and support vector machines (SVM), using two datasets: Bank Marketing and MNIST. Performance is measured using the evaluation metrics accuracy, precision, recall, and F1-score, while energy consumption is measured using the Intel VTune Profiler tool. The results show that higher performance resulted in higher energy consumption, with SVM performing the best but also consuming the most energy in all tests. Furthermore, the results show that optimizing the models resulted in both improved performance and increased energy consumption. The same results were observed when a larger dataset was used. This work is not considered to provide results or guidelines that can be generalized to other studies. However, it contributes to an understanding and awareness of the environmental aspects of AI, which can serve as a foundation for further exploration of the topic. Through increased awareness, shared responsibility can be taken to develop AI solutions that are not only powerful and efficient but also sustainable.
|
44 |
Geotechnical Site Characterization And Liquefaction Evaluation Using Intelligent ModelsSamui, Pijush 02 1900 (has links)
Site characterization is an important task in Geotechnical Engineering. In situ tests based on standard penetration test (SPT), cone penetration test (CPT) and shear wave velocity survey are popular among geotechnical engineers. Site characterization using any of these properties based on finite number of in-situ test data is an imperative task in probabilistic site characterization. These methods have been used to design future soil sampling programs for the site and to specify the soil stratification. It is never possible to know the geotechnical properties at every location beneath an actual site because, in order to do so, one would need to sample and/or test the entire subsurface profile. Therefore, the main objective of site characterization models is to predict the subsurface soil properties with minimum in-situ test data. The prediction of soil property is a difficult task due to the uncertainities. Spatial variability, measurement ‘noise’, measurement and model bias, and statistical error due to limited measurements are the sources of uncertainities.
Liquefaction in soil is one of the other major problems in geotechnical earthquake engineering. It is defined as the transformation of a granular material from a solid to a liquefied state as a consequence of increased pore-water pressure and reduced effective stress. The generation of excess pore pressure under undrained loading conditions is a hallmark of all liquefaction phenomena. This phenomena was brought to the attention of engineers more so after Niigata(1964) and Alaska(1964) earthquakes. Liquefaction will cause building settlement or tipping, sand boils, ground cracks, landslides, dam instability, highway embankment failures, or other hazards. Such damages are generally of great concern to public safety and are of economic significance. Site-spefific evaluation of liquefaction susceptibility of sandy and silty soils is a first step in liquefaction hazard assessment. Many methods (intelligent models and simple methods as suggested by Seed and Idriss, 1971) have been suggested to evaluate liquefaction susceptibility based on the large data from the sites where soil has been liquefied / not liquefied.
The rapid advance in information processing systems in recent decades directed engineering research towards the development of intelligent models that can model natural phenomena automatically. In intelligent model, a process of training is used to build up a model of the particular system, from which it is hoped to deduce responses of the system for situations that have yet to be observed. Intelligent models learn the input output relationship from the data itself. The quantity and quality of the data govern the performance of intelligent model. The objective of this study is to develop intelligent models [geostatistic, artificial neural network(ANN) and support vector machine(SVM)] to estimate corrected standard penetration test (SPT) value, Nc, in the three dimensional (3D) subsurface of Bangalore. The database consists of 766 boreholes spread over a 220 sq km area, with several SPT N values (uncorrected blow counts) in each of them. There are total 3015 N values in the 3D subsurface of Bangalore. To get the corrected blow counts, Nc, various corrections such as for overburden stress, size of borehole, type of sampler, hammer energy and length of connecting rod have been applied on the raw N values. Using a large database of Nc values in the 3D subsurface of Bangalore, three geostatistical models (simple kriging, ordinary kriging and disjunctive kriging) have been developed. Simple and ordinary kriging produces linear estimator whereas, disjunctive kriging produces nonlinear estimator. The knowledge of the semivariogram of the Nc data is used in the kriging theory to estimate the values at points in the subsurface of Bangalore where field measurements are not available. The capability of disjunctive kriging to be a nonlinear estimator and an estimator of the conditional probability is explored. A cross validation (Q1 and Q2) analysis is also done for the developed simple, ordinary and disjunctive kriging model. The result indicates that the performance of the disjunctive kriging model is better than simple as well as ordinary kriging model.
This study also describes two ANN modelling techniques applied to predict Nc data at any point in the 3D subsurface of Bangalore. The first technique uses four layered feed-forward backpropagation (BP) model to approximate the function, Nc=f(x, y, z) where x, y, z are the coordinates of the 3D subsurface of Bangalore. The second technique uses generalized regression neural network (GRNN) that is trained with suitable spread(s) to approximate the function, Nc=f(x, y, z). In this BP model, the transfer function used in first and second hidden layer is tansig and logsig respectively. The logsig transfer function is used in the output layer. The maximum epoch has been set to 30000. A Levenberg-Marquardt algorithm has been used for BP model. The performance of the models obtained using both techniques is assessed in terms of prediction accuracy. BP ANN model outperforms GRNN model and all kriging models.
SVM model, which is firmly based on the theory of statistical learning theory, uses regression technique by introducing -insensitive loss function has been also adopted to predict Nc data at any point in 3D subsurface of Bangalore. The SVM implements the structural risk minimization principle (SRMP), which has been shown to be superior to the more traditional empirical risk minimization principle (ERMP) employed by many of the other modelling techniques. The present study also highlights the capability of SVM over the developed geostatistic models (simple kriging, ordinary kriging and disjunctive kriging) and ANN models.
Further in this thesis, Liquefaction susceptibility is evaluated from SPT, CPT and Vs data using BP-ANN and SVM. Intelligent models (based on ANN and SVM) are developed for prediction of liquefaction susceptibility using SPT data from the 1999 Chi-Chi earthquake, Taiwan. Two models (MODEL I and MODEL II) are developed. The SPT data from the work of Hwang and Yang (2001) has been used for this purpose. In MODEL I, cyclic stress ratio (CSR) and corrected SPT values (N1)60 have been used for prediction of liquefaction susceptibility. In MODEL II, only peak ground acceleration (PGA) and (N1)60 have been used for prediction of liquefaction susceptibility. Further, the generalization capability of the MODEL II has been examined using different case histories available globally (global SPT data) from the work of Goh (1994).
This study also examines the capabilities of ANN and SVM to predict the liquefaction susceptibility of soils from CPT data obtained from the 1999 Chi-Chi earthquake, Taiwan. For determination of liquefaction susceptibility, both ANN and SVM use the classification technique. The CPT data has been taken from the work of Ku et al.(2004). In MODEL I, cone tip resistance (qc) and CSR values have been used for prediction of liquefaction susceptibility (using both ANN and SVM). In MODEL II, only PGA and qc have been used for prediction of liquefaction susceptibility. Further, developed MODEL II has been also applied to different case histories available globally (global CPT data) from the work of Goh (1996).
Intelligent models (ANN and SVM) have been also adopted for liquefaction susceptibility prediction based on shear wave velocity (Vs). The Vs data has been collected from the work of Andrus and Stokoe (1997). The same procedures (as in SPT and CPT) have been applied for Vs also.
SVM outperforms ANN model for all three models based on SPT, CPT and Vs data. CPT method gives better result than SPT and Vs for both ANN and SVM models. For CPT and SPT, two input parameters {PGA and qc or (N1)60} are sufficient input parameters to determine the liquefaction susceptibility using SVM model.
In this study, an attempt has also been made to evaluate geotechnical site characterization by carrying out in situ tests using different in situ techniques such as CPT, SPT and multi channel analysis of surface wave (MASW) techniques. For this purpose a typical site was selected wherein a man made homogeneous embankment and as well natural ground has been met. For this typical site, in situ tests (SPT, CPT and MASW) have been carried out in different ground conditions and the obtained test results are compared. Three CPT continuous test profiles, fifty-four SPT tests and nine MASW test profiles with depth have been carried out for the selected site covering both homogeneous embankment and natural ground. Relationships have been developed between Vs, (N1)60 and qc values for this specific site. From the limited test results, it was found that there is a good correlation between qc and Vs. Liquefaction susceptibility is evaluated using the in situ test data from (N1)60, qc and Vs using ANN and SVM models. It has been shown to compare well with “Idriss and Boulanger, 2004” approach based on SPT test data.
SVM model has been also adopted to determine over consolidation ratio (OCR) based on piezocone data. Sensitivity analysis has been performed to investigate the relative importance of each of the input parameters. SVM model outperforms all the available methods for OCR prediction.
|
45 |
An?lise e classifica??o de imagens de les?es da pele por atributos de cor, forma e textura utilizando m?quina de vetor de suporteSoares, Heliana Bezerra 22 February 2008 (has links)
Made available in DSpace on 2014-12-17T14:54:49Z (GMT). No. of bitstreams: 1
HelianaBS_da_capa_ate_cap4.pdf: 2361373 bytes, checksum: 3e1e43e8ba1aadc274663b8b8e3de72f (MD5)
Previous issue date: 2008-02-22 / Conselho Nacional de Desenvolvimento Cient?fico e Tecnol?gico / The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin.
The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries / O c?ncer de pele ? o mais comum de todos os c?nceres e o aumento da sua incid?ncia deve-se, em parte, ao comportamento das pessoas em rela??o ? exposi??o ao sol. No Brasil, o c?ncer de pele n?o melanoma ? o mais incidente na maioria das regi?es. A dermatoscopia e ideodermatoscopia s?o os principais tipos de exames para o diagn?stico de doen?as da pele dermatol?gicas. O campo que envolve o uso de ferramentas computacionais para o aux?lio ou acompanhamento do diagn?stico m?dico em les?es dermatol?gicas ainda ? visto como muito recente. V?rios m?todos foram propostos para classifica??o autom?tica de patologias da pele utilizando imagens. O presente trabalho tem como objetivo apresentar uma nova metodologia inteligente para an?lise e classifica??o de imagens de c?ncer de pele, baseada nas t?cnicas de processamento digital de imagens para extra??o de caracter?sticas de cor, forma e textura, utilizando a Transformada Wavelet Packet (TWP) e a t?cnicas de aprendizado de m?quina denominada M?quina de Vetor de Suporte (SVM Support Vector Machine). A Transformada Wavelet Packet ? aplicada para extra??o de caracter?sticas de textura nas imagens. Esta consiste de um conjunto de fun??es base que representa a imagem em diferentes bandas de freq??ncia, cada uma com resolu??es distintas correspondente a cada escala. Al?m disso, s?o calculadas tamb?m as caracter?sticas de cor da les?o que s?o dependentes de um contexto visual, influenciada pelas cores existentes em sua volta, e os atributos de forma atrav?s dos descritores de Fourier. Para a tarefa de classifica??o ? utilizado a M?quina de Vetor de Suporte, que baseia-se nos princ?pios da minimiza??o do risco estrutural, proveniente da teoria do aprendizado estat?stico. A SVM tem como objetivo construir hiperplanos ?timos que apresentem a maior margem de separa??o entre classes. O hiperplano gerado ? determinado por um subconjunto dos pontos das classes, chamado vetores de suporte. Para o banco de dados utilizado neste trabalho, os resultados apresentaram um bom desempenho obtendo um acerto global de 92,73% para melanoma, e 86% para les?es n?o-melanoma e benigna. O potencial dos descritores extra?dos aliados ao classificador SVM tornou o m?todo capaz de reconhecer e classificar as les?es analisadas
|
46 |
Sentiment-Driven Topic Analysis Of Song LyricsSharma, Govind 08 1900 (has links) (PDF)
Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a user. The very field is further sub-divided into Opinion Mining and Emotion Analysis, the latter of which is the basis for the present work. Work on songs is aimed at building affective interactive applications such as music recommendation engines. Using song lyrics, we are interested in both supervised and unsupervised analyses, each of which has its own pros and cons.
For an unsupervised analysis (clustering), we use a standard probabilistic topic model called Latent Dirichlet Allocation (LDA). It mines topics from songs, which are nothing but probability distributions over the vocabulary of words. Some of the topics seem sentiment-based, motivating us to continue with this approach. We evaluate our clusters using a gold dataset collected from an apt website and get positive results. This approach would be useful in the absence of a supervisor dataset.
In another part of our work, we argue the inescapable existence of supervision in terms of having to manually analyse the topics returned. Further, we have also used explicit supervision in terms of a training dataset for a classifier to learn sentiment specific classes. This analysis helps reduce dimensionality and improve classification accuracy. We get excellent dimensionality reduction using Support Vector Machines (SVM) for feature selection. For re-classification, we use the Naive Bayes Classifier (NBC) and SVM, both of which perform well. We also use Non-negative Matrix Factorization (NMF) for classification, but observe that the results coincide with those of NBC, with no exceptions. This drives us towards establishing a theoretical equivalence between the two.
|
47 |
Anomaly Detection With Machine Learning In Astronomical ImagesEtsebeth, Verlon January 2020 (has links)
Masters of Science / Observations that push the boundaries have historically fuelled scientific breakthroughs, and these observations frequently involve phenomena that were previously unseen and unidentified. Data sets have increased in size and quality as modern technology advances
at a record pace. Finding these elusive phenomena within these large data sets becomes a tougher challenge with each advancement made. Fortunately, machine learning techniques have proven to be extremely valuable in detecting outliers within data sets. Astronomaly is a framework that utilises machine learning techniques for anomaly detection in astronomy and incorporates active learning to provide target specific results. It is used here to evaluate whether machine learning techniques are suitable to detect anomalies within the optical astronomical data obtained from the Dark Energy Camera Legacy Survey. Using the machine learning algorithm isolation forest, Astronomaly
is applied on subsets of the Dark Energy Camera Legacy Survey (DECaLS) data set. The pre-processing stage of Astronomaly had to be significantly extended to handle real survey data from DECaLS, with the changes made resulting in up to 10% more sources having their features extracted successfully. For the top 500 sources returned, 292 were ordinary sources, 86 artefacts and masked sources and 122 were interesting anomalous sources. A supplementary machine learning algorithm known as active learning enhances the identification probability of outliers in data sets by making it easier to identify target specific sources. The addition of active learning further increases the amount of
interesting sources returned by almost 40%, with 273 ordinary sources, 56 artefacts and 171 interesting anomalous sources returned. Among the anomalies discovered are some merger events that have been successfully identified in known catalogues and several candidate merger events that have not yet been identified in the literature. The results indicate that machine learning, in combination with active learning, can be effective in detecting anomalies in actual data sets. The extensions integrated into Astronomaly pave the way for its application on future surveys like the Vera C. Rubin Observatory Legacy Survey of Space and Time.
|
48 |
Fault detection of planetary gearboxes in BLDC-motors using vibration and acoustic noise analysisAhnesjö, Henrik January 2020 (has links)
This thesis aims to use vibration and acoustic noise analysis to help a production line of a certain motor type to ensure good quality. Noise from the gearbox is sometimes present and the way it is detected is with a human listening to it. This type of error detection is subjective, and it is possible for human error to be present. Therefore, an automatic test that pass or fail the produced Brush Less Direct Current (BLDC)-motors is wanted. Two measurement setups were used. One was based on an accelerometer which was used for vibration measurements, and the other based on a microphone for acoustic sound measurements. The acquisition and analysis of the measurements were implemented using the data acquisition device, compactDAQ NI 9171, and the graphical programming software, NI LabVIEW. Two methods, i.e., power spectrum analysis and machine learning, were used for the analyzing of vibration and acoustic signals, and identifying faults in the gearbox. The first method based on the Fast Fourier transform (FFT) was used to the recorded sound from the BLDC-motor with the integrated planetary gearbox to identify the peaks of the sound signals. The source of the acoustic sound is from a faulty planet gear, in which a flank of a tooth had an indentation. Which could be measured and analyzed. It sounded like noise, which can be used as the indications of faults in gears. The second method was based on the BLDC-motors vibration characteristics and uses supervised machine learning to separate healthy motors from the faulty ones. Support Vector Machine (SVM) is the suggested machine learning algorithm and 23 different features are used. The best performing model was a Coarse Gaussian SVM, with an overall accuracy of 92.25 % on the validation data.
|
49 |
Distributed Support Vector Machine With Graphics Processing UnitsZhang, Hang 06 August 2009 (has links)
Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem. Sequential Minimal Optimization (SMO) is a decomposition-based algorithm which breaks this large QP problem into a series of smallest possible QP problems. However, it still costs O(n2) computation time. In our SVM implementation, we can do training with huge data sets in a distributed manner (by breaking the dataset into chunks, then using Message Passing Interface (MPI) to distribute each chunk to a different machine and processing SVM training within each chunk). In addition, we moved the kernel calculation part in SVM classification to a graphics processing unit (GPU) which has zero scheduling overhead to create concurrent threads. In this thesis, we will take advantage of this GPU architecture to improve the classification performance of SVM.
|
Page generated in 0.0903 seconds