Global ETD Search

241	Stock Market Forecasting Using SVM With Price and News Analysis Hansen, Patrik, Vojcic, Sandi January 2020 (has links) Many machine learning approaches have been usedfor financial forecasting to estimate stock trends in the future. Thefocus of this project is to implement a Support Vector Machinewith price and news analysis for companies within the technologysector as inputs to predict if the price of the stock is going torise or fall in the coming days and to observe the impact on theprediction accuracy by adding news to the technical analysis.The price analysis is compiled of 9 different financial indicatorsused to indicate changes in price, and the news analysis uses thebag-of-words method to rate headlines as positive or negative.There is a slight indication of the news improving the resultsif the validation data is randomly sampled the testing accuracyincreases. When testing on the last fifth of the data of eachcompany, there was only a small difference in the results whenadding news to the calculation and such no clear correlation canbe seen. The resulting program has a mean and median testingaccuracy over 50 % for almost all settings. Complications whenusing SVM for the purpose of price forecasting in the stockmarket is also discussed. / Många metoder för maskininlärning har använts i syfte av finansiell prognos för att uppskatta aktie trender i framtiden. Fokus för detta projekt är att implementera en Support Vector Machine med pris- och nyhetsanalys för företag inom teknologisektorn som inmatning för att förutsäga om priset på aktien kommer att öka eller minska under de kommande dagarna och för att observera påverkan på förutsägelsens noggrannhet av att lägga till nyheter till den tekniska analysen. Prisanalysen består av 9 olika finansiella indikatorer som används för att indikera prisändringar, och nyhetsanalysen använder metoden bag-of-word för att betygsätta rubriker som positiva eller negativa. Det finns en liten indikation på att nyheterna förbättrar resultat där om valideringsdata stickas ur slumpmässigt provningsnoggrannheten ökar. När man testade den sista femte delen av inmatningsdatan från varje företag, fanns det bara en liten skillnad i resultaten när nyheterna beräknades vilket leder till att en tydlig korrelation kan inte ses. Det resulterande programmet har en genomsnittlig och median test nogrannhet över 50 % för nästan alla inställningar. Komplikationer när SVM används för prisprognoser på aktiemarknaden diskuteras också. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm Support vector machine stock market financialindicators news bag-of-words Elektroteknik och elektronik
242	Development of a Low-Cost and Easy-to-Use Wearable Knee Joint Monitoring System / A Wearable Knee Joint Monitoring System Faisal, Abu Ilius January 2020 (has links) The loss of mobility among the elderly has become a significant health and socio-economic concern worldwide. Poor mobility due to gradual deterioration of the musculoskeletal system causes older adults to be more vulnerable to serious health risks such as joint injuries, bone fractures and traumatic brain injury. The costs associated with the treatment and management of these injuries are a huge financial/social burden on the government, society and family. Knee is one of the key joints that bear most of the body weight, so its proper function is essential for good mobility. Further, Continuous monitoring of the knee joint can potentially provide important quantitative information related to knee health and mobility that can be utilized for health assessment and early diagnoses of mobility-related problems. In this research work, we developed an easy-to-use, low-cost, multi-sensor-based wearable device to monitor and assess the knee joint and proposed an analysis system to characterize and classify an individual’s knee joint features with respect to the baseline characteristics of his/her peer group. The system is composed of a set of different miniaturized sensors (inertial motion, temperature, pressure and galvanic skin response) to obtain linear acceleration, angular velocity, skin temperature, muscle pressure and sweat rate of a knee joint during different daily activities. A database is constructed from 70 healthy adults in the age range from 18 to 86 years using the combination of all signals from our knee joint monitoring system. In order to extract relevant features from the datasets, we employed computationally efficient methods such as complementary filter and wavelet packet decomposition. Minimum redundancy maximum relevance algorithm and principal component analysis were used to select key features and reduce the dimension of the feature vectors. The obtained features were classified using the support vector machine, forming two distinct clusters in the baseline knee joint characteristics corresponding to gender, age, body mass index and knee/leg health condition. Thus, this simple, easy‐to‐use, cost-effective, non-invasive and unobtrusive knee monitoring system can be used for real-time evaluation and early diagnoses of joint disorders, fall detection, mobility monitoring and rehabilitation. / Thesis / Master of Applied Science (MASc) Knee joint monitoring system Wearable Mobility Sensor fusion Inertial sensors Gait analysis Feature extraction Support vector machine
243	Improvement of Bacteria Detection Accuracy and Speed Using Raman Scattering and Machine Learning Mandour, Aseel 15 September 2022 (has links) Bacteria identification plays an essential role in preventing health complications and saving patients' lives. The most widely used method to identify bacteria, the bacterial cultural method, suffers from long processing times. Hence, an effective, rapid, and non-invasive method is needed as an alternative. Raman spectroscopy is a potential candidate for bacteria identifi cation due to its effective and rapid results and the fact that, similar to the uniqueness of a human fingerprint, the Raman spectrum is unique for every material. In my lab at the University of Ottawa, we focus on the use of Raman scattering for biosensing in order to achieve high identifi cation accuracy for different types of bacteria. Based on the unique Raman fingerprint for each bacteria type, different types of bacteria can be identifi ed successfully. However, using the Raman spectrum to identify bacteria poses a few challenges. First, the Raman signal is a weak signal, and so enhancement of the signal intensity is essential, e.g., by using surface-enhanced Raman scattering (SERS). Moreover, the Raman signal can be contaminated by different noise sources. Also, the signal consists of a large number of features, and is non-linear due to the correlation between the Raman features. Using machine learning (ML) along with SERS, we can overcome such challenges in the identifi cation process and achieve high accuracy for the system identifying bacteria. In this thesis, I present a method to improve the identifi cation of different bacteria types using a support vector machine (SVM) ML algorithm based on SERS. I also present dimension reduction techniques to reduce the complexity and processing time while maintaining high identifi cation accuracy in the classifi cation process. I consider four bacteria types: Escherichia coli (EC), Cutibacterium acnes (CA, it was formerly known as Propi-onibacterium acnes), methicillin-resistant Staphylococcus aureus (MRSA), and methicillin-sensitive Staphylococcus aureus (MSSA). Both the MRSA and MSSA are combined in a single class named MS in the classifi cation. We are focusing on using these types of bacteria as they are the most common types in the joint infection disease. Using binary classi fication, I present the simulation results for three binary models: EC vs CA, EC vs MS, and MS vs CA. Using the full data set, binary classi fication achieved a classi fication accuracy of more than 95% for the three models. When the samples data set was reduced, to decrease the complexity based on the samples' signal-to-noise ratio (SNR), a classi fication accuracy of more than 95% for the three models was achieved using less than 60% of the original data set. The recursive feature elimination (RFE) algorithm was then used to reduce the complexity in the feature dimension. Given that a small number of features were more heavily weighted than the rest of the features, the number of features used in the classifi cation could be signi ficantly reduced while maintaining high classi fication accuracy. I also present the classifi cation accuracy of using the multiclass one-versus-all (OVA) method, i.e., EC vs all, MS vs all, and CA vs all. Using the complete data set, the OVA method achieved classi cation accuracy of more than 90%. Similar to the binary classifi cation, the dimension reduction was applied to the input samples. Using the SNR reduction, the input samples were reduced by more than 60% while maintaining classifi cation accuracy higher than 80%. Furthermore, when the RFE algorithm was used to reduce the complexity on the features, and only the 5% top-weighted features of the full data set were used, a classi fication accuracy of more than 90% was achieved. Finally, by combining both reduction dimensions, the classi fication accuracy for the reduced data set was above 92% for a signifi cantly reduced data set. Both the dimension reduction and the improvement in the classi fication accuracy between different types of bacteria using the ML algorithm and SERS could have a signi ficant impact in ful lfiling the demand for accurate, fast, and non-destructive identi fication of bacteria samples in the medical fi eld, in turn potentially reducing health complications and saving patient lives. Support vector machine (SVM) Bacteria identification
244	Integrative Modeling and Analysis of High-throughput Biological Data Chen, Li 21 January 2011 (has links) Computational biology is an interdisciplinary field that focuses on developing mathematical models and algorithms to interpret biological data so as to understand biological problems. With current high-throughput technology development, different types of biological data can be measured in a large scale, which calls for more sophisticated computational methods to analyze and interpret the data. In this dissertation research work, we propose novel methods to integrate, model and analyze multiple biological data, including microarray gene expression data, protein-DNA interaction data and protein-protein interaction data. These methods will help improve our understanding of biological systems. First, we propose a knowledge-guided multi-scale independent component analysis (ICA) method for biomarker identification on time course microarray data. Guided by a knowledge gene pool related to a specific disease under study, the method can determine disease relevant biological components from ICA modes and then identify biologically meaningful markers related to the specific disease. We have applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification. Second, we propose a novel method for transcriptional regulatory network identification by integrating gene expression data and protein-DNA binding data. The approach is built upon a multi-level analysis strategy designed for suppressing false positive predictions. With this strategy, a regulatory module becomes increasingly significant as more relevant gene sets are formed at finer levels. At each level, a two-stage support vector regression (SVR) method is utilized to reduce false positive predictions by integrating binding motif information and gene expression data; a significance analysis procedure is followed to assess the significance of each regulatory module. The resulting performance on simulation data and yeast cell cycle data shows that the multi-level SVR approach outperforms other existing methods in the identification of both regulators and their target genes. We have further applied the proposed method to breast cancer cell line data to identify condition-specific regulatory modules associated with estrogen treatment. Experimental results show that our method can identify biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Third, we propose a bootstrapping Markov Random Filed (MRF)-based method for subnetwork identification on microarray data by incorporating protein-protein interaction data. Methodologically, an MRF-based network score is first derived by considering the dependency among genes to increase the chance of selecting hub genes. A modified simulated annealing search algorithm is then utilized to find the optimal/suboptimal subnetworks with maximal network score. A bootstrapping scheme is finally implemented to generate confident subnetworks. Experimentally, we have compared the proposed method with other existing methods, and the resulting performance on simulation data shows that the bootstrapping MRF-based method outperforms other methods in identifying ground truth subnetwork and hub genes. We have then applied our method to breast cancer data to identify significant subnetworks associated with drug resistance. The identified subnetworks not only show good reproducibility across different data sets, but indicate several pathways and biological functions potentially associated with the development of breast cancer and drug resistance. In addition, we propose to develop network-constrained support vector machines (SVM) for cancer classification and prediction, by taking into account the network structure to construct classification hyperplanes. The simulation study demonstrates the effectiveness of our proposed method. The study on the real microarray data sets shows that our network-constrained SVM, together with the bootstrapping MRF-based subnetwork identification approach, can achieve better classification performance compared with conventional biomarker selection approaches and SVMs. We believe that the research presented in this dissertation not only provides novel and effective methods to model and analyze different types of biological data, the extensive experiments on several real microarray data sets and results also show the potential to improve the understanding of biological mechanisms related to cancers by generating novel hypotheses for further study. / Ph. D. Support Vector Regression Biomarker Identification Transcriptional Regulatory Network Microarray Data Analysis Markov Random Field Support Vector Machine
245	Automated 2D Detection and Localization of Construction Resources in Support of Automated Performance Assessment of Construction Operations Memarzadeh, Milad 11 January 2013 (has links) This study presents two computer vision based algorithms for automated 2D detection of construction workers and equipment from site video streams. The state-of-the-art research proposes semi-automated detection methods for tracking of construction workers and equipment. Considering the number of active equipment and workers on jobsites and their frequency of appearance in a camera's field of view, application of semi-automated techniques can be time-consuming. To address this limitation, two new algorithms based on Histograms of Oriented Gradients and Colors (HOG+C), 1) HOG+C sliding detection window technique, and 2) HOG+C deformable part-based model are proposed and their performance are compared to the state-of-the-art algorithm in computer vision community. Furthermore, a new comprehensive benchmark dataset containing over 8,000 annotated video frames including equipment and workers from different construction projects is introduced. This dataset contains a large range of pose, scale, background, illumination, and occlusion variation. The preliminary results with average performance accuracies of 100%, 92.02%, and 89.69% for workers, excavators, and dump trucks respectively, indicate the applicability of the proposed methods for automated activity analysis of workers and equipment from single video cameras. Unlike other state-of-the-art algorithms in automated resource tracking, these methods particularly detects idle resources and does not need manual or semi-automated initialization of the resource locations in 2D video frames. / Master of Science Support Vector Machine Histogram of Oriented Gradients Deformable Part-based Models HSV Colors Resource Detection and Localization Performance Monitoring
246	Data-Driven Supervised Classifiers in High-Dimensional Spaces: Application on Gene Expression Data Efrem, Nabiel H. January 2024 (has links) Several ready-to-use supervised classifiers perform predictively well in large-sample cases, but generally, the same cannot be expected when transitioning to high-dimensional settings. This can be explained by the classical supervised theory that has not been developed within high-dimensional spaces, giving several classifiers a hard combat against the curse of dimensionality. A rise in parsimonious classification procedures, particularly techniques incorporating feature selectors, can be observed. It can be interpreted as a two-step procedure: allowing an arbitrary selector to obtain a feature subset independent of a ready-to-use model and subsequently classify unlabelled instances within the selected subset. Modeling the two-step procedure is often heavy in motivation, and theoretical and algorithmic descriptions are frequently overlooked. In this thesis, we aim to describe the theoretical and algorithmic framework when employing a feature selector as a pre-processing step for Support Vector Machine and assess its validity in high-dimensional settings. The validity of the proposed classifier is evaluated based on predictive performance through a comparative study with a state-of-the-art algorithm designed for advanced learning tasks. The chosen algorithm effectively employs feature relevance during training, making it suitable for high-dimensional settings. The results suggest that the proposed classifier performs predicatively superior to the Support Vector Machine in lower input dimensions; however, a high rate of convergence towards a performance comparable to the Support Vector Machine tends to emerge for input dimensions beyond a certain threshold. Additionally, the thesis could not conclude any strict superior performance between the chosen state-of-the-art algorithm and the proposed classifier. Nonetheless, the state-of-the-art algorithm imposes a more balanced performance across both labels. Supervised Classification High-Dimensional Space Feature Selection Parsimonious Classifier Support Vector Machine Probability Theory and Statistics Sannolikhetsteori och statistik
247	Mapping eastern redcedar (Juniperus Virginiana L.) and quantifying its biomass in Riley County, Kansas Burchfield, David Richard January 1900 (has links) Master of Arts / Department of Geography / Kevin P. Price / Due primarily to changes in land management practices, eastern redcedar (Juniperus virginiana L.), a native Kansas conifer, is rapidly invading onto valuable rangelands. The suppression of fire and increase of intensive grazing, combined with the rapid growth rate, high reproductive output, and dispersal ability of the species have allowed it to dramatically expand beyond its original range. There is a growing interest in harvesting this species for use as a biofuel. For economic planning purposes, density and biomass quantities for the trees are needed. Three methods are explored for mapping eastern redcedar and quantifying its biomass in Riley County, Kansas. First, a land cover classification of redcedar cover is performed using a method that utilizes a support vector machine classifier applied to a multi-temporal stack of Landsat TM satellite images. Second, a Small Unmanned Aircraft System (sUAS) is used to measure individual redcedar trees in an area where they are encroaching into a pasture. Finally, a hybrid approach is used to estimate redcedar biomass using high resolution multispectral and light detection and ranging (LiDAR) imagery. These methods showed promise in the forestry, range management, and bioenergy industries for better understanding of an invasive species that shows great potential for use as a biofuel resource. Eastern redcedar Remote sensing Multi-temporal image classification Support vector machine LiDAR Forestry (0478) Remote Sensing (0799)
248	Support vector machines, generalization bounds, and transduction Kroon, Rodney Stephen 12 1900 (has links) Thesis (MComm)--University of Stellenbosch, 2003. / Please refer to full text for abstract. Machine learning Computer algorithms PAC bounds Support vector machine (SVM) Transductive bounds Model selection Theses -- Computer science Dissertations -- Computer science Theses -- Mathematics Dissertations -- Mathematics
249	Identifying Categorical Land Use Transition and Land Degradation in Northwestern Drylands of Ethiopia Zewdie, Worku, Csaplovics, Elmar 08 June 2016 (has links) (PDF) Land use transition in dryland ecosystems is one of the major driving forces to landscape change that directly impacts the welfare of humans. In this study, the support vector machine (SVM) classification algorithm and cross tabulation matrix analysis are used to identify systematic and random processes of change. The magnitude and prevailing signals of land use transitions are assessed taking into account net change and swap change. Moreover, spatiotemporal patterns and the relationship of precipitation and the Normalized Difference Vegetation Index (NDVI) are explored to evaluate landscape degradation. The assessment showed that 44% of net change and about 54% of total change occurred during the study period, with the latter being due to swap change. The conversion of over 39% of woodland to cropland accounts for the existence of the highest loss of valuable ecosystem of the region. The spatial relationship of NDVI and precipitation also showed R2 of below 0.5 over 55% of the landscape with no significant changes in the precipitation trend, thus representing an indicative symptom of land degradation. This in-depth analysis of random and systematic landscape change is crucial for designing policy intervention to halt woodland degradation in this fragile environment. Support-Vektor-Maschine TU Dresden Publikationsfonds support vector machine (SVM) NDVI net change swap change systematic transition dryland degradation Technical University Dresden Publication funds ddc:620 rvk:ZG 1000
250	官員職等陞遷分類預測之研究 / Classification prediction on government official’s rank promotion 賴隆平, Lai, Long Ping Unknown Date (has links) 公務人員的人事陞遷是一個複雜性極高，其中隱藏著許多不變的定律及過程，長官與部屬、各公務人員人之間的關係，更是如同蜘蛛網狀般的錯綜複雜，而各公務人員的陞遷狀況，更是隱藏著許多派系之間的鬥爭拉扯連動，或是提攜後進的過程，目前透過政府公開的總統府公報－總統令，可以清楚得知所有公務人員的任職相關資料，其中包含各職務之間的陞遷、任命、派免等相關資訊，而每筆資料亦包含機關、單位、職稱及職等資料，可以提供各種研究使用。本篇係整理出一種陞遷序列的資料模型來進行研究，透過資料探勘的相關演算法－支撐向量機(Support Vector Machine，簡稱SVM)及決策樹(Decision Tree)的方式，並透過人事的領域知識加以找出較具影響力的屬性，來設計實驗的模型，並使用多組模型及多重資料進行實驗，透過整體平均預測結果及圖表方式來呈現各類別的預測狀況，再以不同的屬性資料來運算產生其相對結果，來分析其合理性，最後再依相關數據來評估此一方法的合理及可行性。透過資料探勘設計的分類預測模型，其支撐向量機與決策樹都具有訓練量越大，展現之預測結果也愈佳之現象，這跟一般模型是相同的，而挖掘的主管職務屬性參數及關鍵屬性構想都跟人事陞遷的邏輯不謀而合，而預測結果雖各有所長，但整體來看則為支撐向量機略勝一籌，惟支撐向量機有一狀況，必須先行排除較不具影響力之屬性參數資料，否則其產生超平面的邏輯運算過程將產生拉扯作用，導致影響其預測結果；而決策樹則無是類狀況，且其應用較為廣泛，可以透過宣告各屬性值的類型，來進行不同屬性資料類型的分類實驗。而透過支撐向量機與決策樹的產生的預測結果，其正確率為百分之77至82左右，如此顯示出國內中高階文官的陞遷制度是有脈絡可循的，其具有一定的制度規範及穩定性，而非隨意的任免陞遷；如此透過以上資料探勘的應用，藉著此特徵研究提供公務部門在進行人力資源管理、組織發展、陞遷發展以及組織部門精簡規劃上，作為調整設計參考的一些相關資訊；另透過一些相關屬性的輸入，可提供尚在服務的公務人員協助其預估陞遷發展的狀況，以提供其進行相關生涯規劃。 / The employee promotion is a highly complexity task in Government office, it include many invariable laws and the process, between the senior officer and the subordinate, various relationships with other government employees, It’s the similar complex with the spider lattice, and it hides many clique's struggles in Government official’s promotion, and help to process the promote for the junior generation, through the government public presidential palace - presidential order, it‘s able to get clearly information about all government employees’ correlation data, include various related information like promotion, recruitment , and each data also contains the instruction, like the job unit, job title and job rank for all research reference. It organizes a promoted material model to conduct the research, by the material exploration's related calculating method – Support Vector Machine (SVM) and the decision tree, and through by knowledge of human resource to discover the influence to design the experiment's model, and uses the multi-group models and materials to process, and by this way , it can get various categories result by overall average forecasting and the graph, then operates by different attribute material to get relative result and analyzes its rationality, finally it depends on the correlation data to re-evaluate its method reasonable and feasibility. To this classification forecast model design, the SVM and the decision tree got better performance together with the good training quality, it’s the same with the general model, and it’s the same view to find the details job description for senior management and employee promotion, however the forecasting result has their own strong points, but for the totally, the SVM is slightly better, only if any accidents occurred, it needs to elimination the attribute parameter material which is not have the big influence, otherwise it will have the planoid logic operation process to produce resist status, and will affect its forecasting result, but the decision tree does not have this problem, and its application is more widespread, it can through by different type to make the different experiment. The forecasting result through by SVM and decision tree, its correction percentage can be achieved around 77% - 82% , so it indicated the high position level promotion policy should be have its own rules to follow, it has certain system standard and the stability, but non-optional promoted, so trough by the above data mining, follow by this characteristic to provide Government office to do the Human resource management, organization development, employee promotion and simplify planning to the organization, takes the re-design information for reference, In addition through by some related attribute input, it may provide the government employee who is still on duty and assist them to evaluate promotion development for future career plan. 資料探勘支撐向量機決策樹 Data Mining SVM Support Vector Machine DT Decision Tree

Search results