331 |
應用情感分析於指數型證券投資信託基金趨勢預測之研究 / Research into sentimental analysis to predict exchange-traded fund trend黃泓銘, Huang, Hung-Ming Unknown Date (has links)
近年來ETF規模快速成長,亞洲區域經濟成長與穩步發展更是帶動國際ETF市場動力來源,而元大台灣50指數型證券投資信託基金因規模大,受到投資人的青睞。根據過去的研究指出,網路上的文本訊息會對群眾情緒造成影響,進而影響股價波動,對投資者而言,若能從大量網路財金快速分析投資者大眾情緒進而預測股價波動走勢,勢必可提高報酬率。然而,每日有上百篇的財金文本產生,人工分析耗時耗力,本研究採用文字探勘技術,提出一套情感分析的價格預測模型。
過去文本情感分析的研究中已證實監督式學習方法可以透過簡單量化的方式達到良好的分類效果,然而,為解決監督式學習無法預期未知的限制,本研究透過非監督式學習將2016整年度的財金文本進行文章主題判別,計算情緒指數並標記文本情緒傾向,再來使用監督式學習結合台股資訊指標、國際指標、總體經濟指標、技術指標等,建立分類模型以預測元大台灣50ETF的價格趨勢。
實驗結果中,主題標注方面,本研究發現因文本數量遠大於議題詞數量造成TF-IDF矩陣過於稀疏,使得TF-IDF結合K-means主題模型分類效果不佳。LDA主題模型基於所有主題被所有文章共享的特性,使得在字詞分群優於TF-IDF結合K-means。情緒傾向標注方面,證實本研究擴充後的情感詞集比起NTUSD有更好的字詞極性判斷效果。
本研究透過比較情緒指數結合技術指標之分類模型與單純技術指標分類模型的準確率發現,前者較後者高出7%的準確率。進一步結合間接情緒指標的分類模型更有71%準確率,故證實財金文本的情感分析確實能有效提升元大台灣50的價格趨勢預測。 / Rapid and stable economic growth in Asia motivated the asset scale of ETF in the globe growing rapidly in the recent years. Yuanta Taiwan Top 50 ETF gains the investors’ favor because of the advantages of large market scale. Past research have shown that the text documents on the internet, e.g. news and tweets, would make great effect on public emotion, and the public emotion could even affect the stock price. For investors, it is important to know how to analyze the potential emotion in text documents to predict the stock trend. However, the traditional way to analyze text documents by human cannot afford the large volume of financial text documents on the internet.
In past sentimental analysis research, supervised method is proven as a method with high accuracy, but there are limits about predicting unknown future trend. This research combined supervised and unsupervised methods to deal with these large financial text documents. By using unsupervised method to find out the topic of documents, and then calculate the sentimental index of each documents to differentiate the sentiment polarity. Afterwards, using supervised method to build a prediction model with the sentimental index.
According to the result, we found that the performance of LDA model is better than the TF-IDF with K-means model. Moreover, the prediction model which include the sentiment index has higher accuracy than the one include the technical indicators only.
|
332 |
Real-time Hand Gesture Detection and Recognition for Human Computer InteractionDardas, Nasser Hasan Abdel-Qader January 2012 (has links)
This thesis focuses on bare hand gesture recognition by proposing a new architecture to solve the problem of real-time vision-based hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bare hand in a cluttered background using face subtraction, skin detection and contour comparison. The second stage allows recognizing hand gestures using bag-of-features and multi-class Support Vector Machine (SVM) algorithms. Finally, a grammar has been developed to generate gesture commands for application control.
Our hand gesture recognition system consists of two steps: offline training and online testing. In the training stage, after extracting the keypoints for every training image using the Scale Invariance Feature Transform (SIFT), a vector quantization technique will map keypoints from every training image into a unified dimensional histogram vector (bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the classifier. In the testing stage, for every frame captured from a webcam, the hand is detected using my algorithm. Then, the keypoints are extracted for every small image that contains the detected hand posture and fed into the cluster model to map them into a bag-of-words vector, which is fed into the multi-class SVM classifier to recognize the hand gesture.
Another hand gesture recognition system was proposed using Principle Components Analysis (PCA). The most eigenvectors and weights of training images are determined. In the testing stage, the hand posture is detected for every frame using my algorithm. Then, the small image that contains the detected hand is projected onto the most eigenvectors of training images to form its test weights. Finally, the minimum Euclidean distance is determined among the test weights and the training weights of each training image to recognize the hand gesture.
Two application of gesture-based interaction with a 3D gaming virtual environment were implemented. The exertion videogame makes use of a stationary bicycle as one of the main inputs for game playing. The user can control and direct left-right movement and shooting actions in the game by a set of hand gesture commands, while in the second game, the user can control and direct a helicopter over the city by a set of hand gesture commands.
|
333 |
An Efficient Vision-Based Pedestrian Detection and Tracking System for ITS ApplicationsZuo, Tianyu January 2014 (has links)
In this thesis, a novel Pedestrian Protection System (PPS), composed of the Pedestrian Detection System (PDS) and the Pedestrian Tracking System (PTS), was proposed. The PPS is a supplementary application for the Advanced Driver Assistance System, which is used to avoid collisions between vehicles and pedestrians.
The Pedestrian Detection System (PDS) is used to detect pedestrians from near to
far ranges with the feature-classi er-based detection method (HOG + SVM). To achieve pedestrian detection from near to far ranges, a novel structure was proposed. The structure of our PDS consists of two cameras (called CS and CL separately). The CS is equipped with a short focal length lens to detect pedestrians in near-to-mid range; and, the CL is equipped with a long focal length lens to detect pedestrians in mid-to-far range. To accelerate the processing speed of pedestrian detection, the parallel computing capacity of GPU was utilized in the PDS. The synchronization algorithm is also introduced to synchronize the detection results of CS and CL. Based on the novel pedestrian detection structure, the detection process can reach a distance which is more than 130 meters away without decreasing detection accuracy. The detection range can be extended more than
100 meters without decreasing the processing speed of pedestrian detection. Afterwards, an algorithm to eliminate duplicate detection results is proposed to improve the detection accuracy.
The Pedestrian Tracking System (PTS) is applied following the Pedestrian Detection
System. The PTS is used to track the movement trajectory of pedestrians and to predict the future motion and movement direction. A C + + class (called pedestrianTracking class, which is short for PTC) was generated to operate the tracking process for every detected pedestrian. The Kalman lter is the main algorithm inside the PTC. During the operation of PPS, the nal detection results of each frame from PDS will be transmitted to the PTS to enable the tracking process. The new detection results will be used to update the existing tracking results in the PTS. Moreover, if there is a newly detected pedestrian, a new process will be generated to track the pedestrian in the PTS. Based on the tracking results in PTS, the movement trajectory of pedestrians can be obtained and their future motion and movement direction can be predicted. Two kinds of alerts are generated based on the predictions: warning alert and dangerous alert. These two alerts represent di erent situations; and, they will alert drivers to the upcoming situations. Based on the predictions and alerts, the collisions can be prevented e ectively. The safety
of pedestrians can be guaranteed.
|
334 |
Využití metod data miningu při analýze kreditních dat / Using data mining methods in the analysis of credit risk dataTvaroh, Tomáš January 2013 (has links)
This thesis focuses on comparison of selected data mining methods for solving classification tasks with the method of logistic regression. First part of the thesis briefly introduces data mining as a scientific discipline and classification task is shown in the context of knowledge data discovery. Next part explains the principle of particular methods amongst which, along with logistic regression, artificial neural networks, classification decision trees and Support Vector Machine method were selected. Together with mathematical background of each algorithm, demonstration of how the classification functions for new examples is mentioned. Analytical part of this thesis tests decribed methods on real-world data from the Lending Club company and they are compared based on classification accuracy. Towards the end, an evaluation of logistic regression is made in terms of whether its majority position is due to historical reasons or for its high classification accuracy compared to other methods.
|
335 |
Using machine learning to determine fold class and secondary structure content from Raman optical activity and Raman vibrational spectroscopyKinalwa-Nalule, Myra January 2012 (has links)
The objective of this project was to apply machine learning methods to determine protein secondary structure content and protein fold class from ROA and Raman vibrational spectral data. Raman and ROA are sensitive to biomolecular structure with the bands of each spectra corresponding to structural elements in proteins and when combined give a fingerprint of the protein. However, there are many bands of which little is known. There is a need, therefore, to find ways of extrapolating information from spectral bands and investigate which regions of the spectra contain the most useful structural information. Support Vector Machines (SVM) classification and Random Forests (RF) trees classification were used to mine protein fold class information and Partial Least Squares (PLS) regression was used to determine secondary structure content of proteins. The classification methods were used to group proteins into α-helix, β-sheet, α/β and disordered fold classes. The PLS regression was used to determine percentage protein structural content from Raman and ROA spectral data. The analyses were performed on spectral bin widths of 10cm-1 and on the spectral amide regions I, II and III. The full spectra and different combinations of the amide regions were also analysed. The SVM analyses, classification and regression, generally did not perform well. SVM classification models for example, had low Matthew Correlation Coefficient (MCC) values below 0.5 but this is better than a negative value which would indicate a random chance prediction. The SVM regression analyses also showed very poor performances with average R2 values below 0.5. R2 is the Pearson's correlations coefficient and shows how well predicted and observed structural content values correlate. An R2 value 1 indicates a good correlation and therefore a good prediction model. The Partial Least Squares regression analyses yielded much improved results with very high accuracies. Analyses of full spectrum and the spectral amide regions produced high R2 values of 0.8-0.9 for both ROA and Raman spectral data. This high accuracy was also seen in the analysis of the 850-1100 cm-1 backbone region for both ROA and Raman spectra which indicates that this region could have an important contribution to protein structure analysis. 2nd derivative Raman spectra PLS regression analysis showed very improved performance with high accuracy R2 values of 0.81-0.97. The Random Forest algorithm used here for classification showed good performance. The 2-dimensional plots used to visualise the classification clusters showed clear clusters in some analyses, for example tighter clustering was observed for amide I, amide I & III and amide I & II & III spectral regions than for amide II, amide III and amide II&III spectra analysis. The Random Forest algorithm also determines variable importance which showed spectral bins were crucial in the classification decisions. The ROA Random Forest analyses performed generally better than Raman Random Forest analyses. ROA Random Forest analyses showed 75% as the highest percentage of correctly classified proteins while Raman analyses reported 50% as the highest percentage. The analyses presented in this thesis have shown that Raman and ROA vibrational spectral contains information about protein secondary structure and these data can be extracted using mathematical methods such as the machine learning techniques presented here. The machine learning methods applied in this project were used to mine information about protein secondary structure and the work presented here demonstrated that these techniques are useful and could be powerful tools in the determination protein structure from spectral data.
|
336 |
App based ski management with performance predictionsNelson, Lars January 2018 (has links)
This report aims to solve a problem for the waxers in the Swedish National Cross-country Ski Team, which hereafter will be referred to as the national team. The problem in hand is that currently, the national team lacks a system for book-keeping of ski pairs and ski tests. Also, the project intends to provide a tool for predicting the best ski pairs in given conditions. The report describes cross-country skis and factors that affect the performance of these skis. Moreover, this report presents the testing procedure of the national team. The project provides a solution to the problem in hand by developing a web service based on Django and Django REST Framework and an iOS application to handle the user interaction. The app was tested and approved by the waxers of the national team. To predict the best performing skis in given conditions, the three Machine Learning algorithms Support Vector Machine (SVM), Decision Tree, and Artificial Neural Network (ANN) is implemented and evaluated. Experimental results indicate that the ANN algorithm has better accuracy than the Decision Tree, and that the SVM algorithms and that the SVM was performing slightly worse than the other two, when applied on test data which is artificially generated based on the experience of the national team. All three Machine Learning algorithms perform better in terms of mean accuracy which is significantly higher compared to the accuracy of a baseline algorithm. The report suggests that the accuracy of the ANN algorithm is high enough to be useful for the national team.
|
337 |
Detecting ADS-B spoofing attacks : using collected and simulated data / Insamling och simulering av ADS-B meddelanden för detektion av attackerWahlgren, Alex, Thorn, Joakim January 2021 (has links)
In a time where general technology is progressing at a rapid rate, this thesis aims to present possible advancements to security in regard to air traffic communication. By highlighting how data can be extracted using simple hardware and open-source software the transparency and lack of authentication is showcased. The research is specifically narrowed down to discovering vulnerabilities of the ADS-B protocol in order to apply countermeasures. Through fetching live aircraft data with OpenSky-Network and through fetching simulated ADS-B attack data with OpenScope, this thesis develops a data set with both authentic and malicious ADS-B messages. The data set was cleaned in order to remove outliers and other improper data. A machine learning model was later trained with the data set in order to detect malicious ADS-B messages. With the use of Support Vector Machine (SVM), it was possible to produce a model that can detect four different types of aviation communications attacks as well as allow authentic messages to pass through the IDS. The finished model was able to detect incoming ADS-B attacks with an overall accuracy of 83.10%.
|
338 |
Třífázový měnič pro synchronní servomotory / Three-phase converter for synchronous servomotorsPerout, Miroslav January 2020 (has links)
This diploma thesis is dealing with the design of a DC / AC converter for the control of PMSM motors. In the first step, the type of motor and the possibilities of sensing the position of the rotor are described. Subsequently, the power section is designed and the losses, heating, and approximate efficiency of the inverter are calculated. In the following step, the processor is selected and individual communication and protection circuits are designed. At the same time, control algorithms are analyzed. The last part is describing the implementation of the PCB and the inverter as a whole.
|
339 |
Efektivnost hlubokých konvolučních neuronových sítí na elementární klasifikační úloze / Efficiency of deep convolutional neural networks on an elementary classification taskPrax, Jan January 2021 (has links)
In this thesis deep convolutional neural networks models and feature descriptor models are compared. Feature descriptors are paired with suitable chosen classifier. These models are a part of machine learning therefore machine learning types are described in this thesis. Further these chosen models are described, and their basics and problems are explained. Hardware and software used for tests is listed and then test results and results summary is listed. Then comparison based on the validation accuracy and training time of these said models is done.
|
340 |
Password protection by analyzed keystrokes : Using Artificial Intelligence to find the impostorDanilovic, Robert, Svensson, Måns January 2021 (has links)
A literature review was done to find that there are still issues with writing passwords. From the information gathered, it is stated that using keystroke characteristics could have the potential to add another layer of security to compromised user accounts. The world has become more and more connected and the amount of people who store personal information online or on their phones has steadily increased. In this thesis, a solution is proposed and evaluated to make authentication safer and less intrusive. Less intrusive in this case means that it does not require cooperation from the user, it just needs to capture data from the user in the background. As authentication methods such as fingerprint scanning and facial recognition are becoming more popular this work is investigating if there are any other biometric features for user authentication.Employing Artificial Intelligence, extra sensor metrics and Machine Learning models with the user's typing characteristics could be used to uniquely identify users. In this context the Neural Network and Support Vector Machine algorithms have been examined, alongside the gyroscope and the touchscreen sensors. To test the proposed method, an application has been built to capture typing characteristics for the models to train on. In this thesis, 10 test subjects were chosen to type a password multiple times so that they would generate the data. After the data was gathered and pre-processed an analysis was conducted and sent to train the Machine Learning models. This work's proposed solution and presented data serve as a proof of concept that there are additional sensors that could be used to authenticate users, namely the gyroscope. Capturing typing characteristics of users, our solution managed to achieve a 97.7% accuracy using Support Vector Machines in authenticating users.
|
Page generated in 0.0263 seconds