21 |
The Effectiveness of a Random Forests Model in Detecting Network-Based Buffer Overflow AttacksJulock, Gregory Alan 01 January 2013 (has links)
Buffer Overflows are a common type of network intrusion attack that continue to plague the networked community. Unfortunately, this type of attack is not well detected with current data mining algorithms. This research investigated the use of Random Forests, an ensemble technique that creates multiple decision trees, and then votes for the best tree. The research Investigated Random Forests' effectiveness in detecting buffer overflows compared to other data mining methods such as CART and Naïve Bayes. Random Forests was used for variable reduction, cost sensitive classification was applied, and each method's detection performance compared and reported along with the receive operator characteristics. The experiment was able to show that Random Forests outperformed CART and Naïve Bayes in classification performance. Using a technique to obtain Buffer Overflow most important variables, Random Forests was also able to improve upon its Buffer Overflow classification performance.
|
22 |
Using Machine Learning to Detect Malicious URLsCheng, Aidan 01 January 2017 (has links)
There is a need for better predictive model that reduces the number of malicious URLs being sent through emails. This system should learn from existing metadata about URLs. The ideal solution for this problem would be able to learn from its predictions. For example, if it predicts a URL to be malicious, and that URL is deemed safe by the sandboxing environment, the predictor should refine its model to account for this data. The problem, then, is to construct a model with these characteristics that can make these predictions for the vast number of URLs being processed. Given that the current system does not employ machine learning methods, we intend to investigate multiple such models and summarize which of those might be worth pursuing on a large scale.
|
23 |
Categorização hierárquica de textos em um portal agregador de notíciasBorges, Hugo Lima January 2009 (has links)
Orientadora: Ana Carolina Lorena / Dissertação (mestrado) - Universidade Federal do ABC. Programa de Pós-Graduação em Engenharia da Informação, 2009
|
24 |
Filtering Social Tags for Songs based on Lyrics using Clustering MethodsChawla, Rahul 21 July 2011 (has links)
In the field of Music Data Mining, Mood and Topic information has been considered as a high level metadata. The extraction of mood and topic information is difficult but is regarded as very valuable. The immense growth of Web 2.0 resulted in Social Tags being a direct interaction with users (humans) and their feedback through tags can help in classification and retrieval of music. One of the major shortcomings of the approaches that have been employed so far is the improper filtering of social tags. This thesis delves into the topic of information extraction from songs’ tags and lyrics. The main focus is on removing all erroneous and unwanted tags with help of other features. The hierarchical clustering method is applied to create clusters of tags. The clusters are based on semantic information any given pair of tags share. The lyrics features are utilized by employing CLOPE clustering method to form lyrics clusters, and Naïve Bayes method to compute probability values that aid in classification process. The outputs from classification are finally used to estimate the accuracy of a tag belonging to the song. The results obtained from the experiments all point towards the success of the method proposed and can be utilized by other research projects in the similar field.
|
25 |
Game theoretic and machine learning techniques for balancing gamesLong, Jeffrey Richard 29 August 2006
Game balance is the problem of determining the fairness of actions or sets of actions in competitive, multiplayer games. This problem primarily arises in the context of designing board and video games. Traditionally, balance has been achieved through large amounts of play-testing and trial-and-error on the part of the designers. In this thesis, it is our intent to lay down the beginnings of a framework for a formal and analytical solution to this problem, combining techniques from game theory and machine learning. We first develop a set of game-theoretic definitions for different forms of balance, and then introduce the concept of a strategic abstraction. We show how machine classification techniques can be used to identify high-level player strategy in games, using the two principal methods of sequence alignment and Naive Bayes classification. Bioinformatics sequence alignment, when combined with a 3-nearest neighbor classification approach, can, with only 3 exemplars of each strategy, correctly identify the strategy used in 55\% of cases using all data, and 77\% of cases on data that experts indicated actually had a strategic class. Naive Bayes classification achieves similar results, with 65\% accuracy on all data and 75\% accuracy on data rated to have an actual class. We then show how these game theoretic and machine learning techniques can be combined to automatically build matrices that can be used to analyze game balance properties.
|
26 |
A wearable real-time system for physical activity recognition and fall detectionYang, Xiuxin 23 September 2010
This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p>
In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p>
This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p>
For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life.
|
27 |
Game theoretic and machine learning techniques for balancing gamesLong, Jeffrey Richard 29 August 2006 (has links)
Game balance is the problem of determining the fairness of actions or sets of actions in competitive, multiplayer games. This problem primarily arises in the context of designing board and video games. Traditionally, balance has been achieved through large amounts of play-testing and trial-and-error on the part of the designers. In this thesis, it is our intent to lay down the beginnings of a framework for a formal and analytical solution to this problem, combining techniques from game theory and machine learning. We first develop a set of game-theoretic definitions for different forms of balance, and then introduce the concept of a strategic abstraction. We show how machine classification techniques can be used to identify high-level player strategy in games, using the two principal methods of sequence alignment and Naive Bayes classification. Bioinformatics sequence alignment, when combined with a 3-nearest neighbor classification approach, can, with only 3 exemplars of each strategy, correctly identify the strategy used in 55\% of cases using all data, and 77\% of cases on data that experts indicated actually had a strategic class. Naive Bayes classification achieves similar results, with 65\% accuracy on all data and 75\% accuracy on data rated to have an actual class. We then show how these game theoretic and machine learning techniques can be combined to automatically build matrices that can be used to analyze game balance properties.
|
28 |
A wearable real-time system for physical activity recognition and fall detectionYang, Xiuxin 23 September 2010 (has links)
This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p>
In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p>
This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p>
For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life.
|
29 |
Cross-Lingual Category Integration TechniqueTzeng, Guo-han 30 August 2006 (has links)
With the emergence of the Internet, many innovative and interesting applications from different countries have been stimulated and e-commerce is also getting more and more pervasive. Under this scenario, tremendous amount of information expressed in different languages are exchanged and shared by not only organizations but also individuals in the modern global environment. A large proportion of information is typically formatted and available as textual documents and managed by using categories. Consequently, the development of a practical and effective technique to deal with the problem of cross-lingual category integration (CLCI) becomes a very essential and important issue. Several category integration techniques have been proposed, but all of them deal with category integration involving only monolingual documents. In response, in this study, we combine the existing cross-lingual text categorization techniques with an existing monolingual category integration technique (specifically, Enhanced Naive Bayes) and proposed a CLCI solution to address cross-lingual category integration. Our empirical evaluation results show that our proposed CLCI technique demonstrates its feasibility and superior effectiveness.
|
30 |
Applying Data Mining Techniques on Continuous Sensed Data : For daily living activity recognitionLi, Yunjie January 2014 (has links)
Nowadays, with the rapid development of the Internet of Things, the applicationfield of wearable sensors has been continuously expanded and extended, especiallyin the areas of remote electronic medical treatment, smart homes ect. Human dailyactivities recognition based on the sensing data is one of the challenges. With avariety of data mining techniques, the activities can be automatically recognized. Butdue to the diversity and the complexity of the sensor data, not every kind of datamining technique can performed very easily, until after a systematic analysis andimprovement. In this thesis, several data mining techniques were involved in theanalysis of a continuous sensing dataset in order to achieve the objective of humandaily activities recognition. This work studied several data mining techniques andfocuses on three of them; Decision Tree, Naive Bayes and neural network, analyzedand compared these techniques according to the classification results. The paper alsoproposed some improvements to the data mining techniques according to thespecific dataset. The comparison of the three classification results showed that eachclassifier has its own limitations and advantages. The proposed idea of combing theDecision Tree model with the neural network model significantly increased theclassification accuracy in this experiment.
|
Page generated in 0.0446 seconds