• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 57
  • 18
  • 13
  • 7
  • 5
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 121
  • 121
  • 64
  • 57
  • 49
  • 42
  • 28
  • 28
  • 27
  • 26
  • 24
  • 21
  • 20
  • 18
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Using Machine Learning to Detect Malicious URLs

Cheng, Aidan 01 January 2017 (has links)
There is a need for better predictive model that reduces the number of malicious URLs being sent through emails. This system should learn from existing metadata about URLs. The ideal solution for this problem would be able to learn from its predictions. For example, if it predicts a URL to be malicious, and that URL is deemed safe by the sandboxing environment, the predictor should refine its model to account for this data. The problem, then, is to construct a model with these characteristics that can make these predictions for the vast number of URLs being processed. Given that the current system does not employ machine learning methods, we intend to investigate multiple such models and summarize which of those might be worth pursuing on a large scale.
22

Categorização hierárquica de textos em um portal agregador de notícias

Borges, Hugo Lima January 2009 (has links)
Orientadora: Ana Carolina Lorena / Dissertação (mestrado) - Universidade Federal do ABC. Programa de Pós-Graduação em Engenharia da Informação, 2009
23

Filtering Social Tags for Songs based on Lyrics using Clustering Methods

Chawla, Rahul 21 July 2011 (has links)
In the field of Music Data Mining, Mood and Topic information has been considered as a high level metadata. The extraction of mood and topic information is difficult but is regarded as very valuable. The immense growth of Web 2.0 resulted in Social Tags being a direct interaction with users (humans) and their feedback through tags can help in classification and retrieval of music. One of the major shortcomings of the approaches that have been employed so far is the improper filtering of social tags. This thesis delves into the topic of information extraction from songs’ tags and lyrics. The main focus is on removing all erroneous and unwanted tags with help of other features. The hierarchical clustering method is applied to create clusters of tags. The clusters are based on semantic information any given pair of tags share. The lyrics features are utilized by employing CLOPE clustering method to form lyrics clusters, and Naïve Bayes method to compute probability values that aid in classification process. The outputs from classification are finally used to estimate the accuracy of a tag belonging to the song. The results obtained from the experiments all point towards the success of the method proposed and can be utilized by other research projects in the similar field.
24

Game theoretic and machine learning techniques for balancing games

Long, Jeffrey Richard 29 August 2006
Game balance is the problem of determining the fairness of actions or sets of actions in competitive, multiplayer games. This problem primarily arises in the context of designing board and video games. Traditionally, balance has been achieved through large amounts of play-testing and trial-and-error on the part of the designers. In this thesis, it is our intent to lay down the beginnings of a framework for a formal and analytical solution to this problem, combining techniques from game theory and machine learning. We first develop a set of game-theoretic definitions for different forms of balance, and then introduce the concept of a strategic abstraction. We show how machine classification techniques can be used to identify high-level player strategy in games, using the two principal methods of sequence alignment and Naive Bayes classification. Bioinformatics sequence alignment, when combined with a 3-nearest neighbor classification approach, can, with only 3 exemplars of each strategy, correctly identify the strategy used in 55\% of cases using all data, and 77\% of cases on data that experts indicated actually had a strategic class. Naive Bayes classification achieves similar results, with 65\% accuracy on all data and 75\% accuracy on data rated to have an actual class. We then show how these game theoretic and machine learning techniques can be combined to automatically build matrices that can be used to analyze game balance properties.
25

A wearable real-time system for physical activity recognition and fall detection

Yang, Xiuxin 23 September 2010
This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p> In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p> This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p> For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life.
26

Game theoretic and machine learning techniques for balancing games

Long, Jeffrey Richard 29 August 2006 (has links)
Game balance is the problem of determining the fairness of actions or sets of actions in competitive, multiplayer games. This problem primarily arises in the context of designing board and video games. Traditionally, balance has been achieved through large amounts of play-testing and trial-and-error on the part of the designers. In this thesis, it is our intent to lay down the beginnings of a framework for a formal and analytical solution to this problem, combining techniques from game theory and machine learning. We first develop a set of game-theoretic definitions for different forms of balance, and then introduce the concept of a strategic abstraction. We show how machine classification techniques can be used to identify high-level player strategy in games, using the two principal methods of sequence alignment and Naive Bayes classification. Bioinformatics sequence alignment, when combined with a 3-nearest neighbor classification approach, can, with only 3 exemplars of each strategy, correctly identify the strategy used in 55\% of cases using all data, and 77\% of cases on data that experts indicated actually had a strategic class. Naive Bayes classification achieves similar results, with 65\% accuracy on all data and 75\% accuracy on data rated to have an actual class. We then show how these game theoretic and machine learning techniques can be combined to automatically build matrices that can be used to analyze game balance properties.
27

A wearable real-time system for physical activity recognition and fall detection

Yang, Xiuxin 23 September 2010 (has links)
This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p> In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p> This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p> For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life.
28

Cross-Lingual Category Integration Technique

Tzeng, Guo-han 30 August 2006 (has links)
With the emergence of the Internet, many innovative and interesting applications from different countries have been stimulated and e-commerce is also getting more and more pervasive. Under this scenario, tremendous amount of information expressed in different languages are exchanged and shared by not only organizations but also individuals in the modern global environment. A large proportion of information is typically formatted and available as textual documents and managed by using categories. Consequently, the development of a practical and effective technique to deal with the problem of cross-lingual category integration (CLCI) becomes a very essential and important issue. Several category integration techniques have been proposed, but all of them deal with category integration involving only monolingual documents. In response, in this study, we combine the existing cross-lingual text categorization techniques with an existing monolingual category integration technique (specifically, Enhanced Naive Bayes) and proposed a CLCI solution to address cross-lingual category integration. Our empirical evaluation results show that our proposed CLCI technique demonstrates its feasibility and superior effectiveness.
29

Applying Data Mining Techniques on Continuous Sensed Data : For daily living activity recognition

Li, Yunjie January 2014 (has links)
Nowadays, with the rapid development of the Internet of Things, the applicationfield of wearable sensors has been continuously expanded and extended, especiallyin the areas of remote electronic medical treatment, smart homes ect. Human dailyactivities recognition based on the sensing data is one of the challenges. With avariety of data mining techniques, the activities can be automatically recognized. Butdue to the diversity and the complexity of the sensor data, not every kind of datamining technique can performed very easily, until after a systematic analysis andimprovement. In this thesis, several data mining techniques were involved in theanalysis of a continuous sensing dataset in order to achieve the objective of humandaily activities recognition. This work studied several data mining techniques andfocuses on three of them; Decision Tree, Naive Bayes and neural network, analyzedand compared these techniques according to the classification results. The paper alsoproposed some improvements to the data mining techniques according to thespecific dataset. The comparison of the three classification results showed that eachclassifier has its own limitations and advantages. The proposed idea of combing theDecision Tree model with the neural network model significantly increased theclassification accuracy in this experiment.
30

Disaster tweet classification using parts-of-speech tags: a domain adaptation approach

Robinson, Tyler January 1900 (has links)
Master of Science / Department of Computer Science / Doina Caragea / Twitter is one of the most active social media sites today. Almost everyone is using it, as it is a medium by which people stay in touch and inform others about events in their lives. Among many other types of events, people tweet about disaster events. Both man made and natural disasters, unfortunately, occur all the time. When these tragedies transpire, people tend to cope in their own ways. One of the most popular ways people convey their feelings towards disaster events is by offering or asking for support, providing valuable information about the disaster, and voicing their disapproval towards those who may be the cause. However, not all of the tweets posted during a disaster are guaranteed to be useful or informative to authorities nor to the general public. As the number of tweets that are posted during a disaster can reach the hundred thousands range, it is necessary to automatically distinguish tweets that provide useful information from those that don't. Manual annotation cannot scale up to the large number of tweets, as it takes significant time and effort, which makes it unsuitable for real-time disaster tweet annotation. Alternatively, supervised machine learning has been traditionally used to learn classifiers that can quickly annotate new unseen tweets. But supervised machine learning algorithms make use of labeled training data from the disaster of interest, which is presumably not available for a current target disaster. However, it is reasonable to assume that some amount of labeled data is available for a prior source disaster. Therefore, domain adaptation algorithms that make use of labeled data from a source disaster to learn classifiers for the target disaster provide a promising direction in the area of tweet classification for disaster management. In prior work, domain adaptation algorithms have been trained based on tweets represented as bag-of-words. In this research, I studied the effect of Part of Speech (POS) tag unigrams and bigrams on the performance of the domain adaptation classifiers. Specifically, I used POS tag unigram and bigram features in conjunction with a Naive Bayes Domain Adaptation algorithm to learn classifiers from source labeled data together with target unlabeled data, and subsequently used the resulting classifiers to classify target disaster tweets. The main research question addressed through this work was if the POS tags can help improve the performance of the classifiers learned from tweet bag-of-words representations only. Experimental results have shown that the POS tags can improve the performance of the classifiers learned from words only, but not always. Furthermore, the results of the experiments show that POS tag bigrams contain more information as compared to POS tag unigrams, as the classifiers learned from bigrams have better performance than those learned from unigrams.

Page generated in 0.0673 seconds