Global ETD Search

101	SVM-based algorithms for aligning ontologies using literature xu, wei January 2008 (has links) <p>Ontologies is one of the key techniques used in Semantic Web establishment. Nowadays，many ontologies have been developed and it is critical to understand the relationships between the terms of the ontologies, i.e. we need to align the ontologies.</p><p>This thesis deals with an approach for finding relationships between ontologies using literature by classifying documents related to terms in the ontologies.</p><p> </p><p>In this project the general method from [1] is used, but in the classifier generation part, a brand new classifier based on SVMs algorithm is implemented by LPU and SVM<em><sup>light</sup></em>. We evaluate our approach and compare it to previous approaches.</p> ontology alignment text classifier SVM LPU SVM-light Computer science Datavetenskap
102	Segmentation of human ovarian follicles from ultrasound images acquired <i>in vivo</i> using geometric active contour models and a naïve Bayes classifier Harrington, Na 14 September 2007 Ovarian follicles are spherical structures inside the ovaries which contain developing eggs. Monitoring the development of follicles is necessary for both gynecological medicine (ovarian diseases diagnosis and infertility treatment), and veterinary medicine (determining when to introduce superstimulation in cattle, or dividing herds into different stages in the estrous cycle).<p>Ultrasound imaging provides a non-invasive method for monitoring follicles. However, manually detecting follicles from ovarian ultrasound images is time consuming and sensitive to the observer's experience. Existing (semi-) automatic follicle segmentation techniques show the power of automation, but are not widely used due to their limited success.<p>A new automated follicle segmentation method is introduced in this thesis. Human ovarian images acquired <i>in vivo</i> were smoothed using an adaptive neighbourhood median filter. Dark regions were initially segmented using geometric active contour models. Only part of these segmented dark regions were true follicles. A naïve Bayes classifier was applied to determine whether each segmented dark region was a true follicle or not. <p>The Hausdorff distance between contours of the automatically segmented regions and the gold standard was 2.43 ± 1.46 mm per follicle, and the average root mean square distance per follicle was 0.86 ± 0.49 mm. Both the average Hausdorff distance and the root mean square distance were larger than those reported in other follicle segmentation algorithms. The mean absolute distance between contours of the automatically segmented regions and the gold standard was 0.75 ± 0.32 mm, which was below that reported in other follicle segmentation algorithms.<p>The overall follicle recognition rate was 33% to 35%; and the overall image misidentification rate was 23% to 33%. If only follicles with diameter greater than or equal to 3 mm were considered, the follicle recognition rate increased to 60% to 63%, and the follicle misidentification rate increased slightly to 24% to 34%. The proposed follicle segmentation method is proved to be accurate in detecting a large number of follicles with diameter greater than or equal to 3 mm. Naïve Bayes Classifier Ultrasound Images Follicle Segmentation Geometric Active Contour Models
103	A wearable real-time system for physical activity recognition and fall detection Yang, Xiuxin 23 September 2010 This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p> In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p> This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p> For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life. Machine learning Accelerometer Naive Bayes classifier Fall detection Physical activity recognition
104	Feature Ranking for Text Classifiers Makrehchi, Masoud January 2007 (has links) Feature selection based on feature ranking has received much attention by researchers in the field of text classification. The major reasons are their scalability, ease of use, and fast computation. %, However, compared to the search-based feature selection methods such as wrappers and filters, they suffer from poor performance. This is linked to their major deficiencies, including: (i) feature ranking is problem-dependent; (ii) they ignore term dependencies, including redundancies and correlation; and (iii) they usually fail in unbalanced data. While using feature ranking methods for dimensionality reduction, we should be aware of these drawbacks, which arise from the function of feature ranking methods. In this thesis, a set of solutions is proposed to handle the drawbacks of feature ranking and boost their performance. First, an evaluation framework called feature meta-ranking is proposed to evaluate ranking measures. The framework is based on a newly proposed Differential Filter Level Performance (DFLP) measure. It was proved that, in ideal cases, the performance of text classifier is a monotonic, non-decreasing function of the number of features. Then we theoretically and empirically validate the effectiveness of DFLP as a meta-ranking measure to evaluate and compare feature ranking methods. The meta-ranking framework is also examined by a stopword extraction problem. We use the framework to select appropriate feature ranking measure for building domain-specific stoplists. The proposed framework is evaluated by SVM and Rocchio text classifiers on six benchmark data. The meta-ranking method suggests that in searching for a proper feature ranking measure, the backward feature ranking is as important as the forward one. Second, we show that the destructive effect of term redundancy gets worse as we decrease the feature ranking threshold. It implies that for aggressive feature selection, an effective redundancy reduction should be performed as well as feature ranking. An algorithm based on extracting term dependency links using an information theoretic inclusion index is proposed to detect and handle term dependencies. The dependency links are visualized by a tree structure called a term dependency tree. By grouping the nodes of the tree into two categories, including hub and link nodes, a heuristic algorithm is proposed to handle the term dependencies by merging or removing the link nodes. The proposed method of redundancy reduction is evaluated by SVM and Rocchio classifiers for four benchmark data sets. According to the results, redundancy reduction is more effective on weak classifiers since they are more sensitive to term redundancies. It also suggests that in those feature ranking methods which compact the information in a small number of features, aggressive feature selection is not recommended. Finally, to deal with class imbalance in feature level using ranking methods, a local feature ranking scheme called reverse discrimination approach is proposed. The proposed method is applied to a highly unbalanced social network discovery problem. In this case study, the problem of learning a social network is translated into a text classification problem using newly proposed actor and relationship modeling. Since social networks are usually sparse structures, the corresponding text classifiers become highly unbalanced. Experimental assessment of the reverse discrimination approach validates the effectiveness of the local feature ranking method to improve the classifier performance when dealing with unbalanced data. The application itself suggests a new approach to learn social structures from textual data. Feature Ranking Feature Selection Information Theory Text Classifier Social Network Link Mining Electrical and Computer Engineering
105	Adaptive agents in the House of Quality Fent, Thomas January 1999 (has links) (PDF) Managing the information flow within a big organization is a challenging task. Moreover, in a distributed decision-making process conflicting objectives occur. In this paper, artificial adaptive agents are used to analyze this problem. The decision makers are implemented as Classifier Systems, and their learning process is simulated by Genetic Algorithms. To validate the outcomes we compared the results with the optimal solutions obtained by full enumeration. It turned out that the genetic algorithm indeed was able to generate useful rules that describe how the decision makers involved in new product development should react to the requests they are required to fulfill. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
106	Feature Ranking for Text Classifiers Makrehchi, Masoud January 2007 (has links) Feature selection based on feature ranking has received much attention by researchers in the field of text classification. The major reasons are their scalability, ease of use, and fast computation. %, However, compared to the search-based feature selection methods such as wrappers and filters, they suffer from poor performance. This is linked to their major deficiencies, including: (i) feature ranking is problem-dependent; (ii) they ignore term dependencies, including redundancies and correlation; and (iii) they usually fail in unbalanced data. While using feature ranking methods for dimensionality reduction, we should be aware of these drawbacks, which arise from the function of feature ranking methods. In this thesis, a set of solutions is proposed to handle the drawbacks of feature ranking and boost their performance. First, an evaluation framework called feature meta-ranking is proposed to evaluate ranking measures. The framework is based on a newly proposed Differential Filter Level Performance (DFLP) measure. It was proved that, in ideal cases, the performance of text classifier is a monotonic, non-decreasing function of the number of features. Then we theoretically and empirically validate the effectiveness of DFLP as a meta-ranking measure to evaluate and compare feature ranking methods. The meta-ranking framework is also examined by a stopword extraction problem. We use the framework to select appropriate feature ranking measure for building domain-specific stoplists. The proposed framework is evaluated by SVM and Rocchio text classifiers on six benchmark data. The meta-ranking method suggests that in searching for a proper feature ranking measure, the backward feature ranking is as important as the forward one. Second, we show that the destructive effect of term redundancy gets worse as we decrease the feature ranking threshold. It implies that for aggressive feature selection, an effective redundancy reduction should be performed as well as feature ranking. An algorithm based on extracting term dependency links using an information theoretic inclusion index is proposed to detect and handle term dependencies. The dependency links are visualized by a tree structure called a term dependency tree. By grouping the nodes of the tree into two categories, including hub and link nodes, a heuristic algorithm is proposed to handle the term dependencies by merging or removing the link nodes. The proposed method of redundancy reduction is evaluated by SVM and Rocchio classifiers for four benchmark data sets. According to the results, redundancy reduction is more effective on weak classifiers since they are more sensitive to term redundancies. It also suggests that in those feature ranking methods which compact the information in a small number of features, aggressive feature selection is not recommended. Finally, to deal with class imbalance in feature level using ranking methods, a local feature ranking scheme called reverse discrimination approach is proposed. The proposed method is applied to a highly unbalanced social network discovery problem. In this case study, the problem of learning a social network is translated into a text classification problem using newly proposed actor and relationship modeling. Since social networks are usually sparse structures, the corresponding text classifiers become highly unbalanced. Experimental assessment of the reverse discrimination approach validates the effectiveness of the local feature ranking method to improve the classifier performance when dealing with unbalanced data. The application itself suggests a new approach to learn social structures from textual data. Feature Ranking Feature Selection Information Theory Text Classifier Social Network Link Mining Electrical and Computer Engineering
107	Segmentation of human ovarian follicles from ultrasound images acquired <i>in vivo</i> using geometric active contour models and a naïve Bayes classifier Harrington, Na 14 September 2007 (has links) Ovarian follicles are spherical structures inside the ovaries which contain developing eggs. Monitoring the development of follicles is necessary for both gynecological medicine (ovarian diseases diagnosis and infertility treatment), and veterinary medicine (determining when to introduce superstimulation in cattle, or dividing herds into different stages in the estrous cycle).<p>Ultrasound imaging provides a non-invasive method for monitoring follicles. However, manually detecting follicles from ovarian ultrasound images is time consuming and sensitive to the observer's experience. Existing (semi-) automatic follicle segmentation techniques show the power of automation, but are not widely used due to their limited success.<p>A new automated follicle segmentation method is introduced in this thesis. Human ovarian images acquired <i>in vivo</i> were smoothed using an adaptive neighbourhood median filter. Dark regions were initially segmented using geometric active contour models. Only part of these segmented dark regions were true follicles. A naïve Bayes classifier was applied to determine whether each segmented dark region was a true follicle or not. <p>The Hausdorff distance between contours of the automatically segmented regions and the gold standard was 2.43 ± 1.46 mm per follicle, and the average root mean square distance per follicle was 0.86 ± 0.49 mm. Both the average Hausdorff distance and the root mean square distance were larger than those reported in other follicle segmentation algorithms. The mean absolute distance between contours of the automatically segmented regions and the gold standard was 0.75 ± 0.32 mm, which was below that reported in other follicle segmentation algorithms.<p>The overall follicle recognition rate was 33% to 35%; and the overall image misidentification rate was 23% to 33%. If only follicles with diameter greater than or equal to 3 mm were considered, the follicle recognition rate increased to 60% to 63%, and the follicle misidentification rate increased slightly to 24% to 34%. The proposed follicle segmentation method is proved to be accurate in detecting a large number of follicles with diameter greater than or equal to 3 mm. Naïve Bayes Classifier Ultrasound Images Follicle Segmentation Geometric Active Contour Models
108	A wearable real-time system for physical activity recognition and fall detection Yang, Xiuxin 23 September 2010 (has links) This thesis work designs and implements a wearable system to recognize physical activities and detect fall in real time. Recognizing peoples physical activity has a broad range of applications. These include helping people maintaining their energy balance by developing health assessment and intervention tools, investigating the links between common diseases and levels of physical activity, and providing feedback to motivate individuals to exercise. In addition, fall detection has become a hot research topic due to the increasing population over 65 throughout the world, as well as the serious effects and problems caused by fall.<p> In this work, the Sun SPOT wireless sensor system is used as the hardware platform to recognize physical activity and detect fall. The sensors with tri-axis accelerometers are used to collect acceleration data, which are further processed and extracted with useful information. The evaluation results from various algorithms indicate that Naive Bayes algorithm works better than other popular algorithms both in accuracy and implementation in this particular application.<p> This wearable system works in two modes: indoor and outdoor, depending on users demand. Naive Bayes classifier is successfully implemented in the Sun SPOT sensor. The results of evaluating sampling rate denote that 20 Hz is an optimal sampling frequency in this application. If only one sensor is available to recognize physical activity, the best location is attaching it to the thigh. If two sensors are available, the combination at the left thigh and the right thigh is the best option, 90.52% overall accuracy in the experiment.<p> For fall detection, a master sensor is attached to the chest, and a slave sensor is attached to the thigh to collect acceleration data. The results show that all falls are successfully detected. Forward, backward, leftward and rightward falls have been distinguished from standing and walking using the fall detection algorithm. Normal physical activities are not misclassified as fall, and there is no false alarm in fall detection while the user is wearing the system in daily life. Machine learning Accelerometer Naive Bayes classifier Fall detection Physical activity recognition
109	Automatic Assignment of Protein Function with Supervised Classifiers Jung, Jae 16 January 2010 (has links) High-throughput genome sequencing and sequence analysis technologies have created the need for automated annotation and analysis of large sets of genes. The Gene Ontology (GO) provides a common controlled vocabulary for describing gene function. However, the process for annotating proteins with GO terms is usually through a tedious manual curation process by trained professional annotators. With the wealth of genomic data that are now available, there is a need for accurate auto- mated annotation methods. The overall objective of my research is to improve our ability to automatically an- notate proteins with GO terms. The first method, Automatic Annotation of Protein Functional Class (AAPFC), employs protein functional domains as features and learns independent Support Vector Machine classifiers for each GO term. This approach relies only on protein functional domains as features, and demonstrates that statistical pattern recognition can outperform expert curated mapping of protein functional domain features to protein functions. The second method Predict of Gene Ontology (PoGO) describes a meta-classification method that integrates multiple heterogeneous data sources. This method leads to improved performance than the protein domain method can achieve alone. Apart from these two methods, several systems have been developed that employ pattern recognition to assign gene function using a variety of features, such as the sequence similarity, presence of protein functional domains and gene expression patterns. Most of these approaches have not considered the hierarchical relationships among the terms in the form of a directed acyclic graph (DAG). The DAG represents the functional relationships between the GO terms, thus it should be an important component of an automated annotation system. I describe a Bayesian network used as a multi-layered classifier that incorporates the relationships among GO terms found in the GO DAG. I also describe an inference algorithm for quickly assigning GO terms to unlabeled proteins. A comparative analysis of the method to other previously described annotation systems shows that the method provides improved annotation accuracy when the performance of individual GO terms are compared. More importantly, this method enables the classification of significantly more GO terms to more proteins than was previously possible. Protein Function Gene Annotation Gene Ontology Protein domain InterPro Multi-layered classifier
110	Empirical Evaluations of Different Strategies for Classification with Skewed Class Distribution Ling, Shih-Shiung 09 August 2004 (has links) Existing classification analysis techniques (e.g., decision tree induction,) generally exhibit satisfactory classification effectiveness when dealing with data with non-skewed class distribution. However, real-world applications (e.g., churn prediction and fraud detection) often involve highly skewed data in decision outcomes. Such a highly skewed class distribution problem, if not properly addressed, would imperil the resulting learning effectiveness. In this study, we empirically evaluate three different approaches, namely the under-sampling, the over-sampling and the multi-classifier committee approaches, for addressing classification with highly skewed class distribution. Due to its popularity, C4.5 is selected as the underlying classification analysis technique. Based on 10 highly skewed class distribution datasets, our empirical evaluations suggest that the multi-classifier committee generally outperformed the under-sampling and the over-sampling approaches, using the recall rate, precision rate and F1-measure as the evaluation criteria. Furthermore, for applications aiming at a high recall rate, use of the over-sampling approach will be suggested. On the other hand, if the precision rate is the primary concern, adoption of the classification model induced directly from original datasets would be recommended. Classification Analysis Decision Tree Induction Multi-classifier Committee Approach Under-sampling Over-sampling Skewed Class Distribution

Search results