281 |
Automated Message Triage - A Proposal for Supervised Semantic Classification of MessagesTavasoli, Amir 09 1900 (has links)
<p>Classification or Categorization is a text mining technique in which the given text documents are classified into specified categories. There are several techniques for classifying messages, ranging from simple K Nearest Neighbours to complicated Support Vector Machines. These classifiers have proven to be effective in cases where the documents in each category do not have a great deal of overlap with other documents. Designing a classifier that is effective in environments where there is no way to avoid this overlap, like em ails, text messages, or user opinions and comments, has remained a continuing challenge. This work is a proposal for a system that classifies such documents based on their content so they can be sorted by semantic significance. This has several applications in the real world, like triaging patient messages to physicians in the healthcare field or sorting user opinions on a product webpage. We have combined and tailored different classifiers to build a high performance classifier that supports this type of classification. The system has been tested and proven to have good performance with real-world user messages that were exchanged between patients and physicians during a hypertension prevention study.</p> / Master of Science (MS)
|
282 |
A New Incremental Classification Approach Monitoring The Risk of Heart DiseaseAghtar, Shima 10 1900 (has links)
<p>Medical decision support systems are one of the main applications for data mining and machine learning techniques. Most of these systems involve solving a classification problem. Classification models can be generated by one of two types of learning classification algorithms: batch or incremental learning algorithms.</p> <p>A batch learning algorithm generates a classification model trained by using the complete available data. Examples of batch learning algorithms are: decision tree C4.5 and multilayer perceptron neural network algorithms. However, an incremental learning algorithm generates a classification model trained incrementally through batches of training data. Examples of this are Learn++ and DWMV Learn++. Incremental learning algorithms are effective in problems in the healthcare domain where the training data become available periodically over time or where the size of database is very large. In the health care system, we consider heart disease a major cause death, and thus, it is a domain requiring attention. Early screening of patients for heart disease before they actually have its symptoms could therefore be an effective solution for decreasing the risk of this disease. Classification techniques can be employed to recognize patients who are at high risk of developing heart disease in order to send them for further attention or treatment by specialists.</p> <p>This work proposes an incremental learning algorithm, called modified DWMV Learn++, for primary care decision support that classifies patients into high risk and low risk, based on certain risk factors. This system has been tested and proven to have good performance using real-world patient clinical records.</p> / Master of Science (MSc)
|
283 |
Graph and Subspace Learning for Domain AdaptationShu, Le January 2015 (has links)
In many practical problems, given that the instances in the training and test may be drawn from different distributions, traditional supervised learning can not achieve good performance on the new domain. Domain adaptation algorithms are therefore designed to bridge the distribution gap between training (source) data and test (target) data. In this thesis, I propose two graph learning and two subspace learning methods for domain adaptation. Graph learning methods use a graph to model pairwise relations between instances and then minimize the domain discrepancy based on the graphs directly. The first effort we make is to propose a novel locality preserving projection method for domain adaptation task, which can find a linear mapping preserving the intrinsic structure for both source and target domains. We first construct two graphs encoding the neighborhood information for source and target domains separately. We then find linear projection coefficients which have the property of locality preserving for each graph. Instead of combing the two objective terms under compatibility assumption and requiring the user to decide the importance of each objective function, we propose a multi-objective formulation for this problem and solve it simultaneously using Pareto optimization. Pareto optimization allows multiple objectives to compete with each other in deciding the optimal trade-off. We use generalized eigen-decomposition to find the pareto frontier, which captures all possible good linear projection coefficients that are preferred by one or more objectives. The second effort is to directly improve the pair-wise similarities between instances in the same domain as well as in different domains. We propose a novel method to solve domain adaptation task in a transductive setting. The proposed method bridges the distribution gap between source domain and target domain through affinity learning. It exploits the existence of a subset of data points in target domain which distribute similarly to the data points in the source domain. These data points act as the bridge that facilitates the data similarities propagation across domains. We also propose to control the relative importance of intra- and inter- domain similarities to boost the similarity propagation. In our approach, we first construct the similarity matrix which encodes both the intra- and inter- domain similarities. We then learn the true similarities among data points in joint manifold using graph diffusion. We demonstrate that with improved similarities between source and target data, spectral embedding provides a better data representation, which boosts the prediction accuracy. Subspace learning methods aim to find a new coordinate system, in which the domain discrepancy is minimized. In this thesis, we refer to subspace-based method as those which model the domain shift between two subspaces directly. Our first effort is to propose a novel linear subspace learning approach for domain adaptation. Our key observation is that in many real world problems, such as image classification with blurred test images or cross domain text classification, domain shift can be modeled by a linear transformation between the source and target domain (intrinsically linear transformation between two subspaces underlying the source and target data). Motivated by this observation, our method explicitly aligns the data in two domains using a linear transformation while simultaneously finding a subspace which preserves the most data variance. With explicit data alignment, the subspace learning is formulated as minimizing of a PCA-like objective, which consists of two variables: the basis vectors of the common subspace and the linear transformation between two domains. We show that the optimization can be solved efficiently using an iterative algorithm based on alternating minimization, and prove its convergence to a local optimum. Our method can also integrate the label information of source data, which further improves the robustness of the subspace learning and yields better prediction. Existing subspace based domain adaptation methods assume that data lie in a single low dimensional subspace. This assumption is too strong in many real world applications especially considering the domain could be a mixture of latent domains with significant inner-domain variations that should not be neglected. In our second approach, the key idea is to assume the data lie in a union of multiple low dimensional subspaces, which relaxes the common assumption above. We propose a novel two step subspace based domain adaptation algorithm: in subspaces discovery step, we cluster the source and target data using subspace clustering algorithm and estimate the subspace for each cluster using principal component analysis; in domain adaptation step, we propose a novel multiple subspace alignment (Multi-SA) algorithm, in which we identify one common subspace that aligns well with both source and target subspaces, and therefore, best preserves the variance for both domains. To solve this alignment problem jointly for multiple subspaces, we formulate this problem as solving an optimization problem that minimizes the weighted sum of multiple alignment costs. A higher weight is assigned to a source subspace if its label distribution has smaller distance, measured by KL divergence, compared to the overall label distribution. By putting more weights on those subspaces, the learned common subspace is able to to preserve the distinctive information. / Computer and Information Science
|
284 |
Robust Speech Enhancement in the Time DomainPandey, Ashutosh 13 September 2022 (has links)
No description available.
|
285 |
Learning with Imperfect Data and Supervision for Visual Perception and UnderstandingZhang, Cheng 02 September 2022 (has links)
No description available.
|
286 |
The Simulation of the Behavior of a Student-Created Operating System using GPSSDroucas, George 08 1900 (has links)
<p>While operating system concepts are taught to students in undergraduate programs in Computer Science, a student project involving the development of an operating system creates a difficult situation due to time and financial considerations. Using GPSS to simulate the behavior of a student-treated operating system can reduce these problems and serve as an effective learning device. Many features and concepts can be simulated that might otherwise be ignored in a student project. An implementation of a student-created operating system is discussed. Statistics collected from the GPSS simulated model are used to operating system.</p> / Master of Science (MS)
|
287 |
An Investigation and Implementation of Some Binary Search Tree AlgorithmsWalker, Aldon N. 11 1900 (has links)
<p>This project documents the results of an investigation into binary search trees. Because of their favourable characteristics binary search trees have become popular for information storage and retrieval applications in a one level store. The trees may be of two types, weighted and unweighted. Various algorithms are presented, in a machine independent context, for both types and an empirical evaluation is performed. An important software aid used for graphically displaying a binary tree is also described.</p> / Master of Science (MS)
|
288 |
Plotting on an Electrostatic Printer/PlotterBryce, Christopher A. 04 1900 (has links)
<p>A survey of printers and plotters is given, and in particular the operation and capabilities of electrostatic printer/Plotters is discussed. An implementation of a plotting system which plots on an electrostatic printer/plotter is presented. This plotting system is designed to be compatible with the Benson-Lehner plotting system. The standard "PLOT" routine is replaced by a two pass system, which generates plots on the printer/plotter. In addition to the plotting system, an implementation of a graph utility is presented. This utility provides a single one pass system that plots one or more functions (where the function has one value for each value of x.)</p> / Master of Science (MS)
|
289 |
Measures of Association on a Bibliographic Data BaseFox, Allan Donald 06 1900 (has links)
<p>A critical survey of research done in automatic, indexing and classification and statistical linguistics, of importance to the study of bibliographic data bases, is given. A theory of measures of association in vector form is presented and applied using the FAMULUS system for storage and retrieval, and a data base in the social sciences constructed using that system. Certain conclusions are drawn regarding the usefulness of the various measures of association employed, and some argue of future research are given.</p> / Master of Science (MS)
|
290 |
A Four Factor Model for the Selection of a Systems Development ApproachDececchi, Thomas January 1989 (has links)
<p>The purpose of this research was to develop a model which would aid in selecting the best systems development approach for supplying a decision maker with a computer based support system. The research proceeded in several stages. First a hierarchical model was developed. The "top" level of the model described situations in terms of four factors or meta-constructs; User Participation in the Decision Making Process, Problem Space Complexity, Resource Availability and Organizational Context. The set of factors was based on Churchman's systems theory and the organizational interaction represented by the Leavitt Diamond. In the "lower" level the factors were each described by a set of attributes. The list of attributes was based on a literature search, aided by a model developed by Ginzberg and Stohr. Next the model was validated in a three phase process. The first phase involved validation of the model structure and content. A normative group technique (Delphi method) was chosen to obtain expert consensus on both the factors and the attributes that defined them. The second phase of the validation aided in content validation of the lower level of the model and associated a factor value with each unique set of attribute levels. It consisted of two sets of case-based interviews. Two of the factors had been defined as managerial in nature and these interviews were conducted with senior administrative personnel. The other two factors had been defined as technical in nature and the subjects of these interviews had a systems background. The thrid phase of the research aided in content validation of the "top" level of the model and determined which approaches were preferred in which situations (unique set of factor values). It consisted of a set of case-based interviews with senior MIS personnel (including experienced academics) to assign the "best" or "preferred" approach to each of the situations (set of factor values). Based on the results of these studies we have shown that it is possible to define situations in terms of a hierarchically ordered set of attributes, for the purpose of determining how best to provide computer based support for the decision maker facing a particular situation.</p> / Doctor of Philosophy (PhD)
|
Page generated in 0.07 seconds