Global ETD Search

651	USING SNP DATA TO PREDICT RADIATION TOXICITY FOR PROSTATE CANCER PATIENTS Mirzazadeh, Farzaneh 06 1900 (has links) Radiotherapy is often used to treat prostate cancer. While using high dose of radiation does kill cancer cells, it can cause toxicity in healthy tissues for some patients. It would be best to apply this treatment only to patients who are likely to be immune from such toxicity. This requires a classifier that can predict, before treatment, which patients are likely to exhibit severe toxicity. Here, we explore ways to use certain genetic features, called Single Nucleotide Polymorphisms (SNPs), for this task. This thesis uses several machine learning methods for learning such classifiers for predicting toxicity. This problem is challenging as there are a large number of features (164,273 SNPs) but only 82 samples. We explore an ensemble classification method for this problem, called Mixture Using Variance (MUV), which first learns several different base probabilistic classifiers, then for each query combines the responses of the different base classifiers based on their respective variances. The original MUV learns the individual classifiers using bootstrap sampling of the training data; we modify this by considering different subsets of the features for each classifier. We derive a new combination rule for base classifiers in the proposed setting and obtain some new theoretical results. Based on characteristics of our task, we propose an approach that involves first clustering the features before selecting only a subset of features from each cluster for each base classifier. Unfortunately, we were unable to predict radiation toxicity in prostate cancer patients using just the SNP values. However, our further experimental results reveal strong relation between correctness of a classifier in its prediction and the variance of the response to the corresponding classification query, which show that the main idea is promising.
652	An Annotated Guide to the Songs of Karl Goldmark Spivak, Mary Amanda 21 April 2008 (has links) The purpose of this study was to examine and provide a pedagogical content analysis of the published songs of Karl Goldmark (1830-1915), an Austrian composer from the Romantic Era. The songs' characteristics were evaluated to determine the level of singer for which they would be appropriate. The creation of an annotation format was devised for the analysis of each song including the areas of subject matter, difficulty level, range, tessitura, tempo indication, duration, and unique characteristics of the vocal line and the piano line. The detailed entries provide an easy and accurate evaluation of the individual songs in order for the voice teacher to assess their value for each student, with particular attention to their suitability for the beginning, intermediate and advanced singer. These levels generally correspond to freshman or sophomore, junior or senior, and graduate student, respectively. The results indicate a division of difficulty level among all of the songs, with moderate difficulty being the most common. It has also been concluded that there are valuable songs for all levels of student. Areas for further study are included. Repertoire Selection Graded Difficulty Classification Song Annotation
653	Towards a Framework For Resource Allocation in Networks Ranasingha, Maththondage Chamara Sisirawansha 26 May 2009 (has links) Network resources (such as bandwidth on a link) are not unlimited, and must be shared by all networked applications in some manner of fairness. This calls for the development and implementation of effective strategies that enable optimal utilization of these scarce network resources among the various applications that share the network. Although several rate controllers have been proposed in the literature to address the issue of optimal rate allocation, they do not appear to capture other factors that are of critical concern. For example, consider a battlefield data fusion application where a fusion center desires to allocate more bandwidth to incoming flows that are perceived to be more accurate and important. For these applications, network users should consider transmission rates of other users in the process of rate allocation. Hence, a rate controller should consider application specific rate coordination directives given by the underlying application. The work reported herein addresses this issue of how a rate controller may establish and maintain the desired application specific rate coordination directives. We identify three major challenges in meeting this objective. First, the application specific performance measures must be formulated as rate coordination directives. Second, it is necessary to incorporate these rate coordination directives into a rate controller. Of course, the resulting rate controller must co-exist with ordinary rate controllers, such as TCP Reno, in a shared network. Finally, a mechanism for identifying those flows that require the rate allocation directives must be put in place. The first challenge is addressed by means of a utility function which allows the performance of the underlying application to be maximized. The second challenge is addressed by utilizing the Network Utility Maximization (NUM) framework. The standard utility function (i.e. utility function of the standard rate controller) is augmented by inserting the application specific utility function as an additive term. Then the rate allocation problem is formulated as a constrained optimization problem, where the objective is to maximize the aggregate utility of the network. The gradient projection algorithm is used to solve the optimization problem. The resulting solution is formulated and implemented as a window update function. To address the final challenge we resort to a machine learning algorithm. We demonstrate how data features estimated utilizing only a fraction of the flow can be used as evidential input to a series of Bayesian Networks (BNs). We account for the uncertainty introduced by partial flow data through the Dempster-Shafer (DS) evidential reasoning framework.
654	Vegetation changes in the Willamette River Greenway, Benton and Linn Counties, Oregon, 1972-1981 / Wickramaratne, Siri Nimal. January 1983 (has links) Thesis (M.S.)--Oregon State University, 1983. / Typescript (photocopy). Includes bibliographical references (leaves 76-80). Also available via the World Wide Web.
655	Data aggregation for capacity management Lee, Yong Woo 30 September 2004 (has links) This thesis presents a methodology for data aggregation for capacity management. It is assumed that there are a very large number of products manufactured in a company and that every product is stored in the database with its standard unit per hour and attributes that uniquely specify each product. The methodology aggregates products into families based on the standard units-per-hour and finds a subset of attributes that unambiguously identifies each family. Data reduction and classification are achieved using well-known multivariate statistical techniques such as cluster analysis, variable selection and discriminant analysis. The experimental results suggest that the efficacy of the proposed methodology is good in terms of data reduction. data aggregation capacity management data reduction classification
656	Evolutionary study of the Hox gene family with matrix-based bioinformatics approaches Thomas-Chollier, Morgane 27 June 2008 (has links) Hox transcription factors are extensively investigated in diverse fields of molecular and evolutionary biology. Hox genes belong to the family of homeobox transcription factors characterised by a 60 amino acids region called homeodomain. These genes are evolutionary conserved and play crucial roles in the development of animals. In particular, they are involved in the specification of segmental identity, and in the tetrapod limb differentiation. In vertebrates, this family of genes can be divided into 14 groups of homology. Common methods to classify Hox proteins focus on the homeodomain. Classification is however hampered by the high conservation of this short domain. Since phylogenetic tree reconstruction is time-consuming, it is not suitable to classify the growing number of Hox sequences. The first goal of this thesis is therefore to design an automated approach to classify vertebrate Hox proteins in their groups of homology. This approach classifies Hox proteins on the basis of their scores for a combination of protein generalised profiles. The resulting program, HoxPred, combines predictive accuracy and time efficiency. We used this program to detect and classify Hox genes in several teleost fish genomes. In particular, it allowed us to clarify the evolutionary history of the HoxC1a genes in teleosts. Overall, HoxPred could efficiently contribute to the bioinformatics toolbox commonly used to annotate vertebrate Hox sequences. This program was then evaluated in non-vertebrate species. Although not intended for the classification of Hox proteins in distantly related species, HoxPred showed a high accuracy in bilaterians. It has also given insights into the evolutionary relationships between bilaterian posterior Hox genes, which are notoriously difficult to classify with phylogenetic trees. As transcription factors, Hox proteins regulate target genes by specifically binding DNA on cis-regulatory elements. Only a few of these target genes have been identified so far. The second goal of this work was to evaluate whether it is possible to apply computational approaches to detect Hox cis-regulatory elements in genomic sequences. Regulatory Sequence Analysis Tools (RSAT) is a suite of bioinformatics tools dedicated to the detection of cis-regulatory elements in genomes. We participated to the development of matrix-based pattern matching approaches in RSAT. After having performed a statistical validation of the pattern-matching scores, we focused on a study case based on the vertebrate HoxB1 protein, which binds DNA with its cofactors Pbx and Meis. This study aimed at predicting combinations of cis-regulatory elements for these three transcription factors. evolution hox matrix pattern matching classification bioinformatics
657	On Data Mining and Classification Using a Bayesian Confidence Propagation Neural Network Orre, Roland January 2003 (has links) The aim of this thesis is to describe how a statisticallybased neural network technology, here named BCPNN (BayesianConfidence Propagation Neural Network), which may be identifiedby rewriting Bayes' rule, can be used within a fewapplications, data mining and classification with credibilityintervals as well as unsupervised pattern recognition. BCPNN is a neural network model somewhat reminding aboutBayesian decision trees which are often used within artificialintelligence systems. It has previously been success- fullyapplied to classification tasks such as fault diagnosis,supervised pattern recognition, hiearchical clustering and alsoused as a model for cortical memory. The learning paradigm usedin BCPNN is rather different from many other neural networkarchitectures. The learning in, e.g. the popularbackpropagation (BP) network, is a gradient method on an errorsurface, but learning in BCPNN is based upon calculations ofmarginal and joint prob- abilities between attributes. This isa quite time efficient process compared to, for instance,gradient learning. The interpretation of the weight values inBCPNN is also easy compared to many other networkarchitechtures. The values of these weights and theiruncertainty is also what we are focusing on in our data miningapplication. The most important results and findings in thisthesis can be summarised in the following points: We demonstrate how BCPNN (Bayesian Confidence PropagationNeural Network) can be extended to model the uncertainties incollected statistics to produce outcomes as distributionsfrom two different aspects: uncertainties induced by sparsesampling, which is useful for data mining; uncertainties dueto input data distributions, which is useful for processmodelling. We indicate how classification with BCPNN gives highercertainty than an optimal Bayes classifier and betterprecision than a naïve Bayes classifier for limited datasets. We show how these techniques have been turned into auseful tool for real world applications within the drugsafety area in particular. We present a simple but working method for doingautomatic temporal segmentation of data sequences as well asindicate some aspects of temporal tasks for which a Bayesianneural network may be useful. We present a method, based on recurrent BCPNN, whichperforms a similar task as an unsupervised clustering method,on a large database with noisy incomplete data, but muchquicker, with an efficiency in finding patterns comparablewith a well known (Autoclass) Bayesian clustering method,when we compare their performane on artificial data sets.Apart from BCPNN being able to deal with really large datasets, because it is a global method working on collectivestatistics, we also get good indications that the outcomefrom BCPNN seems to have higher clinical relevance thanAutoclass in our application on the WHO database of adversedrug reactions and therefore is a relevant data mining toolto use on the WHO database. Artificial neural network, Bayesian neural network, datamining, adverse drug reaction signalling, classification,learning. data mining bcpnn classification neural network
658	The Robust Classification of Hyperspectral Images Using Adaptive Wavelet Kernel Support Vector Data Description Kollegala, Revathi 2012 May 1900 (has links) Detection of targets in hyperspectral images is a specific case of one-class classification. It is particularly relevant in the area of remote sensing and has received considerable interest in the past few years. The thesis proposes the use of wavelet functions as kernels with Support Vector Data Description for target detection in hyperspectral images. Specifically, it proposes the Adaptive Wavelet Kernel Support Vector Data Description (AWK-SVDD) that learns the optimal wavelet function to be used given the target signature. The performance and computational requirements of AWK-SVDD is compared with that of existing methods and other wavelet functions. An introduction to target detection and target detection in the context of hyperspectral images is given. This thesis also includes an overview of the thesis and lists the contributions of the thesis. A brief mathematical background into one-class classification in reference to target detection is included. Also described are the existing methods and introduces essential concepts relevant to the proposed approach. The use of wavelet functions as kernels with Support Vector Data Description, the conditions for use of wavelet functions and the use of two functions in order to form the kernel are checked and analyzed. The proposed approach, AWKSVDD, is mathematically described. The details of the implementation and the results when applied to the Urban dataset of hyperspectral images with a random target signature are given. The results confirm the better performance of AWK-SVDD compared to conventional kernels, wavelet kernels and the two-function Morlet-Radial Basis Function kernel. The problems faced with convergence during the Support Vector Data Description optimization are discussed. The thesis concludes with the suggestions for future work. hyperspectral classification support vector description domain data
659	Improving Query Classification by Features’ Weight Learning Abghari, Arash January 2013 (has links) This work is an attempt to enhance query classification in call routing applications. A new method has been introduced to learn weights from training data by means of a regression model. This work has investigated applying the tf-idf weighting method, but the approach is not limited to a specific method and can be used for any weighting scheme. Empirical evaluations with several classifiers including Support Vector Machines (SVM), Maximum Entropy, Naive Bayes, and k-Nearest Neighbor (k-NN) show substantial improvement in both macro and micro F1 measures. Query Classification Weight learning Electrical and Computer Engineering
660	Benchmark Evaluation of HOG Descriptors as Features for Classification of Traffic Signs Fleyeh, Hasan, Roch, Janina January 2013 (has links) The purpose of this paper is to analyze the performance of the Histograms of Oriented Gradients (HOG) as descriptors for traffic signs recognition. The test dataset consists of speed limit traffic signs because of their high inter-class similarities. HOG features of speed limit signs, which were extracted from different traffic scenes, were computed and a Gentle AdaBoost classifier was invoked to evaluate the different features. The performance of HOG was tested with a dataset consisting of 1727 Swedish speed signs images. Different numbers of HOG features per descriptor, ranging from 36 features up 396 features, were computed for each traffic sign in the benchmark testing. The results show that HOG features perform high classification rate as the Gentle AdaBoost classification rate was 99.42%, and they are suitable to real time traffic sign recognition. However, it is found that changing the number of orientation bins has insignificant effect on the classification rate. In addition to this, HOG descriptors are not robust with respect to sign orientation. Traffic signs HOG descriptors Gentle AdaBoost Classification

Search results