Global ETD Search

11	Evaluating the use of neighborhoods for query dependent estimation of survival prognosis for oropharyngeal cancer patients Shay, Keegan P. 01 May 2019 (has links) Oropharyngeal Cancer diagnoses make up three percent of all cancer diagnoses in the United States per year. Recently, there has been an increase in the incidence of HPV-associated oropharyngeal cancer, necessitating updates to prior survival estimation techniques, in order to properly account for this shift in demographic. Clinicians depend on accurate survival prognosis estimates in order to create successful treatment plans that aim to maximize patient life while minimizing adverse treatment side effects. Additionally, recent advances in data analysis have resulted in richer and more complex data, motivating the use of more advanced data analysis techniques. Incorporation of sophisticated survival analysis techniques can leverage complex data, from a variety of sources, resulting in improved personalized prediction. Current survival prognosis prediction methods often rely on summary statistics and underlying assumptions regarding distribution or overall risk. We propose a k-nearest neighbor influenced approach for predicting oropharyngeal survival outcomes. We evaluate our approach for overall survival (OS), recurrence-free survival (RFS), and recurrence-free overall survival (RF+OS). We define two distance functions, not subject to the curse of dimensionality, in order to reconcile heterogeneous features with patient-to-patient similarity scores to produce a meaningful overall measure of distance. Using these distance functions, we obtain the k-nearest neighbors for each patient, forming neighborhoods of similar patients. We leverage these neighborhoods for prediction in two novel ensemble methods. The first ensemble method uses the nearest neighbors for each patient to combine globally trained predictions, weighted by their accuracies within a selected neighborhood. The second ensemble method combines Kaplan-Meier predictions from a variety of neighborhoods. Both proposed methods outperform an ensemble of standard global survival predictive models, with statistically significant calibration. k nearest neighbors local prediction Machine learning oropharyngeal QED Electrical and Computer Engineering
12	Social learning in labor markets and in real estate brokerage Gathright, Graton Marshal Randal. January 2010 (has links) Thesis (Ph. D.)--University of California, San Diego, 2010. / Title from first page of PDF file (viewed Feb. 19, 2010). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references (p. 59-60).
13	Classification of Twitter disaster data using a hybrid feature-instance adaptation approach Mazloom, Reza January 1900 (has links) Master of Science / Department of Computer Science / Doina Caragea / Huge amounts of data that are generated on social media during emergency situations are regarded as troves of critical information. The use of supervised machine learning techniques in the early stages of a disaster is challenged by the lack of labeled data for that particular disaster. Furthermore, supervised models trained on labeled data from a prior disaster may not produce accurate results. To address these challenges, domain adaptation approaches, which learn models for predicting the target, by using unlabeled data from the target disaster in addition to labeled data from prior source disasters, can be used. However, the resulting models can still be affected by the variance between the target domain and the source domain. In this context, we propose to use a hybrid feature-instance adaptation approach based on matrix factorization and the k-nearest neighbors algorithm, respectively. The proposed hybrid adaptation approach is used to select a subset of the source disaster data that is representative of the target disaster. The selected subset is subsequently used to learn accurate supervised or domain adaptation Naïve Bayes classifiers for the target disaster. In other words, this study focuses on transforming the existing source data to bring it closer to the target data, thus overcoming the domain variance which may prevent effective transfer of information from source to target. A combination of selective and transformative methods are used on instances and features, respectively. We show experimentally that the proposed approaches are effective in transferring information from source to target. Furthermore, we provide insights with respect to what types and combinations of selections/transformations result in more accurate models for the target. Tweet classification Domain adaptation Matrix factorization K-nearest neighbors Disaster response
14	Pattern Recognition applied to Continuous integration system. VANGALA, SHIVAKANTHREDDY January 2018 (has links) Context: Thisthesis focuses on regression testing in the continuous integration environment which is integration testing that ensures that changes made in the new development code to thesoftware product do not introduce new faults to the software product. Continuous integration is software development practice which integrates all development, testing, and deployment activities. In continuous integration,regression testing is done by manually selecting and prioritizingtestcases from a larger set of testcases. The main challenge faced using manual testcases selection and prioritization is insome caseswhereneeded testcases are ignored in subset of selected testcasesbecause testers didn’t includethem manually while designing hourly cycle regression test suite for particular feature development in product. So, Ericsson, the company in which environment this thesis is conducted,aims at improvingtheirtestcase selection and prioritization in regression testing using pattern recognition. Objectives:This thesis study suggests prediction models using pattern recognition algorithms for predicting future testcases failures using historical data. This helpsto improve the present quality of continuous integration environment by selecting appropriate subset of testcases from larger set of testcases for regression testing. There exist several candidate pattern recognition algorithms that are promising for predicting testcase failures. Based on the characteristics of the data collected at Ericsson, suitable pattern recognition algorithms are selected and predictive models are built. Finally, two predictive models are evaluated and the best performing model is integrated into the continuous integration system. Methods:Experiment research method is chosen for this research because discovery of cause and effect relationships between dependent and independent variables can be used for the evaluation of the predictive model.The experiment is conducted in RStudio, which facilitates to train the predictive models using continuous integration historical data. The predictive ability of the algorithms is evaluated using prediction accuracy evaluation metrics. Results: After implementing two predictive models (neural networks & k-nearest means) using continuous integration data, neural networks achieved aprediction accuracy of 75.3%, k-nearest neighbor gave result 67.75%. Conclusions: This research investigated the feasibility of an adaptive and self-learning test machinery by pattern recognition in continuous integration environment to improve testcase selection and prioritization in regression testing. Neural networks have proved effective capability of predicting failure testcase by 75.3% over the k-nearest neighbors.Predictive model can only make continuous integration efficient only if it has 100% prediction capability, the prediction capability of the 75.3% will not make continuous integration system more efficient than present static testcase selection and prioritization as it has deficiency of lacking prediction 25%. So, this research can only conclude that neural networks at present has 75.3% prediction capability but in future when data availability is more,this may reach to 100% predictive capability. The present Ericsson continuous integration system needs to improve its data storage for historical data at present it can only store 30 days of historical data. The predictive models require large data to give good prediction. To support continuous integration at present Ericsson is using jenkins automation server, there are other automation servers like Team city, Travis CI, Go CD, Circle CI which can store data more than 30 days using them will mitigate the problem of data storage. continuous integration pattern recognition artificial neural network k-nearest neighbors Computer Sciences Datavetenskap (datalogi)
15	Automatic Pain Assessment from Infants’ Crying Sounds Pai, Chih-Yun 01 November 2016 (has links) Crying is infants utilize to express their emotional state. It provides the parents and the nurses a criterion to understand infants’ physiology state. Many researchers have analyzed infants’ crying sounds to diagnose specific diseases or define the reasons for crying. This thesis presents an automatic crying level assessment system to classify infants’ crying sounds that have been recorded under realistic conditions in the Neonatal Intensive Care Unit (NICU) as whimpering or vigorous crying. To analyze the crying signal, Welch’s method and Linear Predictive Coding (LPC) are used to extract spectral features; the average and the standard deviation of the frequency signal and the maximum power spectral density are the other spectral features which are used in classification. For classification, three state-of-the-art classifiers, namely K-nearest Neighbors, Random Forests, and Least Squares Support Vector Machine are tested in this work, and the experimental result achieves the highest accuracy in classifying whimper and vigorous crying using the clean dataset is 90%, which is sampled with 10 seconds before scoring and 5 seconds after scoring and uses K-nearest neighbors as the classifier. Whimpering Vigorous Crying K-Nearest Neighbors Random Forests Least Squares Support Vector Machines Computer Sciences
16	Nepal and Bhutan two similar nations with different strategic approach towards their big neighbors-India and China Ghimire, Anupama January 2021 (has links) There have has been instances when a powerful neighboring countries are observed as being difficulty for the smaller ones. Moreover, the phenomena of subjugation roots back to imperialism era and its loitered notion of superiority is still practiced by most of the developed and sturdy countries. But the most vital thing here to be considered is the other nations’ (or smaller nations’) action against the dominance, which sometimes is demonstrated either in a resilient fashion or completely in submissive manner. In the era of globalization where nations’ relationship is intricate in a complex web of dependency, the nations with limited resources, weak diplomacy and instable politics are mostly compelled to succumb itself in front of relatively huge powers. And if the powerful states happens to be the immediate neighbors than the things might get more complex. In addition to this, the situation can be worse if the nation is a Least Developed Country (LDC hereafter) and also Land Locked States like Nepal and Bhutan. This research paper intends to analyze situation of such two nations, namely Nepal and Bhutan that are squeezed between China– a rising global power and India- an aspiring regional power. The interfering and controlling nature of these giants, at times, through diplomatic and coercive tactics has been evident in both the nation. But, despite the similarities these two small countries are seen to have adopted different strategies while dealing with their neighbors. If we look at Nepal we can see that it has developed bilateral relation with its both neighbors. And Bhutan has bilateral relation only with India and still has not welcomed China in its friendship zone, and this puzzle drives the research paper. The paper attempts to understand the situation from the lens of realism, as the theory implies that the nation is the nucleus and whatever action it undertakes is based on the advantage and mostly concerned in their individual power growth. It believes that any nation’s behavior does not involve the utopian notion but functions solely on the self-indulgence manner. Furthermore, the paper has tried to make an analysis with the help of inductive theory. The research finds that realism along is to sufficient to understand the small country’s perspective. There are many other factors that have contributed in making the strategic choices that these small countries have opted in order to establish a certain kind of relationship with their neighbors. Along with this the area of study needs to be broadened in order to comprehend the situation completely. Land Locked Neighbors Realism Foreign Policies Domestic Politics Political Science Statsvetenskap
17	Hybrid Recommender Systems via Spectral Learning and a Random Forest Williams, Alyssa 01 December 2019 (has links) We demonstrate spectral learning can be combined with a random forest classifier to produce a hybrid recommender system capable of incorporating meta information. Spectral learning is supervised learning in which data is in the form of one or more networks. Responses are predicted from features obtained from the eigenvector decomposition of matrix representations of the networks. Spectral learning is based on the highest weight eigenvectors of natural Markov chain representations. A random forest is an ensemble technique for supervised learning whose internal predictive model can be interpreted as a nearest neighbor network. A hybrid recommender can be constructed by first deriving a network model from a recommender's similarity matrix then applying spectral learning techniques to produce a new network model. The response learned by the new version of the recommender can be meta information. This leads to a system capable of incorporating meta data into recommendations. similarity learning collaborative filtering nearest neighbors Databases and Information Systems Other Mathematics Theory and Algorithms
18	The Hornet’s Nest: Humanism, Neighbors, and Hatred in Renaissance Florence Maxson, Brian 09 July 2012 (has links) . humanism neighbors Renaissance Florence History Cultural History European History Intellectual History
19	Disc : Approximative Nearest Neighbor Search using Ellipsoids for Photon Mapping on GPUs / Disc : Approximativ närmaste grannsökning med ellipsoider för fotonmappning på GPU:er Bergholm, Marcus, Kronvall, Viktor January 2016 (has links) Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high-performance computing for general-purpose applications. The K-Nearest Neighbors problem is widely used in applications ranging from classification to gathering of photons in the Photon Mapping algorithm. Using the euclidean distance measure when gathering photons can cause false bleeding of colors between surfaces. Ellipsoidical search boundaries for photon gathering are shown to reduce artifacts due to this false bleeding. Shifted Sorting has been found to yield high performance on GPUs while simultaneously retaining a high approximation rate. This study presents an algorithm for approximatively solving the K-Nearest Neighbors problem modified to use a distance measure creating an ellipsoidical search boundary. The ellipsoidical search boundary is used to alleviate the issue of false bleeding of colors between surfaces in Photon Mapping. The Approximative K-Nearest Neighbors algorithm presented is a modification of the Shifted Sorting algorithm. The algorithm is found to be highly parallelizable and performs to a factor of 86% queries processed per millisecond compared to a reference implementation using spherical search boundaries implied by the euclidean distance. The rate of compression from spherical to ellipsoidical search boundary is appropriately chosen in the range 3.0 to 7.0. The algorithm is found to scale well in respect to increases in both number of data points and number of query points. / Grafikprocessorer (GPU-er) har på senare tid möjliggjort högprestandaberäkningar till låga kostnader för generella applikationer. K-Nearest Neighbors problemet har vida applikationsområden, från klassifikation inom maskininlärning till insamlande av fotoner i Photon Mapping för rendering av tredimensionella scener. Användning av euklidiska avstånd vid insamling av fotoner kan leda till en felaktig bladning av färger mellan ytor. Ellipsoidiska sökområden vid fotoninsamling har visats reducera artefakter oraskade av denna typ av felaktiga färgutblandning. Shifted Sorting har visats ge hög prestanda på GPU-er utan att förlora kvalitet av approximationsgrad. Denna rapport undersöker hur den approximativa varianten av K-Nearest Neighborsalgoritmen med Shifted Sorting presterar på GPU-er med avståndsmåttet modifierat sådant att ett ellipsoidiskt sökområde bildas. Algoritmen används för att reduceras problemet av felaktig blanding av färg i Photon Mapping. Algoritmen visas vara mycket parallelliserbar och presterar till en grad av 86% behandlade sökpunkter per millisekund i jämförelse med en referensimplementation som använder sfäriska sökområden. Kompressionsgraden längs sökpunktens ytnormal väljs fördelaktligen till ett värde i intervallet 3,0 till 7,0. Algoritmen visas skala väl med avseende på både ökningar i antal data punkter och antal sökpunkter. photon mapping k-nearest neighbors ellipsoid gpu parallel Computer Sciences Datavetenskap (datalogi)
20	An Analysis of Notions of Differential Privacy for Edge-Labeled Graphs / En analys av olika uppfattningar om differentiell integritet i grafer med kantetiketter Christensen, Robin January 2020 (has links) The user data in social media platforms is an excellent source of information that is beneficial for both commercial and scientific purposes. However, recent times has seen that the user data is not always used for good, which has led to higher demands on user privacy. With accurate statistical research data being just as important as the privacy of the user data, the relevance of differential privacy has increased. Differential privacy allows user data to be accessible under certain privacy conditions at the cost of accuracy in query results, which is caused by noise. The noise is based on a tuneable constant ε and the global sensitivity of a query. The query sensitivity is defined as the greatest possible difference in query result between the queried database and a neighboring database. Where the neighboring database is defined to differ by one record in a tabular database, there are multiple neighborhood notions for edge-labeled graphs. This thesis considers the notions of edge neighborhood, node neighborhood, QL-edge neighborhood and QL-outedges neighborhood. To study these notions, a framework was developed in Java to function as a query mechanism for a graph database. ArangoDB was used as a storage for graphs, which was generated by parsing data sets in the RDF format as well as through a graph synthesizer in the developed framework. Querying a database in the framework is done with Apache TinkerPop, and a Laplace distribution is used when generating noise for the query results. The framework was used to study the privacy and utility trade-off of different histogram queries on a number of data sets, while employing the different notions of neighborhood in edge-labeled graphs. The level of privacy is determined by the value on ε, and the utility is defined as a measurement based on the L1-distance between the true and noisy result. In the general case, the notions of edge neighborhood and QL-edge neighborhood are the better alternatives in terms of privacy and utility. Although, there are indications that node neighborhood and QL-outedges neighborhood are considerable options for larger graphs, where the level of privacy for edge neighborhood and QL-edge neighborhood appears to be negligible based on utility measurements. Differential privacy Analysis framework Query mechanism Graph databases Graph neighbors Neighborhood notions Computer Engineering Datorteknik

Search results