• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 111
  • 32
  • 8
  • 6
  • 5
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 219
  • 219
  • 73
  • 58
  • 54
  • 49
  • 36
  • 36
  • 33
  • 32
  • 30
  • 29
  • 27
  • 26
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

DIMENSIONALITY REDUCTION FOR DATA DRIVEN PROCESS MODELING

DWIVEDI, SAURABH January 2003 (has links)
No description available.
32

High Level Design Methodology for Reconfigurable Systems

Ding, Mingwei January 2005 (has links)
No description available.
33

AN EVALUATION OF DIMENSIONALITY REDUCTION ON CELL FORMATION EFFICACY

Sharma, Vikas Manesh 28 August 2007 (has links)
No description available.
34

Multi-Platform Genomic Data Fusion with Integrative Deep Learning

Oni, Olatunji January 2019 (has links)
The abundance of next-generation sequencing (NGS) data has encouraged the adoption of machine learning methods to aid in the diagnosis and treatment of human disease. In particular, the last decade has shown the extensive use of predictive analytics in cancer research due to the prevalence of rich cellular descriptions of genetic and transcriptomic profiles of cancer cells. Despite the availability of wide-ranging forms of genomic data, few predictive models are designed to leverage multidimensional data sources. In this paper, we introduce a deep learning approach using neural network based information fusion to facilitate the integration of multi-platform genomic data, and the prediction of cancer cell sub-class. We propose the dGMU (deep gated multimodal unit), a series of multiplicative gates that can learn intermediate representations between multi-platform genomic data and improve cancer cell stratification. We also provide a framework for interpretable dimensionality reduction and assess several methods that visualize and explain the decisions of the underlying model. Experimental results on nine cancer types and four forms of NGS data (copy number variation, simple nucleotide variation, RNA expression, and miRNA expression) showed that the dGMU model improved the classification agreement of unimodal approaches and outperformed other fusion strategies in class accuracy. The results indicate that deep learning architectures based on multiplicative gates have the potential to expedite representation learning and knowledge integration in the study of cancer pathogenesis. / Thesis / Master of Science (MSc)
35

Andromeda in Education: Studies on Student Collaboration and Insight Generation with Interactive Dimensionality Reduction

Taylor, Mia Rachel 04 October 2022 (has links)
Andromeda is an interactive visualization tool that projects high-dimensional data into a scatterplot-like visualization using Weighted Multidimensional Scaling (WMDS). The visualization can be explored through surface-level interaction (viewing data values), parametric interaction (altering underlying parameterizations), and observation-level interaction (directly interacting with projected points). This thesis presents analyses on the collaborative utility of Andromeda in a middle school class and the insights college-level students generate when using Andromeda. The first study discusses how a middle school class collaboratively used Andromeda to explore and compare their engineering designs. The students analyzed their designs, represented as high-dimensional data, as a class. This study shows promise for introducing collaborative data analysis to middle school students in conjunction with other technical concepts such as the engineering design process. Participants in the study on college-level students were given a version of Andromeda, with access to different interactions, and were asked to generate insights on a dataset. By applying a novel visualization evaluation methodology on students' natural language insights, the results of this study indicate that students use different vocabulary supported by the interactions available to them, but not equally. The implications, as well as limitations, of these two studies are further discussed. / Master of Science / Data is often high-dimensional. A good example of this is a spreadsheet with many columns. Visualizing high-dimensional data is a difficult task because it must capture all information in 2 or 3 dimensions. Andromeda is a tool that can project high-dimensional data into a scatterplot-like visualization. Data points that are considered similar are plotted near each other and vice versa. Users can alter how important certain parts of the data are to the plotting algorithm as well as move points directly to update the display based on the user-specified layout. These interactions within Andromeda allow data analysts to explore high-dimensional data based on their personal sensemaking processes. As high dimensional thinking and exploratory data analysis are being introduced into more classrooms, it is important to understand the ways in which students analyze high-dimensional data. To address this, this thesis presents two studies. The first study discusses how a middle school class used Andromeda for their engineering design assignments. The results indicate that using Andromeda in a collaborative way enriched the students' learning experience. The second study analyzes how college-level students, when given access to different interaction types in Andromeda, generate insights into a dataset. Students use different vocabulary supported by the interactions available to them, but not equally. The implications, as well as limitations, of these two studies are further discussed.
36

Interpretation of Dimensionality Reduction with Supervised Proxies of User-defined Labels

Leoni, Cristian January 2021 (has links)
Research on Machine learning (ML) explainability has received a lot of focus in recent times. The interest, however, mostly focused on supervised models, while other ML fields have not had the same level of attention. Despite its usefulness in a variety of different fields, unsupervised learning explainability is still an open issue. In this paper, we present a Visual Analytics framework based on eXplainable AI (XAI) methods to support the interpretation of Dimensionality reduction methods. The framework provides the user with an interactive and iterative process to investigate and explain user-perceived patterns for a variety of DR methods by using XAI methods to explain a supervised method trained on the selected data. To evaluate the effectiveness of the proposed solution, we focus on two main aspects: the quality of the visualization and the quality of the explanation. This challenge is tackled using both quantitative and qualitative methods, and due to the lack of pre-existing test data, a new benchmark has been created. The quality of the visualization is established using a well-known survey-based methodology, while the quality of the explanation is evaluated using both case studies and a controlled experiment, where the generated explanation accuracy is evaluated on the proposed benchmark. The results show a strong capacity of our framework to generate accurate explanations, with an accuracy of 89% over the controlled experiment. The explanation generated for the two case studies yielded very similar results when compared with pre-existing, well-known literature on ground truths. Finally, the user experiment generated high quality overall scores for all assessed aspects of the visualization.
37

Computational analysis of facial expressions

Shenoy, A. January 2010 (has links)
This PhD work constitutes a series of inter-disciplinary studies that use biologically plausible computational techniques and experiments with human subjects in analyzing facial expressions. The performance of the computational models and human subjects in terms of accuracy and response time are analyzed. The computational models process images in three stages. This includes: Preprocessing, dimensionality reduction and Classification. The pre-processing of face expression images includes feature extraction and dimensionality reduction. Gabor filters are used for feature extraction as they are closest biologically plausible computational method. Various dimensionality reduction methods: Principal Component Analysis (PCA), Curvilinear Component Analysis (CCA) and Fisher Linear Discriminant (FLD) are used followed by the classification by Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). Six basic prototypical facial expressions that are universally accepted are used for the analysis. They are: angry, happy, fear, sad, surprise and disgust. The performance of the computational models in classifying each expression category is compared with that of the human subjects. The Effect size and Encoding face enable the discrimination of the areas of the face specific for a particular expression. The Effect size in particular emphasizes the areas of the face that are involved during the production of an expression. This concept of using Effect size on faces has not been reported previously in the literature and has shown very interesting results. The detailed PCA analysis showed the significant PCA components specific for each of the six basic prototypical expressions. An important observation from this analysis was that with Gabor filtering followed by non linear CCA for dimensionality reduction, the dataset vector size may be reduced to a very small number, in most cases it was just 5 components. The hypothesis that the average response time (RT) for the human subjects in classifying the different expressions is analogous to the distance measure of the data points from the classification hyper-plane was verified. This means the harder a facial expression is to classify by human subjects, the closer to the classifying hyper-plane of the classifier it is. A bi-variate correlation analysis of the distance measure and the average RT suggested a significant anti-correlation. The signal detection theory (SDT) or the d-prime determined how well the model or the human subjects were in making the classification of an expressive face from a neutral one. On comparison, human subjects are better in classifying surprise, disgust, fear, and sad expressions. The RAW computational model is better able to distinguish angry and happy expressions. To summarize, there seems to some similarities between the computational models and human subjects in the classification process.
38

Dimension Reduction Techniques in Morhpometrics / Dimension Reduction Techniques in Morhpometrics

Kratochvíl, Jakub January 2011 (has links)
This thesis centers around dimensionality reduction and its usage on landmark-type data which are often used in anthropology and morphometrics. In particular we focus on non-linear dimensionality reduction methods - locally linear embedding and multidimensional scaling. We introduce a new approach to dimensionality reduction called multipass dimensionality reduction and show that improves the quality of classification as well as requiring less dimensions for successful classification than the traditional singlepass methods.
39

On Dimensionality Reduction of Data

Vamulapalli, Harika Rao 05 August 2010 (has links)
Random projection method is one of the important tools for the dimensionality reduction of data which can be made efficient with strong error guarantees. In this thesis, we focus on linear transforms of high dimensional data to the low dimensional space satisfying the Johnson-Lindenstrauss lemma. In addition, we also prove some theoretical results relating to the projections that are of interest when applying them in practical applications. We show how the technique can be applied to synthetic data with probabilistic guarantee on the pairwise distance. The connection between dimensionality reduction and compressed sensing is also discussed.
40

Extração de tópicos baseado em agrupamento de regras de associação / Topic extraction based on association rule clustering

Santos, Fabiano Fernandes dos 29 May 2015 (has links)
Uma representação estruturada dos documentos em um formato apropriado para a obtenção automática de conhecimento, sem que haja perda de informações relevantes em relação ao formato originalmente não-estruturado, é um dos passos mais importantes da mineração de textos, pois a qualidade dos resultados obtidos com as abordagens automáticas para obtenção de conhecimento de textos estão fortemente relacionados à qualidade dos atributos utilizados para representar a coleção de documentos. O Modelo de Espaço de Vetores (MEV) é um modelo tradicional para obter uma representação estruturada dos documentos. Neste modelo, cada documento é representado por um vetor de pesos correspondentes aos atributos do texto. O modelo bag-of-words é a abordagem de MEV mais utilizada devido a sua simplicidade e aplicabilidade. Entretanto, o modelo bag-of-words não trata a dependência entre termos e possui alta dimensionalidade. Diversos modelos para representação dos documentos foram propostos na literatura visando capturar a informação de relação entre termos, destacando-se os modelos baseados em frases ou termos compostos, o Modelo de Espaço de Vetores Generalizado (MEVG) e suas extensões, modelos de tópicos não-probabilísticos, como o Latent Semantic Analysis (LSA) ou o Non-negative Matrix Factorization (NMF), e modelos de tópicos probabilísticos, como o Latent Dirichlet Allocation (LDA) e suas extensões. A representação baseada em modelos de tópicos é uma das abordagens mais interessantes uma vez que elas fornece uma estrutura que descreve a coleção de documentos em uma forma que revela sua estrutura interna e as suas inter-relações. As abordagens de extração de tópicos também fornecem uma estratégia de redução da dimensionalidade visando a construção de novas dimensões que representam os principais tópicos ou assuntos identificados na coleção de documentos. Entretanto, a extração é eficiente de informações sobre as relações entre os termos para construção da representação de documentos ainda é um grande desafio de pesquisa. Os modelos para representação de documentos que exploram a correlação entre termos normalmente enfrentam um grande desafio para manter um bom equilíbrio entre (i) a quantidade de dimensões obtidas, (ii) o esforço computacional e (iii) a interpretabilidade das novas dimensões obtidas. Assim,é proposto neste trabalho o modelo para representação de documentos Latent Association Rule Cluster based Model (LARCM). Este é um modelo de extração de tópicos não-probabilístico que explora o agrupamento de regras de associação para construir uma representação da coleção de documentos com dimensionalidade reduzida tal que as novas dimensões são extraídas a partir das informações sobre as relações entre os termos. No modelo proposto, as regras de associação são extraídas para cada documento para obter termos correlacionados que formam expressões multi-palavras. Essas relações entre os termos formam o contexto local da relação entre termos. Em seguida, aplica-se um processo de agrupamento em todas as regras de associação para formar o contexto geral das relações entre os termos, e cada grupo de regras de associação obtido formará um tópico, ou seja, uma dimensão da representação. Também é proposto neste trabalho uma metodologia de avaliação que permite selecionar modelos que maximizam tanto os resultados na tarefa de classificação de textos quanto os resultados de interpretabilidade dos tópicos obtidos. O modelo LARCM foi comparado com o modelo LDA tradicional e o modelo LDA utilizando uma representação que inclui termos compostos (bag-of-related-words). Os resultados dos experimentos indicam que o modelo LARCM produz uma representação para os documentos que contribui significativamente para a melhora dos resultados na tarefa de classificação de textos, mantendo também uma boa interpretabilidade dos tópicos obtidos. O modelo LARCM também apresentou ótimo desempenho quando utilizado para extração de informação de contexto para aplicação em sistemas de recomendação sensíveis ao contexto. / A structured representation of documents in an appropriate format for the automatic knowledge extraction without loss of relevant information is one of the most important steps of text mining, since the quality of the results obtained with automatic approaches for the text knowledge extraction is strongly related to the quality of the selected attributes to represent the collection of documents. The Vector Space model (VSM) is a traditional structured representation of documents. In this model, each document is represented as a vector of weights that corresponds to the features of the document. The bag-of-words model is the most popular VSM approach because of its simplicity and general applicability. However, the bag-of-words model does not include dependencies of the terms and has a high dimensionality. Several models for document representation have been proposed in the literature in order to capture the dependence among the terms, especially models based on phrases or compound terms, the Generalized Vector Space Model (GVSM) and their extensions, non-probabilistic topic models as Latent Semantic Analysis (LSA) or Non-negative Matrix Factorization (NMF) and still probabilistic topic models as the Latent Dirichlet Allocation (LDA) and their extensions. The topic model representation is one of the most interesting approaches since it provides a structure that describes the collection of documents in a way that reveals their internal structure and their interrelationships. Also, this approach provides a dimensionality reduction strategy aiming to built new dimensions that represent the main topics or ideas of the document collection. However, the efficient extraction of information about the relations of terms for document representation is still a major research challenge nowadays. The document representation models that explore correlated terms usually face a great challenge of keeping a good balance among the (i) number of extracted features, (ii) the computational performance and (iii) the interpretability of new features. In this way, we proposed the Latent Association Rule Cluster based Model (LARCM). The LARCM is a non-probabilistic topic model that explores association rule clustering to build a document representation with low dimensionality in a way that each dimension is composed by information about the relations among the terms. In the proposed approach, the association rules are built for each document to extract the correlated terms that will compose the multi-word expressions. These relations among the terms are the local context of relations. Then, a clustering process is applied for all association rules to discover the general context of the relations, and each obtained cluster is an extracted topic or a dimension of the new document representation. This work also proposes in this work an evaluation methodology to select topic models that maximize the results in the text classification task as much as the interpretability of the obtained topics. The LARCM model was compared against both the traditional LDA model and the LDA model using a document representation that includes multi-word expressions (bag-of-related-words). The experimental results indicate that LARCM provides an document representation that improves the results in the text classification task and even retains a good interpretability of the extract topics. The LARCM model also achieved great results as a method to extract contextual information for context-aware recommender systems.

Page generated in 0.1216 seconds