• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 5
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 40
  • 40
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Computational Intelligence and Data Mining Techniques Using the Fire Data Set

Storer, Jeremy J. 04 May 2016 (has links)
No description available.
22

Cairn Detection in Southern Arabia Using a Supervised Automatic Detection Algorithm and Multiple Sample Data Spectroscopic Clustering

Schuetter, Jared Michael 25 August 2010 (has links)
No description available.
23

Segmentation d'images TEP dynamiques par classification spectrale automatique et déterministe / Automatic and deterministic spectral clustering for segmentation of dynamic PET images

Zbib, Hiba 09 December 2013 (has links)
La quantification d’images TEP dynamiques est un outil performant pour l’étude in vivo de la fonctionnalité des tissus. Cependant, cette quantification nécessite une définition des régions d’intérêts pour l’extraction des courbes temps-activité. Ces régions sont généralement identifiées manuellement par un opérateur expert, ce qui renforce leur subjectivité. En conséquent, un intérêt croissant a été porté sur le développement de méthodes de classification. Ces méthodes visent à séparer l’image TEP en des régions fonctionnelles en se basant sur les profils temporels des voxels. Dans cette thèse, une méthode de classification spectrale des profils temporels des voxels est développée. Elle est caractérisée par son pouvoir de séparer des classes non linéaires. La méthode est ensuite étendue afin de la rendre utilisable en routine clinique. Premièrement une procédure de recherche globale est utilisée pour localiser d’une façon déterministe les centres optimaux des données projetées. Deuxièmement, un critère non supervisé de qualité de segmentation est proposé puis optimisé par le recuit simulé pour estimer automatiquement le paramètre d’échelle et les poids temporels associés à la méthode. La méthode de classification spectrale automatique et déterministe proposée est validée sur des images simulées et réelles et comparée à deux autres méthodes de segmentation de la littérature. Elle a présenté une amélioration de la définition des régions et elle paraît un outil prometteur pouvant être appliqué avant toute tâche de quantification ou d’estimation de la fonction d’entrée artérielle. / Quantification of dynamic PET images is a powerful tool for the in vivo study of the functionality of tissues. However, this quantification requires the definition of regions of interest for extracting the time activity curves. These regions are usually identified manually by an expert operator, which reinforces their subjectivity. As a result, there is a growing interest in the development of clustering methods that aim to separate the dynamic PET sequence into functional regions based on the temporal profiles of voxels. In this thesis, a spectral clustering method of the temporal profiles of voxels that has the advantage of handling nonlinear clusters is developed. The method is extended to make it more suited for clinical applications. First, a global search procedure is used to locate in a deterministic way the optimal cluster centroids from the projected data. Second an unsupervised clustering criterion is proposed and optimised by the simulated annealing to automatically estimate the scale parameter and the weighting factors involved in the method. The proposed automatic and deterministic spectral clustering method is validated on simulated and real images and compared to two other segmentation methods from the literature. It improves the ROI definition, and appears as a promising pre-processing tool before ROI-based quantification and input function estimation tasks.
24

Contributions à l'étude de la classification spectrale et applications / Contributions to the study of spectral clustering and applications

Mouysset, Sandrine 07 December 2010 (has links)
La classification spectrale consiste à créer, à partir des éléments spectraux d'une matrice d'affinité gaussienne, un espace de dimension réduite dans lequel les données sont regroupées en classes. Cette méthode non supervisée est principalement basée sur la mesure d'affinité gaussienne, son paramètre et ses éléments spectraux. Cependant, les questions sur la séparabilité des classes dans l'espace de projection spectral et sur le choix du paramètre restent ouvertes. Dans un premier temps, le rôle du paramètre de l'affinité gaussienne sera étudié à travers des mesures de qualités et deux heuristiques pour le choix de ce paramètre seront proposées puis testées. Ensuite, le fonctionnement même de la méthode est étudié à travers les éléments spectraux de la matrice d'affinité gaussienne. En interprétant cette matrice comme la discrétisation du noyau de la chaleur définie sur l'espace entier et en utilisant les éléments finis, les vecteurs propres de la matrice affinité sont la représentation asymptotique de fonctions dont le support est inclus dans une seule composante connexe. Ces résultats permettent de définir des propriétés de classification et des conditions sur le paramètre gaussien. A partir de ces éléments théoriques, deux stratégies de parallélisation par décomposition en sous-domaines sont formulées et testées sur des exemples géométriques et de traitement d'images. Enfin dans le cadre non supervisé, le classification spectrale est appliquée, d'une part, dans le domaine de la génomique pour déterminer différents profils d'expression de gènes d'une légumineuse et, d'autre part dans le domaine de l'imagerie fonctionnelle TEP, pour segmenter des régions du cerveau présentant les mêmes courbes d'activités temporelles. / The Spectral Clustering consists in creating, from the spectral elements of a Gaussian affinity matrix, a low-dimension space in which data are grouped into clusters. This unsupervised method is mainly based on Gaussian affinity measure, its parameter and its spectral elements. However, questions about the separability of clusters in the projection space and the spectral parameter choices remain open. First, the rule of the parameter of Gaussian affinity will be investigated through quality measures and two heuristics for choosing this setting will be proposed and tested. Then, the method is studied through the spectral element of the Gaussian affinity matrix. By interpreting this matrix as the discretization of the heat kernel defined on the whole space and using finite elements, the eigenvectors of the affinity matrix are asymptotic representation of functions whose support is included in one connected component. These results help define the properties of clustering and conditions on the Gaussian parameter. From these theoretical elements, two parallelization strategies by decomposition into sub-domains are formulated and tested on geometrical examples and images. Finally, as unsupervised applications, the spectral clustering is applied, first in the field of genomics to identify different gene expression profiles of a legume and the other in the imaging field functional PET, to segment the brain regions with similar time-activity curves.
25

Structuration de bases multimédia pour une exploration visuelle / Structuring multimedia bases for visual exploration

Voiron, Nicolas 18 December 2015 (has links)
La forte augmentation du volume de données multimédia impose la mise au point de solutions adaptées pour une exploration visuelle efficace des bases multimédia. Après avoir examiné les processus de visualisation mis en jeu, nous remarquons que ceci demande une structuration des données. L’objectif principal de cette thèse est de proposer et d’étudier ces méthodes de structuration des bases multimédia en vue de leur exploration visuelle.Nous commençons par un état de l’art détaillant les données et les mesures que nous pouvons produire en fonction de la nature des variables décrivant les données. Suit un examen des techniques de structuration par projection et classification. Nous présentons aussi en détail la technique du Clustering Spectral sur laquelle nous nous focaliserons ensuite.Notre première réalisation est une méthode originale de production et fusion de métriques par corrélation de rang. Nous testons cette première méthode sur une base multimédia issue de la vidéothèque d’un festival de films. Nous continuons ensuite par la mise au point d’une méthode de classification supervisée par corrélation que nous testons avec les données vidéos d’un challenge de la communauté multimédia. Ensuite nous nous focalisons sur les techniques du Clustering Spectral. Nous testons une technique de Clustering Spectral supervisée que nous comparons aux techniques de l’état de l’art. Et pour finir nous examinons des techniques du Clustering Spectral semi-supervisé actif. Dans ce contexte, nous proposons et validons des techniques de propagation d’annotations et des stratégies permettant d’améliorer la convergence de ces méthodes de classement. / The large increase in multimedia data volume requires the development of effective solutions for visual exploration of multimedia databases. After reviewing the visualization process involved, we emphasis the need of data structuration. The main objective of this thesis is to propose and study clustering and classification of multimedia database for their visual exploration.We begin with a state of the art detailing the data and the metrics we can produce according to the nature of the variables describing each document. Follows a review of the projection and classification techniques. We also present in detail the Spectral Clustering method.Our first contribution is an original method that produces fusion of metrics using rank correlations. We validate this method on an animation movie database coming from an international festival. Then we propose a supervised classification method based on rank correlation. This contribution is evaluated on a multimedia challenge dataset. Then we focus on Spectral Clustering methods. We test a supervised Spectral Clustering technique and compare to state of the art methods. Finally we examine active semi-supervised Spectral Clustering methods. In this context, we propose and validate constraint propagation techniques and strategies to improve the convergence of these active methods.
26

A GPU Accelerated Tensor Spectral Method for Subspace Clustering

Pai, Nithish January 2016 (has links) (PDF)
In this thesis we consider the problem of clustering the data lying in a union of subspaces using spectral methods. Though the data generated may have high dimensionality, in many of the applications, such as motion segmentation and illumination invariant face clustering, the data resides in a union of subspaces having small dimensions. Furthermore, for a number of classification and inference problems, it is often useful to identify these subspaces and work with data in this smaller dimensional manifold. If the observations in each cluster were to be distributed around a centric, applying spectral clustering on an a nifty matrix built using distance based similarity measures between the data points have been used successfully to solve the problem. But it has been observed that using such pair-wise distance based measure between the data points to construct a similarity matrix is not sufficient to solve the subspace clustering problem. Hence, a major challenge is to end a similarity measure that can capture the information of the subspace the data lies in. This is the motivation to develop methods that use an affinity tensor by calculating similarity between multiple data points. One can then use spectral methods on these tensors to solve the subspace clustering problem. In order to keep the algorithm computationally feasible, one can employ column sampling strategies. However, the computational costs for performing the tensor factorization increases very quickly with increase in sampling rate. Fortunately, the advances in GPU computing has made it possible to perform many linear algebra operations several order of magnitudes faster than traditional CPU and multicourse computing. In this work, we develop parallel algorithms for subspace clustering on a GPU com-putting environment. We show that this gives us a significant speedup over the implementations on the CPU, which allows us to sample a larger fraction of the tensor and thereby achieve better accuracies. We empirically analyze the performance of these algorithms on a number of synthetically generated subspaces con gyrations. We ally demonstrate the effectiveness of these algorithms on the motion segmentation, handwritten digit clustering and illumination invariant face clustering and show that the performance of these algorithms are comparable with the state of the art approaches.
27

Steps towards end-to-end neural speaker diarization / Étapes vers un système neuronal de bout en bout pour la tâche de segmentation et de regroupement en locuteurs

Yin, Ruiqing 26 September 2019 (has links)
La tâche de segmentation et de regroupement en locuteurs (speaker diarization) consiste à identifier "qui parle quand" dans un flux audio sans connaissance a priori du nombre de locuteurs ou de leur temps de parole respectifs. Les systèmes de segmentation et de regroupement en locuteurs sont généralement construits en combinant quatre étapes principales. Premièrement, les régions ne contenant pas de parole telles que les silences, la musique et le bruit sont supprimées par la détection d'activité vocale (VAD). Ensuite, les régions de parole sont divisées en segments homogènes en locuteur par détection des changements de locuteurs, puis regroupées en fonction de l'identité du locuteur. Enfin, les frontières des tours de parole et leurs étiquettes sont affinées avec une étape de re-segmentation. Dans cette thèse, nous proposons d'aborder ces quatre étapes avec des approches fondées sur les réseaux de neurones. Nous formulons d’abord le problème de la segmentation initiale (détection de l’activité vocale et des changements entre locuteurs) et de la re-segmentation finale sous la forme d’un ensemble de problèmes d’étiquetage de séquence, puis nous les résolvons avec des réseaux neuronaux récurrents de type Bi-LSTM (Bidirectional Long Short-Term Memory). Au stade du regroupement des régions de parole, nous proposons d’utiliser l'algorithme de propagation d'affinité à partir de plongements neuronaux de ces tours de parole dans l'espace vectoriel des locuteurs. Des expériences sur un jeu de données télévisées montrent que le regroupement par propagation d'affinité est plus approprié que le regroupement hiérarchique agglomératif lorsqu'il est appliqué à des plongements neuronaux de locuteurs. La segmentation basée sur les réseaux récurrents et la propagation d'affinité sont également combinées et optimisées conjointement pour former une chaîne de regroupement en locuteurs. Comparé à un système dont les modules sont optimisés indépendamment, la nouvelle chaîne de traitements apporte une amélioration significative. De plus, nous proposons d’améliorer l'estimation de la matrice de similarité par des réseaux neuronaux récurrents, puis d’appliquer un partitionnement spectral à partir de cette matrice de similarité améliorée. Le système proposé atteint des performances à l'état de l'art sur la base de données de conversation téléphonique CALLHOME. Enfin, nous formulons le regroupement des tours de parole en mode séquentiel sous la forme d'une tâche supervisée d’étiquetage de séquence et abordons ce problème avec des réseaux récurrents empilés. Pour mieux comprendre le comportement du système, une analyse basée sur une architecture de codeur-décodeur est proposée. Sur des exemples synthétiques, nos systèmes apportent une amélioration significative par rapport aux méthodes de regroupement traditionnelles. / Speaker diarization is the task of determining "who speaks when" in an audio stream that usually contains an unknown amount of speech from an unknown number of speakers. Speaker diarization systems are usually built as the combination of four main stages. First, non-speech regions such as silence, music, and noise are removed by Voice Activity Detection (VAD). Next, speech regions are split into speaker-homogeneous segments by Speaker Change Detection (SCD), later grouped according to the identity of the speaker thanks to unsupervised clustering approaches. Finally, speech turn boundaries and labels are (optionally) refined with a re-segmentation stage. In this thesis, we propose to address these four stages with neural network approaches. We first formulate both the initial segmentation (voice activity detection and speaker change detection) and the final re-segmentation as a set of sequence labeling problems and then address them with Bidirectional Long Short-Term Memory (Bi-LSTM) networks. In the speech turn clustering stage, we propose to use affinity propagation on top of neural speaker embeddings. Experiments on a broadcast TV dataset show that affinity propagation clustering is more suitable than hierarchical agglomerative clustering when applied to neural speaker embeddings. The LSTM-based segmentation and affinity propagation clustering are also combined and jointly optimized to form a speaker diarization pipeline. Compared to the pipeline with independently optimized modules, the new pipeline brings a significant improvement. In addition, we propose to improve the similarity matrix by bidirectional LSTM and then apply spectral clustering on top of the improved similarity matrix. The proposed system achieves state-of-the-art performance in the CALLHOME telephone conversation dataset. Finally, we formulate sequential clustering as a supervised sequence labeling task and address it with stacked RNNs. To better understand its behavior, the analysis is based on a proposed encoder-decoder architecture. Our proposed systems bring a significant improvement compared with traditional clustering methods on toy examples.
28

DRFS: Detecting Risk Factor of Stroke Disease from Social Media Using Machine Learning Techniques

Pradeepa, S., Manjula, K. R., Vimal, S., Khan, Mohammad S., Chilamkurti, Naveen, Luhach, Ashish Kr 01 January 2020 (has links)
In general humans are said to be social animals. In the huge expanded internet, it's really difficult to detect and find out useful information about a medical illness. In anticipation of more definitive studies of a causal organization between stroke risk and social network, It would be suitable to help social individuals to detect the risk of stroke. In this work, a DRFS methodology is proposed to find out the various symptoms associated with the stroke disease and preventive measures of a stroke disease from the social media content. We have defined an architecture for clustering tweets based on the content using Spectral Clustering an iterative fashion. The class label detection is furnished with the use of highest TF-IDF value words. The resultant clusters obtained as the output of spectral clustering is prearranged as input to the Probability Neural Network (PNN) to get the suitable class labels and their probabilities. Find Frequent word set using support count measure from the group of clusters for identify the risk factors of stroke. We found that the anticipated approach is able to recognize new symptoms and causes that are not listed in the World Health Organization (WHO), Mayo Clinic and National Health Survey (NHS). It is marked that they get associated with precise outcomes portray real statistics. This type of experiments will empower health organization, doctors and Government segments to keep track of stroke diseases. Experimental results shows the causes preventive measures, high and low risk factors of stroke diseases.
29

[pt] TARIFAÇÃO ZONAL DO USO DA TRANSMISSÃO APLICADA A SISTEMAS ELÉTRICOS INTERLIGADOS / [en] ZONAL TARIFF FOR THE TRANSMISSION USAGE APPLIED TO INTERCONNECTED POWER SYSTEMS

JÉSSICA FELIX MACEDO TALARICO 30 September 2021 (has links)
[pt] Os sistemas de transmissão cumprem uma função vital para o bom desempenho dos mercados de energia elétrica. A precificação do seu uso afeta diretamente a remuneração das empresas concessionárias e os custos dos participantes do mercado. No Brasil, os usuários do sistema interligado nacional (SIN) devem pagar pela disponibilização dos equipamentos que compõem a rede para as transmissoras detentoras destes ativos de forma proporcional ao seu uso. Assim, a agência reguladora brasileira (ANEEL) estabeleceu as tarifas de uso do sistema de transmissão (TUST), que são calculadas anualmente por barra via metodologia nodal. Tais tarifas são compostas por duas parcelas: locacional e selo. A parcela locacional reflete o uso efetivo da rede por cada agente participante, medindo o impacto da injeção de potência marginal de uma barra nos equipamentos do sistema. A parcela selo consiste num valor constante que garantirá a remuneração da porção não utilizada da rede. Em geral, a proximidade elétrica das barras do sistema implica valores tarifários similares. Esta Dissertação de Mestrado propõe uma nova metodologia a ser incorporada no cálculo da TUST, considerando a divisão do SIN em zonas tarifárias de transmissão (ZTT). Desta forma, cada ZTT apresentará uma única tarifa a ser aplicada aos seus participantes, que corresponderá à média ponderada das tarifas finais calculadas via metodologia nodal. Para a identificação das ZTT, são aplicadas técnicas de agrupamento k-Means e espectral nos sistemas IEEE-RTS e SIN. Nesta dissertação, avalia-se também o uso de modelos matemáticos para definir o número ideal de ZTT a ser considerado. São realizadas diversas análises de sensibilidade relativas a mudanças de despacho, alterações de topologia e evolução do sistema ao longo dos anos. Os resultados correspondentes são então extensivamente discutidos. / [en] Transmission systems play a vital role in the good performance of the electrical energy markets. The pricing of its use directly affects the budget of concessionary companies and the costs of market participants. In Brazil, users of the national interconnected system (NIS) must pay for the equipment availability that makes up the network to the transmission companies that own these assets in proportion to their use. Thus, the Brazilian regulatory agency (ANEEL) established the tariffs for transmission system usage (TTSU), which are calculated annually by bus using the nodal methodology. Such tariffs are made up of two installments: locational and postage stamp. The locational portion reflects the effective use of the grid by each participating agent, measuring the impact of the marginal power injection at a bus on the system equipment. The stamp portion consists of a constant amount that will guarantee the remuneration of the unused portion of the network. In general, the electrical proximity of the system buses leads to similar tariff values. This dissertation proposes a new methodology to be incorporated into the TTSU calculation, considering the division of the NIS into transmission tariff zones (TTZ). In this way, each TTZ will present a single tariff to be applied to its participants, which will correspond to the weighted average of the final tariffs calculated via the nodal methodology. For the identification of the TTZ, k-Means and Spectral clustering techniques are applied to the IEEE-RTS and SIN systems. In this dissertation, the use of mathematical models is also assessed to define the ideal number of TTZ to be considered. Various sensitivity analyses are carried out regarding changes in dispatch, grid topology and expansion of the system over the years. The corresponding results are deeply discussed.
30

Contributions to Data Reduction and Statistical Model of Data with Complex Structures

Wei, Yanran 30 August 2022 (has links)
With advanced technology and information explosion, the data of interest often have complex structures, with the large size and dimensions in the form of continuous or discrete features. There is an emerging need for data reduction, efficient modeling, and model inference. For example, data can contain millions of observations with thousands of features. Traditional methods, such as linear regression or LASSO regression, cannot effectively deal with such a large dataset directly. This dissertation aims to develop several techniques to effectively analyze large datasets with complex structures in the observational, experimental and time series data. In Chapter 2, I focus on the data reduction for model estimation of sparse regression. The commonly-used subdata selection method often considers sampling or feature screening. Un- der the case of data with both large number of observation and predictors, we proposed a filtering approach for model estimation (FAME) to reduce both the size of data points and features. The proposed algorithm can be easily extended for data with discrete response or discrete predictors. Through simulations and case studies, the proposed method provides a good performance for parameter estimation with efficient computation. In Chapter 3, I focus on modeling the experimental data with quantitative-sequence (QS) factor. Here the QS factor concerns both quantities and sequence orders of several compo- nents in the experiment. Existing methods usually can only focus on the sequence orders or quantities of the multiple components. To fill this gap, we propose a QS transformation to transform the QS factor to a generalized permutation matrix, and consequently develop a simple Gaussian process approach to model the experimental data with QS factors. In Chapter 4, I focus on forecasting multivariate time series data by leveraging the au- toregression and clustering. Existing time series forecasting method treat each series data independently and ignore their inherent correlation. To fill this gap, I proposed a clustering based on autoregression and control the sparsity of the transition matrix estimation by adap- tive lasso and clustering coefficient. The clustering-based cross prediction can outperforms the conventional time series forecasting methods. Moreover, the the clustering result can also enhance the forecasting accuracy of other forecasting methods. The proposed method can be applied on practical data, such as stock forecasting, topic trend detection. / Doctor of Philosophy / This dissertation focuses on three projects that are related to data reduction and statistical modeling of data with complex structures. In chapter 2, we propose a filtering approach of data for parameter estimation of sparse regression. Given data with thousands of ob- servations and predictors or even more, large storage and computation spaces is need to handle these data. It is challenging to computational power and takes long time in terms of computational cost. So we come up with an algorithm (FAME) that can reduce both the number of observations and predictors. After data reduction, this subdata selected by FAME keeps most information of the original dataset in terms of parameter estimation. Compare with existing methods, the dimension of the subdata generated by the proposed algorithm is smaller while the computational time does not increase. In chapter 3, we use quantitative-sequence (QS) factor to describe experimental data. One simple example of experimental data is milk tea. Adding 1 cup of milk first or adding 2 cup of tea first will influence the flavor. And this case can be extended to cases when there are thousands of ingredients need to be input into the experiment. Then the order and amount of ingredients will generate different experimental results. We use QS factor to describe this kind of order and amount. Then by transforming the QS factor to a matrix containing continuous value and set this matrix as input, we model the experimental results with a simple Gaussian process. In chapter 4, we propose an autoregression-based clustering and forecasting method of multi- variate time series data. Existing research works often treat each time series independently. Our approach incorporates the inherent correlation of data and cluster related series into one group. The forecasting is built based on each cluster and data within one cluster can cross predict each other. One application of this method is on topic trending detection. With thousands of topics, it is unfeasible to apply one model for forecasting all time series. Considering the similarity of trends among related topics, the proposed method can cluster topics based on their similarity, and then perform forecasting in autoregression model based on historical data within each cluster.

Page generated in 0.1334 seconds