Spelling suggestions: "subject:"mixture model"" "subject:"fixture model""
71 |
Implementation and evaluation of packet loss concealment schemes with the JM reference software / Implementation och utvärdering av metoder för att dölja paketförluster med JM-referensmjukvaranCooke, Henrik January 2010 (has links)
Communication over today’s IP-based networks are to some extent subject to packet loss. Most real-time applications, such as video streaming, need methods to hide this effect, since resending lost packets may introduce unacceptable delays. For IP-based video streaming applications such a method is referred to as a packet loss concealment scheme. In this thesis a recently proposed mixture model and least squares-based packet loss concealment scheme is implemented and evaluated together with three more well known concealment methods. The JM reference software is used as basis for the implementation, which is a public available software codec for the H.264 video coding standard. The evaluation is carried out by comparing the schemes in terms of objective measurements, subjective observations and a study with human observers. The recently proposed packet loss concealment scheme shows good performance with respect to the objective measures, and careful observations indicate better concealment of scenes with fast motion and rapidly changing video content. The study with human observers verifies the results for the case when a more sophisticated packetization technique is used. A new packet loss concealment scheme, based on joint modeling of motion vectors and pixels, is also investigated in the last chapter as an additional contribution of the thesis.
|
72 |
Model-based clustering based on sparse finite Gaussian mixturesMalsiner-Walli, Gertraud, Frühwirth-Schnatter, Sylvia, Grün, Bettina January 2016 (has links) (PDF)
In the framework of Bayesian model-based clustering based on a finite mixture of Gaussian distributions, we present a joint approach to estimate the number of mixture components and identify cluster-relevant variables simultaneously as well as to obtain an identified model. Our approach consists in specifying sparse hierarchical priors on the mixture weights and component means. In a deliberately overfitting mixture model the sparse prior on the weights empties superfluous components during MCMC. A straightforward estimator for the true number of components is given by the most frequent number of non-empty components visited during MCMC sampling. Specifying a shrinkage prior, namely the normal gamma prior, on the component means leads to improved parameter estimates as well as identification of cluster-relevant variables. After estimating the mixture model using MCMC methods based on data augmentation and Gibbs sampling, an identified model is obtained by relabeling the MCMC output in the point process representation of the draws. This is performed using K-centroids cluster analysis based on the Mahalanobis distance. We evaluate our proposed strategy in a simulation setup with artificial data and by applying it to benchmark data sets. (authors' abstract)
|
73 |
Bayesian mixture models for frequent itemset miningHe, Ruofei January 2012 (has links)
In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years and these methods have many advantages, including their abilities to avoid overfitting. In this thesis, we develop two Bayesian mixture models with the Dirichlet distribution prior and the Dirichlet process (DP) prior to improve the previous non-Bayesian mixture model developed for transaction dataset mining. First, we develop a finite Bayesian mixture model by introducing conjugate priors to the model. Then, we extend this model to an infinite Bayesian mixture using a Dirichlet process prior. The Dirichlet process mixture model is a nonparametric Bayesian model which allows for the automatic determination of an appropriate number of mixture components. We implement the inference of both mixture models using two methods: a collapsed Gibbs sampling scheme and a variational approximation algorithm. Experiments in several benchmark problems have shown that both mixture models achieve better performance than a non-Bayesian mixture model. The variational algorithm is the faster of the two approaches while the Gibbs sampling method achieves a more accurate result. The Dirichlet process mixture model can automatically grow to a proper complexity for a better approximation. However, these approaches also show that mixture models underestimate the probabilities of frequent itemsets. Consequently, these models have a higher sensitivity but a lower specificity.
|
74 |
Segmentace cévního řečiště na snímcích sítnice s využitím statistických metod / Retinal blood vessel segmentation in fundus images via statistical-based methodsŠolc, Radek January 2015 (has links)
This diploma thesis deals with segmentation of blood vessel from images acquired by fundus camera. The characteristic of fundus images and current methods of segmentation are described in theoretical part. The reach of the practical part is method using statistical model. The model using Student´s distribution for automatic segmentation is gradually drafted. Firstly EM- algorithm has been incorporated and model drafted on Markov random fields for improving robustness to noise after that. Contrast of thin blood vessel is improved in image preprocessing part by discrete wave transformation. The output image is used as mask for grayscale intensity decrease of thinnest blood-vessel and intensity increase of background. Whole model was programed in Matlab. The model was tested on whole HRF database. The quantitative evaluation of binary images were compared with golden standard images.
|
75 |
Genre-based Video Clustering using Deep Learning : By Extraction feature using Object Detection and Action RecognitionVellala, Abhinay January 2021 (has links)
Social media has become an integral part of the Internet. There have been users across the world sharing content like images, texts, videos, and so on. There is a huge amount of data being generated and it has become a challenge to the social media platforms to group the content for further usage like recommending a video. Especially, grouping videos based on similarity requires extracting features. This thesis investigates potential approaches to extract features that can help in determining the similarity between videos. Features of given videos are extracted using Object Detection and Action Recognition. Bag-of-features representation is used to build the vocabulary of all the features and transform data that can be useful in clustering videos. Probabilistic model-based clustering, Multinomial Mixture model is used to determine the underlying clusters within the data by maximizing the expected log-likelihood and estimating the parameters of data as well as probabilities of clusters. Analysis of clusters is done to understand the genre based on dominant actions and objects. Bayesian Information Criterion(BIC) and Akaike Information Criterion(AIC) are used to determine the optimal number of clusters within the given videos. AIC/BIC scores achieved minimum scores at 32 clusters which are chosen to be the optimal number of clusters. The data is labeled with the genres and Logistic regression is performed to check the cluster performance on test data and has achieved 96% accuracy
|
76 |
A Machine Learning Recommender System Based on Collaborative Filtering Using Gaussian Mixture Model ClusteringShakoor, Delshad M., Maihami, Vafa, Maihami, Reza 01 January 2021 (has links)
With the shift toward online shopping, it has become necessary to customize customers' needs and give them more choices. Before making a purchase, buyers research the products' features. The recommender systems facilitate the search task for customers by narrowing down the search space within specific products that align with the customer's needs. A recommender system uses clustering to filter information, calculating the similarity between members of a cluster to determine the factors that will lead to more accurate predictions. We propose a new method for predicting scores in machine learning recommender systems using the Gaussian mixture model clustering and the Pearson correlation coefficient. The proposed method is applied to MovieLens data. The results are then compared to three commonly used methods: Pearson correlation coefficients, K-means, and fuzzy C-means algorithms. As a result of increasing the number of neighbors, our method shows a lower error than others. Additionally, the results depict that accuracy will increase as the number of users increases. Our model, for instance, is 5% more accurate than existing methods when the neighbor size is 30. Gaussian mixture clustering chooses similar users and takes into account the scores distance when choosing nearby users that are similar to the active user.
|
77 |
Clustering of temporal gene expression data with mixtures of mixed effects modelsLu, Darlene 27 February 2019 (has links)
While time-dependent processes are important to biological functions, methods to leverage temporal information from large data have remained computationally challenging. In temporal gene-expression data, clustering can be used to identify genes with shared function in complex processes. Algorithms like K-Means and standard Gaussian mixture-models (GMM) fail to account for variability in replicated data or repeated measures over time and require a priori cluster number assumptions, evaluating many cluster numbers to select an optimal result. An improved penalized-GMM offers a computationally-efficient algorithm to simultaneously optimize cluster number and labels.
The work presented in this dissertation was motivated by mice bone-fracture models interested in determining patterns of temporal gene-expression during bone-healing progression. To solve this, an extension to the penalized-GMM was proposed to account for correlation between replicated data and repeated measures over time by introducing random-effects using a mixture of mixed-effects polynomial regression models and an entropy-penalized EM-Algorithm (EPEM).
First, performance of EPEM for different mixed-effects models were assessed with simulation studies and applied to the fracture-healing study. Second, modifications to address the high computational cost of EPEM were considered that either clustered subsets of data determined by predicted polynomial-order (S-EPEM) or used modified-initialization to decrease the initial burden (I-EPEM). Each was compared to EPEM and applied to the fracture-healing study. Lastly, as varied rates of fracture-healing were observed for mice with different genetic-backgrounds (strains), a new analysis strategy was proposed to compare patterns of temporal gene-expression between different mice-strains and assessed with simulation studies. Expression-profiles for each strain were treated as separate objects to cluster in order to determine genes clustered into different groups across strain.
We found that the addition of random-effects decreased accuracy of predicted cluster labels compared to K-Means, GMM, and fixed-effects EPEM. Polynomial-order optimization with BIC performed with highest accuracy, and optimization on subspaces obtained with singular-value-decomposition performed well. Computation time for S-EPEM was much reduced with a slight decrease in accuracy. I-EPEM was comparable to EPEM with similar accuracy and decrease in computation time. Application of the new analysis strategy on fracture-healing data identified several distinct temporal gene-expression patterns for the different strains. / 2021-02-27T00:00:00Z
|
78 |
Noise sources in robust uncompressed video watermarking / Les sources de bruit dans le tatouage robuste de vidéo non-compresséeDumitru, Corneliu Octavian 11 January 2010 (has links)
Cette thèse traite de ce verrou théorique pour des vidéos naturelles. Les contributions scientifiques développées ont permis : 1. De réfuter mathématiquement le modèle gaussien en général adopté dans la littérature pour représenter le bruit de canal ; 2. D’établir pour la première fois, le caractère stationnaire des processus aléatoires représentant le bruit de canal, la méthode développée étant indépendante du type de données, de leur traitement et de la procédure d’estimation ; 3. De proposer une méthodologie de modélisation du bruit de canal à partir d’un mélange de gaussiennes pour une transformée aussi bien en cosinus discrète qu’en ondelette discrète et pour un large ensemble d’attaques (filtrage, rotation, compression, StirMark, …). L’intérêt de cette approche est entre autres de permettre le calcul exact de la capacité du canal alors que la littérature ne fournissait que des bornes supérieure et inférieure. 4. Les contributions technologique concernent l’intégration et l’implémentions de ces modèles dans la méthode du tatouage IProtect brevetée Institut Télécom/ARTEMIS et SFR avec un gain en temps d’exécution d’un facteur 100 par rapport à l’état de l’art. / The thesis is focus on natural video and attack modelling for uncompressed video watermarking purposes. By reconsidering a statistical investigation combining four types of statistical tests, the thesis starts by identifying with accuracy the drawbacks and limitations of the popular Gaussian model in watermarking applications. Further on, an advanced statistical approach is developed in order to establish with mathematical rigour: 1. That a mathematical model for the original video content and/or attacks exists; 2. The model parameters. From the theoretical point of view, this means to prove for the first time the stationarity of the random processes representing the natural video and/or the watermarking attacks. These general results have been already validated under applicative and theoretical frameworks. On the one hand, when integrating the attack models into the IProtect watermarking method patented by Institut Télécom/ARTEMIS and SFR, a speed-up by a factor of 100 of the insertion procedure has been obtained. On the other hand, accurate models for natural video and attacks allowed the increasing of the precision in the computation of some basic information theory entities (entropies and capacity).
|
79 |
Sum-Product Network in the context of missing data / Sum-Product Nätverk i samband med saknade dataClavier, Pierre January 2020 (has links)
In recent years, the interest in new Deep Learning methods has increased considerably due to their robustness and applications in many fields. However, the lack of interpretability of these models and the lack of theoretical knowledge about them raises many issues. It is in this context that sum product network models have emerged. From a mathematical point of view, SPNs can be described as Directed Acyclic Graphs. In practice, they can be seen as deep mixture models and as a consequence they can be used to represent very rich collections of distributions. The objective of this master thesis was threefold. First we formalized the concept of SPNs with proper mathematical notations, using the concept of Directed Acyclic Graphs and Bayesian Networks theory. Then we developed a new method for learning the structure of a SPN, based on K-means and Mutual Information Theory. Finally we proposed a new method for the estimation of parameters in a fixed SPN, in the context of incomplete data. Our estimation method is based on maximum likelihood methods with the EM algorithm. / Under de senaste åren har intresset för nya Deep Learning-metoder ökat avsevärt på grund av deras robusthet samt deras tillämpning inom en mängd områden. Bristen på teoretisk kunskap om dessa modeller samt deras svårtolkad karaktär väcker emellertid många frågor. Det är i detta sammanhang som Sum-Product Network kom fram, vilken erbjuder en viss ambivalens då den situerar sig mellan ett linjärt neuralt nätverk utan aktiveringsfunktion och en sannolikhetsgraf. Inom vanliga applikationer med verklig data hittar vi ofta ofullständiga, censurerade eller trunkerad data. Inlärningen av dessa grafer till verklig data är dock fortfarande obefintlig. Syftet med detta examensarbete är att studera några grundläggande egenskaper hos Sum-Product Networks och försöka utöka deras inlärning och uppträning till ofullständig data. Trovärdighetsskattningar med hjälp av EM-algoritmer kommer att användas för att utöka inlärningen av dessa grafer till ofullständiga data.
|
80 |
The Single Imputation Technique in the Gaussian Mixture Model FrameworkAisyah, Binti M.J. January 2018 (has links)
Missing data is a common issue in data analysis. Numerous techniques have
been proposed to deal with the missing data problem. Imputation is the most
popular strategy for handling the missing data. Imputation for data analysis is
the process to replace the missing values with any plausible values. Two most
frequent imputation techniques cited in literature are the single imputation and
the multiple imputation.
The multiple imputation, also known as the golden imputation technique, has
been proposed by Rubin in 1987 to address the missing data. However, the
inconsistency is the major problem in the multiple imputation technique. The
single imputation is less popular in missing data research due to bias and less
variability issues. One of the solutions to improve the single imputation
technique in the basic regression model: the main motivation is that, the
residual is added to improve the bias and variability. The residual is drawn by
normal distribution assumption with a mean of 0, and the variance is equal to
the residual variance. Although new methods in the single imputation
technique, such as stochastic regression model, and hot deck imputation,
might be able to improve the variability and bias issues, the single imputation
techniques suffer with the uncertainty that may underestimate the R-square or
standard error in the analysis results.
The research reported in this thesis provides two imputation solutions for the
single imputation technique. In the first imputation procedure, the wild
bootstrap is proposed to improve the uncertainty for the residual variance in
the regression model. In the second solution, the predictive mean matching
(PMM) is enhanced, where the regression model is taking the main role to generate the recipient values while the observations in the donors are taken
from the observed values. Then the missing values are imputed by randomly
drawing one of the observations in the donor pool. The size of the donor pool
is significant to determine the quality of the imputed values. The fixed size of
donor is used to be employed in many existing research works with PMM
imputation technique, but might not be appropriate in certain circumstance
such as when the data distribution has high density region. Instead of using
the fixed size of donor pool, the proposed method applies the radius-based
solution to determine the size of donor pool. Both proposed imputation
procedures will be combined with the Gaussian mixture model framework to
preserve the original data distribution.
The results reported in the thesis from the experiments on benchmark and
artificial data sets confirm improvement for further data analysis. The proposed
approaches are therefore worthwhile to be considered for further investigation
and experiments.
|
Page generated in 0.0681 seconds