11 |
Multi-view hockey tracking with trajectory smoothing and camera selectionWu, Lan 11 1900 (has links)
We address the problem of multi-view multi-target tracking using multiple stationary cameras in the application of hockey tracking and test the approach with data from two cameras. The system is based on the previous work by Okuma et al. [50]. We replace AdaBoost detection with blob detection in both image coordinate systems after background subtraction. The sets of blob-detection results are then mapped to the rink coordinate system using a homography transformation. These observations are further merged into the final detection result which will be incorporated into the particle filter. In addition, we extend the particle filter to use multiple observation models, each corresponding to a view. An observation likelihood and a reference color model are also maintained for each player in each view and are updated only when the player is not occluded in that view. As a result of the expanded coverage range and multiple perspectives in the multi-view tracking, even when the target is occluded in one view, it still can be tracked as long as it is visible from another view. The multi-view tracking data are further processed by trajectory smoothing using the Maximum a posteriori smoother. Finally, automatic camera selection is performed using the Hidden Markov Model to create personalized video programs.
|
12 |
Multi-view hockey tracking with trajectory smoothing and camera selectionWu, Lan 11 1900 (has links)
We address the problem of multi-view multi-target tracking using multiple stationary cameras in the application of hockey tracking and test the approach with data from two cameras. The system is based on the previous work by Okuma et al. [50]. We replace AdaBoost detection with blob detection in both image coordinate systems after background subtraction. The sets of blob-detection results are then mapped to the rink coordinate system using a homography transformation. These observations are further merged into the final detection result which will be incorporated into the particle filter. In addition, we extend the particle filter to use multiple observation models, each corresponding to a view. An observation likelihood and a reference color model are also maintained for each player in each view and are updated only when the player is not occluded in that view. As a result of the expanded coverage range and multiple perspectives in the multi-view tracking, even when the target is occluded in one view, it still can be tracked as long as it is visible from another view. The multi-view tracking data are further processed by trajectory smoothing using the Maximum a posteriori smoother. Finally, automatic camera selection is performed using the Hidden Markov Model to create personalized video programs.
|
13 |
Automatické rozpoznávání zpěvu ptákůBřenek, Roman January 2014 (has links)
This master thesis deals with methods of automatic recognition of bird species by their voices. In first, I defined the database of records and created a reference data by handmade evaluation. The next step is to find the optimal features for describing a bird singing. I use a Human Frequency cepstral Coefficients (HFCC). For the best accuracy of recognition is necessary to correctly classify a bird's vocalization from a non-vocalization segments. The VAD system is based on an algorithm k-Nearest Neighbours. The last step describes the system based on Hidden Markov Models which allows to recognize the concrete bird species from the parts of bird's singing.
|
14 |
Model based approaches to array CGH data analysisShah, Sohrab P. 05 1900 (has links)
DNA copy number alterations (CNAs) are genetic changes that can produce
adverse effects in numerous human diseases, including cancer. CNAs are
segments of DNA that have been deleted or amplified and can range in size
from one kilobases to whole chromosome arms. Development of array
comparative genomic hybridization (aCGH) technology enables CNAs to be
measured at sub-megabase resolution using tens of thousands of probes.
However, aCGH data are noisy and result in continuous valued measurements of
the discrete CNAs. Consequently, the data must be processed through
algorithmic and statistical techniques in order to derive meaningful
biological insights. We introduce model-based approaches to analysis of aCGH
data and develop state-of-the-art solutions to three distinct analytical
problems.
In the simplest scenario, the task is to infer CNAs from a single aCGH
experiment. We apply a hidden Markov model (HMM) to accurately identify
CNAs from aCGH data. We show that borrowing statistical strength across
chromosomes and explicitly modeling outliers in the data, improves on
baseline models.
In the second scenario, we wish to identify recurrent CNAs in a set of aCGH
data derived from a patient cohort. These are locations in the genome
altered in many patients, providing evidence for CNAs that may be playing
important molecular roles in the disease. We develop a novel hierarchical
HMM profiling method that explicitly models both statistical and biological
noise in the data and is capable of producing a representative profile for a
set of aCGH experiments. We demonstrate that our method is more accurate
than simpler baselines on synthetic data, and show our model produces output
that is more interpretable than other methods.
Finally, we develop a model based clustering framework to stratify a patient
cohort, expected to be composed of a fixed set of molecular subtypes. We
introduce a model that jointly infers CNAs, assigns patients to subgroups
and infers the profiles that represent each subgroup. We show our model to
be more accurate on synthetic data, and show in two patient cohorts how the
model discovers putative novel subtypes and clinically relevant subgroups. / Science, Faculty of / Computer Science, Department of / Graduate
|
15 |
Style-driven virtual camera control in 3D environments / Contrôle, basé sur le style, de caméras virtuelles dans des environnements 3DMerabti, Billal 24 September 2015 (has links)
Calculer automatiquement une séquence d'images cinématographiquement cohérente, sur un ensemble d'actions qui se produisent dans un monde 3D, est une tâche complexe. Elle nécessite non seulement le calcul des plans de caméra ( points de vue ) et les transitions appropriées entre ces plans (coupures), mais aussi la capacité d'encoder et de reproduire des éléments de style cinématographique. Les modèles proposés dans la littérature, fondés généralement sur des représentations à machines d'états finis (FSMs), fournissent des fonctionnalités limitées pour construire des séquences de plans et ne permettent pas d'effectuer d'importantes variations de style sur une même séquence d'actions. Dans cette thèse, nous proposons d'abord un modèle cinématographique expressif, basé données, qui peut calculer des variations significatives en termes de style, avec la possibilité de contrôler la durée des prises de vue et la possibilité d'ajouter des contraintes spécifiques à la séquence souhaitée. Le modèle est paramétré de manière à faciliter l'application de techniques d'apprentissage pour reproduire des éléments de style extraits de films réels en utilisant une représentation à base de modèle de Markov caché du processus de montage. Le modèle proposé est à la fois plus général que les représentations existantes, et se révèle être plus expressif dans sa capacité à encoder précisément des éléments de style cinématographique pour des scènes de dialogues. Ensuite, nous introduisons une extension plus générique pour généraliser notre système de montage afin de traiter des contenus cinématographique plus complexes (autres que les dialogues). Il s'agit d'utiliser des Réseaux bayésiens dynamiques à la place des modèles de Markov à états cachés. Enfin, nous avons conçu un outil d'annotation et un format de représentation de données cinématographiques afin de simplifier le processus de création et de manipulation de ses données. Les données collectées serviront comme bases d'apprentissage pour des techniques basées données, telles que les nôtres, ainsi que pour l'analyse de films. / Automatically computing a cinematographically consistent sequence of shots, over a set of actions occurring in a 3D world, is a complex task which requires not only the computation of appropriate shots (viewpoints) and appropriate transitions between shots (cuts), but the ability to encode and reproduce elements of cinematographic style. The models proposed in the literature, generally rule-based, provide limited functionalities to build sequences of shots. These approaches are not designed in mind to easily learn elements of cinematographic style, nor do they allow to perform significant variations in style over the same sequence of actions. In this thesis, we first propose a data-driven model for automated cinematography (framing and editing) that can compute significant variations in terms of cinematographic style, with the ability to control the duration of shots and the possibility to add specific constraints to the desired sequence. By using a Hidden Markov Model representation of the editing process, we demonstrate the possibility of easily reproducing elements of style extracted from real movies. The proposed model is more elaborate in handling dialogue scenes than existing representations, and proves to be more expressive in its ability to precisely encode elements of cinematographic style. Then, we introduce an extension of this model to account for more complex environments (more than dialogues). To this end, we use a more general statistical model: Dynamic Bayesian Network, which enlarges considerably the possibilities in editing and film analysis. We finally designed a data annotation tool and a format to easily create film annotations that would be used for data-driven cinematography or film analysis.
|
16 |
Multi-view hockey tracking with trajectory smoothing and camera selectionWu, Lan 11 1900 (has links)
We address the problem of multi-view multi-target tracking using multiple stationary cameras in the application of hockey tracking and test the approach with data from two cameras. The system is based on the previous work by Okuma et al. [50]. We replace AdaBoost detection with blob detection in both image coordinate systems after background subtraction. The sets of blob-detection results are then mapped to the rink coordinate system using a homography transformation. These observations are further merged into the final detection result which will be incorporated into the particle filter. In addition, we extend the particle filter to use multiple observation models, each corresponding to a view. An observation likelihood and a reference color model are also maintained for each player in each view and are updated only when the player is not occluded in that view. As a result of the expanded coverage range and multiple perspectives in the multi-view tracking, even when the target is occluded in one view, it still can be tracked as long as it is visible from another view. The multi-view tracking data are further processed by trajectory smoothing using the Maximum a posteriori smoother. Finally, automatic camera selection is performed using the Hidden Markov Model to create personalized video programs. / Science, Faculty of / Computer Science, Department of / Graduate
|
17 |
Modelagem computacional de famílias de proteínas microbianas relevantes para produção de bioenergia / Computational modeling of microbial protein families relevants to bioenergy production process.Rego, Fernanda Orpinelli Ramos do 17 August 2015 (has links)
Modelos ocultos de Markov (HMMs - hidden Markov models) são ferramentas essenciais para anotação automática de proteínas. Por muitos anos, bancos de dados de famílias de proteínas baseados em HMMs têm sido disponibilizados para a comunidade científica (e.g. TIGRfams). Muitos esforços também têm sido dedicados à geração automática de HMMs de famílias de proteínas (e.g. PANTHER). No entanto, HMMs manualmente curados de famílias de proteínas permanecem como o padrão-ouro para anotação de genomas. Neste contexto, este trabalho teve como principal objetivo a geração de cerca de 80 famílias de proteínas microbianas relevantes para produção de bioenergia, baseadas em HMMs. Para gerar os HMMs, seguimos um protocolo de curadoria manual, gerado neste trabalho. Partimos de uma proteína que tenha função experimentalmente comprovada, esteja associada a uma publicação e tenha sido manualmente anotada com termos da Gene Ontology, criados pelo projeto MENGO¹ (Microbial ENergy Gene Ontology). Os próximos passos consistiram na (1) definição de um critério de seleção para inclusão de membros à família; (2) busca por membros via BLAST; (3) geração do alinhamento múltiplo (MUSCLE 3.7) e do HMM (HMMER 3.0); (4) análise dos resultados e iteração do processo, com o HMM preliminar usado nas buscas adicionais; (5) definição de uma nota de corte (cutoff) para o HMM final; (6) validação individual dos modelos. As principais contribuições deste trabalho são 74 HMMs (manualmente curados) disponibilizados via web (http://mengofams.lbi.iq.usp.br/), onde é possível fazer buscas e o download dos modelos, um protocolo detalhado sobre a curadoria manual de HMMs para famílias de proteínas e uma lista com proteínas candidatas a reanotação. / Hidden Markov Models (HMMs) are essential tools for automated annotation of protein sequences. For many years now protein family resources based on HMMs have been made available to the scientific community (e.g. TIGRfams). Much effort has also been devoted to the automated generation of protein family HMMs (e.g Panther). However, manually curated protein family HMMs remain the gold standard for use in genome annotation. In this context, this work had as main objectives the generation of appoximately 80 protein families based on HMMs. We follow a standard protocol, that was generated in this work, to create the HMMs. At first, we start from a protein with experimentally proven function, associated to a publication and that was manually annotated with new terms from Gene Ontology provided by MENGO¹ (Microbial ENergy Gene Ontology). The next steps consists of (1) definition of selection criteria to capture members of the family; (2) search for members via BLAST; (3) generation of multiple alignment (MUSCLE 3.7) and the HMM (HMMER 3.0); (4) result analysis and iteration of the process, using the preliminary HMM; (5) cutoff definition to the final HMM; (6) individual validation of the models using tests against NCBIs NR database. The main deliverables of this work are 74 HMMs manually curated available in the site project (mengofams.lbi.iq.usp.br) that allows browsing and download of all HMMs curated so far, a standard protocol manual curation of protein families, a list with proteins that need to be reviewed.
|
18 |
Propagação semi-automática de termos Gene Ontology a proteínas com potencial biotecnológico para a produção de bioenergia / Semi-automatic propagation of Gene Ontology terms to proteins with biotechnology potential for bioenergy productionTaniguti, Lucas Mitsuo 18 November 2014 (has links)
O aumento no volume de dados biológicos, oriundos principalmente do surgimento de sequenciadores de segunda geração, configura um desafio para a manutenção dos bancos de dados, que devem armazenar, disponibilizar e, no caso de bancos secundários, propagar informações biológicas para sequências sem caracterização experimental. Tal propagação é crucial , pois o fluxo com que novas sequências são depositadas é muito superior ao que proteínas são experimentalmente caracterizadas. De forma análoga ao EC number (Enzyme Commission number), a organização de proteínas em famílias visa organizar e facilitar operações automáticas nos bancos de dados. Dentro desse contexto este trabalho teve como objetivos a geração de modelos computacionais para famílias de proteínas envolvidas em processos microbianos biotecnologicamente interessantes para a produção de bioenergia. Para a geração dos modelos estatísticos foram escolhidas proteínas referência analisadas a priori em colaboração com o projeto MENGO1 . A partir da proteína referência foram realizadas buscas no UniProtKB com o objetivo de encontrar proteínas representativas para cada família e descrições de função com base na literatura científica. Com a coleção de sequências primárias das proteínas selecionadas foram realizados alinhamentos múltiplos de sequências com o programa MUSCLE 3.7 e posteriormente com o programa HMMER foram gerados os modelos computacionais (perfis de cadeia oculta de Markov). Os modelos passaram por consecutivas revisões para serem utilizados na propagação dos termos do Gene Ontology com confiança.Um total de 1.233 proteínas puderam receber os termos GO. Dessas proteínas 79% não apresentavam os termos GO disponibilizados no banco de dados UniProtKB. Uma comparação dos perfis-HMM com a utilização de redes de similaridade a um E-value de 10-14 confirmou a utilidade dos modelos na propagação adequada dos termos. Uma segunda validação utilizando um banco de dados construído com sequências aleatórias com base nos modelos e na frequência de codons das proteínas anotadas do SwisProt permitiu verificar a sensibilidade da estratégia quanto a recuperar membros não pertencentes aos modelos gerados. / The increase of biological data produced mainly by the second generation technologies stands as a challenge for the biological databases, that needs to adress issues like storage, data availability and, in the case of secondary databases, to propagate biological information to sequences with no experimental characterization. The propagation is important since the flow that new sequences are submited into databases is much higher than proteins having their function described by experiments. Similarly to the EC. number (Enzyme Commission number), an organization of protein families aims to organize and help automatic processes in databases. In this context this work had as goals the generation of computational models for protein families related to microbial processes with biotechnology potential for production of bioenergy. Several proteins annotated by MENGO2, a project in collaboration, were used as seeds to the statistic models. Alignments were made on UniProtKB, querying the seeds proteins, looking for representatives for each family generated and the existence of function descriptions referenced on the cientific literature. Multiple sequence alignment were made on each collection of seeds proteins, representatives of the families, thorough the MUSCLE 3.7 program, and after were generated the computational models (profile Hidden Markov Models) with the HMMER package. The models were consecutively reviewed until the curator consider it reliable for propagation of Gene Ontology terms. A set of 1,233 proteins from UniProtKB were classified in our families, suggesting that they could be annotated by the GO terms using MENGOfams families. From those proteins, 79% were not annotated by the MENGO specific GO terms. To compare the results that would be obtained using only BLAST similarity measures and using pHMMs we generated similarity networks, using an Evaue cutoff of 10-14. The results showed that the classification results of pHMMs are valuable for biological annotation propagation because it identifies precisely members of each family. A second analysis was applied for each family, using the respective pHMMs to query a collection of sequences generated by a null model. For null model were assumed that all sequences were not homologous and could be represented just by the aminoacid frequencies observed in the SwissProt database. No non-homologous proteins were classified as members by the MENGOfams models, suggesting that they were sensitive to identify only true member sequences.
|
19 |
Propagação semi-automática de termos Gene Ontology a proteínas com potencial biotecnológico para a produção de bioenergia / Semi-automatic propagation of Gene Ontology terms to proteins with biotechnology potential for bioenergy productionLucas Mitsuo Taniguti 18 November 2014 (has links)
O aumento no volume de dados biológicos, oriundos principalmente do surgimento de sequenciadores de segunda geração, configura um desafio para a manutenção dos bancos de dados, que devem armazenar, disponibilizar e, no caso de bancos secundários, propagar informações biológicas para sequências sem caracterização experimental. Tal propagação é crucial , pois o fluxo com que novas sequências são depositadas é muito superior ao que proteínas são experimentalmente caracterizadas. De forma análoga ao EC number (Enzyme Commission number), a organização de proteínas em famílias visa organizar e facilitar operações automáticas nos bancos de dados. Dentro desse contexto este trabalho teve como objetivos a geração de modelos computacionais para famílias de proteínas envolvidas em processos microbianos biotecnologicamente interessantes para a produção de bioenergia. Para a geração dos modelos estatísticos foram escolhidas proteínas referência analisadas a priori em colaboração com o projeto MENGO1 . A partir da proteína referência foram realizadas buscas no UniProtKB com o objetivo de encontrar proteínas representativas para cada família e descrições de função com base na literatura científica. Com a coleção de sequências primárias das proteínas selecionadas foram realizados alinhamentos múltiplos de sequências com o programa MUSCLE 3.7 e posteriormente com o programa HMMER foram gerados os modelos computacionais (perfis de cadeia oculta de Markov). Os modelos passaram por consecutivas revisões para serem utilizados na propagação dos termos do Gene Ontology com confiança.Um total de 1.233 proteínas puderam receber os termos GO. Dessas proteínas 79% não apresentavam os termos GO disponibilizados no banco de dados UniProtKB. Uma comparação dos perfis-HMM com a utilização de redes de similaridade a um E-value de 10-14 confirmou a utilidade dos modelos na propagação adequada dos termos. Uma segunda validação utilizando um banco de dados construído com sequências aleatórias com base nos modelos e na frequência de codons das proteínas anotadas do SwisProt permitiu verificar a sensibilidade da estratégia quanto a recuperar membros não pertencentes aos modelos gerados. / The increase of biological data produced mainly by the second generation technologies stands as a challenge for the biological databases, that needs to adress issues like storage, data availability and, in the case of secondary databases, to propagate biological information to sequences with no experimental characterization. The propagation is important since the flow that new sequences are submited into databases is much higher than proteins having their function described by experiments. Similarly to the EC. number (Enzyme Commission number), an organization of protein families aims to organize and help automatic processes in databases. In this context this work had as goals the generation of computational models for protein families related to microbial processes with biotechnology potential for production of bioenergy. Several proteins annotated by MENGO2, a project in collaboration, were used as seeds to the statistic models. Alignments were made on UniProtKB, querying the seeds proteins, looking for representatives for each family generated and the existence of function descriptions referenced on the cientific literature. Multiple sequence alignment were made on each collection of seeds proteins, representatives of the families, thorough the MUSCLE 3.7 program, and after were generated the computational models (profile Hidden Markov Models) with the HMMER package. The models were consecutively reviewed until the curator consider it reliable for propagation of Gene Ontology terms. A set of 1,233 proteins from UniProtKB were classified in our families, suggesting that they could be annotated by the GO terms using MENGOfams families. From those proteins, 79% were not annotated by the MENGO specific GO terms. To compare the results that would be obtained using only BLAST similarity measures and using pHMMs we generated similarity networks, using an Evaue cutoff of 10-14. The results showed that the classification results of pHMMs are valuable for biological annotation propagation because it identifies precisely members of each family. A second analysis was applied for each family, using the respective pHMMs to query a collection of sequences generated by a null model. For null model were assumed that all sequences were not homologous and could be represented just by the aminoacid frequencies observed in the SwissProt database. No non-homologous proteins were classified as members by the MENGOfams models, suggesting that they were sensitive to identify only true member sequences.
|
20 |
Modelagem computacional de famílias de proteínas microbianas relevantes para produção de bioenergia / Computational modeling of microbial protein families relevants to bioenergy production process.Fernanda Orpinelli Ramos do Rego 17 August 2015 (has links)
Modelos ocultos de Markov (HMMs - hidden Markov models) são ferramentas essenciais para anotação automática de proteínas. Por muitos anos, bancos de dados de famílias de proteínas baseados em HMMs têm sido disponibilizados para a comunidade científica (e.g. TIGRfams). Muitos esforços também têm sido dedicados à geração automática de HMMs de famílias de proteínas (e.g. PANTHER). No entanto, HMMs manualmente curados de famílias de proteínas permanecem como o padrão-ouro para anotação de genomas. Neste contexto, este trabalho teve como principal objetivo a geração de cerca de 80 famílias de proteínas microbianas relevantes para produção de bioenergia, baseadas em HMMs. Para gerar os HMMs, seguimos um protocolo de curadoria manual, gerado neste trabalho. Partimos de uma proteína que tenha função experimentalmente comprovada, esteja associada a uma publicação e tenha sido manualmente anotada com termos da Gene Ontology, criados pelo projeto MENGO¹ (Microbial ENergy Gene Ontology). Os próximos passos consistiram na (1) definição de um critério de seleção para inclusão de membros à família; (2) busca por membros via BLAST; (3) geração do alinhamento múltiplo (MUSCLE 3.7) e do HMM (HMMER 3.0); (4) análise dos resultados e iteração do processo, com o HMM preliminar usado nas buscas adicionais; (5) definição de uma nota de corte (cutoff) para o HMM final; (6) validação individual dos modelos. As principais contribuições deste trabalho são 74 HMMs (manualmente curados) disponibilizados via web (http://mengofams.lbi.iq.usp.br/), onde é possível fazer buscas e o download dos modelos, um protocolo detalhado sobre a curadoria manual de HMMs para famílias de proteínas e uma lista com proteínas candidatas a reanotação. / Hidden Markov Models (HMMs) are essential tools for automated annotation of protein sequences. For many years now protein family resources based on HMMs have been made available to the scientific community (e.g. TIGRfams). Much effort has also been devoted to the automated generation of protein family HMMs (e.g Panther). However, manually curated protein family HMMs remain the gold standard for use in genome annotation. In this context, this work had as main objectives the generation of appoximately 80 protein families based on HMMs. We follow a standard protocol, that was generated in this work, to create the HMMs. At first, we start from a protein with experimentally proven function, associated to a publication and that was manually annotated with new terms from Gene Ontology provided by MENGO¹ (Microbial ENergy Gene Ontology). The next steps consists of (1) definition of selection criteria to capture members of the family; (2) search for members via BLAST; (3) generation of multiple alignment (MUSCLE 3.7) and the HMM (HMMER 3.0); (4) result analysis and iteration of the process, using the preliminary HMM; (5) cutoff definition to the final HMM; (6) individual validation of the models using tests against NCBIs NR database. The main deliverables of this work are 74 HMMs manually curated available in the site project (mengofams.lbi.iq.usp.br) that allows browsing and download of all HMMs curated so far, a standard protocol manual curation of protein families, a list with proteins that need to be reviewed.
|
Page generated in 0.03 seconds