• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 134
  • 42
  • 24
  • 14
  • 14
  • 6
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 288
  • 288
  • 288
  • 63
  • 44
  • 43
  • 42
  • 36
  • 33
  • 33
  • 32
  • 31
  • 31
  • 28
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

Custom supply chain engineering : modeling and risk management : application to the customs / Ingénierie de la chaîne logistique douanière : modélisation et gestion de risques : application au cas des douanes

Hammadi, Lamia 10 December 2018 (has links)
La sécurité, la sûreté et l’efficacité de la chaîne logistique internationale revêtent une importance capitale pour le gouvernement, pour ses intérêts financiers et économiques et pour la sécurité de ses résidents. À cet égard, la société est confrontée à des multiples menaces, telles que le trafic illicite de drogues, d’armes ou autre type de contrebande, ainsi que la contrefaçon et la fraude commerciale. Pour contrer (détecter, prévenir, enquêter et atténuer) ces menaces, le rôle des douanes se pose en tant que gardiens du commerce international et acteurs principaux de la sécurisation de la chaîne logistique internationale. Les douanes interviennent à tous les stades de l'acheminement des marchandises ; toutes les transactions en provenance ou à destination des pays doivent être traitées par leurs services douaniers. Dans un tel environnement, les douanes deviennent un élément essentiel de la chaîne logistique. Nous adoptons ce point de vue, avec un accent particulier sur les opérations douanières et, pour souligner cet objectif, nous appelons cette analyse "chaîne logistique douanière". Dans cette thèse, nous avons tout d’abord mis en place le concept de chaîne logistique douanière, en identifiant les acteurs et les liens structurels entre eux, puis en établissant la cartographie des processus, l’approche d’intégration et le modèle de mesure de performance du concept proposé. Deuxièmement, nous développons une nouvelle approche de gestion de risques dans la chaîne logistique douanière basée sur une approche qualitative. Une telle approche conduit à identifier les classes de risques et à recommander les meilleures solutions afin de réduire le niveau de risque. Notre approche est appliquée dans la douane Marocaine en considérant la criticité comme un indicateur de risque en premier temps, en appliquant la méthode AMDEC (Analyse des modes de défaillance, de leurs effets et de leur criticité) et la méthode ABC croisée et le poids prioritaire en deuxième temps, en utilisant la méthode AHP (Analytic Hierarchy Process) et la méthode AHP floue (c.-à-d. Évaluation de risques sous incertitude); puis une analyse comparative des deux indicateurs est effectuée afin d’examiner l’efficacité des résultats obtenus. Enfin, nous développons des modèles stochastiques pour les séries chronologiques de risques qui abordent le défi le plus important de la modélisation de risques dans le contexte douanier : la Saisonnalité. Plus précisément, nous proposons d’une part des modèles basés sur la quantification des incertitudes pour décrire les comportements mensuels. Les différents modèles sont ajustés en utilisant la méthode de coïncidence des moments sur des séries temporelles de quantités saisies du trafic illicite dans cinq sites. D'autre part, des modèles de Markov cachés sont ajustés à l'aide de l'algorithme EM sur les mêmes séquences d’observations. Nous montrons que nos modèles permettent avec précision de gérer et de décrire les composantes saisonnières des séries chronologiques de risques dans le contexte douanier. On montre également que les modèles ajustés sont interprétables et fournissent une bonne description des propriétés importantes des données, telles que la structure du second ordre et les densités de probabilité par saison et par site. / The security, safety and efficiency of the international supply chain are of central importance for the governments, for their financial and economic interests and for the security of its residents. In this regard, the society faces multiple threats, such as illicit traffic of drugs, arms and other contraband, as well as counterfeiting and commercial fraud. For countering (detecting, preventing, investigating and mitigating) such threats, the role of customs arises as the gatekeepers of international trade and the main actor in securing the international supply chain. Customs intervene in all stages along the routing of cargo; all transactions leaving or entering the country must be processed by the custom agencies. In such an environment, customs become an integral thread within the supply chain. We adopt this point of view, with a particular focus on customs operations and, in order to underline this focus, we refer to this analysis as “customs supply chain”. In this thesis, we firstly set up the concept of customs supply chain, identify the actors and structural links between them, then establish the process mapping, integration approach and performance model. Secondly, we develop a new approach for managing risks in customs supply chain based on qualitative analysis. Such an approach leads to identify the risk classes as well as recommend best possible solutions to reduce the risk level. Our approach is applied in Moroccan customs by considering the criticality as a risk indicator. In a first time we use Failure Modes Effects Criticality Analysis (FMECA) and Cross Activity Based Costing (ABC) Method and priority weight; in the second time we use Analytic Hierarchy Process (AHP) and Fuzzy AHP (i.e., risk assessment under uncertainty); then a benchmarking of the two indicators is conducted in order to examine the effectiveness of the obtained results. Finally, we develop stochastic models for risk time series that address the most important challenge of risk modeling in the customs context: Seasonality. To be more specific, we propose on the one hand, models based on uncertainty quantification to describe monthly components. The different models are fitted using Moment Matching method to the time series of seized quantities of the illicit traffic on five sites. On the other hand, Hidden Markov Models which are fitted using the EM-algorithm on the same observation sequences. We show that these models allow to accurately handle and describe the seasonal components of risk time series in customs context. It is also shown that the fitted models can be easily interpreted and provide a good description of important properties of the data such as the second-order structure and Probability Density Function (PDFs) per season per site.
262

Identification and validation of putative therapeutic and diagnostic antimicrobial peptides against HIV: An in silico approach

January 2013 (has links)
Magister Scientiae (Medical Bioscience) - MSc(MBS) / Background: Despite the effort of scientific research on HIV therapies and to reduce the rate of HIV infection, AIDS remains one of the major causes of death in the world and mostly in sub-Saharan Africa. To date, neither a cure nor an HIV vaccine had been found and the disease can only be managed by using High Active Antiretroviral Therapy (HAART) if detected early. The need for an effective early diagnostic and non-toxic treatment has brought about the necessity for the discovery of additional HIV diagnostic methods and treatment regimens to lower mortality rates. Antimicrobial Peptides (AMPs) are components of the first line of defense of prokaryotes and eukaryotes and have been proven to be promising therapeutic agents against HIV. Methods: With the utility of computational biology, this work proposes the use of profile search methods combined with structural modeling to identify putative AMPs with diagnostic and anti-HIV activity. Firstly, experimentally validated anti-HIV AMPs were retrieved from various publicly available AMP databases, APD, CAMP, Bactibase and UniProtKB and classified according to super-families. Hidden Markov Model (HMMER) and Gap Local Alignment of Motifs (GLAM2) profiles were built for each super-family of anti- HIV AMPs. Putative anti-HIV AMPs were identified after scanning genome sequence databases using the trained models, retrieved AMPs, and ranked based on their E-values. The 3-D structures of the 10 peptides that were ranked highest were predicted using 1-TASSER. These peptides were docked against various HIV proteins using PatchDock and putative AMPs showing the highest affinity and having the correct orientation to the HIV -1 proteins gp120 and p24 were selected for future work to establish their function in HIV therapy and diagnosis. Results: The results of the in silica analysis showed that the constructed models using the HMMER algorithm had better performances compare to that of the models built by the GLAM2 algorithm. Furthermore, the former tool has a better statistical and probability explanation compared to the latter tool. Thus only the HMMER scanning results were considered for further study. Out of 1059 species scanned by the HMMER models, 30 putative anti-HIV AMPs were identified from genome scans with the family-specific profile models after the elimination of duplicate peptides. Docking analysis of putative AMPs against HIV proteins showed that from the 10 best performing anti-HIV AMPs with the highest E-scores, molecules 1,3, 8, and 10 firmly bind the gp120 binding pocket at the VIN2 domain and the point of interaction between gp120 and T cells, with the 1st and 3rd highest scoring anti-HIV AMPs having the highest binding affinities. However, all 10 putative anti-HIV AMPs bind to the N-terminal domain of p24 with large surface interaction, rather than the C-terminal. Conclusion: The in silica approach has made it possible to construct computational models having high performances, and which enabled the identification of putative anti-HIV peptides from genome sequence scans. The in silica validation of these putative peptides through docking studies has shown that some of these AMPs may be involved in HIV/AIDS therapeutics and diagnostics. The molecular validation of these findings will be the way forward for the development of an early diagnostic tool and as a consequence initiate early treatment. This will prevent the invasion of the immune system by blocking the VIN2 domain and thus designing of a successful vaccine with broad neutralizing activity against this domain.
263

Training of Hidden Markov models as an instance of the expectation maximization algorithm

Majewsky, Stefan 22 August 2017 (has links)
In Natural Language Processing (NLP), speech and text are parsed and generated with language models and parser models, and translated with translation models. Each model contains a set of numerical parameters which are found by applying a suitable training algorithm to a set of training data. Many such training algorithms are instances of the Expectation-Maximization (EM) algorithm. In [BSV15], a generic EM algorithm for NLP is described. This work presents a particular speech model, the Hidden Markov model, and its standard training algorithm, the Baum-Welch algorithm. It is then shown that the Baum-Welch algorithm is an instance of the generic EM algorithm introduced by [BSV15], from which follows that all statements about the generic EM algorithm also apply to the Baum-Welch algorithm, especially its correctness and convergence properties.:1 Introduction 1.1 N-gram models 1.2 Hidden Markov model 2 Expectation-maximization algorithms 2.1 Preliminaries 2.2 Algorithmic skeleton 2.3 Corpus-based step mapping 2.4 Simple counting step mapping 2.5 Regular tree grammars 2.6 Inside-outside step mapping 2.7 Review 3 The Hidden Markov model 3.1 Forward and backward algorithms 3.2 The Baum-Welch algorithm 3.3 Deriving the Baum-Welch algorithm 3.3.1 Model parameter and countable events 3.3.2 Tree-shaped hidden information 3.3.3 Complete-data corpus 3.3.4 Inside weights 3.3.5 Outside weights 3.3.6 Complete-data corpus (cont.) 3.3.7 Step mapping 3.4 Review Appendix A Elided proofs from Chapter 3 A.1 Proof of Lemma 3.8 A.2 Proof of Lemma 3.9 B Formulary for Chapter 3 Bibliography
264

Frequency based efficiency evaluation - from pattern recognition via backwards simulation to purposeful drive design

Starke, Martin, Beck, Benjamin, Ritz, Denis, Will, Frank, Weber, Jürgen 23 June 2020 (has links)
The efficiency of hydraulic drive systems in mobile machines is influenced by several factors, like the operators’ guidance, weather conditions, material respectively loading properties and primarily the working cycle. This leads to varying operation points, which have to be performed by the drive system. Regarding efficiency analysis, the usage of standardized working cycles gained through measurements or synthetically generated is state of the art. Thereby, only a small extract of the real usage profile is taken into account. This contribution deals with process pattern recognition (PPR) and frequency based efficiency evaluation to gain more precise information and conclusion for the drive design of mobile machines. By the example of an 18 t mobile excavator, the recognition system using Hidden – Markov - Models (HMM) and the efficiency evaluation process by means of backwards simulation of measured operation points will be described.
265

Analyse du contenu expressif des gestes corporels / Analysis of gestures expressive content

Truong, Arthur 21 September 2016 (has links)
Aujourd’hui, les recherches portant sur le geste manquent de modèles génériques. Les spécialistes du geste doivent osciller entre une formalisation excessivement conceptuelle et une description purement visuelle du mouvement. Nous reprenons les concepts développés par le chorégraphe Rudolf Laban pour l’analyse de la danse classique contemporaine, et proposons leur extension afin d’élaborer un modèle générique du geste basé sur ses éléments expressifs. Nous présentons également deux corpus de gestes 3D que nous avons constitués. Le premier, ORCHESTRE-3D, se compose de gestes pré-segmentés de chefs d’orchestre enregistrés en répétition. Son annotation à l’aide d’émotions musicales est destinée à l’étude du contenu émotionnel de la direction musicale. Le deuxième corpus, HTI 2014-2015, propose des séquences d’actions variées de la vie quotidienne. Dans une première approche de reconnaissance dite « globale », nous définissons un descripteur qui se rapporte à l’entièreté du geste. Ce type de caractérisation nous permet de discriminer diverses actions, ainsi que de reconnaître les différentes émotions musicales que portent les gestes des chefs d’orchestre de notre base ORCHESTRE-3D. Dans une seconde approche dite « dynamique », nous définissons un descripteur de trame gestuelle (e.g. défini pour tout instant du geste). Les descripteurs de trame sont utilisés des poses-clés du mouvement, de sorte à en obtenir à tout instant une représentation simplifiée et utilisable pour reconnaître des actions à la volée. Nous testons notre approche sur plusieurs bases de geste, dont notre propre corpus HTI 2014-2015 / Nowadays, researches dealing with gesture analysis suffer from a lack of unified mathematical models. On the one hand, gesture formalizations by human sciences remain purely theoretical and are not inclined to any quantification. On the other hand, the commonly used motion descriptors are generally purely intuitive, and limited to the visual aspects of the gesture. In the present work, we retain Laban Movement Analysis (LMA – originally designed for the study of dance movements) as a framework for building our own gesture descriptors, based on expressivity. Two datasets are introduced: the first one is called ORCHESTRE-3D, and is composed of pre-segmented orchestra conductors’ gestures, which have been annotated with the help of lexicon of musical emotions. The second one, HTI 2014-2015, comprises sequences of multiple daily actions. In a first experiment, we define a global feature vector based upon the expressive indices of our model and dedicated to the characterization of the whole gesture. This descriptor is used for action recognition purpose and to discriminate the different emotions of our orchestra conductors’ dataset. In a second approach, the different elements of our expressive model are used as a frame descriptor (e.g., describing the gesture at a given time). The feature space provided by such local characteristics is used to extract key poses of the motion. With the help of such poses, we obtain a per-frame sub-representation of body motions which is available for real-time action recognition purpose
266

Contributions to the joint segmentation and classification of sequences (My two cents on decoding and handwriting recognition)

España Boquera, Salvador 05 April 2016 (has links)
[EN] This work is focused on problems (like automatic speech recognition (ASR) and handwritten text recognition (HTR)) that: 1) can be represented (at least approximately) in terms of one-dimensional sequences, and 2) solving these problems entails breaking the observed sequence down into segments which are associated to units taken from a finite repertoire. The required segmentation and classification tasks are so intrinsically interrelated ("Sayre's Paradox") that they have to be performed jointly. We have been inspired by what some works call the "successful trilogy", which refers to the synergistic improvements obtained when considering: - a good formalization framework and powerful algorithms; - a clever design and implementation taking the best profit of hardware; - an adequate preprocessing and a careful tuning of all heuristics. We describe and study "two stage generative models" (TSGMs) comprising two stacked probabilistic generative stages without reordering. This model not only includes Hidden Markov Models (HMMs, but also "segmental models" (SMs). "Two stage decoders" may be deduced by simply running a TSGM in reversed way, introducing non determinism when required: 1) A directed acyclic graph (DAG) is generated and 2) it is used together with a language model (LM). One-pass decoders constitute a particular case. A formalization of parsing and decoding in terms of semiring values and language equations proposes the use of recurrent transition networks (RTNs) as a normal form for Context Free Grammars (CFGs), using them in a parsing-as-composition paradigm, so that parsing CFGs result in a slight extension of regular ones. Novel transducer composition algorithms have been proposed that can work with RTNs and can deal with null transitions without resorting to filter-composition even in the presence of null transitions and non-idempotent semirings. A review of LMs is described and some contributions mainly focused on LM interfaces, LM representation and on the evaluation of Neural Network LMs (NNLMs) are provided. A review of SMs includes the combination of generative and discriminative segmental models and general scheme of frame emission and another one of SMs. Some fast cache-friendly specialized Viterbi lexicon decoders taking profit of particular HMM topologies are proposed. They are able to manage sets of active states without requiring dictionary look-ups (e.g. hashing). A dataflow architecture allowing the design of flexible and diverse recognition systems from a little repertoire of components has been proposed, including a novel DAG serialization protocol. DAG generators can take over-segmentation constraints into account, make use SMs other than HMMs, take profit of the specialized decoders proposed in this work and use a transducer model to control its behavior making it possible, for instance, to use context dependent units. Relating DAG decoders, they take profit of a general LM interface that can be extended to deal with RTNs. Some improvements for one pass decoders are proposed by combining the specialized lexicon decoders and the "bunch" extension of the LM interface, including an adequate parallelization. The experimental part is mainly focused on HTR tasks on different input modalities (offline, bimodal). We have proposed some novel preprocessing techniques for offline HTR which replace classical geometrical heuristics and make use of automatic learning techniques (neural networks). Experiments conducted on the IAM database using this new preprocessing and HMM hybridized with Multilayer Perceptrons (MLPs) have obtained some of the best results reported for this reference database. Among other HTR experiments described in this work, we have used over-segmentation information, tried lexicon free approaches, performed bimodal experiments and experimented with the combination of hybrid HMMs with holistic classifiers. / [ES] Este trabajo se centra en problemas (como reconocimiento automático del habla (ASR) o de escritura manuscrita (HTR)) que cumplen: 1) pueden representarse (quizás aproximadamente) en términos de secuencias unidimensionales, 2) su resolución implica descomponer la secuencia en segmentos que se pueden clasificar en un conjunto finito de unidades. Las tareas de segmentación y de clasificación necesarias están tan intrínsecamente interrelacionadas ("paradoja de Sayre") que deben realizarse conjuntamente. Nos hemos inspirado en lo que algunos autores denominan "La trilogía exitosa", refereido a la sinergia obtenida cuando se tiene: - un buen formalismo, que dé lugar a buenos algoritmos; - un diseño e implementación ingeniosos y eficientes, que saquen provecho de las características del hardware; - no descuidar el "saber hacer" de la tarea, un buen preproceso y el ajuste adecuado de los diversos parámetros. Describimos y estudiamos "modelos generativos en dos etapas" sin reordenamientos (TSGMs), que incluyen no sólo los modelos ocultos de Markov (HMM), sino también modelos segmentales (SMs). Se puede obtener un decodificador de "dos pasos" considerando a la inversa un TSGM introduciendo no determinismo: 1) se genera un grafo acíclico dirigido (DAG) y 2) se utiliza conjuntamente con un modelo de lenguaje (LM). El decodificador de "un paso" es un caso particular. Se formaliza el proceso de decodificación con ecuaciones de lenguajes y semianillos, se propone el uso de redes de transición recurrente (RTNs) como forma normal de gramáticas de contexto libre (CFGs) y se utiliza el paradigma de análisis por composición de manera que el análisis de CFGs resulta una extensión del análisis de FSA. Se proponen algoritmos de composición de transductores que permite el uso de RTNs y que no necesita recurrir a composición de filtros incluso en presencia de transiciones nulas y semianillos no idempotentes. Se propone una extensa revisión de LMs y algunas contribuciones relacionadas con su interfaz, con su representación y con la evaluación de LMs basados en redes neuronales (NNLMs). Se ha realizado una revisión de SMs que incluye SMs basados en combinación de modelos generativos y discriminativos, así como un esquema general de tipos de emisión de tramas y de SMs. Se proponen versiones especializadas del algoritmo de Viterbi para modelos de léxico y que manipulan estados activos sin recurrir a estructuras de tipo diccionario, sacando provecho de la caché. Se ha propuesto una arquitectura "dataflow" para obtener reconocedores a partir de un pequeño conjunto de piezas básicas con un protocolo de serialización de DAGs. Describimos generadores de DAGs que pueden tener en cuenta restricciones sobre la segmentación, utilizar modelos segmentales no limitados a HMMs, hacer uso de los decodificadores especializados propuestos en este trabajo y utilizar un transductor de control que permite el uso de unidades dependientes del contexto. Los decodificadores de DAGs hacen uso de un interfaz bastante general de LMs que ha sido extendido para permitir el uso de RTNs. Se proponen también mejoras para reconocedores "un paso" basados en algoritmos especializados para léxicos y en la interfaz de LMs en modo "bunch", así como su paralelización. La parte experimental está centrada en HTR en diversas modalidades de adquisición (offline, bimodal). Hemos propuesto técnicas novedosas para el preproceso de escritura que evita el uso de heurísticos geométricos. En su lugar, utiliza redes neuronales. Se ha probado con HMMs hibridados con redes neuronales consiguiendo, para la base de datos IAM, algunos de los mejores resultados publicados. También podemos mencionar el uso de información de sobre-segmentación, aproximaciones sin restricción de un léxico, experimentos con datos bimodales o la combinación de HMMs híbridos con reconocedores de tipo holístico. / [CA] Aquest treball es centra en problemes (com el reconeiximent automàtic de la parla (ASR) o de l'escriptura manuscrita (HTR)) on: 1) les dades es poden representar (almenys aproximadament) mitjançant seqüències unidimensionals, 2) cal descompondre la seqüència en segments que poden pertanyer a un nombre finit de tipus. Sovint, ambdues tasques es relacionen de manera tan estreta que resulta impossible separar-les ("paradoxa de Sayre") i s'han de realitzar de manera conjunta. Ens hem inspirat pel que alguns autors anomenen "trilogia exitosa", referit a la sinèrgia obtinguda quan prenim en compte: - un bon formalisme, que done lloc a bons algorismes; - un diseny i una implementació eficients, amb ingeni, que facen bon us de les particularitats del maquinari; - no perdre de vista el "saber fer", emprar un preprocés adequat i fer bon us dels diversos paràmetres. Descrivim i estudiem "models generatiu amb dues etapes" sense reordenaments (TSGMs), que inclouen no sols inclouen els models ocults de Markov (HMM), sinò també models segmentals (SM). Es pot obtindre un decodificador "en dues etapes" considerant a l'inrevés un TSGM introduint no determinisme: 1) es genera un graf acíclic dirigit (DAG) que 2) és emprat conjuntament amb un model de llenguatge (LM). El decodificador "d'un pas" en és un cas particular. Descrivim i formalitzem del procés de decodificació basada en equacions de llenguatges i en semianells. Proposem emprar xarxes de transició recurrent (RTNs) com forma normal de gramàtiques incontextuals (CFGs) i s'empra el paradigma d'anàlisi sintàctic mitjançant composició de manera que l'anàlisi de CFGs resulta una lleugera extensió de l'anàlisi de FSA. Es proposen algorismes de composició de transductors que poden emprar RTNs i que no necessiten recorrer a la composició amb filtres fins i tot amb transicions nul.les i semianells no idempotents. Es proposa una extensa revisió de LMs i algunes contribucions relacionades amb la seva interfície, amb la seva representació i amb l'avaluació de LMs basats en xarxes neuronals (NNLMs). S'ha realitzat una revisió de SMs que inclou SMs basats en la combinació de models generatius i discriminatius, així com un esquema general de tipus d'emissió de trames i altre de SMs. Es proposen versions especialitzades de l'algorisme de Viterbi per a models de lèxic que permeten emprar estats actius sense haver de recórrer a estructures de dades de tipus diccionari, i que trauen profit de la caché. S'ha proposat una arquitectura de flux de dades o "dataflow" per obtindre diversos reconeixedors a partir d'un xicotet conjunt de peces amb un protocol de serialització de DAGs. Descrivim generadors de DAGs capaços de tindre en compte restriccions sobre la segmentació, emprar models segmentals no limitats a HMMs, fer us dels decodificadors especialitzats proposats en aquest treball i emprar un transductor de control que permet emprar unitats dependents del contexte. Els decodificadors de DAGs fan us d'una interfície de LMs prou general que ha segut extesa per permetre l'ús de RTNs. Es proposen millores per a reconeixedors de tipus "un pas" basats en els algorismes especialitzats per a lèxics i en la interfície de LMs en mode "bunch", així com la seua paral.lelització. La part experimental està centrada en el reconeiximent d'escriptura en diverses modalitats d'adquisició (offline, bimodal). Proposem un preprocés d'escriptura manuscrita evitant l'us d'heurístics geomètrics, en el seu lloc emprem xarxes neuronals. S'han emprat HMMs hibridats amb xarxes neuronals aconseguint, per a la base de dades IAM, alguns dels millors resultats publicats. També podem mencionar l'ús d'informació de sobre-segmentació, aproximacions sense restricció a un lèxic, experiments amb dades bimodals o la combinació de HMMs híbrids amb classificadors holístics. / España Boquera, S. (2016). Contributions to the joint segmentation and classification of sequences (My two cents on decoding and handwriting recognition) [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/62215 / TESIS / Premios Extraordinarios de tesis doctorales
267

Models of Discrete-Time Stochastic Processes and Associated Complexity Measures

Löhr, Wolfgang 12 May 2010 (has links)
Many complexity measures are defined as the size of a minimal representation in a specific model class. One such complexity measure, which is important because it is widely applied, is statistical complexity. It is defined for discrete-time, stationary stochastic processes within a theory called computational mechanics. Here, a mathematically rigorous, more general version of this theory is presented, and abstract properties of statistical complexity as a function on the space of processes are investigated. In particular, weak-* lower semi-continuity and concavity are shown, and it is argued that these properties should be shared by all sensible complexity measures. Furthermore, a formula for the ergodic decomposition is obtained. The same results are also proven for two other complexity measures that are defined by different model classes, namely process dimension and generative complexity. These two quantities, and also the information theoretic complexity measure called excess entropy, are related to statistical complexity, and this relation is discussed here. It is also shown that computational mechanics can be reformulated in terms of Frank Knight''s prediction process, which is of both conceptual and technical interest. In particular, it allows for a unified treatment of different processes and facilitates topological considerations. Continuity of the Markov transition kernel of a discrete version of the prediction process is obtained as a new result.
268

Predictive Modeling and Statistical Inference for CTA returns : A Hidden Markov Approach with Sparse Logistic Regression

Fransson, Oskar January 2023 (has links)
This thesis focuses on predicting trends in Commodity Trading Advisors (CTAs), also known as trend-following hedge funds. The paper applies a Hidden Markov Model (HMM) for classifying trends. Additionally, by incorporating additional features, a regularized logistic regression model is used to enhance prediction capability. The model demonstrates success in identifying positive trends in CTA funds, with particular emphasis on precision and risk-adjusted return metrics. In the context of regularized regression models, techniques for statistical inference such as bootstrap resampling and Markov Chain Monte Carlo are applied to estimate the distribution of parameters. The findings suggest the model's effectiveness in predicting favorable CTA performance and mitigating equity market drawdowns. For future research, it is recommended to explore alternative classification models and extend the methodology to different markets and datasets.
269

Detecting Anomalous Behavior in Radar Data

Rook, Jayson Carr 01 June 2021 (has links)
No description available.
270

Scalable Detection and Extraction of Data in Lists in OCRed Text for Ontology Population Using Semi-Supervised and Unsupervised Active Wrapper Induction

Packer, Thomas L 01 October 2014 (has links) (PDF)
Lists of records in machine-printed documents contain much useful information. As one example, the thousands of family history books scanned, OCRed, and placed on-line by FamilySearch.org probably contain hundreds of millions of fact assertions about people, places, family relationships, and life events. Data like this cannot be fully utilized until a person or process locates the data in the document text, extracts it, and structures it with respect to an ontology or database schema. Yet, in the family history industry and other industries, data in lists goes largely unused because no known approach adequately addresses all of the costs, challenges, and requirements of a complete end-to-end solution to this task. The diverse information is costly to extract because many kinds of lists appear even within a single document, differing from each other in both structure and content. The lists' records and component data fields are usually not set apart explicitly from the rest of the text, especially in a corpus of OCRed historical documents. OCR errors and the lack of document structure (e.g. HMTL tags) make list content hard to recognize by a software tool developed without a substantial amount of highly specialized, hand-coded knowledge or machine learning supervision. Making an approach that is not only accurate but also sufficiently scalable in terms of time and space complexity to process a large corpus efficiently is especially challenging. In this dissertation, we introduce a novel family of scalable approaches to list discovery and ontology population. Its contributions include the following. We introduce the first general-purpose methods of which we are aware for both list detection and wrapper induction for lists in OCRed or other plain text. We formally outline a mapping between in-line labeled text and populated ontologies, effectively reducing the ontology population problem to a sequence labeling problem, opening the door to applying sequence labelers and other common text tools to the goal of populating a richly structured ontology from text. We provide a novel admissible heuristic for inducing regular expression wrappers using an A* search. We introduce two ways of modeling list-structured text with a hidden Markov model. We present two query strategies for active learning in a list-wrapper induction setting. Our primary contributions are two complete and scalable wrapper-induction-based solutions to the end-to-end challenge of finding lists, extracting data, and populating an ontology. The first has linear time and space complexity and extracts highly accurate information at a low cost in terms of user involvement. The second has time and space complexity that are linear in the size of the input text and quadratic in the length of an output record and achieves higher F1-measures for extracted information as a function of supervision cost. We measure the performance of each of these approaches and show that they perform better than strong baselines, including variations of our own approaches and a conditional random field-based approach.

Page generated in 0.0539 seconds