Spelling suggestions: "subject:"longshort"" "subject:"angehort""
41 |
Computational models for multilingual negation scope detectionFancellu, Federico January 2018 (has links)
Negation is a common property of languages, in that there are few languages, if any, that lack means to revert the truth-value of a statement. A challenge to cross-lingual studies of negation lies in the fact that languages encode and use it in different ways. Although this variation has been extensively researched in linguistics, little has been done in automated language processing. In particular, we lack computational models of processing negation that can be generalized across language. We even lack knowledge of what the development of such models would require. These models however exist and can be built by means of existing cross-lingual resources, even when annotated data for a language other than English is not available. This thesis shows this in the context of detecting string-level negation scope, i.e. the set of tokens in a sentence whose meaning is affected by a negation marker (e.g. 'not'). Our contribution has two parts. First, we investigate the scenario where annotated training data is available. We show that Bi-directional Long Short Term Memory (BiLSTM) networks are state-of-the-art models whose features can be generalized across language. We also show that these models suffer from genre effects and that for most of the corpora we have experimented with, high performance is simply an artifact of the annotation styles, where negation scope is often a span of text delimited by punctuation. Second, we investigate the scenario where annotated data is available in only one language, experimenting with model transfer. To test our approach, we first build NEGPAR, a parallel corpus annotated for negation, where pre-existing annotations on English sentences have been edited and extended to Chinese translations. We then show that transferring a model for negation scope detection across languages is possible by means of structured neural models where negation scope is detected on top of a cross-linguistically consistent representation, Universal Dependencies. On the other hand, we found cross-lingual lexical information only to help very little with performance. Finally, error analysis shows that performance is better when a negation marker is in the same dependency substructure as its scope and that some of the phenomena related to negation scope requiring lexical knowledge are still not captured correctly. In the conclusions, we tie up the contributions of this thesis and we point future work towards representing negation scope across languages at the level of logical form as well.
|
42 |
Biological applications, visualizations, and extensions of the long short-term memory networkvan der Westhuizen, Jos January 2018 (has links)
Sequences are ubiquitous in the domain of biology. One of the current best machine learning techniques for analysing sequences is the long short-term memory (LSTM) network. Owing to significant barriers to adoption in biology, focussed efforts are required to realize the use of LSTMs in practice. Thus, the aim of this work is to improve the state of LSTMs for biology, and we focus on biological tasks pertaining to physiological signals, peripheral neural signals, and molecules. This goal drives the three subplots in this thesis: biological applications, visualizations, and extensions. We start by demonstrating the utility of LSTMs for biological applications. On two new physiological-signal datasets, LSTMs were found to outperform hidden Markov models. LSTM-based models, implemented by other researchers, also constituted the majority of the best performing approaches on publicly available medical datasets. However, even if these models achieve the best performance on such datasets, their adoption will be limited if they fail to indicate when they are likely mistaken. Thus, we demonstrate on medical data that it is straightforward to use LSTMs in a Bayesian framework via dropout, providing model predictions with corresponding uncertainty estimates. Another dataset used to show the utility of LSTMs is a novel collection of peripheral neural signals. Manual labelling of this dataset is prohibitively expensive, and as a remedy, we propose a sequence-to-sequence model regularized by Wasserstein adversarial networks. The results indicate that the proposed model is able to infer which actions a subject performed based on its peripheral neural signals with reasonable accuracy. As these LSTMs achieve state-of-the-art performance on many biological datasets, one of the main concerns for their practical adoption is their interpretability. We explore various visualization techniques for LSTMs applied to continuous-valued medical time series and find that learning a mask to optimally delete information in the input provides useful interpretations. Furthermore, we find that the input features looked for by the LSTM align well with medical theory. For many applications, extensions of the LSTM can provide enhanced suitability. One such application is drug discovery -- another important aspect of biology. Deep learning can aid drug discovery by means of generative models, but they often produce invalid molecules due to their complex discrete structures. As a solution, we propose a version of active learning that leverages the sequential nature of the LSTM along with its Bayesian capabilities. This approach enables efficient learning of the grammar that governs the generation of discrete-valued sequences such as molecules. Efficiency is achieved by reducing the search space from one over sequences to one over the set of possible elements at each time step -- a much smaller space. Having demonstrated the suitability of LSTMs for biological applications, we seek a hardware efficient implementation. Given the success of the gated recurrent unit (GRU), which has two gates, a natural question is whether any of the LSTM gates are redundant. Research has shown that the forget gate is one of the most important gates in the LSTM. Hence, we propose a forget-gate-only version of the LSTM -- the JANET -- which outperforms both the LSTM and some of the best contemporary models on benchmark datasets, while also reducing computational cost.
|
43 |
Comparative analysis of XGBoost, MLP and LSTM techniques for the problem of predicting fire brigade Iiterventions /Cerna Ñahuis, Selene Leya January 2019 (has links)
Orientador: Anna Diva Plasencia Lotufo / Abstract: Many environmental, economic and societal factors are leading fire brigades to be increasingly solicited, and, as a result, they face an ever-increasing number of interventions, most of the time on constant resource. On the other hand, these interventions are directly related to human activity, which itself is predictable: swimming pool drownings occur in summer while road accidents due to ice storms occur in winter. One solution to improve the response of firefighters on constant resource is therefore to predict their workload, i.e., their number of interventions per hour, based on explanatory variables conditioning human activity. The present work aims to develop three models that are compared to determine if they can predict the firefighters' response load in a reasonable way. The tools chosen are the most representative from their respective categories in Machine Learning, such as XGBoost having as core a decision tree, a classic method such as Multi-Layer Perceptron and a more advanced algorithm like Long Short-Term Memory both with neurons as a base. The entire process is detailed, from data collection to obtaining the predictions. The results obtained prove a reasonable quality prediction that can be improved by data science techniques such as feature selection and adjustment of hyperparameters. / Resumo: Muitos fatores ambientais, econômicos e sociais estão levando as brigadas de incêndio a serem cada vez mais solicitadas e, como consequência, enfrentam um número cada vez maior de intervenções, na maioria das vezes com recursos constantes. Por outro lado, essas intervenções estão diretamente relacionadas à atividade humana, o que é previsível: os afogamentos em piscina ocorrem no verão, enquanto os acidentes de tráfego, devido a tempestades de gelo, ocorrem no inverno. Uma solução para melhorar a resposta dos bombeiros com recursos constantes é prever sua carga de trabalho, isto é, seu número de intervenções por hora, com base em variáveis explicativas que condicionam a atividade humana. O presente trabalho visa desenvolver três modelos que são comparados para determinar se eles podem prever a carga de respostas dos bombeiros de uma maneira razoável. As ferramentas escolhidas são as mais representativas de suas respectivas categorias em Machine Learning, como o XGBoost que tem como núcleo uma árvore de decisão, um método clássico como o Multi-Layer Perceptron e um algoritmo mais avançado como Long Short-Term Memory ambos com neurônios como base. Todo o processo é detalhado, desde a coleta de dados até a obtenção de previsões. Os resultados obtidos demonstram uma previsão de qualidade razoável que pode ser melhorada por técnicas de ciência de dados, como seleção de características e ajuste de hiperparâmetros. / Mestre
|
44 |
SENSOR-BASED HUMAN ACTIVITY RECOGNITION USING BIDIRECTIONAL LSTM FOR CLOSELY RELATED ACTIVITIESPavai, Arumugam Thendramil 01 December 2018 (has links)
Recognizing human activities using deep learning methods has significance in many fields such as sports, motion tracking, surveillance, healthcare and robotics. Inertial sensors comprising of accelerometers and gyroscopes are commonly used for sensor based HAR. In this study, a Bidirectional Long Short-Term Memory (BLSTM) approach is explored for human activity recognition and classification for closely related activities on a body worn inertial sensor data that is provided by the UTD-MHAD dataset. The BLSTM model of this study could achieve an overall accuracy of 98.05% for 15 different activities and 90.87% for 27 different activities performed by 8 persons with 4 trials per activity per person. A comparison of this BLSTM model is made with the Unidirectional LSTM model. It is observed that there is a significant improvement in the accuracy for recognition of all 27 activities in the case of BLSTM than LSTM.
|
45 |
Efeitos da baixa liquidez em posições compradas em relação às posições vendidas sobre o desempenho de fundos long short em situação de crise sistêmica (2008)Carvalho, Luís Renato 28 May 2013 (has links)
Submitted by Luis Renato Carvalho (luis.renato@hotmail.com) on 2013-08-30T14:30:56Z
No. of bitstreams: 1
Trabalho final - Dissertação de Mestrado em Finanças - Luis Renato de Carvalho.pdf: 2040201 bytes, checksum: 2c57abeb59c78a8119b34d55c5e1521f (MD5) / Approved for entry into archive by Vitor Souza (vitor.souza@fgv.br) on 2013-09-23T15:14:10Z (GMT) No. of bitstreams: 1
Trabalho final - Dissertação de Mestrado em Finanças - Luis Renato de Carvalho.pdf: 2040201 bytes, checksum: 2c57abeb59c78a8119b34d55c5e1521f (MD5) / Made available in DSpace on 2013-10-21T17:02:30Z (GMT). No. of bitstreams: 1
Trabalho final - Dissertação de Mestrado em Finanças - Luis Renato de Carvalho.pdf: 2040201 bytes, checksum: 2c57abeb59c78a8119b34d55c5e1521f (MD5)
Previous issue date: 2013-05-28 / Diversos estudos sobre investimentos em Ações e Fundos de Investimentos no Brasil, mais especificamente sobre Fundos Multimercados Long-Short, focam em sua neutralidade em relação ao Ibovespa bem como na performance de seus gestores e de suas respectivas estratégias, como Penna (2007) e Gomes e Cresto (2010). Com ênfase na comparação entre a liquidez da posição comprada e a da posição vendida em ações, foi verificado o comportamento de fundos long-short em situações normais e de crise, do período que vai de 2007 a 2009. Foram encontrados fortes indícios de que houve perda maior em momentos de estresse por parte de fundos que carregavam ações menos líquidas em suas carteiras na posição comprada em relação a posição vendida, apesar do número reduzido de fundos estudados e também de ter sido utilizado periodicidade mensal. Encontrou-se um retorno médio em 2008 de 11,1% para uma carteira formada por fundos com ações mais líquidas na posição comprada do que na posição vendida e 5,4% para uma carteira com posição inversa. Uma análise de risco-retorno feita com o Índice de Sharpe (IS) corrobora o estudo, pois a carteira composta por fundos com posição mais líquida na posição vendida apresentou IS de -1,5368, bem inferior ao IS de -0,3374 da carteira de posição inversa (mais líquida na posição comprada). Foi também utilizado o Modelo Índice, como em Bodie, Kane e Marcus (2005), para verificar se esses fundos, separados em carteiras divididas entre mais líquidos na posição comprada do que na posição vendida e vice-versa, tinham desempenho melhor que o mercado (IBOVESPA) de maneira sistemática (alpha=α) e a exposição dessas carteiras ao risco de mercado (Beta = β), além do Modelo de Fatores. As regressões realizadas para os modelos citados encontram coeficientes e respectivas inferências estatísticas que respaldam a hipótese acima, apesar de baixo número de observações utilizado. / Several studies on investments in stocks and investment funds in Brazil, more specifically on Multimarket Funds Long-Short, focus on their neutrality in relation to the Bovespa Index and the performance of their managers and their respective strategies, as Penna (2007) and Gomes and Cresto (2010). With emphasis on the comparison between the liquidity of the long position and the short position in stocks, was verified the behavior of the long-short funds under normal situations and crisis, in the period between 2007 and 2009. There was found strong evidence that loss was greater during stressful times by funds that carried less liquid stocks in their portfolios in the long position in relation to the short position, despite the reduced number of funds studied and was also used monthly periodicity. It was found an average return of 11.1% in 2008 in a portfolio of funds with more liquid stocks in the long position than in the short position and 5.4% for a portfolio with inverse position. An analysis of risk-return made with the Sharpe Ratio (IS) supports the study, because the portfolio of funds with liquid position in the short position presented IS of -1.5368, lower than IS of -0,3374 for the portfolio of inverse position (more liquid at the long position). The Index Model, as in Bodie, Kane and Marcus (2005), was also used to verify that these funds, divided into separate portfolios: more liquid on the long position than the short position and vice versa, had better performance than the market (Bovespa Index) in a systematic way (alpha = α) and exposure of these portfolios to market risk (Beta = β), and the Multifactor Model. The regressions, performed for the models mentioned before, found coefficients and their statistical inferences that support the above hypothesis, despite the low number of observations used.
|
46 |
On The Effectiveness of Multi-TaskLearningAn evaluation of Multi-Task Learning techniques in deep learning modelsTovedal, Sofiea January 2020 (has links)
Multi-Task Learning is today an interesting and promising field which many mention as a must for achieving the next level advancement within machine learning. However, in reality, Multi-Task Learning is much more rarely used in real-world implementations than its more popular cousin Transfer Learning. The questionis why that is and if Multi-Task Learning outperforms its Single-Task counterparts. In this thesis different Multi-Task Learning architectures were utilized in order to build a model that can handle labeling real technical issues within two categories. The model faces a challenging imbalanced data set with many labels to choose from and short texts to base its predictions on. Can task-sharing be the answer to these problems? This thesis investigated three Multi-Task Learning architectures and compared their performance to a Single-Task model. An authentic data set and two labeling tasks was used in training the models with the method of supervised learning. The four model architectures; Single-Task, Multi-Task, Cross-Stitched and the Shared-Private, first went through a hyper parameter tuning process using one of the two layer options LSTM and GRU. They were then boosted by auxiliary tasks and finally evaluated against each other.
|
47 |
Arrival Time Predictions for Buses using Recurrent Neural Networks / Ankomsttidsprediktioner för bussar med rekurrenta neurala nätverkFors Johansson, Christoffer January 2019 (has links)
In this thesis, two different types of bus passengers are identified. These two types, namely current passengers and passengers-to-be have different needs in terms of arrival time predictions. A set of machine learning models based on recurrent neural networks and long short-term memory units were developed to meet these needs. Furthermore, bus data from the public transport in Östergötland county, Sweden, were collected and used for training new machine learning models. These new models are compared with the current prediction system that is used today to provide passengers with arrival time information. The models proposed in this thesis uses a sequence of time steps as input and the observed arrival time as output. Each input time step contains information about the current state such as the time of arrival, the departure time from thevery first stop and the current position in Cartesian coordinates. The targeted value for each input is the arrival time at the next time step. To predict the rest of the trip, the prediction for the next step is simply used as input in the next time step. The result shows that the proposed models can improve the mean absolute error per stop between 7.2% to 40.9% compared to the system used today on all eight routes tested. Furthermore, the choice of loss function introduces models thatcan meet the identified passengers need by trading average prediction accuracy for a certainty that predictions do not overestimate or underestimate the target time in approximately 95% of the cases.
|
48 |
Quantifying implicit and explicit constraints on physics-informed neural processesHaoyang Zheng (10141679) 30 April 2021 (has links)
<p>Due to strong interactions among various phases and among the phases
and fluid motions, multiphase flows (MPFs) are so complex that lots of efforts
have to be paid to predict its sequential
patterns of phases and motions. The present paper applies the physical
constraints inherent in MPFs and enforces them to a physics-informed neural
network (PINN) model either explicitly or implicitly, depending on the type of
constraints. To predict the unobserved order
parameters (OPs) (which locate the phases) in the future steps, the conditional
neural processes (CNPs) with long short-term memory (LSTM, combined as CNPLSTM)
are applied to quickly infer the dynamics of the phases after encoding only a
few observations. After that, the multiphase consistent and conservative
boundedness mapping (MCBOM) algorithm is implemented the correction the predicted OPs from
CNP-LSTM so that the mass conservation, the summation of the volume fractions of
the phases being unity, the consistency of reduction, and the boundedness of the OPs are
strictly satisfied. Next, the density of the
fluid mixture is computed from the corrected OPs. The observed velocity and
density of the fluid mixture then encode in a physics-informed conditional
neural processes and long short-term memory (PICNP-LSTM) where the constraint
of momentum conservation is included in the loss function. Finally, the unobserved
velocity in future steps is predicted from PICNP-LSTM. The proposed physics-informed neural
processes (PINPs) model (CNP-LSTM-MCBOM-PICNP-LSTM) for MPFs avoids unphysical behaviors
of the OPs, accelerates the convergence, and requires fewer data. The proposed
model successfully predicts several canonical MPF problems, i.e., the horizontal shear
layer (HSL) and dam break (DB) problems, and its performances are validated.</p>
|
49 |
A STUDY OF TRANSFORMER MODELS FOR EMOTION CLASSIFICATION IN INFORMAL TEXTAlvaro S Esperanca (11797112) 07 January 2022 (has links)
<div>Textual emotion classification is a task in affective AI that branches from sentiment analysis and focuses on identifying emotions expressed in a given text excerpt. </div><div>It has a wide variety of applications that improve human-computer interactions, particularly to empower computers to understand subjective human language better. </div><div>Significant research has been done on this task, but very little of that research leverages one of the most emotion-bearing symbols we have used in modern communication: Emojis.</div><div>In this thesis, we propose several transformer-based models for emotion classification that processes emojis as input tokens and leverages pretrained models and uses them</div><div>, a model that processes Emojis as textual inputs and leverages DeepMoji to generate affective feature vectors used as reference when aggregating different modalities of text encoding. </div><div>To evaluate ReferEmo, we experimented on the SemEval 2018 and GoEmotions datasets, two benchmark datasets for emotion classification, and achieved competitive performance compared to state-of-the-art models tested on these datasets. Notably, our model performs better on the underrepresented classes of each dataset.</div>
|
50 |
Aktiemarknadsprognoser: En jämförande studie av LSTM- och SVR-modeller med olika dataset och epoker / Stock Market Forecasting: A Comparative Study of LSTM and SVR Models Across Different Datasets and EpochsNørklit Johansen, Mads, Sidhu, Jagtej January 2023 (has links)
Predicting stock market trends is a complex task due to the inherent volatility and unpredictability of financial markets. Nevertheless, accurate forecasts are of critical importance to investors, financial analysts, and stakeholders, as they directly inform decision-making processes and risk management strategies associated with financial investments. Inaccurate forecasts can lead to notable financial consequences, emphasizing the crucial and demanding task of developing models that provide accurate and trustworthy predictions. This article addresses this challenging problem by utilizing a long-short term memory (LSTM) model to predict stock market developments. The study undertakes a thorough analysis of the LSTM model's performance across multiple datasets, critically examining the impact of different timespans and epochs on the accuracy of its predictions. Additionally, a comparison is made with a support vector regression (SVR) model using the same datasets and timespans, which allows for a comprehensive evaluation of the relative strengths of the two techniques. The findings offer insights into the capabilities and limitations of both models, thus paving the way for future research in stock market prediction methodologies. Crucially, the study reveals that larger datasets and an increased number of epochs can significantly enhance the LSTM model's performance. Conversely, the SVR model exhibits significant challenges with overfitting. Overall, this research contributes to ongoing efforts to improve financial prediction models and provides potential solutions for individuals and organizations seeking to make accurate and reliable forecasts of stock market trends.
|
Page generated in 0.0297 seconds