• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 247
  • 124
  • 44
  • 38
  • 31
  • 29
  • 24
  • 24
  • 13
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • Tagged with
  • 629
  • 629
  • 144
  • 132
  • 122
  • 115
  • 95
  • 89
  • 87
  • 82
  • 81
  • 77
  • 72
  • 67
  • 66
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
581

Financial market monitoring and surveillance systems framework : a service systems and business intelligence approach

Diaz Solis, David Alejandro January 2012 (has links)
The thesis introduces a framework for analysing market monitoring and surveillance systems in order to provide a common foundation for researchers and practitioners to specify, design, implement, compare and evaluate such systems. The proposed framework serves as a reference map for researchers and practitioners to position their work in the context of market monitoring and surveillance, resulting in a useful instrument for the analysis, testing and management of such systems. More specifically, the thesis examines the new requirements for the operation of financial markets, the role of technologies, the recent consultations on the structure and governance of EU and US markets, as well as, future usage scenarios and emerging technologies. It examines the context in which market monitoring and market surveillance systems are currently been used. It reports on their processes, performance, and on the organisational and regulatory environments in which they exist. Furthermore, it develops a set of taxonomies which cover the majority of the concepts of market manipulation, market monitoring, market surveillance, entities, technologies and actors that are relevant for the work in this thesis. Building on the gaps and limitations of the current systems, it proposes a new framework following the Design Science methodology. The usefulness of the framework is evaluated through four critical case studies, which not only help to understand with practical exercises the way how markets monitoring and surveillance systems work, but also to investigate their weaknesses, potential evolution and ways to improve them. For each case study, the thesis develops a fully working prototype tested using a sample prosecution case and evaluated in terms of the appropriateness and suitability of the proposed framework. Finally, implications relating to policies, procedures and future market structures are discussed followed by suggestions for future research.
582

O algoritmo de aprendizado semi-supervisionado co-training e sua aplicação na rotulação de documentos / The semi-supervised learning algorithm co-training applied to label text documents

Edson Takashi Matsubara 26 May 2004 (has links)
Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente, o que torna esse processo demorado e caro. Por outro lado, exemplos não-rotulados são facilmente obtidos se comparados a exemplos rotulados. Isso é particularmente verdade para tarefas de classificação de textos que envolvem fontes de dados on-line tais como páginas de internet, email e artigos científicos. A classificação de textos tem grande importância dado o grande volume de textos disponível on-line. Aprendizado semi-supervisionado, uma área de pesquisa relativamente nova em Aprendizado de Máquina, representa a junção do aprendizado supervisionado e não-supervisionado, e tem o potencial de reduzir a necessidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados está disponível. Este trabalho descreve o algoritmo de aprendizado semi-supervisionado co-training, que necessita de duas descrições de cada exemplo. Deve ser observado que as duas descrições necessárias para co-training podem ser facilmente obtidas de documentos textuais por meio de pré-processamento. Neste trabalho, várias extensões do algoritmo co-training foram implementadas. Ainda mais, foi implementado um ambiente computacional para o pré-processamento de textos, denominado PreTexT, com o objetivo de utilizar co-training em problemas de classificação de textos. Os resultados experimentais foram obtidos utilizando três conjuntos de dados. Dois conjuntos de dados estão relacionados com classificação de textos e o outro com classificação de páginas de internet. Os resultados, que variam de excelentes a ruins, mostram que co-training, similarmente a outros algoritmos de aprendizado semi-supervisionado, é afetado de maneira bastante complexa pelos diferentes aspectos na indução dos modelos. / In Machine Learning, the supervised approach usually requires a large number of labeled training examples to learn accurately. However, labeling is often manually performed, making this process costly and time-consuming. By contrast, unlabeled examples are often inexpensive and easier to obtain than labeled examples. This is especially true for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Text classification is of great practical importance today given the massive volume of online text available. Semi-supervised learning, a relatively new area in Machine Learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labeled data whenever only a small set of labeled examples is available. This work describes the semi-supervised learning algorithm co-training, which requires a partitioned description of each example into two distinct views. It should be observed that the two different views required by co-training can be easily obtained from textual documents through pre-processing. In this works, several extensions of co-training algorithm have been implemented. Furthermore, we have also implemented a computational environment for text pre-processing, called PreTexT, in order to apply the co-training algorithm to text classification problems. Experimental results using co-training on three data sets are described. Two data sets are related to text classification and the other one to web-page classification. Results, which range from excellent to poor, show that co-training, similarly to other semi-supervised learning algorithms, is affected by modelling assumptions in a rather complicated way.
583

Auxílio na prevenção de doenças crônicas por meio de mapeamento e relacionamento conceitual de informações em biomedicina / Support in the Prevention of Chronic Diseases by means of Mapping and Conceptual Relationship of Biomedical Information

Juliana Tarossi Pollettini 28 November 2011 (has links)
Pesquisas recentes em medicina genômica sugerem que fatores de risco que incidem desde a concepção de uma criança até o final de sua adolescência podem influenciar no desenvolvimento de doenças crônicas da idade adulta. Artigos científicos com descobertas e estudos inovadores sobre o tema indicam que a epigenética deve ser explorada para prevenir doenças de alta prevalência como doenças cardiovasculares, diabetes e obesidade. A grande quantidade de artigos disponibilizados diariamente dificulta a atualização de profissionais, uma vez que buscas por informação exata se tornam complexas e dispendiosas em relação ao tempo gasto na procura e análise dos resultados. Algumas tecnologias e técnicas computacionais podem apoiar a manipulação dos grandes repositórios de informações biomédicas, assim como a geração de conhecimento. O presente trabalho pesquisa a descoberta automática de artigos científicos que relacionem doenças crônicas e fatores de risco para as mesmas em registros clínicos de pacientes. Este trabalho também apresenta o desenvolvimento de um arcabouço de software para sistemas de vigilância que alertem profissionais de saúde sobre problemas no desenvolvimento humano. A efetiva transformação dos resultados de pesquisas biomédicas em conhecimento possível de ser utilizado para beneficiar a saúde pública tem sido considerada um domínio importante da informática. Este domínio é denominado Bioinformática Translacional (BUTTE,2008). Considerando-se que doenças crônicas são, mundialmente, um problema sério de saúde e lideram as causas de mortalidade com 60% de todas as mortes, o presente trabalho poderá possibilitar o uso direto dos resultados dessas pesquisas na saúde pública e pode ser considerado um trabalho de Bioinformática Translacional. / Genomic medicine has suggested that the exposure to risk factors since conception may influence gene expression and consequently induce the development of chronic diseases in adulthood. Scientific papers bringing up these discoveries indicate that epigenetics must be exploited to prevent diseases of high prevalence, such as cardiovascular diseases, diabetes and obesity. A large amount of scientific information burdens health care professionals interested in being updated, once searches for accurate information become complex and expensive. Some computational techniques might support management of large biomedical information repositories and discovery of knowledge. This study presents a framework to support surveillance systems to alert health professionals about human development problems, retrieving scientific papers that relate chronic diseases to risk factors detected on a patient\'s clinical record. As a contribution, healthcare professionals will be able to create a routine with the family, setting up the best growing conditions. According to Butte, the effective transformation of results from biomedical research into knowledge that actually improves public health has been considered an important domain of informatics and has been called Translational Bioinformatics. Since chronic diseases are a serious health problem worldwide and leads the causes of mortality with 60% of all deaths, this scientific investigation will probably enable results from bioinformatics researches to directly benefit public health.
584

Étude comparative du vocabulaire de description de la danse dans les archives et du vocabulaire de représentation de la danse dans la littérature

Paquette-Bigras, Ève 03 1900 (has links)
Notre recherche s’insère dans la mouvance des humanités numériques; nous y faisons dialoguer les arts et les sciences de l’information. Depuis quelques décennies, la danse est un sujet d’études et de recherche à part entière. Il devient donc nécessaire de mieux décrire la danse dans les archives, sachant que la description en amont influe grandement sur l’accès en aval. Les méthodes d’extraction automatique de connaissances nous semblent offrir de nouvelles possibilités. L’objectif de notre recherche est de contribuer au développement d’outils de gestion de l’information dans les archives de la danse en comparant un vocabulaire de description de la danse dans les archives et un vocabulaire de représentation de la danse dans la littérature, recueilli grâce à des méthodes d’extraction automatique de connaissances, pour en distinguer une possible complémentarité, particulièrement en ce qui a trait au vocabulaire de l’expérience esthétique. D’abord, nous analysons un vocabulaire de description de la danse dans les archives. Nous décrivons certains outils de description des archives de la danse et nous analysons le thésaurus de descripteurs Collier. Nous constatons que le vocabulaire de description de la danse dans les archives ne semble pas prendre en compte l’expérience esthétique. Ensuite, nous analysons un vocabulaire de représentation de la danse dans la littérature. Un vocabulaire structuré de l’expérience esthétique de la danse moderne est ainsi extrait d’un corpus de textes de l’écrivain français Stéphane Mallarmé et analysé. Puis nous comparons les deux vocabulaires afin d'en distinguer la complémentarité quant à la description de l’expérience esthétique. Nous formulons une première suggestion d’amélioration de certains thésaurus employés dans les archives de la danse : un thésaurus au vocabulaire essentiellement factuel, comme le thésaurus de descripteurs Collier, peut être enrichi de termes à propos de l’expérience esthétique. Le vocabulaire de représentation de la danse dans la littérature est jusqu’à un certain point complémentaire au vocabulaire de description de l’expérience esthétique de la danse dans les archives. Nous menons ainsi une première expérimentation qui justifie en partie la pertinence de certaines méthodes d’extraction de connaissances dans le développement et la maintenance de ressources documentaires pour le domaine des arts d’interprétation tels que la danse. / This research falls within the field of digital humanities; arts and information science engage in dialogue. In the last few decades, dance has become a distinct research subject. Dance description in archives needs to be improved, because the quality of the description impacts access to the documentation. Knowledge extraction seems to offer new opportunities in this regard. The goal of this research is to contribute to the development of information management tools by comparing a vocabulary for describing dance in archives with a vocabulary for representing dance in literature obtained through knowledge extraction. We look for possible complementarity, particularly in regard to the aesthetic experience. First, some tools for describing dance in archives are described, and the Collier Descriptor Thesaurus is analyzed. We observe that this vocabulary for describing dance in archives does not take into account aesthetic experience. Second, a vocabulary for representing dance in literature is analyzed. More specifically, a structured vocabulary of aesthetic experience of modern dance is drawn from a corpus of texts from the French writer Stéphane Mallarmé, and the vocabulary obtained is analyzed. Finally, the two vocabularies are compared to consider their complementarity. We conclude that some vocabularies for describing dance in archives, consisting mainly of factual terms, such as the Collier Descriptor Thesaurus, can be enriched with terms related to aesthetic experience. The vocabulary for representing dance in literature complements to a certain extent the vocabulary for describing dance in archives. Thus this initial experiment supports the relevance of knowledge extraction in information resources maintenance and development for performing arts such as dance. / Diese Arbeit beschäftigt sich mit dem Fachgebiet der Digital Humanities und verbindet dabei Kunst mit informationswissenschaftlichen Methoden. In den letzten Jahrzehnten ist Tanz ein eigenständiges Forschungsgebiet geworden. Da sich die Qualität der Beschreibung direkt auf den Zugang zu Dokumenten im Archiv auswirkt, bedarf die Beschreibung von Tanz in Archiven Verbesserung. Ziel der Forschung ist es zur Entwicklung von Informationsverwaltungs-Tools beizutragen, indem das Vokabular der Beschreibung von Tanz im Archiv mit Vokabular aus der Literatur, extrahiert aus textuellen Datenbanken, verglichen wird. Dabei liegt der Fokus auf der Komplementarität beider Quellen, besonders in Bezug auf die Beschreibung von ästhetischen Erfahrungen. Zunächst werden Tools für die Beschreibung von Tanz in Archiven beschrieben und der Collier Descriptor Thesaurus analysiert. Dabei zeigt sich, dass das Vokabular der Tanz-Beschreibung im Archiv ästhetische Erfahrung generell nicht berücksichtigt. Daraufhin wird das Vokabular der Tanz-Darstellung in der Literatur am Beispiel der Text-Sammlung des franzözischen Dichters Stéphane Mallarmé analysiert. Im Anschluss werden die zwei Wortschätze verglichen, um die Komplementarität beider Quellen zu beschreiben. Die Arbeit kommt zu dem Schluss, dass das Vokabular der Tanz-Beschreibung im Archiv hauptsächlich aus sachbezogenen Begriffen besteht (z.B. der Collier Descriptor Thesaurus), welche um Begriffe zur ästhetischen Erfahrung ergänzt werden können. Die Begriffe für die Tanz-Beschreibung in der Literatur komplementieren bis zu einem gewissen Grad das Vokabular der Tanz-Beschreibung im Archiv. Demzufolge bildet diese Arbeit eine Grundlage für weitere Forschung im Bereich der Wissensextraktion in textuellen Datenbanken im Fachgebiet darstellender Künste wie Tanz.
585

Web mining for social network analysis

Elhaddad, Mohamed Kamel Abdelsalam 09 August 2021 (has links)
Undoubtedly, the rapid development of information systems and the widespread use of electronic means and social networks have played a significant role in accelerating the pace of events worldwide, such as, in the 2012 Gaza conflict (the 8-day war), in the pro-secessionist rebellion in the 2013-2014 conflict in Eastern Ukraine, in the 2016 US Presidential elections, and in conjunction with the COVID-19 outbreak pandemic since the beginning of 2020. As the number of daily shared data grows quickly on various social networking platforms in different languages, techniques to carry out automatic classification of this huge amount of data timely and correctly are needed. Of the many social networking platforms, Twitter is of the most used ones by netizens. It allows its users to communicate, share their opinions, and express their emotions (sentiments) in the form of short blogs easily at no cost. Moreover, unlike other social networking platforms, Twitter allows research institutions to access its public and historical data, upon request and under control. Therefore, many organizations, at different levels (e.g., governmental, commercial), are seeking to benefit from the analysis and classification of the shared tweets to serve in many application domains, for examples, sentiment analysis to evaluate and determine user’s polarity from the content of their shared text, and misleading information detection to ensure the legitimacy and the credibility of the shared information. To attain this objective, one can apply numerous data representation, preprocessing, natural language processing techniques, and machine/deep learning algorithms. There are several challenges and limitations with existing approaches, including issues with the management of tweets in multiple languages, the determination of what features the feature vector should include, and the assignment of representative and descriptive weights to these features for different mining tasks. Besides, there are limitations in existing performance evaluation metrics to fully assess the developed classification systems. In this dissertation, two novel frameworks are introduced; the first is to efficiently analyze and classify bilingual (Arabic and English) textual content of social networks, while the second is for evaluating the performance of binary classification algorithms. The first framework is designed with: (1) An approach to handle Arabic and English written tweets, and can be extended to cover data written in more languages and from other social networking platforms, (2) An effective data preparation and preprocessing techniques, (3) A novel feature selection technique that allows utilizing different types of features (content-dependent, context-dependent, and domain-dependent), in addition to (4) A novel feature extraction technique to assign weights to the linguistic features based on how representative they are in in the classes they belong to. The proposed framework is employed in performing sentiment analysis and misleading information detection. The performance of this framework is compared to state-of-the-art classification approaches utilizing 11 benchmark datasets comprising both Arabic and English textual content, demonstrating considerable improvement over all other performance evaluation metrics. Then, this framework is utilized in a real-life case study to detect misleading information surrounding the spread of COVID-19. In the second framework, a new multidimensional classification assessment score (MCAS) is introduced. MCAS can determine how good the classification algorithm is when dealing with binary classification problems. It takes into consideration the effect of misclassification errors on the probability of correct detection of instances from both classes. Moreover, it should be valid regardless of the size of the dataset and whether the dataset has a balanced or unbalanced distribution of its instances over the classes. An empirical and practical analysis is conducted on both synthetic and real-life datasets to compare the comportment of the proposed metric against those commonly used. The analysis reveals that the new measure can distinguish the performance of different classification techniques. Furthermore, it allows performing a class-based assessment of classification algorithms, to assess the ability of the classification algorithm when dealing with data from each class separately. This is useful if one of the classifying instances from one class is more important than instances from the other class, such as in COVID-19 testing where the detection of positive patients is much more important than negative ones. / Graduate
586

Zpracování uživatelských recenzí / Processing of User Reviews

Cihlářová, Dita January 2019 (has links)
Very often, people buy goods on the Internet that they can not see and try. They therefore rely on reviews of other customers. However, there may be too many reviews for a human to handle them quickly and comfortably. The aim of this work is to offer an application that can recognize in Czech reviews what features of a product are most commented and whether the commentary is positive or negative. The results can save a lot of time for e-shop customers and provide interesting feedback to the manufacturers of the products.
587

Multi-label klasifikace textových dokumentů / Multi-Label Classification of Text Documents

Průša, Petr January 2012 (has links)
The master's thesis deals with automatic classifi cation of text document. It explains basic terms and problems of text mining. The thesis explains term clustering and shows some basic clustering algoritms. The thesis also shows some methods of classi fication and deals with matrix regression closely. Application using matrix regression for classifi cation was designed and developed. Experiments were focused on normalization and thresholding.
588

Rozpoznávání emocí v česky psaných textech / Recognition of emotions in Czech texts

Červenec, Radek January 2011 (has links)
With advances in information and communication technologies over the past few years, the amount of information stored in the form of electronic text documents has been rapidly growing. Since the human abilities to effectively process and analyze large amounts of information are limited, there is an increasing demand for tools enabling to automatically analyze these documents and benefit from their emotional content. These kinds of systems have extensive applications. The purpose of this work is to design and implement a system for identifying expression of emotions in Czech texts. The proposed system is based mainly on machine learning methods and therefore design and creation of a training set is described as well. The training set is eventually utilized to create a model of classifier using the SVM. For the purpose of improving classification results, additional components were integrated into the system, such as lexical database, lemmatizer or derived keyword dictionary. The thesis also presents results of text documents classification into defined emotion classes and evaluates various approaches to categorization.
589

Reprezentace textu a její vliv na kategorizaci / Representation of Text and Its Influence on Categorization

Šabatka, Ondřej January 2010 (has links)
The thesis deals with machine processing of textual data. In the theoretical part, issues related to natural language processing are described and different ways of pre-processing and representation of text are also introduced. The thesis also focuses on the usage of N-grams as features for document representation and describes some algorithms used for their extraction. The next part includes an outline of classification methods used. In the practical part, an application for pre-processing and creation of different textual data representations is suggested and implemented. Within the experiments made, the influence of these representations on accuracy of classification algorithms is analysed.
590

Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit

Griol Barres, Israel 30 May 2022 (has links)
Tesis por compendio / [ES] En esta tesis doctoral se propone y evalúa un sistema para detectar señales débiles (weak signals) relacionadas con cambios futuros trascendentales. Si bien la mayoría de las soluciones conocidas se basan en el uso de datos estructurados, el sistema propuesto detecta cuantitativamente estas señales utilizando información heterogénea y no estructurada de fuentes científicas, periodísticas y de redes sociales. La predicción de nuevas tendencias en un medio tiene muchas aplicaciones. Por ejemplo, empresas y startups se enfrentan a cambios constantes en sus mercados que son muy difíciles de predecir. Por esta razón, el desarrollo de sistemas para detectar automáticamente cambios futuros significativos en una etapa temprana es relevante para que cualquier organización tome decisiones acertadas a tiempo. Este trabajo ha sido diseñado para obtener señales débiles del futuro en cualquier campo dependiendo únicamente del conjunto de datos de entrada de documentos. Se aplican técnicas de minería de textos y procesamiento del lenguaje natural para procesar todos estos documentos. Como resultado, se obtiene un mapa con un ranking de términos, una lista de palabras clave clasificadas automáticamente y una lista de expresiones formadas por múltiples palabras. El sistema completo se ha probado en cuatro sectores diferentes: paneles solares, inteligencia artificial, sensores remotos e imágenes médicas. Este trabajo ha obtenido resultados prometedores, evaluados con dos metodologías diferentes. Como resultado, el sistema ha sido capaz de detectar de forma satisfactoria nuevas tendencias en etapas muy tempranas que se han vuelto cada vez más importantes en la actualidad. La computación cuántica es un nuevo paradigma para una multitud de aplicaciones informáticas. En esta tesis doctoral también se presenta un estudio de las tecnologías disponibles en la actualidad para la implementación física de qubits y puertas cuánticas, estableciendo sus principales ventajas y desventajas, y los marcos disponibles para la programación e implementación de circuitos cuánticos. Con el fin de mejorar la efectividad del sistema, se describe un diseño de un circuito cuántico basado en máquinas de vectores de soporte (SVM) para la resolución de problemas de clasificación. Este circuito está especialmente diseñado para los ruidosos procesadores cuánticos de escala intermedia (NISQ) que están disponibles actualmente. Como experimento, el circuito ha sido probado en un computador cuántico real basado en qubits superconductores por IBM como una mejora para el subsistema de minería de texto en la detección de señales débiles. Los resultados obtenidos con el experimento cuántico muestran también conclusiones interesantes y una mejora en el rendimiento de cerca del 20% sobre los sistemas convencionales, pero a su vez confirman que aún se requiere un desarrollo tecnológico continuo para aprovechar al máximo la computación cuántica. / [CA] En aquesta tesi doctoral es proposa i avalua un sistema per detectar senyals febles (weak signals) relacionats amb canvis futurs transcendentals. Si bé la majoria de solucions conegudes es basen en l'ús de dades estructurades, el sistema proposat detecta quantitativament aquests senyals utilitzant informació heterogènia i no estructurada de fonts científiques, periodístiques i de xarxes socials. La predicció de noves tendències en un medi té moltes aplicacions. Per exemple, empreses i startups s'enfronten a canvis constants als seus mercats que són molt difícils de predir. Per això, el desenvolupament de sistemes per detectar automàticament canvis futurs significatius en una etapa primerenca és rellevant perquè les organitzacions prenguen decisions encertades a temps. Aquest treball ha estat dissenyat per obtenir senyals febles del futur a qualsevol camp depenent únicament del conjunt de dades d'entrada de documents. S'hi apliquen tècniques de mineria de textos i processament del llenguatge natural per processar tots aquests documents. Com a resultat, s'obté un mapa amb un rànquing de termes, un llistat de paraules clau classificades automàticament i un llistat d'expressions formades per múltiples paraules. El sistema complet s'ha provat en quatre sectors diferents: panells solars, intel·ligència artificial, sensors remots i imatges mèdiques. Aquest treball ha obtingut resultats prometedors, avaluats amb dues metodologies diferents. Com a resultat, el sistema ha estat capaç de detectar de manera satisfactòria noves tendències en etapes molt primerenques que s'han tornat cada cop més importants actualment. La computació quàntica és un paradigma nou per a una multitud d'aplicacions informàtiques. En aquesta tesi doctoral també es presenta un estudi de les tecnologies disponibles actualment per a la implementació física de qubits i portes quàntiques, establint-ne els principals avantatges i desavantatges, i els marcs disponibles per a la programació i implementació de circuits quàntics. Per tal de millorar l'efectivitat del sistema, es descriu un disseny d'un circuit quàntic basat en màquines de vectors de suport (SVM) per resoldre problemes de classificació. Aquest circuit està dissenyat especialment per als sorollosos processadors quàntics d'escala intermèdia (NISQ) que estan disponibles actualment. Com a experiment, el circuit ha estat provat en un ordinador quàntic real basat en qubits superconductors per IBM com una millora per al subsistema de mineria de text. Els resultats obtinguts amb l'experiment quàntic també mostren conclusions interessants i una millora en el rendiment de prop del 20% sobre els sistemes convencionals, però a la vegada confirmen que encara es requereix un desenvolupament tecnològic continu per aprofitar al màxim la computació quàntica. / [EN] In this doctoral thesis, a system to detect weak signals related to future transcendental changes is proposed and tested. While most known solutions are based on the use of structured data, the proposed system quantitatively detects these signals using heterogeneous and unstructured information from scientific, journalistic, and social sources. Predicting new trends in an environment has many applications. For instance, companies and startups face constant changes in their markets that are very difficult to predict. For this reason, developing systems to automatically detect significant future changes at an early stage is relevant for any organization to make right decisions on time. This work has been designed to obtain weak signals of the future in any field depending only on the input dataset of documents. Text mining and natural language processing techniques are applied to process all these documents. As a result, a map of ranked terms, a list of automatically classified keywords and a list of multi-word expressions are obtained. The overall system has been tested in four different sectors: solar panels, artificial intelligence, remote sensing, and medical imaging. This work has obtained promising results that have been evaluated with two different methodologies. As a result, the system was able to successfully detect new trends at a very early stage that have become more and more important today. Quantum computing is a new paradigm for a multitude of computing applications. This doctoral thesis also presents a study of the technologies that are currently available for the physical implementation of qubits and quantum gates, establishing their main advantages and disadvantages and the available frameworks for programming and implementing quantum circuits. In order to improve the effectiveness of the system, a design of a quantum circuit based on support vector machines (SVMs) is described for the resolution of classification problems. This circuit is specially designed for the noisy intermediate-scale quantum (NISQ) computers that are currently available. As an experiment, the circuit has been tested on a real quantum computer based on superconducting qubits by IBM as an improvement for the text mining subsystem in the detection of weak signals. The results obtained with the quantum experiment show interesting outcomes with an improvement of close to 20% better performance than conventional systems, but also confirm that ongoing technological development is still required to take full advantage of quantum computing. / Griol Barres, I. (2022). Modelling of a System for the Detection of Weak Signals Through Text Mining and NLP. Proposal of Improvement by a Quantum Variational Circuit [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/183029 / TESIS / Compendio

Page generated in 0.1013 seconds