Global ETD Search

1	T bangos kaitos analizė, naudojant modifikuotą slenkančio vidurkio metodą: įvairių laiko eilučių išlyginimo ir panašumo nustatymo būdų palyginimas / T wave alternans analysis using a modified moving average method: a comparison of various time series alignment and similarity detection techniques Puronaitė, Roma 04 July 2014 (has links) Šiame darbe analizuotos įvairių laiko eilučių išlyginimo ir panašumo nustatymo metodų pritaikymo galimybės T bangų kaitos (TBK) analizėje, pagerinant Nearing ir Verrier pasiūlytą, modifikuoto slenkančio vidurkio metodą. Pasinaudojant TWA duomenų baze ir generuotais duomenimis surasti labiausiai TBK analizei tinkami išlyginimo ir panašumo vertinimo metodai. TBK, paskaičiuoto naudojant modifikuotą slenkančio vidurkio metodą papildytą TBK analizei tinkamiausiais laiko eilučių išlyginimo ir panašumo nustatymo metodais, tinkamumas širdies ligų diagnostikai patikrintas su duomenimis iš PTB duomenų bazės. Pasiremiant PTB duomenimis rastas galimas biomarkeris širdies ligų diagnostikoje, paskutinių dviejų TBK įverčių, gautų taikant MSVM su atviros pradžios ir pabaigos dinaminio laiko skalės kraipymo su asimetriniu judėjimo šablonu išlyginimą ir panašumą vertinant kaip absoliutinį skirtumą tarp maksimumo taškų, min-max kombinaciją. / T wave alternans (TWA) is a beat-to-beat change in the amplitude or shape of T wave. TWA is one of potential biomarkers for ventricular arrhythmias and can be a sign of serious heart disease. Because there is no gold standard in TWA measuring, modifications of existing methods and new solutions are possible. Modified moving average method, proposed by Nearing and Verrier, is one of mostly used in medical practise, but can give misleading results then T waves is not properly aligned or T wave length and morphology changes because of heart rate variability. It is known, that some ventricular arrhythmias can cause heart rate variability, so this type of error is unwanted, because online TWA measuring can become one of sudden ventricular arrhythmias predictors in the near future. In this work, variuos time series alignment and similarity detection techniques were used to improve TWA measuring and this measure capabilities in heart disease diagnostic were analized. TWA analysis with simulated and real data from ECG databases was performed and potentional biomarker was found by using biomarkers combining method, proposed by Liu, Liu and Halabi. 57. T bangos kaita Laiko eilučių išlyginimas Panašumo nustatymas T wave alternans Time series alignment Similarity detection
2	The Value of Everything: Ranking and Association with Encyclopedic Knowledge Coursey, Kino High 12 1900 (has links) This dissertation describes WikiRank, an unsupervised method of assigning relative values to elements of a broad coverage encyclopedic information source in order to identify those entries that may be relevant to a given piece of text. The valuation given to an entry is based not on textual similarity but instead on the links that associate entries, and an estimation of the expected frequency of visitation that would be given to each entry based on those associations in context. This estimation of relative frequency of visitation is embodied in modifications to the random walk interpretation of the PageRank algorithm. WikiRank is an effective algorithm to support natural language processing applications. It is shown to exceed the performance of previous machine learning algorithms for the task of automatic topic identification, providing results comparable to that of human annotators. Second, WikiRank is found useful for the task of recognizing text-based paraphrases on a semantic level, by comparing the distribution of attention generated by two pieces of text using the encyclopedic resource as a common reference. Finally, WikiRank is shown to have the ability to use its base of encyclopedic knowledge to recognize terms from different ontologies as describing the same thing, and thus allowing for the automatic generation of mapping links between ontologies. The conclusion of this thesis is that the "knowledge access heuristic" is valuable and that a ranking process based on a large encyclopedic resource can form the basis for an extendable general purpose mechanism capable of identifying relevant concepts by association, which in turn can be effectively utilized for enumeration and comparison at a semantic level. PageRank topic identification similarity detection mapreduce ontology matching associative ranking Wikipedia WikiRank Information retrieval. Computational linguistics. Relevance. Wikipedia.
3	Similarités textuelles sémantiques translingues : vers la détection automatique du plagiat par traduction / Cross-lingual semantic textual similarity : towards automatic cross-language plagiarism detection Ferrero, Jérémy 08 December 2017 (has links) La mise à disposition massive de documents via Internet (pages Web, entrepôts de données,documents numériques, numérisés ou retranscrits, etc.) rend de plus en plus aisée la récupération d’idées. Malheureusement, ce phénomène s’accompagne d’une augmentation des cas de plagiat.En effet, s’approprier du contenu, peu importe sa forme, sans le consentement de son auteur (ou de ses ayants droit) et sans citer ses sources, dans le but de le présenter comme sa propre œuvre ou création est considéré comme plagiat. De plus, ces dernières années, l’expansion d’Internet a également facilité l’accès à des documents du monde entier (écrits dans des langues étrangères)et à des outils de traduction automatique de plus en plus performants, accentuant ainsi la progression d’un nouveau type de plagiat : le plagiat translingue. Ce plagiat implique l’emprunt d’un texte tout en le traduisant (manuellement ou automatiquement) de sa langue originale vers la langue du document dans lequel le plagiaire veut l’inclure. De nos jours, la prévention du plagiat commence à porter ses fruits, grâce notamment à des logiciels anti-plagiat performants qui reposent sur des techniques de comparaison monolingue déjà bien éprouvées. Néanmoins, ces derniers ne traitent pas encore de manière efficace les cas translingues. Cette thèse est née du besoin de Compilatio, une société d’édition de l’un de ces logiciels anti-plagiat, de mesurer des similarités textuelles sémantiques translingues (sous-tâche de la détection du plagiat). Après avoir défini le plagiat et les différents concepts abordés au cours de cette thèse, nous établissons un état de l’art des différentes approches de détection du plagiat translingue. Nousprésentons également les différents corpus déjà existants pour la détection du plagiat translingue et exposons les limites qu’ils peuvent rencontrer lors d’une évaluation de méthodes de détection du plagiat translingue. Nous présentons ensuite le corpus que nous avons constitué et qui ne possède pas la plupart des limites rencontrées par les différents corpus déjà existants. Nous menons,à l’aide de ce nouveau corpus, une évaluation de plusieurs méthodes de l’état de l’art et découvrons que ces dernières se comportent différemment en fonction de certaines caractéristiques des textes sur lesquelles elles opèrent. Ensuite, nous présentons des nouvelles méthodes de mesure de similarités textuelles sémantiques translingues basées sur des représentations continues de mots(word embeddings). Nous proposons également une notion de pondération morphosyntaxique et fréquentielle de mots, qui peut aussi bien être utilisée au sein d’un vecteur qu’au sein d’un sac de mots, et nous montrons que son introduction dans ces nouvelles méthodes augmente leurs performances respectives. Nous testons ensuite différents systèmes de fusion et combinaison entre différentes méthodes et étudions les performances, sur notre corpus, de ces méthodes et fusions en les comparant à celles des méthodes de l’état de l’art. Nous obtenons ainsi de meilleurs résultats que l’état de l’art dans la totalité des sous-corpus étudiés. Nous terminons en présentant et discutant les résultats de ces méthodes lors de notre participation à la tâche de similarité textuelle sémantique (STS) translingue de la campagne d’évaluation SemEval 2017, où nous nous sommes classés 1er à la sous-tâche correspondant le plus au scénario industriel de Compilatio. / The massive amount of documents through the Internet (e.g. web pages, data warehouses anddigital or transcribed texts) makes easier the recycling of ideas. Unfortunately, this phenomenonis accompanied by an increase of plagiarism cases. Indeed, claim ownership of content, withoutthe consent of its author and without crediting its source, and present it as new and original, isconsidered as plagiarism. In addition, the expansion of the Internet, which facilitates access todocuments throughout the world (written in foreign languages) as well as increasingly efficient(and freely available) machine translation tools, contribute to spread a new kind of plagiarism:cross-language plagiarism. Cross-language plagiarism means plagiarism by translation, i.e. a texthas been plagiarized while being translated (manually or automatically) from its original languageinto the language of the document in which the plagiarist wishes to include it. While prevention ofplagiarism is an active field of research and development, it covers mostly monolingual comparisontechniques. This thesis is a joint work between an academic laboratory (LIG) and Compilatio (asoftware publishing company of solutions for plagiarism detection), and proposes cross-lingualsemantic textual similarity measures, which is an important sub-task of cross-language plagiarismdetection.After defining the plagiarism and the different concepts discussed during this thesis, wepresent a state-of-the-art of the different cross-language plagiarism detection approaches. Wealso present the preexisting corpora for cross-language plagiarism detection and show their limits.Then we describe how we have gathered and built a new dataset, which does not contain mostof the limits encountered by the preexisting corpora. Using this new dataset, we conduct arigorous evaluation of several state-of-the-art methods and discover that they behave differentlyaccording to certain characteristics of the texts on which they operate. We next present newmethods for measuring cross-lingual semantic textual similarities based on word embeddings.We also propose a notion of morphosyntactic and frequency weighting of words, which can beused both within a vector and within a bag-of-words, and we show that its introduction inthe new methods increases their respective performance. Then we test different fusion systems(mostly based on linear regression). Our experiments show that we obtain better results thanthe state-of-the-art in all the sub-corpora studied. We conclude by presenting and discussingthe results of these methods obtained during our participation to the cross-lingual SemanticTextual Similarity (STS) task of SemEval-2017, where we ranked 1st on the sub-task that bestcorresponds to Compilatio’s use-case scenario. Détection de plagiat Détection de similarités Traduction Translingue Plagiarism detection Similarity detection Translation Cross-Language Cross-Lingual Semantic textual similarity 004
4	Evaluating Similarity of Cross-Architecture Basic Blocks Meyer, Elijah L. 26 May 2022 (has links) No description available. Computer Science neural network long short term memory natural language processing architecture ghidra analysis binary similarity detection keras intermediate representation ARM x86

1

Page generated in 0.105 seconds