Global ETD Search

331	Kompendium der Online-Forschung (DGOF) Deutsche Gesellschaft für Online-Forschung e. V. (DGOF) 24 November 2021 (has links) Die DGOF veröffentlicht hier digitale Kompendien zu aktuellen Themen der Online-Forschung mit Fachbeiträgen von Experten und Expertinnen aus der Branche. info:eu-repo/classification/ddc/378 ddc:378
332	Aiding Remote Diagnosis with Text Mining / Underlätta fjärrdiagnostik genom textbaserad datautvinning Hellström Karlsson, Rebecca January 2017 (has links) The topic of this thesis is on how text mining could be used on patient-reported symptom descriptions, and how it could be used to aid doctors in their diagnostic process. Healthcare delivery today is struggling to provide care to remote settings, and costs are increasing together with the aging population. The aid provided to doctors from text mining on patient descriptions is unknown.Investigating if text mining can aid doctors by presenting additional information, based on what patients who write similar things to what their current patient is writing about, could be relevant to many settings in healthcare. It has the potential to improve the quality of care to remote settings and increase the number of patients treated on the limited resources available. In this work, patient texts were represented using the Bag-of-Words model and clustered using the k-means algorithm. The final clustering model used 41 clusters, and the ten most important words for the cluster centroids were used as representative words for the cluster. An experiment was then performed to gauge how the doctors were aided in their diagnostic process when patient texts were paired with these additional words. The results were that the words aided doctors in cases where the patient case was difficult and that the clustering algorithm can be used to provide the current patient with specific follow-up questions. / Ämnet för detta examensarbete är hur text mining kan användas på patientrapporterade symptombeskrivningar, och hur det kan användas för att hjälpa läkare att utföra den diagnostiska processen. Sjukvården har idag svårigheter med att leverera vård till avlägsna orter, och vårdkostnader ökar i och med en åldrande population. Idag är det okänt hur text mining skulle kunna hjälpa doktorer i sitt arbete. Att undersöka om läkare blir hjälpta av att presenteras med mer information, baserat på vad patienter som skriver liknande saker som deras nuvarande patient gör, kan vara relevant för flera olika områden av sjukvården. Text mining har potential att förbättra vårdkvaliten för patienter med låg tillgänglighet till vård, till exempel på grund av avstånd. I detta arbete representerades patienttexter med en Bag-of-Words modell, och klustrades med en k-means algoritm. Den slutgiltiga klustringsmodellen använde sig av 41 kluster, och de tio viktigaste orden för klustercentroider användes för att representera respektive kluster. Därefter genomfördes ett experiment för att se om och hur läkare blev behjälpta i sin diagnostiska process, om patienters texter presenterades med de tio orden från de kluster som texterna hörde till. Resultaten från experimentet var att orden hjälpte läkarna i de mer komplicerade patientfallen, och att klustringsalgoritmen skulle kunna användas för att ställa specifika följdfrågor till patienter. patient-reported symptoms self-reported symptoms symptom descriptions text mining machine learning clustering virtual visits web-doctor diagnostic aid patientrapporterade symptombeskrivningar självrapporterade symptombeskrivningar text mining maskininlärning klustring virtuella möten nätdoktor diagnosstöd Other Medical Engineering Annan medicinteknik
333	Automating debugging through data mining / Automatisering av felsökning genom data mining Thun, Julia, Kadouri, Rebin January 2017 (has links) Contemporary technological systems generate massive quantities of log messages. These messages can be stored, searched and visualized efficiently using log management and analysis tools. The analysis of log messages offer insights into system behavior such as performance, server status and execution faults in web applications. iStone AB wants to explore the possibility to automate their debugging process. Since iStone does most parts of their debugging manually, it takes time to find errors within the system. The aim was therefore to find different solutions to reduce the time it takes to debug. An analysis of log messages within access – and console logs were made, so that the most appropriate data mining techniques for iStone’s system would be chosen. Data mining algorithms and log management and analysis tools were compared. The result of the comparisons showed that the ELK Stack as well as a mixture between Eclat and a hybrid algorithm (Eclat and Apriori) were the most appropriate choices. To demonstrate their feasibility, the ELK Stack and Eclat were implemented. The produced results show that data mining and the use of a platform for log analysis can facilitate and reduce the time it takes to debug. / Dagens system genererar stora mängder av loggmeddelanden. Dessa meddelanden kan effektivt lagras, sökas och visualiseras genom att använda sig av logghanteringsverktyg. Analys av loggmeddelanden ger insikt i systemets beteende såsom prestanda, serverstatus och exekveringsfel som kan uppkomma i webbapplikationer. iStone AB vill undersöka möjligheten att automatisera felsökning. Eftersom iStone till mestadels utför deras felsökning manuellt så tar det tid att hitta fel inom systemet. Syftet var att därför att finna olika lösningar som reducerar tiden det tar att felsöka. En analys av loggmeddelanden inom access – och konsolloggar utfördes för att välja de mest lämpade data mining tekniker för iStone’s system. Data mining algoritmer och logghanteringsverktyg jämfördes. Resultatet av jämförelserna visade att ELK Stacken samt en blandning av Eclat och en hybrid algoritm (Eclat och Apriori) var de lämpligaste valen. För att visa att så är fallet så implementerades ELK Stacken och Eclat. De framställda resultaten visar att data mining och användning av en plattform för logganalys kan underlätta och minska den tid det tar för att felsöka. Association rule mining Machine learning Classification algorithms Supervised learning Text mining Log management and analysis tools Association rule mining Maskininlärning Classification algorithms Supervised learning Text mining Logghanteringsverktyg Software Engineering Programvaruteknik
334	針對臉書粉絲專頁貼文之政治傾向預測 / Predicting Political Affiliation for Posts on Facebook Fan Pages 張哲嘉, Chang, Che Chia Unknown Date (has links) 近年來社群媒體興起，尤其以臉書為主。在台灣超過1500萬個臉書用戶，其遍及族群從公眾人物到一般民眾。此外，這類的新興資訊交流平台其實內含許多有意義的資訊，每一則貼文都隱含著每個使用者的情緒以及立場傾向。然而，利用社群媒體來預測選舉與使用者政治傾向已成為目前的趨勢，在台灣各政黨與政治人物紛紛成立粉絲專頁，投入利用網路與社群媒體來打選戰與預測民調。本研究發現此一特性，致力於預測粉絲專頁貼文之政治傾向，收集台灣兩大政黨派國民黨與民進黨之粉絲專頁貼文，建立兩種預測模型分別為以相異字為特徵模型與文字互動特徵模型。利用資料探勘之相關技術，以貼文所含藍綠政黨特徵表現建立分類器，並細部探討與設計多種特徵組合，比較不同特徵組合之預測效果與影響因素以及在預測資料不平衡的情況下是否影響分類結果。最後，研究結果顯示使用文字特徵中黨派典型字與互動特徵值域取對數並搭配KNN分類器效果最佳，其準確度可達0.908，F1-score可達0.827。 / Recently, the social media is becoming more and more popular, especially Facebook. In Taiwan, there are 15 million Facebook users from celebrities to the general public. Receiving information every day from Facebook has become a lifestyle of most people. These new information-exchanging platforms contain lots of meaningful messages including users' emotions and affiliations. Moreover, using the social media data to predict the election result and political affiliation is becoming the current trend in Taiwan. For example, politicians try to win the election and predict the polls by means of Internet and the social media, and every political parties also have their own fan pages. In this thesis, we make an effort to predict the political inclinations of the posts of fan pages, especially for KMT and DPP which are the two largest political parties in Taiwan. We filter the appropriate literal and interactive features. We use the posts of the two parties to predict the political inclinations by constructing the classification models .In the end, we compare the performances of different classifiers .The result shows that the literal and interactive features work the best with KNN classifier, whose accuracy and F1-score are 0.908 and 0.827, respectively. 政治傾向分類臉書文字探勘 political affiliation classification facebook text mining
335	運用文字探勘技術輔助建構法律條文之語意網路－以公司法為例張露友 Unknown Date (has links) 本論文運用文字探勘相關技術，嘗試自動計算法條間的相似度，輔助專家從公司法眾多法條中整理出規則，建立法條之間的關聯，使整個法典並不是獨立的法條條號與法條內容的集合，而是在法條之間透過語意的方式連結成網路，並從分析與解釋關聯的過程中，探討文字探勘技術運用於法律條文上所遭受之困難及限制，以供後續欲從事相關研究之參考。本論文的研究結果，從積極面來看，除了可以建立如何運用文字探勘於輔助法律知識擷取的方法之外，另一方面，從消極面來看，倘若研究結果顯示，文字探勘技術並不完全適用於法律條文的知識擷取上，那麼對於從事類似研究的專業人員而言，本論文所提出的結論與建議，亦可作為改善相關技術的重要參考。 / This thesis tries to use text mining technique to calculate, compare and analyze the correlation of legal codes. And based on the well-known defined legal concept and knowledge, it also tries to help explain and evaluate the relations above using the result of automatic calculation. Furthermore, this thesis also wishes to contribute on how to apply information technology effectively onto legal knowledge domain. If the research reveals the positive result, it could be used for knowledge build-up on how to utilize text mining technology onto legal domain. However, if the study shows that text mining doesn’t apparently apply to knowledge extracting of legal domain, then the conclusion and suggestion from this thesis could also be regarded as a important reference to other professionals in the similar research fields. 文字探勘語意網路知識擷取 Text mining Semantic web Knowledge extraction
336	Introducing Explorer of Taxon Concepts with a case study on spider measurement matrix building Cui, Hong, Xu, Dongfang, Chong, Steven S., Ramirez, Martin, Rodenhausen, Thomas, Macklin, James A., Ludäscher, Bertram, Morris, Robert A., Soto, Eduardo M., Koch, Nicolás Mongiardino 17 November 2016 (has links) Background: Taxonomic descriptions are traditionally composed in natural language and published in a format that cannot be directly used by computers. The Exploring Taxon Concepts (ETC) project has been developing a set of web-based software tools that convert morphological descriptions published in telegraphic style to character data that can be reused and repurposed. This paper introduces the first semi-automated pipeline, to our knowledge, that converts morphological descriptions into taxon-character matrices to support systematics and evolutionary biology research. We then demonstrate and evaluate the use of the ETC Input Creation - Text Capture - Matrix Generation pipeline to generate body part measurement matrices from a set of 188 spider morphological descriptions and report the findings. Results: From the given set of spider taxonomic publications, two versions of input (original and normalized) were generated and used by the ETC Text Capture and ETC Matrix Generation tools. The tools produced two corresponding spider body part measurement matrices, and the matrix from the normalized input was found to be much more similar to a gold standard matrix hand-curated by the scientist co-authors. Special conventions utilized in the original descriptions (e.g., the omission of measurement units) were attributed to the lower performance of using the original input. The results show that simple normalization of the description text greatly increased the quality of the machine-generated matrix and reduced edit effort. The machine-generated matrix also helped identify issues in the gold standard matrix. Conclusions: ETC Text Capture and ETC Matrix Generation are low-barrier and effective tools for extracting measurement values from spider taxonomic descriptions and are more effective when the descriptions are self-contained. Special conventions that make the description text less self contained challenge automated extraction of data from biodiversity descriptions and hinder the automated reuse of the published knowledge. The tools will be updated to support new requirements revealed in this case study. Information extraction Text mining Natural language processing Taxonomic morphological descriptions Phenotypic characters Phenotypic traits Evaluation Spiders ETC Explorer of Taxon Concepts
337	Identifying Genetic Pleiotropy through a Literature-wide Association Study (LitWAS) and a Phenotype Association Study (PheWAS) in the Age-related Eye Disease Study 2 (AREDS2) Simmons, Michael 26 May 2017 (has links) A Thesis submitted to The University of Arizona College of Medicine - Phoenix in partial fulfillment of the requirements for the Degree of Doctor of Medicine. / Genetic association studies simplify genotype‐phenotype relationship investigation by considering only the presence of a given polymorphism and the presence or absence of a given downstream phenotype. Although such associations do not indicate causation, collections of phenotypes sharing association with a single genetic polymorphism may provide valuable mechanistic insights. In this thesis we explore such genetic pleiotropy with Deep Phenotype Association Studies (DeePAS) using data from the Age‐Related Eye Study 2 (AREDS2). We also employ a novel text mining approach to extract pleiotropic associations from the published literature as a hypothesis generation mechanism. Is it possible to identify pleiotropic genetic associations across multiple published abstracts and validate these in data from AREDS2? Data from the AREDS2 trial includes 123 phenotypes including AMD features, other ocular conditions, cognitive function and cardiovascular, neurological, gastrointestinal and endocrine disease. A previously validated relationship extraction algorithm was used to isolate descriptions of genetic associations with these phenotypes in MEDLINE abstracts. Results were filtered to exclude negated findings and normalize variant mentions. Genotype data was available for 1826 AREDS2 participants. A DeePAS was performed by evaluating the association between selected SNPs and all available phenotypes. Associations that remained significant after Bonferroni‐correction were replicated in AREDS. LitWAS analysis identified 9372 SNPs with literature support for at least two distinct phenotypes, with an average of 3.1 phenotypes/SNP. PheWAS analyses revealed that two variants of the ARMS2‐HTRA1 locus at 10q26, rs10490924 and rs3750846, were significantly associated with sub‐retinal hemorrhage in AMD (rs3750846 OR 1.79 (1.41‐2.27), p=1.17*10‐7). This associated remained significant even in populations of participants with neovascular AMD. Furthermore, odds ratios for the development of sub‐retinal hemorrhage in the presence of the rs3750846 SNP were similar between incident and prevalent AREDS2 sub‐populations (OR: 1.94 vs 1.75). This association was also replicated in data from the AREDS trial. No literature‐defined pleiotropic associations tested remained significant after multiple‐testing correction. The rs3750846 variant of the ARMS2‐HTRA1 locus is associated with sub‐retinal hemorrhage. Automatic literature mining, when paired with clinical data, is a promising method for exploring genotype‐phenotype relationships. Text Mining Genetic Association Studies GWAS PheWAS Age-related Macular Degeneration AMD AREDS2 ARMS2 Phenome HtrA1 Protein
338	A WEB PERSONALIZATION ARTIFACT FOR UTILITY-SENSITIVE REVIEW ANALYSIS Flory, Long, Mrs. 01 January 2015 (has links) Online customer reviews are web content voluntarily posted by the users of a product (e.g. camera) or service (e.g. hotel) to express their opinions about the product or service. Online reviews are important resources for businesses and consumers. This dissertation focuses on the important consumer concern of review utility, i.e., the helpfulness or usefulness of online reviews to inform consumer purchase decisions. Review utility concerns consumers since not all online reviews are useful or helpful. And, the quantity of the online reviews of a product/service tends to be very large. Manual assessment of review utility is not only time consuming but also information overloading. To address this issue, review helpfulness research (RHR) has become a very active research stream dedicated to study utility-sensitive review analysis (USRA) techniques for automating review utility assessment. Unfortunately, prior RHR solution is inadequate. RHR researchers call for more suitable USRA approaches. Our current research responds to this urgent call by addressing the research problem: What is an adequate USRA approach? We address this problem by offering novel Design Science (DS) artifacts for personalized USRA (PUSRA). Our proposed solution extends not only RHR research but also web personalization research (WPR), which studies web-based solutions for personalized web provision. We have evaluated the proposed solution by applying three evaluation methods: analytical, descriptive, and experimental. The evaluations corroborate the practical efficacy of our proposed solution. This research contributes what we believe (1) the first DS artifacts to the knowledge body of RHR and WPR, and (2) the first PUSRA contribution to USRA practice. Moreover, we consider our evaluations of the proposed solution the first comprehensive assessment of USRA solutions. In addition, this research contributes to the advancement of decision support research and practice. The proposed solution is a web-based decision support artifact with the capability to substantially improve accurate personalized webpage provision. Also, website designers can apply our research solution to transform their works fundamentally. Such transformation can add substantial value to businesses. review helpfulness web personalization decision support utility-sensitive review analysis text mining data mining Business Intelligence E-Commerce Marketing Technology and Innovation
339	Vulnerability Reports Analysis and Management / Vulnerability Reports Analysis and Management Domány, Dušan January 2011 (has links) Various vulnerabilities in software products can often represent a significant security threat if they are discovered by malicious attackers. It is therefore important to identify these vulnerabilities and report their presence to responsible persons before they are exploited by malicious subjects. The number of security reports about discovered vulnerabilities in various software products has grown rapidly over the last decade. It is becoming more and more difficult to process all of the incoming reports manually. This work discusses various methods that can be used to automate several important processes in collecting and sorting the reports. The reports are analyzed in various ways, including techniques of text mining, and the results of the analysis are applied in form of practical implementation.
340	Abordagem simbólica de aprendizado de máquina na recuperação automática de artigos científicos a partir de web / Symbolic approach of machine learning in the scientific article automatic recovery from the web Brasil, Christiane Regina Soares 07 April 2006 (has links) Atualmente, devido ao incessante aumento dos documentos científicos disponíveis na rede mundial de computadores, as ferrametas de busca tornaram-se um importante auxílio para recuperação de informação a partir da Internet em todas as áreas de conhecimento para pesquisadores e usuários. Entretanto, as atuais ferramentas de busca disponíveis selecionam uma enorme lista de páginas, cabendo ao usuário a tarefa final de escolher aquelas que realmente são relevantes a sua pesquisa. Assim, é importante o desenvolvimento de técnicas e ferramentas que não apenas retornem uma lista de possíveis documentos relacionados com a consulta apresentada pelo usuário, mas que organizem essa informação de acordo com o conteúdo de tais documentos, e apresentem o resultado da busca em uma representação gráfica que auxilie a exploração e o entendimento geral dos documentos recuperados. Neste contexto, foi proposto o projeto de uma Ferramenta Inteligente de Apoio à Pesquisa (FIP), do qual este trabalho é parte. O objetivo deste trabalho é analisar estratégias de recuperação automática de artigos científicos sobre uma determinada área de pesquisa a partir da Web, que poderá ser adotada pelo módulo de recuperação da FIP. Neste trabalho são considerados artigos escritos em inglês, no formato PDF, abrangendo as áreas da Ciência da Computação. Corpora de treino e teste foram usados para avaliação das abordagens simbólicas de Aprendizado de Máquina na indução de regras que poderão ser inseridas em um crawler inteligente para recuperação automática de artigos dessas áreas. Diversos experimentos foram executados para definir parâmetros de pré-processamento apropriados ao domínio, bem como para definir a melhor estratégia de aplicação das regras induzidas e do melhor algoritmo simbólico de indução. / Today, due to the increase of scientific documents available on the World Wide Web, search tools have become an important aid for information retrieval from the Internet in all fields of knowledge for researchers and users. However, the search tools currently available, in general, select a huge list of pages leaving the user with the final task of choosing those pages that actually fit its research. It is important to develop techniques and tools that return a list of documents related to the query made by the user in accordance with the content of such documents, and then present the result in a meaningful graphical representation with the aim to improve the exploration and understanding of the retrieved articles. In this context, a project of an Intelligent Tool for Research Supporting (FIP) was proposed. This MSc work is part of this project. The objective of this work is to analyze strategies of automatic scientific article retrieval on a specific field from the Web. Such strategy must fit the requirements of the retrieval module of the FIP. In this work articles written in English, in PDF format, covering the fields of Computer Science were considered. Corpora of training and testing were used to evaluate the symbolic approaches of Machine Learning in the induction of rules. These rules could be imbedded in an intelligent crawler for automatic retrieving of the articles in the chosen fields. Several experiments have been carried out in order to define parameters as attribute weights, cut-off point, stopwords in the corpora domain, a better strategy to apply the rules for the categorization of the articles and a better symbolic algorithm to induce the rules Aprendizado de máquina Information retrieval Machine learning Mineração de texto Mineração na web Recuperação de informação Text mining Web mining

Search results