Global ETD Search

241	Techniques d'identification d'entités nommées et de classification non-supervisée pour des requêtes de recherche web à l'aide d'informations contenues dans les pages web visitées Goulet, Sylvain January 2014 (has links) Le web est maintenant devenu une importante source d’information et de divertissement pour un grand nombre de personnes et les techniques pour accéder au contenu désiré ne cessent d’évoluer. Par exemple, en plus de la liste de pages web habituelle, certains moteurs de recherche présentent maintenant directement, lorsque possible, l’information recherchée par l’usager. Dans ce contexte, l’étude des requêtes soumises à ce type de moteur de recherche devient un outil pouvant aider à perfectionner ce genre de système et ainsi améliorer l’expérience d’utilisation de ses usagers. Dans cette optique, le présent document présentera certaines techniques qui ont été développées pour faire l’étude des requêtes de recherche web soumises à un moteur de recherche. En particulier, le travail présenté ici s’intéresse à deux problèmes distincts. Le premier porte sur la classification non-supervisée d’un ensemble de requêtes de recherche web dans le but de parvenir à regrouper ensemble les requêtes traitant d’un même sujet. Le deuxième problème porte quant à lui sur la détection non-supervisée des entités nommées contenues dans un ensemble de requêtes qui ont été soumises à un moteur de recherche. Les deux techniques proposées utilisent l’information supplémentaire apportée par la connaissance des pages web qui ont été visitées par les utilisateurs ayant émis les requêtes étudiées. Classification non-supervisée Requête de recherche web Détection d’entités nommées Topic modeling Fouille du web
242	Multi Domain Semantic Information Retrieval Based on Topic Model Lee, Sanghoon 07 May 2016 (has links) Over the last decades, there have been remarkable shifts in the area of Information Retrieval (IR) as huge amount of information is increasingly accumulated on the Web. The gigantic information explosion increases the need for discovering new tools that retrieve meaningful knowledge from various complex information sources. Thus, techniques primarily used to search and extract important information from numerous database sources have been a key challenge in current IR systems. Topic modeling is one of the most recent techniquesthat discover hidden thematic structures from large data collections without human supervision. Several topic models have been proposed in various fields of study and have been utilized extensively for many applications. Latent Dirichlet Allocation (LDA) is the most well-known topic model that generates topics from large corpus of resources, such as text, images, and audio.It has been widely used in many areas in information retrieval and data mining, providing efficient way of identifying latent topics among document collections. However, LDA has a drawback that topic cohesion within a concept is attenuated when estimating infrequently occurring words. Moreover, LDAseems not to consider the meaning of words, but rather to infer hidden topics based on a statisticalapproach. However, LDA can cause either reduction in the quality of topic words or increase in loose relations between topics. In order to solve the previous problems, we propose a domain specific topic model that combines domain concepts with LDA. Two domain specific algorithms are suggested for solving the difficulties associated with LDA. The main strength of our proposed model comes from the fact that it narrows semantic concepts from broad domain knowledge to a specific one which solves the unknown domain problem. Our proposed model is extensively tested on various applications, query expansion, classification, and summarization, to demonstrate the effectiveness of the model. Experimental results show that the proposed model significantly increasesthe performance of applications. Information retrieval Semantics Topic model Query expansion Text classification Text summarization
243	High performance latent dirichlet allocation for text mining Liu, Zelong January 2013 (has links) Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space. 006.3
244	International Students' Cross-cultural Communication Accommodation through Language Approximation and Topic Selection Strategies on Facebook and Its Relationship to the Students' Acculturation Attitude, Psychological Adjustment, and Socio-cultural Adaptation Kim, Sara January 2015 (has links) Language use and communicative behaviors are important indicators of sojourners' adjustment. The current research was conducted to understand international students' communication behavior on Facebook during their adjustment period in the US and its relationship to the students' acculturative attitude (identification with heritage and mainstream culture), current psychological adjustment level, socio-cultural adaptation level, and target audience on Facebook. Two main theories provided the theoretical framework of the study: Giles' communication accommodation theory (1973) and Berry's acculturation model (1984). Snowball and convenience samples were used to recruit 178 international students from different universities across the US. A mixed approach of online survey and content analysis was used to test the hypotheses and research questions. The results showed that during the stay in the US, international students accommodate their language and topic choice towards their American peers on Facebook. Particularly, it was found that language accommodation levels increase as the students' length of stay in the US increases. The results also demonstrate that international students use Facebook mainly to communicate with friends who reside in the US. When students had higher levels of mainstream identification, they were likely to target American friends as their audience on Facebook and thus have more language and topic accommodation. Additionally, acculturation attitude (heritage and mainstream identification) predicted the students' language accommodation level. Lastly, the study showed that there is a positive relationship between language accommodation and sociocultural adjustment. The findings of the study not only expand the scope of communication accommodation theory and acculturation model, but also enhance understanding of international students' online communication patterns, their purposes, and practical consequences upon their adjustment in the US. This is important because it can be useful in finding ways to improve the students' experience in the US. cross-cultural adjustment Facebook International students language accommodation topic accommodation Communication acculturation model
245	Topic and focus in Cantonese: an OT-LFG account Fung, Suet-man., 馮雪雯. January 2007 (has links) published_or_final_version / abstract / Humanities / Master / Master of Philosophy Cantonese dialects - Topic and comment Cantonese dialects - Discourse analysis. Lexical-functional grammar.
246	Nonparametric Discovery of Human Behavior Patterns from Multimodal Data Sun, Feng-Tso 01 May 2014 (has links) Recent advances in sensor technologies and the growing interest in context- aware applications, such as targeted advertising and location-based services, have led to a demand for understanding human behavior patterns from sensor data. People engage in routine behaviors. Automatic routine discovery goes beyond low-level activity recognition such as sitting or standing and analyzes human behaviors at a higher level (e.g., commuting to work). The goal of the research presented in this thesis is to automatically discover high-level semantic human routines from low-level sensor streams. One recent line of research is to mine human routines from sensor data using parametric topic models. The main shortcoming of parametric models is that they assume a fixed, pre-specified parameter regardless of the data. Choosing an appropriate parameter usually requires an inefficient trial-and-error model selection process. Furthermore, it is even more difficult to find optimal parameter values in advance for personalized applications. The research presented in this thesis offers a novel nonparametric framework for human routine discovery that can infer high-level routines without knowing the number of latent low-level activities beforehand. More specifically, the frame-work automatically finds the size of the low-level feature vocabulary from sensor feature vectors at the vocabulary extraction phase. At the routine discovery phase, the framework further automatically selects the appropriate number of latent low-level activities and discovers latent routines. Moreover, we propose a new generative graphical model to incorporate multimodal sensor streams for the human activity discovery task. The hypothesis and approaches presented in this thesis are evaluated on public datasets in two routine domains: two daily-activity datasets and a transportation mode dataset. Experimental results show that our nonparametric framework can automatically learn the appropriate model parameters from multimodal sensor data without any form of manual model selection procedure and can outperform traditional parametric approaches for human routine discovery tasks. Activity recognition machine learning topic modeling nonparametric Bayesian probabilistic graphical models context-aware systems
247	Effekter av patientutbildning på livskvalité och egenvård hos patienter med hjärtsvikt - en litteraturstudie Herrero, Anna, Engberg, Emelie January 2015 (has links) Bakgrund: Hjärtsvikt är en allvarlig och vanligt förekommande sjukdom samt en av de vanligaste orsakerna till sjukhusinläggning. Hjärtsvikt bidrar till en försämrad hälsa och livskvalitet. Egenvårdsåtgärder har en betydande roll för att främja patientens hälsa samt för att förebygga försämring av tillståndet. Brister i egenvård hos patienten beror till stor del på okunskap om hjärtsvikt och egenvårdsåtgärder. Syfte: Att beskriva vilka effekter patientutbildning har på egenvård och livskvalité hos patienter med hjärtsvikt, syftet var även att beskriva de ingående artiklarnas datainsamlingsmetoder. Metod: En deskriptiv litteraturstudie där 12 vetenskapliga artiklar från databasen Pubmed har inkluderats för att kunna svara på syfte och frågeställningar. Artiklarnas resultat och metod har analyserats och sammanställts under 6 kategorier. Resultat: Patientutbildning har visat sig ha positiva effekter både på egenvård och livskvalité. Det kunde ses förbättringar gällande medicinhantering, följsamhet vid medicinering och andra egenvårdsåtgärder så som att följa salt- och vätskerestriktioner, flertalet av patienterna var även mer positiva till livsstilsförändringar. När det kom till livskvalité förbättrades det fysiska, psykiska och sociala måendet, det kunde i en studie kopplas till att patienterna upplevde en större kontroll över sin situation. Den datainsamlingsmetod som var mest förekommande i de ingående artiklarna var enkäter och frågeformulär. Slutsats:Patientutbildning har visat sig ha goda effekter på livskvalité och egenvård. Olika typer av insatser kan påverka livskvalité och egenvård på olika sätt. Kunskapen om hjärtsvikt ökar vid patientutbildning vilket kan påverka inställning och motivation hos patienter med hjärtsvikt, det innebär bättre förutsättningar vid medicinering samt egenvårdsinsatser vilket leder till en bättre hälsa och en ökad livskvalité. / Background: Heart failure is a serious and common disease and one of the most common causes of hospitalization. Heart failure contributes to a deteriorating health and quality of life. Self-care measures will contribute a large and significant role in promoting the health of the patient and to prevent worsening of the condition. The shortcomings in self-care of the patient depend largely on the lack of knowledge in heart failure and the self-care process. Aim: To describe the effects of patient education on self-care and quality of life in patients with heart failure, the aim was also to describe the data collection methods of the included articles. Method: A descriptive literature study of 12 scientific articles from the database PubMed has been included in order to respond to aim and questions. The articles results and methods have been analyzed and compiled under 6 categories. Result: Patient education has been shown to have positive effects both in terms of self-care and quality of life. Improvements could be seen regarding medication management, compliance in medication and other self-care measures so as to comply with salt and fluid restriction. The majority of patients were also more positive about lifestyle changes. When it came to quality of life, improvements could be seen in the physical, mental and social well-being. It could, in one study, be linked to the patients experiences of a greater control over their situation. The data collection method that was the most prevalent in the included articles were surveys and questionnaires. Conclusion:Patient education has been shown to have positive effects on quality of life and self-care. Different types of actions can affect quality of life and self-care in different ways. Knowledge of heart failure increases with patient education, which can affect attitude and motivation in patients with heart failure, it means better conditions for medication and self-care actions leading to a better health and an improved quality of life. Heartfailure Patient education as topic quality of life self-care impact Hjärtsvikt Patientutbildning livskvalite egenvård effekt
248	台灣學生英語中介語中主題顯著現象的探討 / Topic Prominence: Taiwanese EFL Learner's Interlanguage 賴曉琳, Lai,Xiao lin Unknown Date (has links) 由語言類型來看，中文常被視為是「主題顯著」的語言，而英文常被視為是「主詞顯著」的語言。本篇論文將從語言轉移的角度，探討台灣的英語學習者學習英文時的中介語言，包括是否有中文主題顯著的轉移現象以及是否有學習英文的主詞顯著結構困難。七十八位就讀台北縣某高職的學生參與此研究，他們因英文程度而分為高、中、低成就三組。實驗設計包括三種題型，文法判斷題、翻譯題及寫作題，研究重點在於四種主題顯著的結構，包括無主詞及無受詞句型、主題化的動詞片語及子句、連續動詞結構、雙主語結構，以及兩種主詞顯著的結構，包括主詞動詞一致、虛主詞結構。質化及量化的研究結果顯示，主題結構轉移到學習者的中介語言中，且學習者會有學習主詞結構的困難。當受試者的英文程度提升，主題結構的轉移會逐漸減少且伴隨著主詞結構的發展。皮爾森相關係數亦指出此兩種語言類型的發展在學習者的中介語言中有強烈的相關性。最後，我們發現測驗題型會影響實驗結果。在兩種控制型的題型中，文法判斷題難於翻譯題。寫作測驗不像其他測驗，會造成高、中、低三組表現的差異。不同的測驗題型會改變主題顯著結構的使用趨勢。 / With regard to language typology, Mandarin Chinese has been considered a topic-prominent language while English a subject-prominent language (Li & Thompson & Thomson 1976, Rutherford, 1983, et al.) The present study explored Taiwanese EFL learners’ interlanguage from the perspective of typological transfer; it investigated the influence of first language (L1) topic-prominence typology on the transfer effect and the acquisition of L2 subject-prominence. Seventy-eight vocational high school students in Taipei County participated in the experiment and were further divided into three proficiency groups. Three tasks used to measure learners’ L2 interlanguage were a grammaticality judgment task, a translation task, and a free writing task. The tasks were designed on structures where L1 and L2 contrast typologically including four topic-prominence properties: null subject and null object, topicalized verb phrase and clause, serial verb construction, double nominatives and two subject-prominence properties: subject-verb agreement and dummy subject. Both quantitative and qualitative results showed that topic-prominence has been transferred into learners’ interlanguage; also, learners were found to have difficulty acquiring subject-prominence properties. Besides, it was discovered that as learners’ proficiency increases, there is a gradual decrease of topic-prominence and a relative development of subject-prominence. Pearson Correlation Coefficients indicated that the two linguistic typologies exert a high degree of correlation in learners’ interlanguage development. Finally, methodological effect was found in that, of the two controlled tasks, comprehension task was harder than the production task. Free writing task did not lead a significant group difference as the other tasks did. Also, different task formats changed the trend of topic-prominence transfer. 主題顯著中介語言 topic prominence interlanguage
249	BNS informacinių žinučių analizė teminiu aspektu / Topic analysis in news items of BNS news agency Grigaitytė, Justina 17 June 2010 (has links) Darbe nagrinėjamas temų identifikavimo uždavinys, kuris siejamas su teksto klasifikavimu į tam tikras kategorijas, t.y. įvairių tekstinių duomenų grupavimas pagal atitinkamas temas. Žinutės naujienų agentūrose yra skirstomos į atskiras grupes ir pogrupius pagal temas. Šis darbas atliekamas rankomis, t.y. perskaitomas tekstas ir priskiriamas kokiai nors temai. Vis dėlto, vystantis žiniasklaidai ir kuriantis įvairiems naujienų portalams, aktualu naujienas skirstyti ne rankiniu, o automatiniu būdu, todėl galimybė automatizuoti šį procesą galėtų būti naudinga įvairiems naujienų portalams, padedant skirstyti pranešimus ir taupant laiko bei energijos sąnaudas. Darbo objektą apima 2007 metų BNS spaudos centro žinutės. Darbo tikslas – išsiaiškinti, kaip atskiri žodžiai padeda nustatyti teksto temą. Temos nustatymui taikomi trys metodai: dažnų žodžių, dvižodžių junginių (bigramų) ir prasminių žodžių. Darbas susideda iš trijų dalių. Pirmoje dalyje buvo aptarti teoriniai pagrindai (temos nustatymas, tekstų klasifikavimas, žinių kalba). Apžvelgus žinučių ypatumus pastebėta, kad šis informacinis žanras iš kitų išsiskiria tekstų glaustumu, faktų konstatavimu. Taip pat daroma prielaida, kad temos nustatymo tikslumui yra svarbu žinutės apimtis ir aktualumas. Antroje dalyje aprašyti dažnų žodžių ir dvižodžių junginių sąrašų sudarymo bei prasminių žodžių ištraukimo būdai. Apžvelgus naujienų skirstymą pagal temas, buvo sudarytas temų sąrašas ir juo remiantis, buvo anotuoti dažnų žodžių ir... [toliau žr. visą tekstą] / The thesis is based on topic detection in BNS news reports. The reports are divided into different groups and sub-grouped according to topics. This topic analysis is manual; namely, reading texts and assigning to any topic. However, media and various news portals are developing very quickly, so the possibility to distribute reports automatically is quite relevant problem. The automated topic detection process would be useful for various news portals, automated distribution would save time and energy costs. Therefore, the task of the paper is topic detection issue, which is associated with the classification of text into certain categories, in other words, various text data is classified by subject. The object of the thesis is reports from BNS news agency received in 2007. The aim of the paper is to analyze how separate words help identify the topic. Three methods are applied to detect the topic: high frequency words, bigrams (two-word compounds) and the keywords. The paper consists of three parts. The first part is theoretical; it presents the bases of topic detection, text classification and report language. The report was chosen because this information genre is concise and clearly stating facts. What is more, it is hypothesized that the accuracy of topic detection depends on the size and relevance of the report. The second part describes the formation of frequent words’ and bigram lists and keyword extraction techniques. Those frequent word and bigram lists were... [to full text] Philology Temos nustatymas Dažni Prasminiai Bigramų Topic Bigrams Frequent words Keyword
250	Intergovernmental relations between Britain, Ireland and Northern Ireland 1966-1974 Craig, Anthony January 2009 (has links) This thesis investigates how relations between the government of Britain, Ireland and Northern Ireland changed in the early years of the Northern Ireland Troubles until the collapse of the Sunningdale executive in May 1974. Specifically this research looks at the three relations studying many of the important aspects of intergovernmental relations within the three jurisdictions at the time and using a wide range of examples to demonstrate how the primary driver in relations between all three jurisdictions moved from economic to political, security and intelligence by 1972 and how these relationships grew and developed before their eventual collapse in the months following the Ulster Workers’ Council Strike. Primarily this study is based on archive research in London, Dublin and Belfast at the official national archives of the three states. However it has also made use of interviews with officials. It includes new insight into negotiations for membership of the EEC, Territorial Seas Delimitation, the Arms Crisis, British relations with Terence O’Neill (and the Northern Ireland government’s opinion of the British), the preparations for internment and Direct Rule, the origins of the Northern Ireland Office and the Irish government’s relations with Northern Ireland’s nationalists. This thesis, using recently released sources, challenges a number of conclusions from previously published research, particularly into North-South relations after 1966, and Britain’s preparations for sending British troops in support of the Northern Ireland government. Significantly, this PhD also demonstrates a long series of British attempts at the end of 1972 and throughout 1973 to tease the Irish government into increasing their border security operations. In doing so it explains the Sunningdale Agreement in the context of a relationship between the Cosgrave and Heath governments that went far beyond what was known at the time and was dependent to a far greater extent on security cooperation than has previously been accepted. 941.5

Search results