Global ETD Search

51	Visualização de documentos partilhados em colecções dinâmicas Almeida, João Miguel Neves Pereira de January 2009 (has links) Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major de Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2009 Wikipedia Colecção dinâmica Wiki - software colaborativo Trabalho colaborativo
52	Improving disease surveillance : sentinel surveillance network design and novel uses of Wikipedia Fairchild, Geoffrey Colin 01 December 2014 (has links) Traditional disease surveillance systems are instrumental in guiding policy-makers' decisions and understanding disease dynamics. The first study in this dissertation looks at sentinel surveillance network design. We consider three location-allocation models: two based on the maximal coverage model (MCM) and one based on the K-median model. The MCM selects sites that maximize the total number of people within a specified distance to the site. The K-median model minimizes the sum of the distances from each individual to the individual's nearest site. Using a ground truth dataset consisting of two million de-identified Medicaid billing records representing eight complete influenza seasons and an evaluation function based on the Huff spatial interaction model, we empirically compare networks against the existing volunteer-based Iowa Department of Public Health influenza-like illness network by simulating the spread of influenza across the state of Iowa. We compare networks on two metrics: outbreak intensity (i.e., disease burden) and outbreak timing (i.e., the start, peak, and end of the epidemic). We show that it is possible to design a network that achieves outbreak intensity performance identical to the status quo network using two fewer sites. We also show that if outbreak timing detection is of primary interest, it is actually possible to create a network that matches the existing network's performance using 42% fewer sites. Finally, in an effort to demonstrate the generic usefulness of these location-allocation models, we examine primary stroke center selection. We describe the ineffectiveness of the current self-initiated approach and argue for a more organized primary stroke center system. While these traditional disease surveillance systems are important, they have several downsides. First, due to a complex reporting hierarchy, there is generally a reporting lag; for example, most diseases in the United States experience a reporting lag of approximately 1-2 weeks. Second, many regions of the world lack trustworthy or reliable data. As a result, there has been a surge of research looking at using publicly available data on the internet for disease surveillance purposes. The second and third studies in this dissertation analyze Wikipedia's viability in this sphere. The first of these two studies looks at Wikipedia access logs. Hourly access logs dating back to December 2007 are available for anyone to download completely free of charge. These logs contain, among other things, the total number of accesses for every article in Wikipedia. Using a linear model and a simple article selection procedure, we show that it is possible to nowcast and, in some cases, forecast up to the 28 days tested in 8 of the 14 disease-location contexts considered. We also demonstrate that it may be possible in some cases to train a model in one context and use the same model to nowcast or forecast in another context with poor surveillance data. The second of the Wikipedia studies looked at disease-relevant data found in the article content. A number of disease outbreaks are meticulously tracked on Wikipedia. Case counts, death counts, and hospitalization counts are often provided in the article narrative. Using a dataset created from 14 Wikipedia articles, we trained a named-entity recognizer (NER) to recognize and tag these phrases. The NER achieved an F1 score of 0.753. In addition to these counts in the narrative, we tested the accuracy of tabular data using the 2014 West African Ebola virus disease epidemic. This article, like a number of other disease articles on Wikipedia, contains granular case counts and deaths counts per country affected by the disease. By computing the root-mean-square error between the Wikipedia time series and a ground truth time series, we show that the Wikipedia time series are both timely and accurate. publicabstract disease surveillance forecasting sentinel surveillance simulation Wikipedia Computer Sciences
53	Automatic Document Topic Identification Using Hierarchical Ontology Extracted from Human Background Knowledge Hassan, Mostafa January 2013 (has links) The rapid growth in the number of documents available to various end users from around the world has led to a greatly increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. We introduce in this thesis a novel approach for identifying document topics. In this approach, we try to utilize human background knowledge to help us to automatically find the best matching topic for input documents. There are several applications for this task. For example, it can be used to improve the relevancy of search engine results by categorizing the search results according to their general topic. It can also give users the ability to choose the domain which is most relevant to their needs. It can also be used for an application like a news publisher, where we want to automatically assign each news article to one of the predefined news main topics. In order to achieve this, we need to extract background knowledge in a form appropriate to this task. The thesis contributions can be summarized into two main modules. In the first module, we introduce a new approach to extract background knowledge from a human knowledge source, in the form of a knowledge repository, and store it in a well-structured and organized form, namely an ontology. We define the methodology of identifying ontological concepts, as well as defining the relations between these concepts. We use the ontology to infer the semantic similarity between documents, as well as to identify their topics. We apply our proposed approach using perhaps the best-known of the knowledge repositories, namely Wikipedia. The second module of this dissertation defines the framework for automatic document topic identification (ADTI). We present a new approach that utilizes the knowledge stored in the created ontology to automatically find the best matching topics for input documents, without the need for a training process such as in document classification. We compare ADTI to other text mining tasks by conducting several experiments to compare the performance of ADTI and its competitors, namely document clustering and document classification. Results show that our document topic identification approach outperforms several document clustering techniques. They show also that while ADTI does not require training, it nevertheless shows competitive performance with one of the state-of-the-art methods for document classification. topic identification ontology creation Wikipedia ontology Electrical and Computer Engineering
54	Map-like Wikipedia visualization Pang, Cheong Iao January 2011 (has links) University of Macau / Faculty of Science and Technology / Department of Computer and Information Science Wikipedia Electronic encyclopedias
55	Missing Link Discovery In Wikipedia: A Comparative Study Sunercan, Omer 01 February 2010 (has links) (PDF) The fast growing online encyclopedia concept presents original and innovative features by taking advantage of information technologies. The links connecting the articles is one of the most important instances of these features. In this thesis, we present our work on discovering missing links in Wikipedia articles. This task is important for both readers and authors of Wikipedia. Readers will bene&amp / #64257 / t from the increased article quality with better navigation support. On the other hand, the system can be employed to support authors during editing. This study combines the strengths of different approaches previously applied for the task, and proposes its own techniques to reach satisfactory results. Because of the subjectivity in the nature of the task / automatic evaluation is hard to apply. Comparing approaches seems to be the best method to evaluate new techniques, and we offer a semi-automatized method for evaluation of the results. The recall is calculated automatically using existing links in Wikipedia. The precision is calculated according to manual evaluations of human assessors. Comparative results for different techniques are presented, showing the success of our improvements. Our system employs Turkish Wikipedia (Vikipedi) and, according to our knowledge, it is the &amp / #64257 / rst study on it. We aim to exploit the Turkish Wikipedia as a semantic resource to examine whether it is scalable enough for such purposes. T Information Technology 58.5-58.64 Link Analysis, Link Discovery, Wikipedia
56	Was haben Viehweiden mit Software zu tun? Informationstechnologien und die Allmende Pentzold, Christian 28 May 2010 (has links) (PDF) Der Vortrag wurde zum UNIX-Stammtsich am 25.5.2010 gehalten. Commons Free/Open Source Software Wikipedia ddc:300 Allmendgenossenschaft Einhegung
57	Hospitable texts Brown, James Joseph, 1978- 03 September 2009 (has links) This dissertation examines Wikipedia, the free encyclopedia that “anyone can edit,” in order to locate an emerging digital rhetoric. That emerging rhetoric is being developed from the bottom up by various rhetors, and it offers rhetoricians a framework for rethinking some of the foundations of the discipline. The discipline has tended to define agency in terms of the conscious rhetor, intellectual property in terms of an author-origin, and community in terms of a shared project that a collective has agreed upon. This dissertation rethinks each of these disciplinary key terms by examining Wikipedia’s hospitable structure, a structure that welcomes writers regardless of identity or credentials. This structure of hospitality troubles the notions that agency can be reduced to consciousness, that texts are easily linked to an owner, or that community is the result of an agreed upon project. In many ways, Wikipedia acts as a microcosm of the various rhetorical collisions that happen to rhetors both online and offline. The proliferation of new media makes for more rhetors and more rhetorical situations, and this requires a complete rethinking of certain portions of rhetorical theory. The theory of hospitality that grounds this project is not utopian—it is instead a full consideration of the complications and perils of welcoming others regardless of identity or credentials. This is a structural hospitality, one that is not necessarily the result of conscious choice. This structure means that Wikipedia is far from a utopia—certain voices are filtered or silenced. But these filters are put up in the face of a hospitable structure that welcomes a broad range of writers, invites colliding interests, allows libelous or inaccurate writings, and encourages an endless chain of citations. The invitations extended by hospitable texts open up difficult questions for rhetoricians: Who is editing this text as I read it? How do we define “community” in such a situation? Who owns this text? “Hospitable Texts” rethinks these questions in light of the Web’s emerging ethical and rhetorical structures. / text rhetoric new media digital rhetoric wikipedia hospitality composition rhetorical theory
58	Wikipedia and Encyclopaedism: A Genre Analysis of Epistemological Values Jankowski, Steven J. 14 May 2013 (has links) This thesis considers how Wikipedia justifies, structures, and legitimizes its production of knowledge. To do so the thesis positions Wikipedia as a site of conflict over the epistemic values between its wiki and encyclopaedic traditions. Through the literature review, the wiki epistemology is argued to be composed of six values: self-identification, collaboration, co-construction, cooperation, trust in the community, and constructionism. While these values are explicit, encyclopaedism’s were not found to be equally defined. To fill this gap, the thesis conducts a genre analysis of encyclopaedism. It first identifies the genre through its communicative purposes to create a universal system of total knowledge and to use this system to educate the public. Second, an analysis of recurrent social contexts within Chambers’ Cyclopaedia (1728), Diderot & d’Alembert’s Encyclopédie (1751–72), the Encyclopaedia Britannica (1771–), and Wikipedia (2001–) finds that the communicative purposes are achieved through the use of five epistemic values: utility, systematic organization, authority, trust in experts, and consistency. Third, a comparison spanning 240 years between Wikipedia and the Britannica’s article headings finds that the value of systematic organization structures Wikipedia’s articles using seventeenth century categories of knowledge. Having established two sets of values that determine Wikipedia’s production of knowledge, the thesis sets the stage for future research to analyze how Wikipedia’s epistemology is articulated in its different production spaces. Ultimately, such research may not only describe the shifting values of Wikipedia’s epistemology but also explain how knowledge is transformed and produced in the network society. Wikipedia knowledge encyclopedism genre analysis network society epistemology
59	Automatic Document Topic Identification Using Hierarchical Ontology Extracted from Human Background Knowledge Hassan, Mostafa January 2013 (has links) The rapid growth in the number of documents available to various end users from around the world has led to a greatly increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. We introduce in this thesis a novel approach for identifying document topics. In this approach, we try to utilize human background knowledge to help us to automatically find the best matching topic for input documents. There are several applications for this task. For example, it can be used to improve the relevancy of search engine results by categorizing the search results according to their general topic. It can also give users the ability to choose the domain which is most relevant to their needs. It can also be used for an application like a news publisher, where we want to automatically assign each news article to one of the predefined news main topics. In order to achieve this, we need to extract background knowledge in a form appropriate to this task. The thesis contributions can be summarized into two main modules. In the first module, we introduce a new approach to extract background knowledge from a human knowledge source, in the form of a knowledge repository, and store it in a well-structured and organized form, namely an ontology. We define the methodology of identifying ontological concepts, as well as defining the relations between these concepts. We use the ontology to infer the semantic similarity between documents, as well as to identify their topics. We apply our proposed approach using perhaps the best-known of the knowledge repositories, namely Wikipedia. The second module of this dissertation defines the framework for automatic document topic identification (ADTI). We present a new approach that utilizes the knowledge stored in the created ontology to automatically find the best matching topics for input documents, without the need for a training process such as in document classification. We compare ADTI to other text mining tasks by conducting several experiments to compare the performance of ADTI and its competitors, namely document clustering and document classification. Results show that our document topic identification approach outperforms several document clustering techniques. They show also that while ADTI does not require training, it nevertheless shows competitive performance with one of the state-of-the-art methods for document classification. topic identification ontology creation Wikipedia ontology Electrical and Computer Engineering
60	Analyse et critique des méthodes d'une encyclopédie en ligne : Wikipédia / Analysis and criticism of the methods of an online encyclopedia : Wikipédia Boussetat Mbaye, Sana 20 June 2016 (has links) En se basant sur une observation participante et la collecte de données recueillies par d'autres techniques d'observation (statistiques, observation directe), nous avons exploré en profondeur le fonctionnement d'une encyclopédie : l'encyclopédie en ligne libre et ouverte Wikipédia1. Cette observation participante a consisté à contribuer activement au projet open source, à nous impliquer dans son fonctionnement, au point même de tenter d'agir sur certains aspects organisationnels (prises de décision, publication d’articles), tout ceci dans le but d’analyser les méthodes employées. Cependant, rectifions d'emblée, il ne s'agit pas seulement d'un projet open source au sens strict du terme, mais d'un concept très particulier dont l'étendue, le public, les pratiques, la nature juridique, font qu'il se trouve à la croisée de divers milieux : monde scientifique, culture libre, population étudiante, public des forums et du Web 2.0.Au niveau théorique, cette observation présente plusieurs intérêts. Elle permet tout d'abord de démontrer que Wikipédia ne se limite pas à une activité informatique immatérielle. Wikipédia pénètre aujourd'hui toutes les activités culturelles, scientifiques voire juridiques.On peut aussi à travers cette étude tirer des enseignements sur la dynamique, le fonctionnement et le crédit à accorder à une telle entreprise. L'immersion dans le projet nous a, en effet, permis de mieux comprendre les logiques et les dynamiques de Wikipédia. Cette étude peut également servir de référent ou d'élément comparatif dans l'observation et l'analyse d'autres projets similaires ou sous forme papier dont les méthodes de réalisation sont différentes (Encyclopædia Universalis, encyclopédie Larousse en ligne, Encarta...). Et enfin, en phase avec l’ère du numérique, elle offre la perspective de nouvelles voies de régulation et des pistes pour exploiter ces nouveaux outils cognitifs. / Based on participant observation and collection of data gathered by other observation techniques (statistics, direct observation), we explored in depth the operation of an encyclopedia: the free and open online encyclopedia Wikipedia. The participant observation consists in actively contributing to the open source project, in getting involved in its operation, even to the point of trying to act on some organizational aspects (decision-making, publication of articles), all this in order to analyze the methods used. However, let’s immediately rectify, this is not only an open source project in the strict sense of the term, but a very particular concept of which extent, public, practices, legal nature, make so that it is located at the crossroads of various backgrounds: scientific world, free culture, students, forums’ public and Web 2.0.On the theoretical level, this observation presents several interests. It first allows to show that Wikipedia is not limited to an immaterial computer activity. Wikipedia now penetrates all cultural, scientific and even legal activities.Through this study, we can also learn on the dynamic, the functioning and the credit to grant to such an undertaking. The immersion in the project has indeed helped us to better understand the logics and dynamics of Wikipedia. This study can also serve as reference or comparative element in the observation and analysis of similar projects or in paper form of which construction methods are different (Encyclopædia Universalis, Larousse online encyclopedia, Encarta ...). And finally, in step with the digital age, it offers the prospect of new regulatory pathways and trails to exploit these new cognitive tools. Wikipedia Analyse Methode Internet Critique Analysis Methods Criticism Online

Search results