Global ETD Search

41	Indexation et recherche conceptuelles de documents pédagogiques guidées par la structure de Wikipédia / Learning document indexing and retrieval nased on wikipedia's structure Abi Chahine, Carlo 14 October 2011 (has links) Cette thèse propose un système d'aide à l'indexation et à la recherche de documents pédagogiques fondé sur l'utilisation de Wikipédia.l'outil d'aide à l'indexation permet de seconder les documentalistes dans la validation, le filtrage et la sélection des thématiques, des concepts et des mots-clés issus de l'extraction automatique d'un document. En effectuant une analyse des données textuelles d'un document, nous proposons au documentaliste une liste de descripteurs permettant de représenter et discriminer le document. Le travail du documentaliste se limite alors à une lecture rapide du document et à la sélection et suppression des descripteurs suggérés par le système pour rendre l'indexation homogène, discriminante et exhaustive. Pour cela nous utilisons Wikipédia comme base de connaissances. Le modèle utilisé pour l'extraction des descripteurs permet également de faire de la recherche d'information sur un corpus de document déjà indexé. / This thesis proposes an indexing suupport and information retrieval system for learning resources based on Wikipedia.The indexing support system assists the archivists in fetching the descriptors for a given document. The system analyses the textual content of the document and suggests to the archivists a set of relevant descriptors for this document. After a speed-reading of the document and superficail analysis, the archivists can validate, filter and select the descriptors they consider relevant. To perform this task, we decide to use Wikipedai as knowledge base. The suggest model also anables us to carry out information retrieval tasks on the previously analyzed documents of a corpus. Recherche d'information Indexation Base de connaissances Wikipedia Information retrieval Indexing Expert systems Wikipedia
42	Automatic Editing Rights Management in Wikipedia Wöhner, Thomas 12 March 2012 (has links) The free online encyclopedia Wikipedia is one of the most successful collaborative web projects. It is based on an open editing model, which allows everyone to edit the articles directly in the web browser. As a result of the open concept undesirable contributions like vandalism cannot be ruled out. These contributions reduce the article quality temporarily, consume system resources and cause effort for correcting. To address these problems, this paper introduces an approach for automatic editing rights management in Wikipedia that assigns editing rights according to the reputation of the author and the quality of the article to be edited. This analysis shows that this approach reduces undesirable contributions significantly while valuable contributions are nearly unaffected. info:eu-repo/classification/ddc/330 ddc:330 Wikipedia, automatische Editierrechte Wikipedia, automatic editing rights
43	Descriptive Labeling of Document Clusters / Deskriptiv märkning av dokumentkluster Österberg, Adam January 2022 (has links) Labeling is the process of giving a set of data a descriptive name. This thesis dealt with documents with no additional information and aimed at clustering them using topic modeling and labeling them using Wikipedia as a second source. Labeling documents is a new field with many potential solutions. This thesis examined one method in a practical setting. Unstructured data was preprocessed and clustered using a topic model. Frequent words from each cluster were used to generate a search query sent to Wikipedia, where titles and categories from the most relevant pages were stored as candidate labels. Each candidate label was evaluated based on the frequency of common cluster words among the candidate labels. The frequency was weighted proportional to the relevance of the original Wikipedia article. The relevance was based on the order of appearance in the search results. The five labels with the highest scores were chosen to describe the cluster. The clustered documents consisted of exam questions that students use to practice before a course exam. Each question in the cluster was scored by someone experienced in the relevant topic by evaluating if one of the five labels correctly described the content. The method proved unreliable, with only one course receiving labels considered descriptive for most of its questions. A significant problem was the closely related data with all documents belonging to one overarching category instead of a dataset containing independent topics. However, for one dataset, 80 % of the documents received a descriptive label, indicating that labeling using secondary sources has potential, but needs to be investigated further. / Märkning handlar om att ge okända data en beskrivning. I denna uppsats behandlas data i form av dokument som utan ytterligare information klustras med temamodellering samt märks med hjälp av Wikipedia som en sekundär källa. Märkning av dokument är ett nytt forskningsområde med flera tänkbara vägar framåt. I denna uppsats undersöks en möjlig metod i en praktisk miljö. Dokumenten förbehandlas och grupperas i kluster med hjälp av en temamodell. Vanliga ord från varje kluster används sedan för att generera en sökfråga som skickas till Wikipedia där titlar och kategorier från de mest relevanta sidorna lagras som kandidater. Varje kandidat utvärderas sedan baserat på frekvensen av kandidatordet bland titlarna i klustret och relevansen av den ursprungliga Wikipedia-artikeln. Relevansen av artiklarna baserades på i vilken ordning de dök upp i sökresultatet. De fem märkningarna med högst poäng valdes ut för att beskriva klustret. De klustrade dokumenten bestod av tentamensfrågor som studenter använder sig av för att träna inför ett prov. Varje fråga i klustret utvärderades av någon med erfarenhet av det i frågan behandlade ämnet. Utvärderingen baserades på om någon av de fem märkningarna ansågs beskriva innehållet. Metoden visade sig vara opålitlig med endast en kurs som erhöll märkningar som ansågs beskrivande för majoriteten av dess frågor. Ett stort problem var att data var nära relaterad med alla dokument tillhörande en övergripande kategori i stället för oberoende ämnen. För en datamängd fick dock 80 % av dokumenten en beskrivande etikett. Detta visar att märkning med hjälp av sekundära källor har potential, men behöver undersökas ytterligare. Natural Language Processing Wikipedia Topic Modeling Labeling Språkteknologi Wikipedia Temamodellering Märkning Computer and Information Sciences Data- och informationsvetenskap
44	The Swedish Wikipedia Gender Gap Helgeson, Björn January 2015 (has links) The proportion of women editors on the English language Wikipedia has for years been known to be very low. The purpose of this thesis is to see if this gender gap exists on the Swedish language Wikipedia as well, and investigate the reasons behind it. To do this, three methods are used. Firstly a literature review is conducted, looking at women in computing and how Wikipedia works and how it was founded. Secondly, user behavior and activity-levels are measured through means of a database analysis of editors and edits. And thirdly, a survey is distributed, aimed at both readers and editors of Swedish Wikipedia, gathering some 2700 respondents. The results indicate that there is indeed a big disproportion, and that only between 13-19% of editors are women. The findings did not indicate readers of the encyclopedia having any strong negative preconceptions about Wikipedia or its community. However when looking at reasons for not contributing, women were significantly more likely to perceive themselves as not competent enough to edit. Computer skills were found to be an important factor for trying out editing in the first place, and Wikipedia’s connection to a male-dominated computing/programming culture is put forth as a reason for the resilience of the gender gap. The difference in men’s and women’s communication styles in relation to the climate Wikipedia’s policies and guidelines is also discussed. / Andelen kvinnor som redigerar engelskspråkiga Wikipedia har visats vara väldigt låg. Syftet med detta arbetet är att undersöka om andelen ser likadan ut på den Svenskspråkiga siten också, samt undersöka de bakomliggande orsakerna. För att göra detta används tre metoder. Först görs en literaturstudie som behandlar kvinnor inom programmering och hur Wikipedia fungerar och dess grundande. Därefter mäts användarbeteende och aktivitetsnivåer genom en databasanalys på redigerare och redigeringar. slutligen distribuerades en webb-enkät riktad till både läsare och redigerare av svenskspråkiga Wikipedia, med runt 2700 svaranden. Resultaten visar att det finns en stor snedfördelning och att endast mellan 13-19% av redigerare är kvinnor. Resultaten påvisar inte några särskilda negativa uppfattningar hos läsare om Wikipedia eller dess gemenskap. Däremot uppgav kvinnor i signifikant högre utsträckning att en viktig anledning till att de inte bidrog till encyklopedin var att de inte upplevde sig tillräckligt kompetenta. Datorvana fanns vara en viktig faktor till att testa på att redigera första gången, och Wikipedias koppling till en mans-dominerad programmeringskultur diskuteras som en faktor till den låga andelen kvinnor. Wikipedias policies och riktlinjer och dess sammankoppling med skillnader i män och kvinnors kommunikationsstilar på internet diskuteras även. Wikipedia Wikmedia gender gender gap equality wikipedia equality women editors wikipedia editors Swedish Wikipedia gender disproportion computer skills gender bias gender bias on wikipedia men encyclopedia distribution gender distributon systemic bias Wikimedia Foundation female editorship female female editors Media Engineering Mediateknik
45	"Diderot 2.0" : mémoire de maîtrise portant sur la portée des mesures prises par les nouvelles encyclopédies pour s'adapter aux nouvelles technologies Péloquin, André January 2007 (has links) Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal. Encyclopédie Multimédia Savoir Médium Référence Internet Imprimerie Auteur Wikipedia
46	Content policies in Social Media Critical Discourse Studies: The invisible hand of social media providers? Kopf, Susanne January 2019 (has links) (PDF) This paper complements theoretical and methodological considerations regarding social media in critical discourse studies as it addresses social media content policies as a key contextual element. Specifically, this paper argues that - and why - the exploration of content policies and their enforcement is indispensable when approaching social media platforms and social media data in particular from a critical perspective. A number of researchers have already begun to identify contextual elements that require particular attention when viewing social media and social media data through a CDS lens. However, social media sites' content policies, as pervasive contextual element, have not received adequate research attention yet. Drawing on Computer-Mediated Discourse Analysis (CMDA) and recent developments in Social Media CDS (SM-CDS), this paper first demonstrates the existing gap in research. Then, it contends that social media sites' content policies deserve more detailed attention in SM-CDS, argues why this is the case and elaborates on the different aspects of content policies and policy enforcement that require examination. After detailed theoretical discussion of this, empirical evidence to support this argument is presented in the form of a case study of Wikipedia and Wikipedia data.
47	Towards a definition of Web 2.0 - a comparative study of the 'wiki', 'blog' and 'social network' as instances of Web 2.0 Lewis, Belinda Ann 03 February 2009 (has links) Web 2.0 was a phrase coined in 2004 to describe the characteristics of web sites which survived the original Dot-com crash. Despite the discussion of this phenomenon in a wide variety of both academic and mass media sources, itʼs exact definition remains unclear. The relative contributions of technology and social participation to this phenomenon are particularly confused. The primary aim of this research report is to provide a clear and comprehensive definition of Web 2.0. This definition is determined through a combined social and technological analysis of blogs, wikis and social network sites, through their particular manifestations in Boing Boing, Wikipedia and Facebook respectively. It is the finding of this research that Web 2.0 is primarily the result of a natural evolution from Web 1.0 technologies and attitudes, and that Web 2.0 is essentially a social phenomenon. This research provides separate definitions for Web 2.0 technologies and Web 2.0 platforms. A Web 2.0 technology is any technology that aids and encourages simple intuitive user interaction through an architecture of participation. These technologies enable user feedback, and are thus constantly improved and exist within the ethos of a perpetual beta. Web 2.0 technologies embrace re-mix and mash-up philosophies. A Web 2.0 platform is a read-write Web platform designed to enable and encourage User Generated Content and interaction. These platforms can be built with any set of technologies, and their primary characteristics are social in nature, but the platforms must allow users to interact with the technology at either an open-source, network or appropriation level. These platforms become more powerful and richer the greater the number of people using the platform, and ultimately result in the formation of Web 2.0 communities.
48	Network of Knowledge: Wikipedia as a Sociotechnical System of Intelligence Livingstone, Randall, Livingstone, Randall January 2012 (has links) The purpose of this study was to explore the codependencies of the social and technical structures that yield Wikipedia the website and Wikipedia the community. In doing so, the research investigated the implications of such a sociotechnical system for the maintenance of the project and the emergence of collective intelligence. Using a theoretical framework informed by digital media studies, science and technology studies, and the political economy of communication, this study examined the material and ideological conditions in which Wikipedia has developed. The study's guiding research questions addressed the nature of Wikipedia's sociotechnical system and potential for collective intelligence, as well as the historical development of the project's technical infrastructure and the state of its technology-assisted collaboration. A mainly qualitative multi-method research approach was employed, including document analysis, semi-structured interviewing, and social network analysis. A plethora of documents were carefully selected and examined to explore how and why decisions were made, policies implemented, and technologies adopted on the site. Additionally, 45 interviews were conducted with members of Wikipedia's technical community to understand the relationships between social and technical aspects of the project and the motivations of programmers who contribute automated tools. Finally, social network measures and visualizations were used to interrogate notions of collaboration and make more transparent the centrality of technology to the content creation process. The study revealed that Wikipedia's technical development has been shaped by the dueling ideologies of the open-source software movement and postindustrial capitalism. Its sociotechnical system features the complex collaboration of human contributors, automated programs, social bureaucracy, and technical protocol, each of which conditions the existence and meaning of the others. In addition, the activity on Wikipedia fits established models of collective intelligence and suggests the emergence of a cyberculture, or culturally informed shared intelligence, unique to the digital media context. Software robots (bots) are central actors in this system and are explored in detail throughout this research. Collaboration Collective intelligence Digital media Sociotechnical Wiki Wikipedia
49	Métricas de análise de links e qualidade de conteúdo: um estudo de caso na Wikipédia / Link analysis metrics and content quality: a case of study in Wikipedia Hanada, Raíza Tamae Sarkis 26 February 2013 (has links) Muitos links entre páginas na Web podem ser vistos como indicadores de qualidade e importância para as páginas que eles apontam. A partir desta ideia, vários estudos propuseram métricas baseadas na estrutura de links para inferir qualidade de conteúdo em páginas da web. Contudo, até onde sabemos, o único trabalho que examinou a correlação entre tais métricas e qualidade de conteúdo consistiu de um estudo limitado que deixou várias questões em aberto. Embora tais métricas sejam muito bem sucedidas na tarefa de ranquear páginas que foram fornecidas como respostas para consultas submetidas para máquinas de busca, não é possível determinar a contribuição específica de fatores como qualidade, popularidade e importância para os resultados. Esta dificuldade se deve em parte ao fato de que a informação sobre qualidade, popularidade e importância é difícil de obter para páginas da web em geral. Ao contrário de páginas da web, estas informações podem ser obtidas para artigos da Wikipédia, uma vez que qualidade e importância são avaliadas por especialistas humanos, enquanto a popularidade pode ser estimada com base nas visualizações dos artigos. Isso torna possível a verificação da relação existente entre estes fatores e métricas de análise de links, nosso objetivo neste trabalho. Para fazer isto, nós implementamos vários algoritmos de análise de links e comparamos os rankings obtidos com eles com os obtidos considerando a avaliação humana feita na Wikipédia com relação aos fatores qualidade, popularidade e importância. Nós observamos que métricas de análise de links são mais relacionadas com qualidade e popularidade que com importância e a correlação é moderada / Many links between Web pages can be viewed as indicative of the quality and importance of the pages pointed to. Accordingly, several studies have proposed metrics based on links to infer web page content quality. However, as far as we know, the only work that has examined the correlation between such metrics and content quality consisted of a limited study that left many open questions. In spite of these metrics having been shown successful in the task of ranking pages which were provided as answers to queries submitted to search machines, it is not possible to determine the specific contribution of factors such as quality, popularity, and importance to the results. This difficulty is partially due to the fact that such information is hard to obtain for Web pages in general. Unlike ordinary Web pages, the content quality of Wikipedia articles is evaluated by human experts, which makes it feasible to verify the relation between such link analysis metrics and the quality of Wikipedia articles, our goal in this work. To accomplish that, we implemented several link analysis algorithms and compared their resulting rankings with the ones created by human evaluators regarding factors such as quality, popularity and importance. We found that the metrics are more correlated to quality and popularity than to importance, and the correlation is moderate Análise de links Informartion retrieval Link analysis Recuperação da informação Wikipedia Wikipédia
50	Kognitiv auktoritet och Wikipedia : En analys av gymnasieelevers källkritiska granskning av Wikipedia / Cognitive authority and Wikipedia : An analysis of high school students’ critical evaluation of Wikipedia Johansson, Henrik, Stiel, Johan January 2008 (has links) The main purpose of this master’s thesis is to examine how high school students evaluate the quality of the information available on the online encyclopedia Wikipedia. By doing quantitative research based on questionnaires we expected to find that the means for judging that students use in this situation are most frequently based on a text’s intrinsic qualities in the point of view of its content. It was also our presupposition that 18 and 19 year old high school graduates have received some education in information research and authoritative sources on the Internet. From this assumption we assessed that, while not always using methods recommended by experts, the students use Wikipedia with a high level of healthy suspicion. We considered Patrick Wilson’s theory about cognitive authority to be valid for these assumptions. Though generalizations based on a large sample reduce an understanding of each individual’s expected behavior during information qualitiy evaluation, conclusions could be drawn about which criteria most individuals in this age use in a situation like this. The results show that though intrinsic plausibility has great significance in the students’ evaluation of quality, looking for outside sources is still the main criteria for credibility. It was further discovered that the students in the study used a wide variety of criteria apart from intrinsic plausibility and examining outside sources. This can be of certain significance for future research because it gives us valuable information about the behavior of high school students in terms of information research. / Uppsatsnivå: D kognitiv auktoritet wikipedia internet källkritik trovärdighet Social Sciences Samhällsvetenskap

Search results