Global ETD Search

181	Broad-domain Quantifier Scoping with RoBERTa Rasmussen, Nathan Ellis 10 August 2022 (has links) No description available. Linguistics quantifiers quantifier scope disambiguation explanatory text Simple English Wikipedia corpus annotation inter-annotator agreement RoBERTa self-trained language models transfer learning span pair classification
182	Accountability in action: how can archaeology make amends? Fitzpatrick, Alexandra L. 22 March 2022 (has links) Yes / This special issue gathers together a selection of short articles reflecting on the historical construction of inequality and race in the histories of archaeology. The articles also suggest ways in which the discipline might grapple with the—often obvious, sometimes subtle—consequences of that historical process. Solicited via an open call for papers in the summer of 2020 (one made with the aim of speedy publication), the breadth of the topics discussed in the articles reflect how inequality and race have become more prominent research themes within the histories of archaeology in the previous five-to-ten years. At the same time, the pieces show how research can—and should—be connected to attempts to promote social justice and an end to racial discrimination within archaeological practice, the archaeological profession, and the wider worlds with which the discipline interacts. Published at a time when a pandemic has not only swept the world, but also exposed such inequalities further, the special issue represents a positive intervention in what continues to be a contentious issue. / The EDH project was funded by the UK’s Arts and Humanities Research Council (AHRC), project number AH/S004580/1, and conducted in compliance with UCL’s ethical guidance, project id 14901/001. Yugoslavia Racial anthropology Wikipedia Activism Race Gender Museum of Fine Arts Boston Antiquities market Images of Africans in classical art Flinders Petrie Human remains Kahun Decolonisation Racism Indigenous archaeology
183	Frihetens rike : Wikipedianer om sin praktik, sitt produktionssätt och kapitalismen Lund, Arwid January 2015 (has links) This study is about voluntary productive activities in digital networks and on digital platforms that often are described as pleasurable. The aim of the study is to relate the peer producers’ perceptions of their activities on a micro level in terms of play, game, work and labour, to their views on Wikipedia’s relation to capitalism on a macro level, to compare the identified ideological formations on both levels and how they relate to each other, and finally compare the identified ideological formations with contemporary Marxist theory on cognitive capitalism. The intention is to perform a critical evaluation of the economic role of peer production in society.Qualitative and semi-structured interviews with eight Wikipedians active within the Swedish language version of Wikipedia constitute the empirical base of the study together with one public lecture by a Wikipedian on the encyclopaedia and a selection of pages in the encyclopaedia that are text analysed. The transcribed interviews have been analysed using a version of ideological analysis as it has been developed by the Gothenburg School. The views on the peer producing activities on the micro level has been analysed in a dialectical way but is also grounded in a specific field model.Six ideological formations are identified in the empirical material. On the micro level: the peripheral, bottom-up- and top-down-formation, on the macro level: the Californian alikeness ideology, communism of capital and capitalism of communism. Communism of capital has two sides to it: one stresses the synergies and the other the conflicts between the two phenomena. The formations on the macro level conform broadly to contemporary Marxist theory, but there are important differences as well. The study results in a hypothesis that the critical side of communism of capital and the peripheral and bottom-up-formation could help to further a more sustainable capitalism of communism, and counteract a deeper integration of the top-down-formation with Californian alikeness ideology. The latter is the main risk of capitalist co-optation of the peer production that is underway as the manifestly dominant formations on the macro level are Californian alikeness ideology and communism of capital. / <p>©<strong> </strong>2015 Arwid Lund, used under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 license: http://creativecommons.org/licenses/by-nc-nd/3.0/</p><p></p> Wikipedia Peer Production Crowdsourcing Digital Labour Digital Work Wikipedians Ideology Ideology Analysis Historical Materialism Marxism Autonomist Marxism Capitalism Cognitive Capitalism Communism of Capital Capitalism of Communism Play Playing Work Working Gaming Labour Labouring Mode of Production Wikipedia wikipedianer jämlik produktion marxism autonom marxism historiematerialism produktionssätt ideologi ideologianalys kommunism kapitalism kognitiv kapitalism kapitalets kommunism kommunismens kapitalism lek tillverkning tävling arbete lönarbete gåvoekonomi encyklopedier uppslagsverk immateriellt arbete
184	Zur Beziehung von Raum und Inhalt nutzergenerierter geographischer Informationen Hahmann, Stefan 21 July 2014 (has links) (PDF) In the last ten years there has been a significant progress of the World Wide Web, which evolved to become the so-called “Web 2.0”. The most important feature of this new quality of the WWW is the participation of the users in generating contents. This trend facilitates the formation of user communities which collaborate on diverse projects, where they collect and publish information. Prominent examples of such projects are the online-encyclopedia “Wikipedia”, the microblogging-platform “Twitter”, the photo-platform “Flickr” and the database of topographic information “OpenStreetMap”. User-generated content, which is directly or indirectly geospatially referenced, is of-ten termed more specifically as “volunteered geographic information”. The geospatial reference of this information is constituted either directly by coordinates that are given as meta-information or indirectly through georeferencing of toponyms or addresses that are contained in this information. Volunteered geographic information is particularly suited for research, as it can be accessed with low or even at no costs at all. Furthermore it reflects a variety of human decisions which are linked to geographic space. In this thesis, the relationship of space and content of volunteered geographic information is investigated from two different perspectives. The first part of this thesis addresses the question for which share of information there exists a relationship between space and content of the information, such that the information is locatable in geospace. In this context, the assumption that about 80% of all information has a reference to space has been well known within the community of geographic information system users. Since the 1980s it has served as a marketing tool within the whole geoinformation sector, although there has not been any empirical evidence. This thesis contributes to fill this research gap. For the validation of the ‘80%-hypothesis’ two approaches are presented. The first approach is based on a corpus of information that is as representative as possible for world knowledge. For this purpose the German language edition of Wikipedia has been selected. This corpus is modeled as a network of information where the articles are considered the nodes and the cross references are considered the edges of a directed graph. With the help of this network a graduated definition of geospatial references is possible. It is implemented by computing the distance of each article to its closest article within the network that is assigned with spatial coordinates. Parallel to this, a survey-based approach is developed where participants have the task to assign pieces of information to one of the categories “direct geospatial reference”, “indirect geospatial reference” and “no geospatial reference”. A synthesis of both approaches leads to an empirically justified figure for the “80%-assertion”. The result of the investigation is that for the corpus of Wikipedia 27% of the information may be categorized as directly geospatially referenced and 30% of the information may be categorized as indirectly geospatially referenced. In the second part of the thesis the question is investigated in how far volunteered geographic information that is produced on mobile devices is related to the locations where it is published. For this purpose, a collection of microblogging-texts produced on mobile devices serve as research corpus. Microblogging-texts are short texts that are published via the World Wide Web. For this type of information the relationship be-tween the content of the information and their position is less obvious than e.g. for topographic information or photo descriptions. The analysis of microblogging-texts offers new possibilities for market and opinion research, the monitoring of natural events and human activities as well as for decision support in disaster management. The spatial analysis of the texts may add extra value. In fact for some of the applications the spatial analysis is a necessary condition. For this reason, the investigation of the relationship of the published contents with the locations where they are generated is of interest. Within this thesis, methods are described that support the investigation of this relationship. In the presented approach, classified Points of Interest serve as a model for the environment. For the purpose of the investigation of the correlation between these points and the microblogging-texts, manual classification and natural language processing are used in order to classify these texts according to their relevance in regard to the respective feature classes. Subsequently, it is tested whether the share of relevant texts in the proximity of objects of the tested classes is above average. The results of the investigation show that the strength of the location-content-correlation depends on the tested feature class. While for the feature classes ‘train station’, ‘airport’ and ‘restaurant’ a significant dependency of the share of relevant texts on the distance to the respective objects may be observed, this is not confirmed for objects of other feature classes, such as ‘cinema’ and ‘supermarket’. However, as prior research that describes investigations on small cartographic scale has detected correlations between space and content of microblogging-texts, it can be concluded that the strength of the correlation between space and content of microblogging-texts depends on scale and topic. / Während der vergangenen zehn Jahre vollzog sich eine signifikante Veränderung des World Wide Webs, das sich zum sogenannten „Web 2.0“ entwickelte. Das wesentlichste Merkmal dieser neuen Qualität des WWW ist die Beteiligung der Nutzer bei der Erstellung der Inhalte. Diese Entwicklung fördert das Entstehen von Nutzergemeinschaften, die kollaborativ in unterschiedlichsten Projekten Informationen sammeln und veröffentlichen. Prominente Beispiele für solche Projekte sind die Online-Enzyklopädie „Wikipedia“, die Microblogging-Plattform „Twitter“, die Foto-Plattform „Flickr“ und die Sammlung topographischer Informationen „OpenStreetMap“. Nutzergenerierte Inhalte, die direkt oder indirekt raumbezogen sind, können spezifischer als „nutzergenerierte geographische Informationen“ bezeichnet werden. Der Raumbezug dieser Informationen entsteht entweder direkt durch die Angabe räumlicher Koordinaten als Metainformationen oder er kann indirekt durch die Georeferenzierung von in den Informationen enthaltenen Toponymen oder Adressen hergestellt werden. Nutzergenerierte geographische Informationen haben für die Forschung den besonderen Vorteil, dass sie einerseits häufig gänzlich ohne oder nur mit geringen Kosten verfügbar gemacht werden können und andererseits eine Vielzahl von menschlichen Entscheidungen widerspiegeln, die mit dem Raum verknüpft sind. In der vorliegenden Dissertation wird die Beziehung von Raum und Inhalt nutzergenerierter geographischer Informationen aus zwei Perspektiven untersucht. Im ersten Teil der Arbeit steht die Frage im Vordergrund, für welchen Anteil an Informationen eine Beziehung zwischen Raum und Informationsinhalt in der Art besteht, dass die Informationen im Georaum lokalisierbar sind. In diesem Zusammenhang existiert seit den 1980er Jahren die unter Nutzern von geographischen Informationssystemen weit verbreitete These, dass 80% aller Informationen einen Raumbezug haben. Diese These dient im gesamten Spektrum der Branche als Marketinginstrument, ist jedoch nicht empirisch belegt. Diese Arbeit trägt dazu bei, die bestehende Forschungslücke zu schließen. Für die Prüfung dieser These, die in der Arbeit als „Raumbezugshypothese“ bezeichnet wird, werden zwei Ansätze vorgestellt. Der erste Ansatz basiert auf der Analyse eines möglichst repräsentativen Informationskorpus, wofür die deutsche Sprachversion der Wikipedia ausgewählt wird. Diese wird als Informationsnetzwerk modelliert, indem deren Artikel als Knoten und deren interne Querverweise als Kanten eines gerichteten Graphen betrachtet werden. Mit Hilfe dieses Netzwerkes ist es möglich eine abgestufte Definition des Raumbezuges von Informationen einzuführen, indem die Entfernung jedes Artikels innerhalb des Netzwerkes zum jeweils nächstgelegenen Artikel, der mit räumlichen Koordinaten gekennzeichnet ist, berechnet wird. Parallel dazu wird ein Befragungsansatz entwickelt, bei dem Probanden die Aufgabe haben, Informationen in die Kategorien „Direkter Raumbezug“, „Indirekter Raumbezug“ und „Kein Raumbezug“ einzuordnen. Die Synthese beider Ansätze führt zu einer empirisch begründeten Zahl für die „Raumbezugsthese“. Das Ergebnis ist, dass für das Untersuchungskorpus Wikipedia 27% der Informationen als direkt raumbezogenen und 30% der Informationen als indirekt raumbezogen kategorisiert werden können. Im zweiten Teil der Arbeit wird die Forschungsfrage untersucht, inwiefern nutzergenerierte Informationen, die über mobile Geräte erzeugt werden, in Beziehung zu den Orten stehen, an denen sie veröffentlicht werden. Als Forschungskorpus dienen mobil verfasste Microblogging-Texte. Dies sind kurze Texte, die über das WWW veröffentlicht werden. Bei dieser Informationsart liegt im Gegensatz zu beispielsweise topographischen Information oder Fotobeschreibungen die Vermutung eines starken Zusammenhanges zwischen dem Inhalt der Informationen und deren Positionen nicht nahe. Die Analyse von Microblogging-Texten bietet unter anderem Potential für die Markt- und Meinungsforschung, die Beobachtung von Naturereignissen und menschlichen Aktivitäten sowie die Entscheidungsunterstützung in Katastrophenfällen. Aus der räumlichen Auswertung kann sich dabei ein Mehrwert ergeben, für einen Teil der Anwendungen ist die räumliche Auswertung sogar die notwendige Voraussetzung. Aus diesem Grund ist die Erforschung des Zusammenhanges der veröffentlichten Inhalte mit den Orten, an denen diese entstehen, von Interesse. In der Arbeit werden eine Methoden vorgestellt, mit deren Hilfe die Untersuchung dieser Korrelation am Beispiel von klassifizierten Points of Interest durchgeführt wird. Zu diesem Zweck werden die Texte mit Hilfe von manueller Klassifikation und maschineller Sprachverarbeitung entsprechend ihrer Relevanz für die getesteten Objektklassen klassifiziert. Anschließend wird geprüft, ob der Anteil der relevanten Texte in der Nähe von Objekten der getesteten Klassen überdurchschnittlich hoch ist. Die Ergebnisse der Untersuchungen zeigen, dass die Stärke der Raum-Inhalt-Korrelation von den getesteten Objektklassen abhängig ist. Während sich beispielsweise bei Bahnhöfen, Flughäfen und Restaurants eine deutliche Abhängigkeit des Anteils der relevanten Texte von der Entfernung zu den betreffenden Objekten zeigt, kann dies für andere Objektklassen, wie z.B. Kino oder Supermarkt nicht bestätigt werden. Da frühere Forschungsarbeiten bei der Analyse im kleinmaßstäbigen Bereich eine Korrelation der Informationsinhalte mit deren Entstehungsorten feststellten, kann geschlussfolgert werden, dass der Zusammenhang zwischen Raum und Inhalt bei Microblogging-Texten sowohl vom Maßstab als auch vom Thema abhängig ist. Nutzgenerierte Inhalte Wikipedia OpenStreetMap Twitter Netzwerke Raumbezug Geographische Informationssuche Maschinelles Lernen Computerlinguistik Volunteered Geographic Information VGI User Generated Content UGC Geographical information science Wikipedia Twitter OpenStreetMap Networks Geospatial reference Geographic information retrieval machine learning natural language programming ddc:550 rvk:RB 10104
185	AUTOMATED OPTIMAL FORECASTING OF UNIVARIATE MONITORING PROCESSES : Employing a novel optimal forecast methodology to define four classes of forecast approaches and testing them on real-life monitoring processes Razroev, Stanislav January 2019 (has links) This work aims to explore practical one-step-ahead forecasting of structurally changing data, an unstable behaviour, that real-life data connected to human activity often exhibit. This setting can be characterized as monitoring process. Various forecast models, methods and approaches can range from being simple and computationally "cheap" to very sophisticated and computationally "expensive". Moreover, different forecast methods handle different data-patterns and structural changes differently: for some particular data types or data intervals some particular forecast methods are better than the others, something that is usually not known beforehand. This raises a question: "Can one design a forecast procedure, that effectively and optimally switches between various forecast methods, adapting the forecast methods usage to the changes in the incoming data flow?" The thesis answers this question by introducing optimality concept, that allows optimal switching between simultaneously executed forecast methods, thus "tailoring" forecast methods to the changes in the data. It is also shown, how another forecast approach: combinational forecasting, where forecast methods are combined using weighted average, can be utilized by optimality principle and can therefore benefit from it. Thus, four classes of forecast results can be considered and compared: basic forecast methods, basic optimality, combinational forecasting, and combinational optimality. The thesis shows, that the usage of optimality gives results, where most of the time optimality is no worse or better than the best of forecast methods, that optimality is based on. Optimality reduces also scattering from multitude of various forecast suggestions to a single number or only a few numbers (in a controllable fashion). Optimality gives additionally lower bound for optimal forecasting: the hypothetically best achievable forecast result. The main conclusion is that optimality approach makes more or less obsolete other traditional ways of treating the monitoring processes: trying to find the single best forecast method for some structurally changing data. This search still can be sought, of course, but it is best done within optimality approach as its innate component. All this makes the proposed optimality approach for forecasting purposes a valid "representative" of a more broad ensemble approach (which likewise motivated development of now popular Ensemble Learning concept as a valid part of Machine Learning framework). / Denna avhandling syftar till undersöka en praktisk ett-steg-i-taget prediktering av strukturmässigt skiftande data, ett icke-stabilt beteende som verkliga data kopplade till människoaktiviteter ofta demonstrerar. Denna uppsättning kan alltså karakteriseras som övervakningsprocess eller monitoringsprocess. Olika prediktionsmodeller, metoder och tillvägagångssätt kan variera från att vara enkla och "beräkningsbilliga" till sofistikerade och "beräkningsdyra". Olika prediktionsmetoder hanterar dessutom olika mönster eller strukturförändringar i data på olika sätt: för vissa typer av data eller vissa dataintervall är vissa prediktionsmetoder bättre än andra, vilket inte brukar vara känt i förväg. Detta väcker en fråga: "Kan man skapa en predictionsprocedur, som effektivt och på ett optimalt sätt skulle byta mellan olika prediktionsmetoder och för att adaptera dess användning till ändringar i inkommande dataflöde?" Avhandlingen svarar på frågan genom att introducera optimalitetskoncept eller optimalitet, något som tillåter ett optimalbyte mellan parallellt utförda prediktionsmetoder, för att på så sätt skräddarsy prediktionsmetoder till förändringar i data. Det visas också, hur ett annat prediktionstillvägagångssätt: kombinationsprediktering, där olika prediktionsmetoder kombineras med hjälp av viktat medelvärde, kan utnyttjas av optimalitetsprincipen och därmed få nytta av den. Alltså, fyra klasser av prediktionsresultat kan betraktas och jämföras: basprediktionsmetoder, basoptimalitet, kombinationsprediktering och kombinationsoptimalitet. Denna avhandling visar, att användning av optimalitet ger resultat, där optimaliteten för det mesta inte är sämre eller bättre än den bästa av enskilda prediktionsmetoder, som själva optimaliteten är baserad på. Optimalitet reducerar också spridningen från mängden av olika prediktionsförslag till ett tal eller bara några enstaka tal (på ett kontrollerat sätt). Optimalitet producerar ytterligare en nedre gräns för optimalprediktion: det hypotetiskt bästa uppnåeliga prediktionsresultatet. Huvudslutsatsen är följande: optimalitetstillvägagångssätt gör att andra traditionella sätt att ta hand om övervakningsprocesser blir mer eller mindre föråldrade: att leta bara efter den enda bästa enskilda prediktionsmetoden för data med strukturskift. Sådan sökning kan fortfarande göras, men det är bäst att göra den inom optimalitetstillvägagångssättet, där den ingår som en naturlig komponent. Allt detta gör det föreslagna optimalitetstillvägagångssättetet för prediktionsändamål till en giltig "representant" för det mer allmäna ensembletillvägagångssättet (något som också motiverade utvecklingen av numera populär Ensembleinlärning som en giltig del av Maskininlärning). predictions forecasting optimal forecasting forecast classes optimality rules ensemble forecasting state switching combinational forecasting optimality framework exponential smoothing ARIMA ARMA SARIMA Double-Seasonal Holt-Winters time series Wikipedia Wikimedia Wikipedia data Twitter Twitter data electricity data monitoring processes monitoring process monitoring error metrics outliers missing values Mathematics Matematik
186	Zur Beziehung von Raum und Inhalt nutzergenerierter geographischer Informationen Hahmann, Stefan 12 June 2014 (has links) In the last ten years there has been a significant progress of the World Wide Web, which evolved to become the so-called “Web 2.0”. The most important feature of this new quality of the WWW is the participation of the users in generating contents. This trend facilitates the formation of user communities which collaborate on diverse projects, where they collect and publish information. Prominent examples of such projects are the online-encyclopedia “Wikipedia”, the microblogging-platform “Twitter”, the photo-platform “Flickr” and the database of topographic information “OpenStreetMap”. User-generated content, which is directly or indirectly geospatially referenced, is of-ten termed more specifically as “volunteered geographic information”. The geospatial reference of this information is constituted either directly by coordinates that are given as meta-information or indirectly through georeferencing of toponyms or addresses that are contained in this information. Volunteered geographic information is particularly suited for research, as it can be accessed with low or even at no costs at all. Furthermore it reflects a variety of human decisions which are linked to geographic space. In this thesis, the relationship of space and content of volunteered geographic information is investigated from two different perspectives. The first part of this thesis addresses the question for which share of information there exists a relationship between space and content of the information, such that the information is locatable in geospace. In this context, the assumption that about 80% of all information has a reference to space has been well known within the community of geographic information system users. Since the 1980s it has served as a marketing tool within the whole geoinformation sector, although there has not been any empirical evidence. This thesis contributes to fill this research gap. For the validation of the ‘80%-hypothesis’ two approaches are presented. The first approach is based on a corpus of information that is as representative as possible for world knowledge. For this purpose the German language edition of Wikipedia has been selected. This corpus is modeled as a network of information where the articles are considered the nodes and the cross references are considered the edges of a directed graph. With the help of this network a graduated definition of geospatial references is possible. It is implemented by computing the distance of each article to its closest article within the network that is assigned with spatial coordinates. Parallel to this, a survey-based approach is developed where participants have the task to assign pieces of information to one of the categories “direct geospatial reference”, “indirect geospatial reference” and “no geospatial reference”. A synthesis of both approaches leads to an empirically justified figure for the “80%-assertion”. The result of the investigation is that for the corpus of Wikipedia 27% of the information may be categorized as directly geospatially referenced and 30% of the information may be categorized as indirectly geospatially referenced. In the second part of the thesis the question is investigated in how far volunteered geographic information that is produced on mobile devices is related to the locations where it is published. For this purpose, a collection of microblogging-texts produced on mobile devices serve as research corpus. Microblogging-texts are short texts that are published via the World Wide Web. For this type of information the relationship be-tween the content of the information and their position is less obvious than e.g. for topographic information or photo descriptions. The analysis of microblogging-texts offers new possibilities for market and opinion research, the monitoring of natural events and human activities as well as for decision support in disaster management. The spatial analysis of the texts may add extra value. In fact for some of the applications the spatial analysis is a necessary condition. For this reason, the investigation of the relationship of the published contents with the locations where they are generated is of interest. Within this thesis, methods are described that support the investigation of this relationship. In the presented approach, classified Points of Interest serve as a model for the environment. For the purpose of the investigation of the correlation between these points and the microblogging-texts, manual classification and natural language processing are used in order to classify these texts according to their relevance in regard to the respective feature classes. Subsequently, it is tested whether the share of relevant texts in the proximity of objects of the tested classes is above average. The results of the investigation show that the strength of the location-content-correlation depends on the tested feature class. While for the feature classes ‘train station’, ‘airport’ and ‘restaurant’ a significant dependency of the share of relevant texts on the distance to the respective objects may be observed, this is not confirmed for objects of other feature classes, such as ‘cinema’ and ‘supermarket’. However, as prior research that describes investigations on small cartographic scale has detected correlations between space and content of microblogging-texts, it can be concluded that the strength of the correlation between space and content of microblogging-texts depends on scale and topic.:1 Einleitung 1 1.1 Motivation 1 1.1.1 Bedeutung raumbezogener nutzergenerierter Inhalte für die geographische Informationswissenschaft und die Kartographie 1 1.1.2 Die Raumbezugshypothese 3 1.1.3 Die Korrelation von Ort und Inhalt bei nutzergenerierten Inhalten 4 1.2 Forschungsziele und Forschungsfragen 5 1.2.1 Prüfung der Raumbezugshypothese 5 1.2.2 Untersuchung der Korrelation von Ort und Inhalt von nutzergenerierten Inhalten 6 1.3 Aufbau der Arbeit 7 1.3.1 Die Beziehung zwischen Raum und Inhalt von nutzergenerierten geographischen Informationen 7 1.3.2 Gliederung der Arbeit 7 1.3.3 Verwendete Publikationen 8 2 Forschungsstand 11 2.1 Relevante Begriffe 11 2.1.1 Web 2.0 11 2.1.2 User Generated Content / Nutzergenerierte Inhalte 12 2.1.2.1 Bedeutung und Begriffsherkunft 12 2.1.2.2 Begriffsklärung 12 2.1.2.3 Arten von UGC 13 2.1.2.4 Kritik 14 2.1.2.5 Forschungspotential 14 2.1.3 Raumbezug 14 2.1.3.1 Der Begriff ‚Raumbezug‘ in der Fachliteratur 14 2.1.3.2 Kategorien des Georaumbezuges 16 2.1.4 Georäumlich 16 2.1.5 Geographische Information und Geodaten 17 2.1.5.1 Begriffsklärung 17 2.1.5.2 Points of Interest als Spezialfall 19 2.1.6 Volunteered Geographic Information / Nutzergenerierte geographische Informationen 19 2.1.6.1 Begriffsherkunft und Charakteristika von VGI 19 2.1.6.2 Das Konzept der menschlichen Sensoren 20 2.1.6.3 Kommunikation geographischer Informationen bei VGI 21 2.1.6.4 Der Mehrwert von VGI 21 2.1.6.5 Motive der Beitragenden 22 2.1.6.6 VGI im globalen Kontext 22 2.1.6.7 Erfassung der Informationen: partizipativ vs. opportunistisch 23 2.1.6.8 Formale Definition 23 2.1.6.9 Deutsche Entsprechung des Begriffs 24 2.1.7 Semantik nutzergenerierter geographischer Informationen 25 2.1.7.1 Strukturierte Form 25 2.1.7.2 Unstrukturierte Form 26 2.2 Arten nutzergenerierter geographischer Informationen 26 2.2.1 Topographische Informationen – OpenStreetMap 28 2.2.1.1 Korpusbeschreibung 28 2.2.1.2 Forschungsüberblick 30 2.2.1.3 Raumbezug 32 2.2.2 Enzyklopädische Informationen – Wikipedia 34 2.2.2.1 Korpusbeschreibung 34 2.2.2.2 Forschungsüberblick 35 2.2.2.3 Raumbezug 36 2.2.2.4 Metaeigenschaften von Artikeln der deutschen Wikipedia 37 2.2.3 Microblogging-Texte – Twitter 39 2.2.3.1 Korpusbeschreibung 39 2.2.3.2 Forschungsüberblick 41 2.2.3.3 Raumbezug 42 2.2.4 Bilder und Bildmetainformationen – Flickr, Instagram, Picasa, Panoramio, Geograph 43 2.2.4.1 Korpusbeschreibung 43 2.2.4.2 Forschungsüberblick 45 2.3 Informationen und Netzwerke 46 2.3.1 Beispiele für Netzwerkstrukturen 46 2.3.2 Implikationen vernetzter Informationen für die Raumbezugshypothese 47 2.3.3 Netzwerkeigenschaften der Wikipedia 47 2.4 Geographische Informationen und Kognition 49 2.5 Informationen klassifizieren durch maschinelle Sprachverarbeitung 50 2.5.1 Naive Bayes 51 2.5.2 Maximum Entropy 51 2.5.3 Support Vector Machines 52 3 Methoden und Ergebnisse 53 3.1 Korpusanalytischer Ansatz für die Prüfung der Raumbezugshypothese 53 3.1.1 Netzwerkgrad des Georaumbezuges 53 3.1.2 Datenprozessierung 56 3.1.3 Ergebnisse der NGGR-Berechnung 57 3.1.4 Korrelation zwischen NGGR und den Eigenschaften von Wikipedia-Artikeln 60 3.2 Befragungsansatz für die Prüfung der Raumbezugshypothese 65 3.2.1 Kategorisierungsaufgabe zur Untersuchung des Georaumbezuges 65 3.2.1.1 Material 66 3.2.1.2 Prozedur 66 3.2.1.3 Teilnehmer 67 3.2.2 Hypothesen 68 3.2.3 Daten zur Beteiligung an der Befragung 68 3.2.4 Ergebnisse 70 3.3 Synthese von korpusanalytischem Ansatz und Befragungsansatz für die Prüfung der Raumbezugshypothese 71 3.3.1 Methodik 71 3.3.2 Ergebnisse 72 3.3.3 Einfluss des Faktors Wissen auf die Ergebnisse der Befragung 73 3.3.4 Einfluss des fachlichen Hintergrundes auf die Ergebnisse der Befragung 74 3.3.5 Prädiktion des Anteils raumbezogener Informationen für das gesamte Korpus der deutschen Wikipedia 76 3.4 Klassifikation nutzergenerierter geographischer Informationen hinsichtlich der Korrelation Ort-Inhalt am Beispiel von mobil verfassten Microblogging-Texten 77 3.4.1 Manuelle Textklassifikation 78 3.4.2 Überwachte maschinelle Textklassifikation mit manuell klassifizierten Trainingsdaten 80 3.4.2.1 Vorverarbeitung der Microblogging-Texte 81 3.4.2.2 Evaluation der Ergebnisse der maschinellen Textklassifikation 82 3.4.2.3 Tuning der maschinellen Klassifikation 83 3.4.3 Überwachte maschinelle Textklassifikation mit lexikalischen Trainingsdaten 83 3.4.4 Verwendete Daten 86 3.4.4.1 Aufzeichnung von mobilen Microblogging-Texten mit der Twitter-Streaming-API 86 3.4.4.2 Filterung verwendbarer Microblogging-Texte 87 3.4.4.3 Zeitliche und räumliche Muster der Microblogging-Texte 89 3.4.4.4 Verwendete Points of Interest 91 3.4.5 Ergebnisse 92 3.4.5.1 Manuelle Annotation von Texten 92 3.4.5.2 Überwachte maschinelle Klassifikation von Texten mit manuell klassifizierten Trainingsdaten 95 3.4.5.3 Überwachte maschinelle Klassifikation von Texten mit lexikalischen Trainingsdaten 99 3.5 Bestimmung der Entfernungsabhängigkeit des Anteils von für spezifische Orte relevanten Informationen am Beispiel von mobil verfassten Microblogging-Texten 103 3.5.1 Methodik 103 3.5.2 Ergebnisse 104 4 Diskussion 111 4.1 Methoden zur Prüfung der Raumbezugshypothese am Beispiel des Korpus Wikipedia 111 4.1.1 Wahl des Korpus 111 4.1.2 Abstraktes Konzept und Instanz 112 4.1.3 Korpusanalytischer Ansatz 112 4.1.4 Befragungsansatz 114 4.2 Methoden zur Bestimmung der Korrelation Ort-Inhalt von nutzergenerierten Informationen am Beispiel von mobil erzeugten Microblogging-Texten 115 4.2.1 Manuelle Klassifikation 116 4.2.2 Überwachte maschinelle Klassifikation mit manuell klassifizierten Trainingsdaten 117 4.2.3 Unüberwachte maschinelle Klassifikation mit lexikalischen Trainingsdaten 118 4.2.4 Berechnung der Entfernungsabhängigkeit des Anteils ortsbezogener Texte 119 4.2.5 Points of Interest als Modell für den räumlichen Kontext 120 4.3 Der Begriff ‚Raumbezug‘ im Kontext von nutzergenerierten geographischen Informationen 120 5 Schlussfolgerungen und Forschungsausblick 123 5.1 Beantwortung der Forschungsfragen 123 5.1.1 Zur Überprüfung der Raumbezugshypothese 123 5.1.2 Zur Korrelation von Ort und Inhalt von nutzergenerierten geographischen Informationen 125 5.2 Implikationen der Forschungsergebnisse 128 5.3 Forschungsausblick nutzergenerierte geographische Informationen 130 5.3.1 Qualität von VGI 130 5.3.2 Synthese von VGI mit amtlichen Daten 132 5.3.3 Weitere aktuelle Entwicklungen im Bereich VGI-Forschung 132 6 Literaturverzeichnis 135 7 Anhang 151 Anhang A Dokumentation des „Experiments Geoaumbezug“ 152 Anhang B Ergebnisse der Kategorisierungsaufgabe des „Experiments Georaumbezug“ 157 Anhang C Rückmeldungen der Teilnehmer des „Experiments Georaumbezug“ 163 Anhang D Einfluss der Faktoren fachlicher Hintergrund und Wissen auf die Kategorisierung von Begriffen hinsichtlich ihrer Georäumlichkeit 166 Anhang E Ergebnisse der manuellen Klassifikation der Microblogging-Texte 168 Anhang F Klassifikationsmodelle resultierend aus manuellen und lexikalischen Trainingsdaten 177 Anhang G Forschungsdaten-Anhang 181 / Während der vergangenen zehn Jahre vollzog sich eine signifikante Veränderung des World Wide Webs, das sich zum sogenannten „Web 2.0“ entwickelte. Das wesentlichste Merkmal dieser neuen Qualität des WWW ist die Beteiligung der Nutzer bei der Erstellung der Inhalte. Diese Entwicklung fördert das Entstehen von Nutzergemeinschaften, die kollaborativ in unterschiedlichsten Projekten Informationen sammeln und veröffentlichen. Prominente Beispiele für solche Projekte sind die Online-Enzyklopädie „Wikipedia“, die Microblogging-Plattform „Twitter“, die Foto-Plattform „Flickr“ und die Sammlung topographischer Informationen „OpenStreetMap“. Nutzergenerierte Inhalte, die direkt oder indirekt raumbezogen sind, können spezifischer als „nutzergenerierte geographische Informationen“ bezeichnet werden. Der Raumbezug dieser Informationen entsteht entweder direkt durch die Angabe räumlicher Koordinaten als Metainformationen oder er kann indirekt durch die Georeferenzierung von in den Informationen enthaltenen Toponymen oder Adressen hergestellt werden. Nutzergenerierte geographische Informationen haben für die Forschung den besonderen Vorteil, dass sie einerseits häufig gänzlich ohne oder nur mit geringen Kosten verfügbar gemacht werden können und andererseits eine Vielzahl von menschlichen Entscheidungen widerspiegeln, die mit dem Raum verknüpft sind. In der vorliegenden Dissertation wird die Beziehung von Raum und Inhalt nutzergenerierter geographischer Informationen aus zwei Perspektiven untersucht. Im ersten Teil der Arbeit steht die Frage im Vordergrund, für welchen Anteil an Informationen eine Beziehung zwischen Raum und Informationsinhalt in der Art besteht, dass die Informationen im Georaum lokalisierbar sind. In diesem Zusammenhang existiert seit den 1980er Jahren die unter Nutzern von geographischen Informationssystemen weit verbreitete These, dass 80% aller Informationen einen Raumbezug haben. Diese These dient im gesamten Spektrum der Branche als Marketinginstrument, ist jedoch nicht empirisch belegt. Diese Arbeit trägt dazu bei, die bestehende Forschungslücke zu schließen. Für die Prüfung dieser These, die in der Arbeit als „Raumbezugshypothese“ bezeichnet wird, werden zwei Ansätze vorgestellt. Der erste Ansatz basiert auf der Analyse eines möglichst repräsentativen Informationskorpus, wofür die deutsche Sprachversion der Wikipedia ausgewählt wird. Diese wird als Informationsnetzwerk modelliert, indem deren Artikel als Knoten und deren interne Querverweise als Kanten eines gerichteten Graphen betrachtet werden. Mit Hilfe dieses Netzwerkes ist es möglich eine abgestufte Definition des Raumbezuges von Informationen einzuführen, indem die Entfernung jedes Artikels innerhalb des Netzwerkes zum jeweils nächstgelegenen Artikel, der mit räumlichen Koordinaten gekennzeichnet ist, berechnet wird. Parallel dazu wird ein Befragungsansatz entwickelt, bei dem Probanden die Aufgabe haben, Informationen in die Kategorien „Direkter Raumbezug“, „Indirekter Raumbezug“ und „Kein Raumbezug“ einzuordnen. Die Synthese beider Ansätze führt zu einer empirisch begründeten Zahl für die „Raumbezugsthese“. Das Ergebnis ist, dass für das Untersuchungskorpus Wikipedia 27% der Informationen als direkt raumbezogenen und 30% der Informationen als indirekt raumbezogen kategorisiert werden können. Im zweiten Teil der Arbeit wird die Forschungsfrage untersucht, inwiefern nutzergenerierte Informationen, die über mobile Geräte erzeugt werden, in Beziehung zu den Orten stehen, an denen sie veröffentlicht werden. Als Forschungskorpus dienen mobil verfasste Microblogging-Texte. Dies sind kurze Texte, die über das WWW veröffentlicht werden. Bei dieser Informationsart liegt im Gegensatz zu beispielsweise topographischen Information oder Fotobeschreibungen die Vermutung eines starken Zusammenhanges zwischen dem Inhalt der Informationen und deren Positionen nicht nahe. Die Analyse von Microblogging-Texten bietet unter anderem Potential für die Markt- und Meinungsforschung, die Beobachtung von Naturereignissen und menschlichen Aktivitäten sowie die Entscheidungsunterstützung in Katastrophenfällen. Aus der räumlichen Auswertung kann sich dabei ein Mehrwert ergeben, für einen Teil der Anwendungen ist die räumliche Auswertung sogar die notwendige Voraussetzung. Aus diesem Grund ist die Erforschung des Zusammenhanges der veröffentlichten Inhalte mit den Orten, an denen diese entstehen, von Interesse. In der Arbeit werden eine Methoden vorgestellt, mit deren Hilfe die Untersuchung dieser Korrelation am Beispiel von klassifizierten Points of Interest durchgeführt wird. Zu diesem Zweck werden die Texte mit Hilfe von manueller Klassifikation und maschineller Sprachverarbeitung entsprechend ihrer Relevanz für die getesteten Objektklassen klassifiziert. Anschließend wird geprüft, ob der Anteil der relevanten Texte in der Nähe von Objekten der getesteten Klassen überdurchschnittlich hoch ist. Die Ergebnisse der Untersuchungen zeigen, dass die Stärke der Raum-Inhalt-Korrelation von den getesteten Objektklassen abhängig ist. Während sich beispielsweise bei Bahnhöfen, Flughäfen und Restaurants eine deutliche Abhängigkeit des Anteils der relevanten Texte von der Entfernung zu den betreffenden Objekten zeigt, kann dies für andere Objektklassen, wie z.B. Kino oder Supermarkt nicht bestätigt werden. Da frühere Forschungsarbeiten bei der Analyse im kleinmaßstäbigen Bereich eine Korrelation der Informationsinhalte mit deren Entstehungsorten feststellten, kann geschlussfolgert werden, dass der Zusammenhang zwischen Raum und Inhalt bei Microblogging-Texten sowohl vom Maßstab als auch vom Thema abhängig ist.:1 Einleitung 1 1.1 Motivation 1 1.1.1 Bedeutung raumbezogener nutzergenerierter Inhalte für die geographische Informationswissenschaft und die Kartographie 1 1.1.2 Die Raumbezugshypothese 3 1.1.3 Die Korrelation von Ort und Inhalt bei nutzergenerierten Inhalten 4 1.2 Forschungsziele und Forschungsfragen 5 1.2.1 Prüfung der Raumbezugshypothese 5 1.2.2 Untersuchung der Korrelation von Ort und Inhalt von nutzergenerierten Inhalten 6 1.3 Aufbau der Arbeit 7 1.3.1 Die Beziehung zwischen Raum und Inhalt von nutzergenerierten geographischen Informationen 7 1.3.2 Gliederung der Arbeit 7 1.3.3 Verwendete Publikationen 8 2 Forschungsstand 11 2.1 Relevante Begriffe 11 2.1.1 Web 2.0 11 2.1.2 User Generated Content / Nutzergenerierte Inhalte 12 2.1.2.1 Bedeutung und Begriffsherkunft 12 2.1.2.2 Begriffsklärung 12 2.1.2.3 Arten von UGC 13 2.1.2.4 Kritik 14 2.1.2.5 Forschungspotential 14 2.1.3 Raumbezug 14 2.1.3.1 Der Begriff ‚Raumbezug‘ in der Fachliteratur 14 2.1.3.2 Kategorien des Georaumbezuges 16 2.1.4 Georäumlich 16 2.1.5 Geographische Information und Geodaten 17 2.1.5.1 Begriffsklärung 17 2.1.5.2 Points of Interest als Spezialfall 19 2.1.6 Volunteered Geographic Information / Nutzergenerierte geographische Informationen 19 2.1.6.1 Begriffsherkunft und Charakteristika von VGI 19 2.1.6.2 Das Konzept der menschlichen Sensoren 20 2.1.6.3 Kommunikation geographischer Informationen bei VGI 21 2.1.6.4 Der Mehrwert von VGI 21 2.1.6.5 Motive der Beitragenden 22 2.1.6.6 VGI im globalen Kontext 22 2.1.6.7 Erfassung der Informationen: partizipativ vs. opportunistisch 23 2.1.6.8 Formale Definition 23 2.1.6.9 Deutsche Entsprechung des Begriffs 24 2.1.7 Semantik nutzergenerierter geographischer Informationen 25 2.1.7.1 Strukturierte Form 25 2.1.7.2 Unstrukturierte Form 26 2.2 Arten nutzergenerierter geographischer Informationen 26 2.2.1 Topographische Informationen – OpenStreetMap 28 2.2.1.1 Korpusbeschreibung 28 2.2.1.2 Forschungsüberblick 30 2.2.1.3 Raumbezug 32 2.2.2 Enzyklopädische Informationen – Wikipedia 34 2.2.2.1 Korpusbeschreibung 34 2.2.2.2 Forschungsüberblick 35 2.2.2.3 Raumbezug 36 2.2.2.4 Metaeigenschaften von Artikeln der deutschen Wikipedia 37 2.2.3 Microblogging-Texte – Twitter 39 2.2.3.1 Korpusbeschreibung 39 2.2.3.2 Forschungsüberblick 41 2.2.3.3 Raumbezug 42 2.2.4 Bilder und Bildmetainformationen – Flickr, Instagram, Picasa, Panoramio, Geograph 43 2.2.4.1 Korpusbeschreibung 43 2.2.4.2 Forschungsüberblick 45 2.3 Informationen und Netzwerke 46 2.3.1 Beispiele für Netzwerkstrukturen 46 2.3.2 Implikationen vernetzter Informationen für die Raumbezugshypothese 47 2.3.3 Netzwerkeigenschaften der Wikipedia 47 2.4 Geographische Informationen und Kognition 49 2.5 Informationen klassifizieren durch maschinelle Sprachverarbeitung 50 2.5.1 Naive Bayes 51 2.5.2 Maximum Entropy 51 2.5.3 Support Vector Machines 52 3 Methoden und Ergebnisse 53 3.1 Korpusanalytischer Ansatz für die Prüfung der Raumbezugshypothese 53 3.1.1 Netzwerkgrad des Georaumbezuges 53 3.1.2 Datenprozessierung 56 3.1.3 Ergebnisse der NGGR-Berechnung 57 3.1.4 Korrelation zwischen NGGR und den Eigenschaften von Wikipedia-Artikeln 60 3.2 Befragungsansatz für die Prüfung der Raumbezugshypothese 65 3.2.1 Kategorisierungsaufgabe zur Untersuchung des Georaumbezuges 65 3.2.1.1 Material 66 3.2.1.2 Prozedur 66 3.2.1.3 Teilnehmer 67 3.2.2 Hypothesen 68 3.2.3 Daten zur Beteiligung an der Befragung 68 3.2.4 Ergebnisse 70 3.3 Synthese von korpusanalytischem Ansatz und Befragungsansatz für die Prüfung der Raumbezugshypothese 71 3.3.1 Methodik 71 3.3.2 Ergebnisse 72 3.3.3 Einfluss des Faktors Wissen auf die Ergebnisse der Befragung 73 3.3.4 Einfluss des fachlichen Hintergrundes auf die Ergebnisse der Befragung 74 3.3.5 Prädiktion des Anteils raumbezogener Informationen für das gesamte Korpus der deutschen Wikipedia 76 3.4 Klassifikation nutzergenerierter geographischer Informationen hinsichtlich der Korrelation Ort-Inhalt am Beispiel von mobil verfassten Microblogging-Texten 77 3.4.1 Manuelle Textklassifikation 78 3.4.2 Überwachte maschinelle Textklassifikation mit manuell klassifizierten Trainingsdaten 80 3.4.2.1 Vorverarbeitung der Microblogging-Texte 81 3.4.2.2 Evaluation der Ergebnisse der maschinellen Textklassifikation 82 3.4.2.3 Tuning der maschinellen Klassifikation 83 3.4.3 Überwachte maschinelle Textklassifikation mit lexikalischen Trainingsdaten 83 3.4.4 Verwendete Daten 86 3.4.4.1 Aufzeichnung von mobilen Microblogging-Texten mit der Twitter-Streaming-API 86 3.4.4.2 Filterung verwendbarer Microblogging-Texte 87 3.4.4.3 Zeitliche und räumliche Muster der Microblogging-Texte 89 3.4.4.4 Verwendete Points of Interest 91 3.4.5 Ergebnisse 92 3.4.5.1 Manuelle Annotation von Texten 92 3.4.5.2 Überwachte maschinelle Klassifikation von Texten mit manuell klassifizierten Trainingsdaten 95 3.4.5.3 Überwachte maschinelle Klassifikation von Texten mit lexikalischen Trainingsdaten 99 3.5 Bestimmung der Entfernungsabhängigkeit des Anteils von für spezifische Orte relevanten Informationen am Beispiel von mobil verfassten Microblogging-Texten 103 3.5.1 Methodik 103 3.5.2 Ergebnisse 104 4 Diskussion 111 4.1 Methoden zur Prüfung der Raumbezugshypothese am Beispiel des Korpus Wikipedia 111 4.1.1 Wahl des Korpus 111 4.1.2 Abstraktes Konzept und Instanz 112 4.1.3 Korpusanalytischer Ansatz 112 4.1.4 Befragungsansatz 114 4.2 Methoden zur Bestimmung der Korrelation Ort-Inhalt von nutzergenerierten Informationen am Beispiel von mobil erzeugten Microblogging-Texten 115 4.2.1 Manuelle Klassifikation 116 4.2.2 Überwachte maschinelle Klassifikation mit manuell klassifizierten Trainingsdaten 117 4.2.3 Unüberwachte maschinelle Klassifikation mit lexikalischen Trainingsdaten 118 4.2.4 Berechnung der Entfernungsabhängigkeit des Anteils ortsbezogener Texte 119 4.2.5 Points of Interest als Modell für den räumlichen Kontext 120 4.3 Der Begriff ‚Raumbezug‘ im Kontext von nutzergenerierten geographischen Informationen 120 5 Schlussfolgerungen und Forschungsausblick 123 5.1 Beantwortung der Forschungsfragen 123 5.1.1 Zur Überprüfung der Raumbezugshypothese 123 5.1.2 Zur Korrelation von Ort und Inhalt von nutzergenerierten geographischen Informationen 125 5.2 Implikationen der Forschungsergebnisse 128 5.3 Forschungsausblick nutzergenerierte geographische Informationen 130 5.3.1 Qualität von VGI 130 5.3.2 Synthese von VGI mit amtlichen Daten 132 5.3.3 Weitere aktuelle Entwicklungen im Bereich VGI-Forschung 132 6 Literaturverzeichnis 135 7 Anhang 151 Anhang A Dokumentation des „Experiments Geoaumbezug“ 152 Anhang B Ergebnisse der Kategorisierungsaufgabe des „Experiments Georaumbezug“ 157 Anhang C Rückmeldungen der Teilnehmer des „Experiments Georaumbezug“ 163 Anhang D Einfluss der Faktoren fachlicher Hintergrund und Wissen auf die Kategorisierung von Begriffen hinsichtlich ihrer Georäumlichkeit 166 Anhang E Ergebnisse der manuellen Klassifikation der Microblogging-Texte 168 Anhang F Klassifikationsmodelle resultierend aus manuellen und lexikalischen Trainingsdaten 177 Anhang G Forschungsdaten-Anhang 181 info:eu-repo/classification/ddc/550 ddc:550
187	Semantic Web Identity of academic organizations / search engine entity recognition and the sources that influence Knowledge Graph Cards in search results Arlitsch, Kenning 11 January 2017 (has links) Semantic Web Identity kennzeichnet den Zustand, in dem ein Unternehmen von Suchmaschinen als Solches erkannt wird. Das Abrufen einer Knowledge Graph Card in Google-Suchergebnissen für eine akademische Organisation wird als Indikator für SWI nominiert, da es zeigt, dass Google nachprüfbare Tatsachen gesammelt hat, um die Organisation als Einheit zu etablieren. Diese Anerkennung kann wiederum die Relevanz ihrer Verweisungen an diese Organisation verbessern. Diese Dissertation stellt Ergebnisse einer Befragung der 125 Mitgliedsbibliotheken der Association of Research Libraries vor. Die Ergebnisse zeigen, dass diese Bibliotheken in den strukturierten Datensätzen, die eine wesentliche Grundlage des Semantic Web sind und Faktor bei der Erreichung der SWI sind, schlecht vertreten sind. Der Mangel an SWI erstreckt sich auf andere akademische Organisationen, insbesondere auf die unteren Hierarchieebenen von Universitäten. Ein Mangel an SWI kann andere Faktoren von Interesse für akademische Organisationen beeinflussen, einschließlich der Fähigkeit zur Gewinnung von Forschungsförderung, Immatrikulationsraten und Verbesserung des institutionellen Rankings. Diese Studie vermutet, dass der schlechte Zustand der SWI das Ergebnis eines Versagens dieser Organisationen ist, geeignete Linked Open Data und proprietäre Semantic Web Knowledge Bases zu belegen. Die Situation stellt eine Gelegenheit für akademische Bibliotheken dar, Fähigkeiten zu entwickeln, um ihre eigene SWI zu etablieren und den anderen Organisationen in ihren Institutionen einen SWI-Service anzubieten. Die Forschung untersucht den aktuellen Stand der SWI für ARL-Bibliotheken und einige andere akademische Organisationen und beschreibt Fallstudien, die die Wirksamkeit dieser Techniken zur Verbesserung der SWI validieren. Die erklärt auch ein neues Dienstmodell der SWI-Pflege, die von anderen akademischen Bibliotheken für ihren eigenen institutionellen Kontext angepasst werden. / Semantic Web Identity (SWI) characterizes an entity that has been recognized as such by search engines. The display of a Knowledge Graph Card in Google search results for an academic organization is proposed as an indicator of SWI, as it demonstrates that Google has gathered enough verifiable facts to establish the organization as an entity. This recognition may in turn improve the accuracy and relevancy of its referrals to that organization. This dissertation presents findings from an in-depth survey of the 125 member libraries of the Association of Research Libraries (ARL). The findings show that these academic libraries are poorly represented in the structured data records that are a crucial underpinning of the Semantic Web and a significant factor in achieving SWI. Lack of SWI extends to other academic organizations, particularly those at the lower hierarchical levels of academic institutions, including colleges, departments, centers, and research institutes. A lack of SWI may affect other factors of interest to academic organizations, including ability to attract research funding, increase student enrollment, and improve institutional reputation and ranking. This study hypothesizes that the poor state of SWI is in part the result of a failure by these organizations to populate appropriate Linked Open Data (LOD) and proprietary Semantic Web knowledge bases. The situation represents an opportunity for academic libraries to develop skills and knowledge to establish and maintain their own SWI, and to offer SWI service to other academic organizations in their institutions. The research examines the current state of SWI for ARL libraries and some other academic organizations, and describes case studies that validate the effectiveness of proposed techniques to correct the situation. It also explains new services that are being developed at the Montana State University Library to address SWI needs on its campus, which could be adapted by other academic libraries. Google Wikipedia Wikidata Semantic Web Identity Knowledge Graph Cards search engines academic libraries Association of Research Libraries Google My Business Google Wikipedia Semantic Web Identity Knowledge Graph Cards search engines academic libraries Association of Research Libraries Google My Business Wikidata AN 93100 ddc:020
188	Lärande utan läraren : Internetkällors framställningar av judendomar / Teaching Without the Teacher : Depictions of Judaisms in Online Sources Mårtensson, Christoffer January 2023 (has links) As the digital age takes root, more and more students use the internet to acquire information for their studies. Common sources in Sweden are online encyclopedias like Wikipedia, SO-rummet and Nationalencyklopedin (NE). Seeing as these online encyclopedias can fill the role of teaching aids it is prudent to examine their contents to evaluate if they hold up to the standards established by the Swedish National Agency for Education (Skolverket). Focusing on the subject “religion” and the topic “Judaism”, this study evaluates the contents and framing used in the main articles about Judaism from both Swedish and English versions of the collaborative encyclopedia Wikipedia as well as the Swedish sources NE which is state sponsored and the commercial actor SO-rummet. Additionally this paper discusses how these sources compare with the central contents of the course Religionsvetenskap 1 (Religious Studies 1) for Swedish upper secondary school. The results show that the articles from NE and Swedish Wikipedia mostly state facts without elaborating and are more likely to give the reader a homogeneous picture of Judaism. SO-rummet is the most beginner friendly source while English Wikipedia is the most nuanced but perhaps most difficult source for students to comprehend. Generally, the sources fail to portray diversity within the tradition, with the exception being English Wikipedia. The sources that compare the best with the central contents for Religionsvetenskap 1 were in the following in descending order: English Wikipedia, SO-rummet, Swedish Wikipedia and lastly NE. This is problematic because previous studies show that students have greater faith in NE than they do in Wikipedia. It is worth keeping in mind however, that students are likely to use more than one source, especially if it is a group assignment. It is up to the teacher to recommend good sources, fill in the blanks and to guide the students with their own teaching. Wikipedia Nationalencyklopedin NE SO-rummet Swedish upper secondary school Swedish education source criticism Judaism religious education teaching aids religion Religionskunskap 1 lgy 11 Wikipedia Nationalencyklopedin NE SO-rummet svenska gymnasieskolan svensk skola källkritik judendom religionsämnet läromedel religion Religionskunskap 1 lgy 11 History of Religions Religionshistoria
189	Comportamento de Metricas de Inteligibilidade Textual em Documentos Recuperados naWeb / THE BEHAVIOR OF READABILITY METRICS IN DOCUMENTS RETRIEVED IN INTERNET AND ITS USE AS AN INFORMATION RETRIEVAL QUERY PARAMETER Londero, Eduardo Bauer 29 March 2011 (has links) Made available in DSpace on 2016-03-22T17:26:45Z (GMT). No. of bitstreams: 1 Dissertacao_Eduardo_Revisado.pdf: 3489154 bytes, checksum: 3c327ee0bc47d79cd4af46e065105650 (MD5) Previous issue date: 2011-03-29 / Text retrieved from the Internet through Google and Yahoo queries are evaluated using Flesch-Kincaid Grade Level, a simple assessment measure of text readability. This kind of metrics were created to help writers to evaluate their text, and recently in automatic text simplification for undercapable readers. In this work we apply these metrics to documents freely retrieved from the Internet, seeking to find correlations between legibility and relevance acknowledged to then by search engines. The initial premise guiding the comparison between readability and relevance is the statement known as Occam s Principle, or Principle of Economy. This study employs Flesch-Kincaid Grade Level in text documents retrieved from the Internet through search-engines queries and correlate it with the position. It was found a centralist trend in the texts recovered. The centralist tendency mean that the average spacing of groups of files from the average of the category they belong is meaningfull. With this measure is possible to establish a correlation between relevance and legibility, and also, to detect diferences in the way both search engines derive their relevance calculation. A subsequent experiment seeks to determine whether the measure of legibility can be employed to assist him or her choosing a document combined with original search engine ranking and if it is useful as advance information for choice and user navigation. In a final experiment, based on previously obtained knowledge, a comparison between Wikipedia and Britannica encyclopedias by employing the metric of understandability Flesch-Kincaid / Textos recuperados da Internet por interm´edio de consultas ao Google e Yahoo s ao analisados segundo uma m´etrica simples de avaliac¸ ao de inteligibilidade textual. Tais m´etricas foram criadas para orientar a produc¸ ao textual e recentemente tamb´em foram empregadas em simplificadores textuais autom´aticos experimentais para leitores inexperientes. Nesse trabalho aplicam-se essas m´etricas a texto originais livres, recuperados da Internet, para buscar correlacionar o grau de inteligibilidade textual com a relev ancia que lhes ´e conferida pelos buscadores utilizados. A premissa inicial a estimular a comparac¸ ao entre inteligibilidade e relev ancia ´e o enunciado conhecido como Princ´ıpio de Occam, ou princ´ıpio da economia. Observa-se uma tend encia centralista que ocorre a partir do pequeno afastamento m´edio dos grupos de arquivos melhor colocados no ranking em relac¸ ao `a m´edia da categoria a que pertencem. ´E com a medida do afastamento m´edio que se consegue verificar correlac¸ ao com a posic¸ ao do arquivo no ranking e ´e tamb´em com essa medida que se consegue registrar diferenc¸as entre o m´etodo de calcular a relev ancia do Google e do Yahoo. Um experimento que decorre do primeiro estudo procura determinar se a medida de inteligibilidade pode ser empregada para auxiliar o usu´ario da Internet a escolher arquivos mais simples ou se a sua indicac¸ ao junto `a listagem de links recuperados ´e ´util e informativa para a escolha e navegac¸ ao do usu´ario. Em um experimento final, embasado no conhecimento previamente obtido, s ao comparadas as enciclop´edias Brit anica eWikip´edia por meio do emprego da m´etrica de inteligibilidade Flesch-Kincaid Grade Level Recuperação de Informações Textuais Processamento de Linguagem Natural
190	Wickrpedia : Integrering av sociala tjänster Ekström, Johan January 2006 (has links) <p>The web has evolved much through the years. From being a place where author and reader were clearly distinguished, it now invites everyone to take part in the development of both content and technology. Social services are central in what is called Web 2.0. Wikis, blogs and folksonomies are all examples of how the users and their communities are key to the development of services. Collaborative writing, tags and API:s are central. Social services are given an extra dimension through integration. The purpose of this study was to investigate whether it was possible to integrate an encyclopedia with a photosharing service. The issue was whether it was possible to find relevant images to the article they were connected to. The method for examining the issue was to create a service which functions was investigated through user tests. Wickrpedia was created, which is an integration of Wikipedia and Flickr. Wikipedia is an encyclopedia in the shape of a wiki, while Flickr is used to store, organize and share photos. The result shows that the images added someting to the encyclopedia; it became more entertaining and pleasant and the users’ knowledge was increased. The relevance of the images was good. The service can and should be improved. The conclusion is still that the service worked well and was seen as an improvement by the users.</p> / <p>Webben har förändrats mycket de senaste åren. Från att tidigare haft en tydlig uppdelning mellan läsare och författare inbjuds nu alla att delta i utvecklingen av både innehåll och teknik. Sociala tjänster är det centrala i det som benämns Web 2.0. Wikis, bloggar och folksonomies är alla exempel på hur användarna och deras gemenskap är nyckeln till utveckling av tjänster. Kollaborativt skrivande, taggar och API:er är centrala. Sociala tjänster får en ytterligare dimension genom integrering. Denna studies syfte var att utreda hur det gick att integrera ett uppslagsverk med en fotodelningstjänst. Frågan är om det gick att göra på ett sådant sätt att bilderna hade relevans för de artiklar de kopplades till. Metoden för att utreda frågan var att skapa en tjänst vars funktion undersöktes med hjälp av användartester. Wickrpedia skapades, vilket är en intregrering av Wikipedia och Flickr. Wikipedia är en encyklopedi i form av en wiki, medan Flickr används för att förvara, organisera och dela med sig av bilder. Resultatet visar att bilderna tillförde något till uppslagsverket; det blev roligare och trevligare och användarna fick en ökad kunskap. Relevansen hos bilderna var god. Tjänsten har brister, och den går att vidareutveckla. Slutsatsen var ändå att tjänsten fungerade och var en förbättring för användarna.</p> API kollaborativ Flickr folksonomy Öppen källkod integrering bilder media technology medieteknik medieteknologi metadata Open source taggar RDF RSS Web 2.0 wiki Wikipedia XML Informatik, data- och systemvetenskap

Search results