Spelling suggestions: "subject:"[een] CROWDSOURCING"" "subject:"[enn] CROWDSOURCING""
101 |
Merging the real with the virtual: crowd behaviour mining with virtual environmentsCh'ng, E., Gaffney, Vincent, Garwood, P., Chapman, H., Bates, R., Neubauer, W. 28 February 2017 (has links)
No / The first recorded crowdsourcing activity was in 1714 [1], with intermittent public event occurrences up until the millennium when such activities become widespread, spanning multiple domains. Crowdsourcing, however, is relatively novel as a methodology within virtual environment studies, in archaeology, and within the heritage domains where this research is focused. The studies that are being conducted are few and far between in comparison to other areas. This paper aims to develop a recent concept in crowdsourcing work termed `crowd behaviour mining' [2] using virtual environments, and to develop a unique concept in crowdsourcing activities that can be applied beyond the case studies presented here and to other domains that involve human behaviour as independent variables. The case studies described here use data from experiments involving separate heritage projects and conducted during two Royal Society Summer Science Exhibitions, in 2012 and 2015 respectively. `Crowd Behaviour Mining' analysis demonstrated a capacity to inform research in respect of potential patterns and trends across space and time as well as preferences between demographic user groups and the influence of experimenters during the experiments.
|
102 |
Data-driven Methods for Spoken Dialogue Systems : Applications in Language Understanding, Turn-taking, Error Detection, and Knowledge AcquisitionMeena, Raveesh January 2016 (has links)
Spoken dialogue systems are application interfaces that enable humans to interact with computers using spoken natural language. A major challenge for these systems is dealing with the ubiquity of variability—in user behavior, in the performance of the various speech and language processing sub-components, and in the dynamics of the task domain. However, as the predominant methodology for dialogue system development is to handcraft the sub-components, these systems typically lack robustness in user interactions. Data-driven methods, on the other hand, have been shown to offer robustness to variability in various domains of computer science and are increasingly being used in dialogue systems research. This thesis makes four novel contributions to the data-driven methods for spoken dialogue system development. First, a method for interpreting the meaning contained in spoken utterances is presented. Second, an approach for determining when in a user’s speech it is appropriate for the system to give a response is presented. Third, an approach for error detection and analysis in dialogue system interactions is reported. Finally, an implicitly supervised learning approach for knowledge acquisition through the interactive setting of spoken dialogue is presented. The general approach taken in this thesis is to model dialogue system tasks as a classification problem and investigate features (e.g., lexical, syntactic, semantic, prosodic, and contextual) to train various classifiers on interaction data. The central hypothesis of this thesis is that the models for the aforementioned dialogue system tasks trained using the features proposed here perform better than their corresponding baseline models. The empirical validity of this claim has been assessed through both quantitative and qualitative evaluations, using both objective and subjective measures. / Den här avhandlingen utforskar datadrivna metoder för utveckling av talande dialogsystem. Motivet bakom sådana metoder är att dialogsystem måste kunna hantera en stor variation, i såväl användarnas beteende, som i prestandan hos olika tal- och språkteknologiska delkomponenter. Traditionella tillvägagångssätt, som baseras på handskrivna komponenter i dialogsystem, har ofta svårt att uppvisa robusthet i hanteringen av sådan variation. Datadrivna metoder har visat sig vara robusta mot variation i olika problem inom datavetenskap och artificiell intelligens, och har på senare tid blivit populära även inom forskning kring talande dialogsystem. Den här avhandlingen presenterar fyra nya bidrag till datadrivna metoder för utveckling av talande dialogsystem. Det första bidraget är en datadriven metod för semantisk tolkning av talspråk. Den föreslagna metoden har två viktiga egenskaper: robust hantering av ”ogrammatisk” indata (på grund av talets spontana natur samt fel i taligenkänning), samt bevarande av strukturella relationer mellan koncept i den semantiska representationen. Tidigare metoder för semantisk tolkning av talspråk har typiskt sett endast hanterat den ena av dessa två utmaningar. Det andra bidraget i avhandlingen är en datadriven metod för turtagning i dialogsystem. Den föreslagna modellen utnyttjar prosodi, syntax, semantik samt dialogkontext för att avgöra när i användarens tal som det är lämpligt för systemet att ge respons. Det tredje bidraget är en data-driven metod för detektering av fel och missförstånd i dialogsystem. Där tidigare arbeten har fokuserat på detektering av fel on-line och endast testats i enskilda domäner, presenterats här modeller för analys av fel såväl off-line som on-line, och som tränats samt utvärderats på tre skilda dialogsystemkorpusar. Slutligen presenteras en metod för hur dialogsystem ska kunna tillägna sig ny kunskap genom interaktion med användaren. Metoden är utvärderad i ett scenario där systemet ska bygga upp en kunskapsbas i en geografisk domän genom så kallad "crowdsourcing". Systemet börjar med minimal kunskap och använder den talade dialogen för att både samla ny information och verifiera den kunskap som inhämtats. Den generella ansatsen i den här avhandlingen är att modellera olika uppgifter för dialogsystem som klassificeringsproblem, och undersöka särdrag i diskursens kontext som kan användas för att träna klassificerare. Under arbetets gång har olika slags lexikala, syntaktiska, prosodiska samt kontextuella särdrag undersökts. En allmän diskussion om dessa särdrags bidrag till modellering av ovannämnda uppgifter utgör ett av avhandlingens huvudsakliga bidrag. En annan central del i avhandlingen är att träna modeller som kan användas direkt i dialogsystem, varför endast automatiskt extraherbara särdrag (som inte kräver manuell uppmärkning) används för att träna modellerna. Vidare utvärderas modellernas prestanda på såväl taligenkänningsresultat som transkriptioner för att undersöka hur robusta de föreslagna metoderna är. Den centrala hypotesen i denna avhandling är att modeller som tränas med de föreslagna kontextuella särdragen presterar bättre än en referensmodell. Giltigheten hos denna hypotes har bedömts med såväl kvalitativa som kvantitativa utvärderingar, som nyttjar både objektiva och subjektiva mått. / <p>QC 20160225</p>
|
103 |
Tillgänglighetsstrategi för kommunala planerare : Ett arbetssätt för att identifiera och planera åtgärder av enkelt avhjälpta hinderEnebjörk, Maria January 2012 (has links)
There are many barriers and obstacles that need to be removed to make the urban environment more accessible for people with disabilities. There are currently no adequate tools for urban planners to assess and address these obstacles. Geographical Information Systems (GIS) provide powerful methods for the visualization and analysis of spatial data, including the kind of data important in studies of accessibility. This study suggests a method urban planners can use to identify accessibility obstacles and to propose future actions to fix them. A literature review was conducted to find out what has been done and what is new regarding the accessibility issues. Different methods were evaluated which in turn led to the construction of a multi-step analytical strategy. The proposed method identifies how urban planners can best work with accessibility issues given existing tools and where there is potential for even further improvement in areas such as data collection. Data only has value when it is used and if it is kept up-to-date. Web services or mobile applications would help municipalities receive and share information about obstacles with local residents, especially the disabled. / Stadsmiljön är i dagsläget inte anpassad för alla individer och innehåller många hinder. En del av dessa hinder räknas som så kallade enkelt avhjälpta hinder (EAH) och ska enligt regeringsbeslut åtgärdas. Det saknas ett verktyg för kommunala planerare att arbeta med tillgänglighetsanpassning där hindren visas på ett tydligt sätt. Geografiska informationssystem (GIS) är ett kraftfullt verktyg för visualisering och analysering och kan användas för att göra analyser av tillgängligheten. Syftet med examensarbetet är att föreslå ett arbetssätt som kommunala samhällsplanerare kan använda i arbetet med att identifiera EAH och planera åtgärder. En litteraturstudie gjordes för att ta reda på vad som redan har gjorts och om det finns pågående arbete inom tillgänglighetsområdet. Existerande metoder utvärderades och en strategi som är uppbyggd av flera steg utarbetades. Arbetssättet är ett förslag på hur kommunala samhällsplanerare kan arbeta med tillgänglighetsfrågor och metoden har potential att utvecklas ytterligare. Det är av yttersta vikt att informationen om hinder används efter att den hämtats in och informationen måste ständigt hållas uppdaterad. Genom att utveckla en webbtjänst eller mobilapplikation skulle kommunen kunna få hjälp att samla in information om hinder av kommuninvånarna, inklusive de hindrade.
|
104 |
Les plateformes Internet comme intermédiaires hybrides du marché / Platforms Internet as intermediate hybrid marketPelissier, Cédric 18 December 2015 (has links)
Numérisation, nouvelle économie du Web ont créé de nouvelles pratiques de consommer et de travailler avec les dispositifs Internet. Les acteurs de l’innovation (laboratoires publics, acteurs de la R&D, collectivités locales, etc.) s’engagent de façon croissante vers des modèles alternatifs de conception distribuée. Des « lead-users » au « crowdsourcing », ces modèles de conception collaborative distribuée prennent notamment appui sur le potentiel de diffusion et de communication offert par l'Internet et la mutualisation de ressources qu'il autorise. L’ambition est ici de construire des espaces de conception et coopération ouverts, réunissant des concepteurs issus d’horizons divers et multipliant les interfaces collaboratives avec des utilisateurs expérimentés de façon à conduire à la définition conjointe de nouveaux produits combinant les technologies et les compétences apportées par chaque entité partie prenante.La thèse propose de développer une connaissance et d’opérer un retour réflexif à partir d’études de cas sur ces nouveaux modèles d’intermédiation de marché et d’innovation. Elle cherche à raisonner les échanges coopératifs en s’intéressant d’une part aux dispositifs qui les supportent (interfaces), d’autre part à la construction des règles de fonctionnement de ce type de « communauté modulaire » (individus dispersés en termes géographiques, organisationnels, culturels, hétérogénéité des profils), cependant engagée dans la mise en commun des connaissances et l’intégration des compétences autour d’assemblages technologiques nouveaux. Les axes de recherche pour répondre à ces questions s’organisent autour des interfaces et instrumentations des processus de coordination, des systèmes d’échange économiques et sociaux (contribution/rétribution) et du fonctionnement et régulation communautaire. / Scanning, new Web economy have created new practices to consume and work with Internet devices. The innovation actors (public laboratories, from R & D, local authorities, etc.) are committed increasingly to alternative models distributed design. The "lead-users" to "crowdsourcing", these distributed collaborative design models are notably supported the potential of dissemination and communication offered by the Internet and the pooling of resources authorized. The ambition here is to build open development and cooperation areas, bringing together designers from different backgrounds and multiplying collaborative interfaces with experienced users in order to lead to the joint definition of new products combining technologies and skills made each entity involved.The thesis proposes to develop knowledge and operate a reflexive return from case studies on these new models of market intermediation and innovation. She tries to reason with the cooperative exchange by focusing on the one hand to devices that support (interfaces) on the other hand the construction of the operating rules of this type of "modular community" (individuals scattered geographically, organizational, cultural, diversity of profiles), however, engaged in the sharing of knowledge and integration skills around new technological assemblies. Research areas to address these issues are organized around the instrumentation interfaces and coordination processes of economic and social exchange systems (contribution / reward) and operation and community control.
|
105 |
Scalable algorithms for monitoring activity traces / Algorithmes pour le monitoring de traces d'activité à grande échellePilourdault, Julien 28 September 2017 (has links)
Dans cette thèse, nous étudions des algorithmes pour le monitoring des traces d’activité à grande échelle. Le monitoring est une aptitude clé dans plusieurs domaines, permettant d’extraire de la valeur des données ou d’améliorer les performances d’un système. Nous explorons d’abord le monitoring de données temporelles. Nous présentons un nouveau type de jointure sur des intervalles, qui inclut des fonctions de score caractérisant le degré de satisfaction de prédicats temporels. Nous étudions ces jointures dans le contexte du batch processing (traitement par lots). Nous formalisons la Ranked Temporal Join (RTJ), une jointure qui combine des collections d’intervalles et retourne les k meilleurs résultats. Nous montrons comment exploiter les propriétés des prédicats temporels et de la sémantique de score associée afin de concevoir TKIJ , une méthode d’évaluation de requête distribuée basée sur Map-Reduce. Nos expériences sur des données synthétiques et réelles montrent que TKIJ est plus performant que les techniques de l’état de l’art et démontre de bonnes performances sur des requêtes RTJ n-aires sur des données temporelles. Nous proposons également une étude préliminaire afin d’étendre nos travaux sur TKIJ au domaine du stream processing (traitement de flots). Nous explorons ensuite le monitoring dans le crowdsourcing (production participative). Nous soutenons la nécessité d’intégrer la motivation des travailleurs dans le processus d’affectation des tâches. Nous proposons d’étudier une approche adaptative, qui évalue la motivation des travailleurs lors de l’exécution des tâches et l’exploite afin d’améliorer l’affectation de tâches qui est réalisée de manière itérative. Nous explorons une première variante nommée Individual Task Assignment (Ita), dans laquelle les tâches sont affectées individuellement, un travailleur à la fois. Nous modélisons Ita et montrons que ce problème est NP-Difficile. Nous proposons trois méthodes d’affectation de tâches qui poursuivent différents objectifs. Nos expériences en ligne étudient l’impact de chaque méthode sur la performance globale dans l’exécution de tâches. Nous observons que différentes stratégies sont dominantes sur les différentes dimensions de performance. En particulier, la méthode affectant des tâches aléatoires et correspondant aux intérêts d’un travailleur donne le meilleur flux d’exécution de tâches. La méthode affectant des tâches correspondant au compromis d’un travailleur entre diversité et niveau de rémunération des tâches donne le meilleur niveau de qualité. Nos expériences confirment l’utilité d’une affectation de tâches adaptative et tenant compte de la motivation. Nous étudions une deuxième variante nommée Holistic Task Assignment (Hta), où les tâches sont affectées à tous les travailleurs disponibles, de manière holistique. Nous modélisons Hta et montrons que ce problème est NP-Difficile et MaxSNP-Difficile. Nous développons des algorithmes d’approximation pour Hta. Nous menons des expériences sur des données synthétiques pour évaluer l’efficacité de nos algorithmes. Nous conduisons également des expériences en ligne et comparons notre approche avec d’autres stratégies non adaptatives. Nous observons que notre approche présente le meilleur compromis sur les différentes dimensions de performance. / In this thesis, we study scalable algorithms for monitoring activity traces. In several domains, monitoring is a key ability to extract value from data and improve a system. This thesis aims to design algorithms for monitoring two kinds of activity traces. First, we investigate temporal data monitoring. We introduce a new kind of interval join, that features scoring functions reflecting the degree of satisfaction of temporal predicates. We study these joins in the context of batch processing: we formalize Ranked Temporal Join (RTJ), that combine collections of intervals and return the k best results. We show how to exploit the nature of temporal predicates and the properties of their associated scored semantics to design TKIJ , an efficient query evaluation approach on a distributed Map-Reduce architecture. Our extensive experiments on synthetic and real datasets show that TKIJ outperforms state-of-the-art competitors and provides very good performance for n-ary RTJ queries on temporal data. We also propose a preliminary study to extend our work on TKIJ to stream processing. Second, we investigate monitoring in crowdsourcing. We advocate the need to incorporate motivation in task assignment. We propose to study an adaptive approach, that captures workers’ motivation during task completion and use it to revise task assignment accordingly across iterations. We study two variants of motivation-aware task assignment: Individual Task Assignment (Ita) and Holistic Task Assignment (Hta). First, we investigate Ita, where we assign tasks to workers individually, one worker at a time. We model Ita and show it is NP-Hard. We design three task assignment strategies that exploit various objectives. Our live experiments study the impact of each strategy on overall performance. We find that different strategies prevail for different performance dimensions. In particular, the strategy that assigns random and relevant tasks offers the best task throughput and the strategy that assigns tasks that best match a worker’s compromise between task diversity and task payment has the best outcome quality. Our experiments confirm the need for adaptive motivation-aware task assignment. Then, we study Hta, where we assign tasks to all available workers, holistically. We model Hta and show it is both NP-Hard and MaxSNP-Hard. We develop efficient approximation algorithms with provable guarantees. We conduct offline experiments to verify the efficiency of our algorithms. We also conduct online experiments with real workers and compare our approach with various non-adaptive assignment strategies. We find that our approach offers the best compromise between performance dimensions thereby assessing the need for adaptability.
|
106 |
Quelles méthodes pour la gestion durable de la ressource des plantes aromatiques et médicinales ? : Analyse des inventaires historiques en Albanie, modélisation des habitats à partir des traces GPS des cueilleurs et construction d’un observatoire / What methods for the sustainable management of the resource of medicinal and aromatic plants? : Analysis of historical inventories in Albania habitat modeling using GPS traces pickers and the construction of an observatoryHoxha, Valter 16 December 2014 (has links)
Les plantes aromatiques et médicinales en Albanie constituent un secteur économique qui exerce de fortes pressions sur la ressource naturelle et entraine la dégradation des habitats des plantes, voire même les expose à des risques d'extinction. L'objectif global de la thèse est de proposer de nouvelles approches complémentaires pour améliorer la base de connaissance sur la ressource des PAM en Albanie. La première partie de la thèse traite du travail qu'il a fallu réaliser sur l'existant (inventaires et études) pour en tirer les enseignements et détecter les manques éventuels. Le travail sur les archives albanaises recouvrant la période allant de 1920 à 1986 et différentes études réalisées entre 1988 et 2010 ont permit de cerner les différents dispositifs de gestion de la ressource. Une partie des données historiques exploitables a été rassemblée et structurée sous forme de base de données. La seconde partie de la thèse propose une méthode de modélisation de l'habitat exploitée à partir des traces GPS des cueilleurs en utilisant essentiellement des concepts issues de la « Time Geography ». La collecte de l'information repose sur une approche participative (crowdsourcing) associant les cueilleurs en tant que contributeur d'information. Les traces GPS sont traitées et analysées par un modèle qui met en œuvre un ensemble de filtres pour ne retenir que les portions de trace qui appartiennent à l'action de cueillette stricto sensu. Déterminer l'action de cueillette reviens à détecter indirectement l'emplacement d'une plante. L'application successive des filtres de la vitesse instantanée, de la densité spatio-temporelle, de la surface et de la moyenne des variations d'angles sert à modéliser la zone de cueillette (zc) qui par agrégation à différentes échelles permet de reconstituer l'habitat exploité. La construction du modèle théorique a été traduite en langage SQL et implémentée dans une base de données spatiale pour faciliter le traitement automatisé des données. Ce modèle a été testé sur trois types plantes : la sauge, le romarin et le tilleul. La comparaison des résultats de la modélisation, représentés sous forme de cartographies synthétiques, d'un côté, avec les données terrains (photos géoréférencées) de l'autre, ont permis de faire évoluer le modèle dans un premier temps et de valider les résultats dans un second temps. La construction d'une base de données capable d'intégrer le résultat du traitement des traces GPS et les données historiques d'archives, tout en les restituant sous forme de vues cartographiques ou statistiques permet de démontrer qu'il est possible de faire cohabiter et de croiser des données provenant de sources d'origine et de nature différente. Malgré un nombre d'expérimentions limités, le modèle couplé à la base de données «BD OPAM», jette les premières bases d'un observatoire préfigurant la gestion évolutive des PAM. / Medicinal and aromatic plants in Albania is an economic sector that exerts great pressure on the natural resource and causes the deterioration of the habitats of plants and even puts them at risk of extinction. The overall aim of the thesis is to provide new complementary approaches to improve the knowledge base on PAM resource in Albania. The first part of the thesis deals with the work that had to be made on the existing (inventories and studies) to draw lessons and identify potential gaps. The Work on the Albanian archives covering the period from 1920 to 1986 and various studies conducted between 1988 and 2010 have made it possible to identify the various devices of resource management. A harvestable part was collected as historical data and structured as a database. The second part of the thesis proposes a method of habitat modeling operated from GPS traces of the gatherers using mainly concepts from the "Time Geography." The collection of information is based on a participatory approach (crowdsourcing) involving gatherers as information contributor. GPS tracks are processed and analyzed by a model that uses a set of filters to select only those portions that belong to trace's action of picking in strict sense. Determine the action of gathering come back to indirectly detect the location of a plant. The successive application filters of the instantaneous speed, of the spatio-temporal density, of the surface and the mean variation of angles used to model the collection area (zc) that the aggregation at the different scales used to reconstruct the exploited habitat. The construction of the theoretical model has been translated into the SQL language and implemented in a spatial database to facilitate automated data processing. This model has been tested on three herbs: sage, rosemary and linden. Comparison of the results of modeling, represented as synthetic maps on one side, with the land data (photos georeferenced) on the other, have helped to change the model initially and validate results a second time. Building a database that can integrate the treatment outcome of GPS tracks and historical data archives, while restoring the form of map views or statistics used to demonstrate that it is possible to integrate and crossing data from sources of different nature and origin. Despite a limited number of experimenting, the coupled to the database "BD OPAM" model, laying the first foundations of a monitoring foreshadowing adaptive management of PAM.
|
107 |
An ontology for enhancing automation and interoperability in Enterprise Crowdsourcing EnvironmentsHetmank, Lars January 2014 (has links)
Enterprise crowdsourcing transforms the way in which traditional business tasks can be processed by harnessing the collective intelligence and workforce of a large and often diver-sified group of people. At the present time, data and information residing within enterprise crowdsourcing systems and other business applications are insufficiently interlinked and are rarely made publicly available in an open and semantically structured manner – neither to the corporate intranet nor to the World Wide Web (WWW). However, the semantic annotation of enterprise crowdsourcing activities is a promising research and application domain. The Semantic Web and its related technologies, methods and principles for publishing structured data offer an extension of the traditional layout-oriented Web to provide more intelligent and complex services.
This technical report describes the efforts toward a universal and lightweight yet powerful Semantic Web vocabulary for the domain of enterprise crowdsourcing. As a methodology for developing the vocabulary, the approach of ontology engineering is applied. To illustrate the purpose and to limit the scope of the ontology, several informal competency questions as well as functional and non-functional requirements are presented. The subsequent con-ceptualization of the ontology applies different sources of knowledge and considers various perspectives. A set of semantic entities is derived from a review of existing crowdsourcing applications and a review of recent crowdsourcing literature. During the domain capture, all partial results of the review are integrated into a consistent data dictionary and structured as a UML data schema. The designed ontology includes 24 classes, 22 object properties and 30 datatype properties to describe the key aspects of a crowdsourcing model (CSM). To demonstrate the technical feasibility, the ontology is implemented using the Web Ontology Language (OWL). Finally, the ontology is evaluated by means of transforming informal to formal competency questions, comparing it to existing semantic vocabularies, and calculat-ing ontology metrics. Evidence is shown that the CSM ontology covers the key representa-tional needs of the enterprise crowdsourcing domain. At the end of the technical report, cur-rent limitations are illustrated and directions for future research are proposed.:Table of Contents I
List of Figures III
List of Tables IV
List of Code Listings V
List of Abbreviations VI
Abstract VIII
1 Introduction 1
2 Research Objective 4
3 Ontology Engineering 6
4 Purpose and Scope 9
4.1 Informal Competency Questions 10
4.2 Requirements 11
4.2.1 Functional Requirements 12
4.2.2 Non-Functional Requirements 15
5 Ontology Development 18
5.1 Conceptualization 18
5.1.1 System Review 18
5.1.2 Literature Review 21
5.2 Domain Capture 26
5.3 Integration 28
5.3.1 Semantic Vocabularies and Standards 28
5.3.2 Implications for the Design 33
5.4 Implementation 33
6 Evaluation 35
6.1 Transforming Informal to Formal Competency Questions 36
6.2 Comparing the Ontology to other Semantic Vocabularies 42
6.3 Calculating Ontology Metrics 44
7 Conclusion 46
8 References 48
Appendix A (System Review) i
Appendix B (Crowdsourcing Taxonomies) v
Appendix C (Data Dictionary) ix
Appendix D (Semantic Vocabularies) xi
Appendix E (CSM Ontology Source Code) xv
Appendix F (Sample Data Instance 1) xxxi
Appendix G (Sample Data Instance 2) xxxiv
|
108 |
Enhancing Automation and Interoperability in Enterprise Crowdsourcing EnvironmentsHetmank, Lars 01 September 2016 (has links)
The last couple of years have seen a fascinating evolution. While the early Web predominantly focused on human consumption of Web content, the widespread dissemination of social software and Web 2.0 technologies enabled new forms of collaborative content creation and problem solving. These new forms often utilize the principles of collective intelligence, a phenomenon that emerges from a group of people who either cooperate or compete with each other to create a result that is better or more intelligent than any individual result (Leimeister, 2010; Malone, Laubacher, & Dellarocas, 2010). Crowdsourcing has recently gained attention as one of the mechanisms that taps into the power of web-enabled collective intelligence (Howe, 2008). Brabham (2013) defines it as “an online, distributed problem-solving and production model that leverages the collective intelligence of online communities to serve specific organizational goals” (p. xix). Well-known examples of crowdsourcing platforms are Wikipedia, Amazon Mechanical Turk, or InnoCentive.
Since the emergence of the term crowdsourcing in 2006, one popular misconception is that crowdsourcing relies largely on an amateur crowd rather than a pool of professional skilled workers (Brabham, 2013). As this might be true for low cognitive tasks, such as tagging a picture or rating a product, it is often not true for complex problem-solving and creative tasks, such as developing a new computer algorithm or creating an impressive product design. This raises the question of how to efficiently allocate an enterprise crowdsourcing task to appropriate members of the crowd. The sheer number of crowdsourcing tasks available at crowdsourcing intermediaries makes it especially challenging for workers to identify a task that matches their skills, experiences, and knowledge (Schall, 2012, p. 2).
An explanation why the identification of appropriate expert knowledge plays a major role in crowdsourcing is partly given in Condorcet’s jury theorem (Sunstein, 2008, p. 25). The theorem states that if the average participant in a binary decision process is more likely to be correct than incorrect, then as the number of participants increases, the higher the probability is that the aggregate arrives at the right answer. When assuming that a suitable participant for a task is more likely to give a correct answer or solution than an improper one, efficient task recommendation becomes crucial to improve the aggregated results in crowdsourcing processes. Although some assumptions of the theorem, such as independent votes, binary decisions, and homogenous groups, are often unrealistic in practice, it illustrates the importance of an optimized task allocation and group formation that consider the task requirements and workers’ characteristics.
Ontologies are widely applied to support semantic search and recommendation mechanisms (Middleton, De Roure, & Shadbolt, 2009). However, little research has investigated the potentials and the design of an ontology for the domain of enterprise crowdsourcing. The author of this thesis argues in favor of enhancing the automation and interoperability of an enterprise crowdsourcing environment with the introduction of a semantic vocabulary in form of an expressive but easy-to-use ontology. The deployment of a semantic vocabulary for enterprise crowdsourcing is likely to provide several technical and economic benefits for an enterprise. These benefits were the main drivers in efforts made during the research project of this thesis:
1. Task allocation: With the utilization of the semantics, requesters are able to form smaller task-specific crowds that perform tasks at lower costs and in less time than larger crowds. A standardized and controlled vocabulary allows requesters to communicate specific details about a crowdsourcing activity within a web page along with other existing displayed information. This has advantages for both contributors and requesters. On the one hand, contributors can easily and precisely search for tasks that correspond to their interests, experiences, skills, knowledge, and availability. On the other hand, crowdsourcing systems and intermediaries can proactively recommend crowdsourcing tasks to potential contributors (e.g., based on their social network profiles).
2. Quality control: Capturing and storing crowdsourcing data increases the overall transparency of the entire crowdsourcing activity and thus allows for a more sophisticated quality control. Requesters are able to check the consistency and receive appropriate support to verify and validate crowdsourcing data according to defined data types and value ranges. Before involving potential workers in a crowdsourcing task, requesters can also judge their trustworthiness based on previous accomplished tasks and hence improve the recruitment process.
3. Task definition: A standardized set of semantic entities supports the configuration of a crowdsourcing task. Requesters can evaluate historical crowdsourcing data to get suggestions for equal or similar crowdsourcing tasks, for example, which incentive or evaluation mechanism to use. They may also decrease their time to configure a crowdsourcing task by reusing well-established task specifications of a particular type.
4. Data integration and exchange: Applying a semantic vocabulary as a standard format for describing enterprise crowdsourcing activities allows not only crowdsourcing systems inside but also crowdsourcing intermediaries outside the company to extract crowdsourcing data from other business applications, such as project management, enterprise resource planning, or social software, and use it for further processing without retyping and copying the data. Additionally, enterprise or web search engines may exploit the structured data and provide enhanced search, browsing, and navigation capabilities, for example, clustering similar crowdsourcing tasks according to the required qualifications or the offered incentives.:Summary: Hetmank, L. (2014). Enhancing Automation and Interoperability in Enterprise Crowdsourcing Environments (Summary).
Article 1: Hetmank, L. (2013). Components and Functions of Crowdsourcing Systems – A Systematic Literature Review. In 11th International Conference on Wirtschaftsinformatik (WI). Leipzig.
Article 2: Hetmank, L. (2014). A Synopsis of Enterprise Crowdsourcing Literature. In 22nd European Conference on Information Systems (ECIS). Tel Aviv.
Article 3: Hetmank, L. (2013). Towards a Semantic Standard for Enterprise Crowdsourcing – A Scenario-based Evaluation of a Conceptual Prototype. In 21st European Conference on Information Systems (ECIS). Utrecht.
Article 4: Hetmank, L. (2014). Developing an Ontology for Enterprise Crowdsourcing. In Multikonferenz Wirtschaftsinformatik (MKWI). Paderborn.
Article 5: Hetmank, L. (2014). An Ontology for Enhancing Automation and Interoperability in Enterprise Crowdsourcing Environments (Technical Report).
Retrieved from http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-155187.
|
109 |
The letters of Casa RicordiRebulla, Patrizia, Ledda, Pierluigi, Müller, Helen 03 December 2019 (has links)
The Archivio Storico Ricordi holds the historical records of one of the most important music publisher of all times. For almost two hundred years, beyond their main business as music publishers, the Ricordis were also impresarios, agents, and cultural organisers, and played a central and unique mediating role within Italian musical life. This role is very well documented by some 30,000 autograph letters addressed to Casa Ricordi by composers, writers, librettists, singers, and conductors, and an impressive and neatly ordered collection of around 600,000 sent letters. The whole collection will be published online bit by bit. The goal of the project is to connect the letters not only with the relevant records of the Ricordi archive (ledgers, contracts, stage designs, scores, pictures...), but also with other music archives over the web. / Das Archivio Storico Ricordi sammelt historische Materialien eines der bedeutendsten Musikverlagshäuser aller Zeiten. Über fast 200 Jahre prägten die Ricordis nicht nur durch ihre Verlagstätigkeit, sondern ebenso als Impresarios, Agenten, Kulturförderer und -manager das italienische Musikleben. Rund 30 000 an die Ricordis gerichtete autografe Schreiben von Komponisten, Schriftstellern, Librettisten, Sängern und Dirigenten sowie 600 000 von den Ricordis verfasste Briefe dokumentieren die herausragende Rolle der Verlegerfamilie. Die gesamte Sammlung wird nach und nach online zur Verfügung gestellt. Dabei werden die Schriftstücke nicht nur mit anderen Objekten des Ricordi-Archivs wie Verlagsbüchern, Verträgen, Bühnenentwürfen, Noten, Bildern, sondern ebenso mit Materialien anderer Musikerarchive im Web verlinkt.
|
110 |
Gestion de la collaboration et compétition dans le crowdsourcing : une approche avec prise en compte de fuites de données via les réseaux sociaux / Managing collaboration and competition in crowdsourcing : approach that takes into account data leakage via social networksBen Amor, Iheb 27 November 2014 (has links)
Le crowdsourcing est une pratique permettant aux entreprises de faire appel à l’intelligence humaine à grande échelle afin d’apporter des solutions à des problématiques qu’elles souhaitent externaliser. Les problématiques externalisées sont de plus en plus complexes et ne peuvent être résolues individuellement. Nous proposons dans cette thèse une approche appelée SocialCrowd, contribuant à améliorer la qualité des résultats de crowdsourcing. Elle consiste à faire collaborer les participants afin d’unir leur capacité de résolution et apporter des solutions aux problèmes externalisés de plus en plus complexes. Les groupes collaboratifs sont mis en compétition, via des rétributions attrayantes, afin d’obtenir de meilleures résolutions. Par ailleurs, il est nécessaire de protéger les données privées des groupes en compétition. Nous utilisons les réseaux sociaux comme support de fuite de données. Nous proposons une approche basée sur l’algorithme Dijkstra pour estimer la probabilité de propagation de données privées d’un membre sur le réseau social. Ce calcul est complexe étant donné la taille des réseaux sociaux. Une parallélisation du calcul est proposée suivant le modèle MapReduce. Un algorithme de classification basé sur le calcul de propagation dans les réseaux sociaux est proposé permettant de regrouper les participants en groupes collaboratifs et compétitifs tout en minimisant les fuites de données d’un groupe vers l’autre. Comme ce problème de classification est d’une complexité combinatoire, nous avons proposé un algorithme de classification basé sur les algorithmes d’optimisation combinatoires tels que le recuit simulé et les algorithmes génétiques. Etant donnée le nombre important de solutions possible, une approche basée sur le modèle du Soft Constraint Satisfaction Problem (SCSP) est proposée pour classer les différentes solutions. / Crowdsourcing is the practice of allowing companies to use human intelligence scale to provide solutions to issues they want to outsource. Outsourced issues are increasingly complex and cannot be resolved individually. We propose in this thesis an approach called SocialCrowd, helping to improve the quality of the results of crowdsourcing. It compromise to collaborate participants to unite solving ability and provide solutions to outsourced problems more and more complex. Collaborative groups are put in competition through attractive remuneration, in order to obtain better resolution. Furthermore, it is necessary to protect the private information of competing groups. We use social media as a support for data leakage. We propose an approach based on Dijkstra algorithm to estimate the propagation probability of private data member in the social network. Given the size of social networks, this computation is complex. Parallelization of computing is proposed according to the MapReduce model. A classification algorithm based on the calculation of propagations in social networks is proposed for grouping participants in collaborative and competitive groups while minimizing data leaks from one group to another. As this classification problem is a combinatorial complexity, we proposed a classification algorithm based on combinatorial optimization algorithms such as simulated annealing and genetic algorithms. Given the large number of feasible solutions, an approach based on the model of Soft Constraint Satisfaction Problem (SCSP) is proposed to classify the different solutions.
|
Page generated in 0.0358 seconds