• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 341
  • 80
  • 25
  • 17
  • 11
  • 9
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 3
  • 3
  • Tagged with
  • 632
  • 632
  • 207
  • 132
  • 73
  • 71
  • 65
  • 62
  • 60
  • 57
  • 56
  • 53
  • 49
  • 44
  • 44
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Non-asymptotic bounds for prediction problems and density estimation.

Minsker, Stanislav 05 July 2012 (has links)
This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.
172

Active visual category learning

Vijayanarasimhan, Sudheendra 02 June 2011 (has links)
Visual recognition research develops algorithms and representations to autonomously recognize visual entities such as objects, actions, and attributes. The traditional protocol involves manually collecting training image examples, annotating them in specific ways, and then learning models to explain the annotated examples. However, this is a rather limited way to transfer human knowledge to visual recognition systems, particularly considering the immense number of visual concepts that are to be learned. I propose new forms of active learning that facilitate large-scale transfer of human knowledge to visual recognition systems in a cost-effective way. The approach is cost-effective in the sense that the division of labor between the machine learner and the human annotators respects any cues regarding which annotations would be easy (or hard) for either party to provide. The approach is large-scale in that it can deal with a large number of annotation types, multiple human annotators, and huge pools of unlabeled data. In particular, I consider three important aspects of the problem: (1) cost-sensitive multi-level active learning, where the expected informativeness of any candidate image annotation is weighed against the predicted cost of obtaining it in order to choose the best annotation at every iteration. (2) budgeted batch active learning, a novel active learning setting that perfectly suits automatic learning from crowd-sourcing services where there are multiple annotators and each annotation task may vary in difficulty. (3) sub-linear time active learning, where one needs to retrieve those points that are most informative to a classifier in time that is sub-linear in the number of unlabeled examples, i.e., without having to exhaustively scan the entire collection. Using the proposed solutions for each aspect, I then demonstrate a complete end-to-end active learning system for scalable, autonomous, online learning of object detectors. The approach provides state-of-the-art recognition and detection results, while using minimal total manual effort. Overall, my work enables recognition systems that continuously improve their knowledge of the world by learning to ask the right questions of human supervisors. / text
173

The Effects Of Activities Based On Role-play On Ninth Grade Students

Kucuker (tuncer), Yadikar 01 September 2004 (has links) (PDF)
This study intented to investigate the effects of activities based on role-play on ninth grade students&rsquo / achievement and attitudes at simple electric circuits. In this study, Physics Achievement Test was developed to evaluate students&rsquo / achievement on simple electric circuits and role-play activities about simple electric circuits were prepared. In addition, Physics Attitude Scale was administered to explore students&rsquo / attitude towards physics. The present study was conducted at one of the high schools in Acipayam during 2003-2004 Spring Semester with a total number of 104 (51 female and 53 male) 9th students from four classes of two physics teachers. One class of each physics teacher was assigned as experimental and instructed by role-play activities on the other hand the other classes of each physics teacher was as control group and instructed by traditional method. The teachers were trained for how to implement role-play activities in the class before the study began. Physics Attitude Scale and Physics Achievement Tests were applied twice as a pre-test and after a three-week treatment period as a post-test to both groups to assess and compare the effectiveness of two different types of teaching / role-play versus traditional teaching method. Data were collected utilizing Physics Achievement Test and Physics Attitude Scale. Data of this study were analyzed utilizing descriptive and inferential statistics. The scores of the post-tests were analyzed by statistical techniques of Multivariate Analyses of Covariance (MANCOVA). Experimental group compared to control group tended to favor a significant difference in the achievement. However the statistical analysis failed to show any significant differences between the experimental and control groups&rsquo / attitude towards physics at simple electric circuits.
174

How can a science educator incorporate field study into their advanced high school science courses?

Apffel, Michael Alexis 01 January 2006 (has links)
Organizes information and opportunities for high school level science field work and categorizes it to inform the educator of the field study possibilities. Assists educators in overcoming the obstacles of implementing field science into existing science courses. Several field study lesson plans are provided.
175

Deep Active Learning for Image Classification using Different Sampling Strategies

Saleh, Shahin January 2021 (has links)
Convolutional Neural Networks (CNNs) have been proved to deliver great results in the area of computer vision, however, one fundamental bottleneck with CNNs is the fact that it is heavily dependant on the ground truth, that is, labeled training data. A labeled dataset is a group of samples that have been tagged with one or more labels. In this degree project, we mitigate the data greedy behavior of CNNs by applying deep active learning with various kinds of sampling strategies. The main focus will be on the sampling strategies random sampling, least confidence sampling, margin sampling, entropy sampling, and K- means sampling. We choose to study the random sampling strategy since it will work as a baseline to the other sampling strategies. Moreover, the least confidence sampling, margin sampling, and entropy sampling strategies are uncertainty based sampling strategies, hence, it is interesting to study how they perform in comparison with the geometrical based K- means sampling strategy. These sampling strategies will help to find the most informative/representative samples amongst all unlabeled samples, thus, allowing us to label fewer samples. Furthermore, the benchmark datasets MNIST and CIFAR10 will be used to verify the performance of the various sampling strategies. The performance will be measured in terms of accuracy and less data needed. Lastly, we concluded that by using least confidence sampling and margin sampling we reduced the number of labeled samples by 79.25% in comparison with the random sampling strategy for the MNIST dataset. Moreover, by using entropy sampling we reduced the number of labeled samples by 67.92% for the CIFAR10 dataset. / Faltningsnätverk har visat sig leverera bra resultat inom området datorseende, men en fundamental flaskhals med Faltningsnätverk är det faktum att den är starkt beroende av klassificerade datapunkter. I det här examensarbetet hanterar vi Faltningsnätverkens giriga beteende av klassificerade datapunkter genom att använda deep active learning med olika typer av urvalsstrategier. Huvudfokus kommer ligga på urvalsstrategierna slumpmässigt urval, minst tillförlitlig urval, marginal baserad urval, entropi baserad urval och K- means urval. Vi väljer att studera den slumpmässiga urvalsstrategin eftersom att den kommer användas för att mäta prestandan hos de andra urvalsstrategierna. Dessutom valde vi urvalsstrategierna minst tillförlitlig urval, marginal baserad urval, entropi baserad urval eftersom att dessa är osäkerhetsbaserade strategier som är intressanta att jämföra med den geometribaserade strategin K- means. Dessa urvalsstrategier hjälper till att hitta de mest informativa/representativa datapunkter bland alla oklassificerade datapunkter, vilket gör att vi behöver klassificera färre datapunkter. Vidare kommer standard dastaseten MNIST och CIFAR10 att användas för att verifiera prestandan för de olika urvalsstrategierna. Slutligen drog vi slutsatsen att genom att använda minst tillförlitlig urval och marginal baserad urval minskade vi mängden klassificerade datapunkter med 79, 25%, i jämförelse med den slumpmässiga urvalsstrategin, för MNIST- datasetet. Dessutom minskade vi mängden klassificerade datapunkter med 67, 92% med hjälp av entropi baserad urval för CIFAR10datasetet.
176

Analyzing the performance of active learning strategies on machine learning problems

Werner, Vendela January 2023 (has links)
Digitalisation within industries is rapidly advancing and data possibilities are growing daily. Machine learning models need a large amount of data that are well-annotated for good performance. To get well-annotated data, an expert is needed, which is expensive, and the annotation itself could be very time-consuming. The performance of machine learning models is dependent on the size of the data set since a large amount of annotation is required for a good performance. Active learning has emerged as a solution to increase the size of the data through selective annotation. Instead of labelling data points at random, active learning strategies can be used to select data points based on informativeness or uncertainty. The challenge lies in determining the most effective active learning strategy for a combination of machine learning model and problem type. Although active learning has been around for a while, benchmarking strategies have not widely been explored. The aim of the thesis was to benchmark different AL strategies and analyse their performance on underlying ML problems and ML methods/models. For this purpose, an experiment was constructed to, in an unbiased way, compare different machine learning models in combination with different active learning strategies within the areas of computer vision, drug discovery, and natural language processing. Nine different active learning strategies were analysed in the thesis, with a random strategy working as the baseline, tested on six different machine learning methods/models. The result of this thesis was that active learning had a positive effect within all problem areas and especially worked well for unbalanced data. The two main conclusions are that all active learning strategies work better for a smaller budget due to the importance of selecting informative data points and that prediction-based strategies are the most successful for all problem types. / Föreställ dig möjligheten att ha ett verktyg för att bota en genetisk sjukdom. Idag finns data överallt, även ditt DNA anses vara fullt av värdefull information och mysterier redo att utforskas. I våra data finns det oändliga kopplingar och dolda relationer som inte ens det bästa mänskliga sinnet kan hitta och datorkraft har blivit en styrka att räkna med. Ett vinnande koncept har visat sig vara human-in-the-loop-programmering, där människa och dator arbetar tillsammans. Detta kallas inom maskininlärning för supervised learn- ing. Normalt sett kräver supervised learning en stor mängd data, och för mer komplexa uppgifter, en expert då feedback från en människa förväntas. Man kan se datorn som en detektiv och experten som dennes chef som pekar i rätt riktning. Riktningen pekas ut genom annotering av data, man berättar för datorn vilket svar som är rätt så att den lär sig ta ut särdrag. Exempelvis om man vill ha ett program som skiljer på hund från katt så kan det vara svårt att veta vad som är vad om man aldrig har sett ett djur innan. Båda har två öron, två ögon, fyra ben, och i många fall, även päls. En människa kan då berätta för datorn om det är en hund eller katt som syns på bilden och datorn kommer då börja lära sig se mönster och se utmärkande egenskaper. Att annotera data tar både lång tid och kostar mycket pengar. Vad gör man egentligen när mängden data är för liten, och/eller kostnaden för en expert blir för stor? Sam är en person med en sällsynt genetisk sjukdom. De har hört talas om ett program som bygger på supervised learning som kan ge förslag på vilken medicinsk behandling de kan pröva för att lindra sina symtom. På grund av den unika genetiska sjukdom som Sam har så finns det inte mycket data om detta, vilket gör att programvaran inte kommer fungera i Sams fall. Kom ihåg att supervised learning behöver mycket data som är väl annoterad för att ge pålitlig utdata. Hur ska programmeraren kunna hjälpa Sam? Med active learning såklart! Active learning är ett samlingsnamn för olika strategier som selekterar de mest informativa, eller osäkra datapunkterna att annotera. I stället för att exempelvis göra 2000 annoteringar kan en bättre prestanda åstadkommas med enbart 100. Skillnaden ligger i att det under supervised learning utan active learn- ing presenteras en färdig uppsättning av punkter för experten att annotera. Med active learning sker en interaktion för att välja ut punkter för annotering. Detta resulterar i en mer kostnadseffektiv inlärning som även presterar bra på ett litet data set. Detta exjobb har studerat prestationen av active learning inom läkemedelsbranschen och även prob- lem inom datorseende och språkteknologi. Resultatet gav att minst en av de applicerade active learning strategierna ledde till en förbättrad prestanda inom samtliga områden. Kanske kan vi i framtiden faktiskt använda active learning till att hjälpa personer som Sam och ha verktyget för att lösa mysteriet och bota dennes genetiska sjukdom.
177

Kompetenscenter : En genomlysning av Kompetenscenters digitala klassrum

Dadayan, Tatevik, Englöv, Alice January 2023 (has links)
Denna studie undersöker användningen av active learning i Kompetenscenters digitala klassrum med fokus på områden som kreativt klimat, motivation och digital undervisning. Kompetenscenter är en kommunal vuxenutbildning som ligger i Köping. Syftet med studien är att förbättra studenternas engagemang i distansstudier och skapa ett mer tillgängligt och inkluderande klimat i det digitala klassrummet. Innovationsbidraget ligger i att kunna skapa bättre förutsättningar för studenterna i den digitala miljön med hjälp av lärometoden active learning. Den valda forskningsmetoden är ett kvalitativt angreppssätt med en fallstudiedesign. Forskarna har utgått från en abduktiv ansats då forskarna kontinuerligt har jämfört ny empiri med teori. Som datainsamlingsmetoder har forskarna använt sig utav sju semistrukturerade intervjuer med sju olika respondenter från Kompetenscenter. Den genomförda analysmetoden är tematisk analys av empiriska data, forskarna har kodat insamlade data för att identifiera teman. Genom lärometoden active learning har studien tagit fram riktlinjer för Kompetenscenter. Implementeringen av dessa riktlinjer kommer hjälpa Kompetenscenter att skapa ett kreativt klimat som är tillgängligt och inkluderande för studenterna. / This study investigates the use of active learning in the Competence Center’s digital classroom with a focus on areas such as creative climate, motivation, and digital teaching. Competence Center is a municipal adult education located in Köping. The aim of this study is to improve student engagement in distance learning and create a more accessible and inclusive climate in the digital classroom. The innovation contribution lies in being able to create better conditions for the students in the digital environment with the help of the teaching method active learning. The chosen research method is a qualitative approach with a case study design. The researchers have used an abductive approach, in which case the researchers have continuously compared new empirical evidence with theory. As data collection methods, the researchers have used seven semi-structured interviews with seven different respondents from Competence Center. The analysis method carried out is thematic analysis of empirical data, the researchers have coded the collected data to identify themes. Through the teaching method active learning, the study has produced guidelines for Competence Center. The implementation of these guidelines will help Competence Center to create a creative climate that is accessible and inclusive for the students in the digital classroom.
178

Practical Cost-Conscious Active Learning for Data Annotation in Annotator-Initiated Environments

Haertel, Robbie A. 12 August 2013 (has links) (PDF)
Many projects exist whose purpose is to augment raw data with annotations that increase the usefulness of the data. The number of these projects is rapidly growing and in the age of “big data” the amount of data to be annotated is likewise growing within each project. One common use of such data is in supervised machine learning, which requires labeled data to train a predictive model. Annotation is often a very expensive proposition, particularly for structured data. The purpose of this dissertation is to explore methods of reducing the cost of creating such data sets, including annotated text corpora.We focus on active learning to address the annotation problem. Active learning employs models trained using machine learning to identify instances in the data that are most informative and least costly. We introduce novel techniques for adapting vanilla active learning to situations wherein data instances are of varying benefit and cost, annotators request work “on-demand,” and there are multiple, fallible annotators of differing levels of accuracy and cost. In order to account for data instances of varying cost, we build a model of cost from real annotation data based on a user study. We also introduce a novel cost-conscious active learning algorithm which we call return-on-investment, that selects instances for annotation that contain the most benefit per unit cost. To address the issue of annotators that request instances “on-demand,” we develop a parallel, “no-wait” framework that performs computation while the annotator is annotating. As a result, annotators need not wait for the computer to determine the best instance for them to annotate—a common problem with existing approaches. Finally, we introduce a Bayesian model designed to simultaneously infer ground truth annotations from noisy annotations, infer each individual annotators accuracy, and predict its own accuracy on unseen data, without the use of a held-out set. We extend ROI-based active learning and our annotation framework to handle multiple annotators using this model. As a whole, our work shows that the techniques introduced in this dissertation reduce the cost of annotation in scenarios that are more true-to-life than previous research.
179

[pt] ESTRATÉGIAS PARA OTIMIZAR PROCESSOS DE ANOTAÇÃO E GERAÇÃO DE DATASETS DE SEGMENTAÇÃO SEMÂNTICA EM IMAGENS DE MAMOGRAFIA / [en] STRATEGIES TO OPTIMIZE ANNOTATION PROCESSES AND GENERATION OF SEMANTIC SEGMENTATION DATASETS IN MAMMOGRAPHY IMAGES

BRUNO YUSUKE KITABAYASHI 17 November 2022 (has links)
[pt] Com o avanço recente do uso de aprendizagem profunda supervisionada (supervised deep learning) em aplicações no ramo da visão computacional, a indústria e a comunidade acadêmica vêm evidenciando que uma das principais dificuldades para o sucesso destas aplicações é a falta de datasets com a suficiente quantidade de dados anotados. Nesse sentido aponta-se a necessidade de alavancar grandes quantidades de dados rotulados para que estes modelos inteligentes possam solucionar problemas pertinentes ao seu contexto para atingir os resultados desejados. O uso de técnicas para gerar dados anotados de maneira mais eficiente está sendo cada vez mais explorado, juntamente com técnicas para o apoio à geração dos datasets que servem de insumos para o treinamento dos modelos de inteligência artificial. Este trabalho tem como propósito propor estratégias para otimizar processos de anotação e geração de datasets de segmentação semântica. Dentre as abordagens utilizadas neste trabalho destacamos o Interactive Segmentation e Active Learning. A primeira, tenta melhorar o processo de anotação de dados, tornando-o mais eficiente e eficaz do ponto de vista do anotador ou especialista responsável pela rotulagem dos dados com uso de um modelo de segmentação semântica que tenta imitar as anotações feitas pelo anotador. A segunda, consiste em uma abordagem que permite consolidar um modelo deep learning utilizando um critério inteligente, visando a seleção de dados não anotados mais informativos para o treinamento do modelo a partir de uma função de aquisição que se baseia na estimação de incerteza da rede para realizar a filtragem desses dados. Para aplicar e validar os resultados de ambas as técnicas, o trabalho os incorpora em um caso de uso relacionado em imagens de mamografia para segmentação de estruturas anatômicas. / [en] With the recent advancement of the use of supervised deep learning in applications in the field of computer vision, the industry and the academic community have been showing that one of the main difficulties for the success of these applications is the lack of datasets with a sufficient amount of annotated data. In this sense, there is a need to leverage large amounts of labeled data so that these intelligent models can solve problems relevant to their context to achieve the desired results. The use of techniques to generate annotated data more efficiently is being increasingly explored, together with techniques to support the generation of datasets that serve as inputs for the training of artificial intelligence models. This work aims to propose strategies to optimize annotation processes and generation of semantic segmentation datasets. Among the approaches used in this work, we highlight Interactive Segmentation and Active Learning. The first one tries to improve the data annotation process, making it more efficient and effective from the point of view of the annotator or specialist responsible for labeling the data using a semantic segmentation model that tries to imitate the annotations made by the annotator. The second consists of an approach that allows consolidating a deep learning model using an intelligent criterion, aiming at the selection of more informative unannotated data for training the model from an acquisition function that is based on the uncertainty estimation of the network to filter these data. To apply and validate the results of both techniques, the work incorporates them in a use case in mammography images for segmentation of anatomical structures.
180

Active learning for text classification in cyber security / Aktiv inlärning för textklassificering i cyberdomänen

Carp, Amanda January 2023 (has links)
In the domain of cyber security, machine learning promises advanced threat detection. However, the volume of available unlabeled data poses challenges for efficient data management. This study investigates the potential for active learning, a subset of interactive machine learning, to reduce the effort required for manual data labelling. Through different query strategies, the most informative unlabeled data points were selected for manual labelling. The performance of different query strategies was assessed by testing a transformer model’s ability to accurately distinguish tweets mentioning names of advanced persistent threats. The findings suggest that the K-means diversity-based query strategy outperformed both the uncertainty-based approach and the random data point selection, when the amount of labelled training data was limited. This study also evaluated the cost-effective active learning approach, which incorporates high-confidence data points into the training dataset. However, this was shown to be the least effective strategy. Lastly, the study acknowledges that the computational time taken for each query strategy varies significantly between strategies. Hence, an optimal query strategy selection requires a balanced consideration of F-score performance taken together with time efficiency. / Maskininlärning skulle kunna användas för avancerad hotdetektion i cyberdomänen. Dock utgör behovet av träningsdata tillsammans med den stora tillgången till oannoterad data en utmaning. Detta arbete undersöker huruvida aktiv inlärning, en delmängd av interaktiv maskininlärning, kan minska behovet av annoterad data. Genom olika frågestrategier valdes de mest informativa datapunkterna ut för mänsklig annotering. Resultaten för de olika frågestrategierna utvärderades sedan genom att testa en maskininlärningsmodells förmåga att korrekt urskilja tweets som innehåller namn på cyberhotsaktörer. Resultaten tyder på att när mängden annoterad data var begränsad, presterade den diversifieringsbaserade strategin K-means bättre än både den osäkerhetsbaserade frågestrategin och strategin som väljer ut datapunkter slumpmässigt. Denna studie utvärderade också kostnadseffektiv aktiv inlärning som lägger till datapunkter som modellen redan är relativt säker på till träningsdatamängden. Denna metod visade sig dock vara den minst effektiva strategin. Slutligen visar arbetet att beräkningstiden som krävs för varje frågestrategi varierar avsevärt. För att utse den mest optimala frågestrategin krävs därför ett övervägande av både prestanda och tidsåtgång.

Page generated in 0.1086 seconds