Global ETD Search

1	Semi-automated annotation and active learning for language documentation Palmer, Alexis Mary 03 April 2013 (has links) By the end of this century, half of the approximately 6000 extant languages will cease to be transmitted from one generation to the next. The field of language documentation seeks to make a record of endangered languages before they reach the point of extinction, while they are still in use. The work of documenting and describing a language is difficult and extremely time-consuming, and resources are extremely limited. Developing efficient methods for making lasting records of languages may increase the amount of documentation achieved within budget restrictions. This thesis approaches the problem from the perspective of computational linguistics, asking whether and how automated language processing can reduce human annotation effort when very little labeled data is available for model training. The task addressed is morpheme labeling for the Mayan language Uspanteko, and we test the effectiveness of two complementary types of machine support: (a) learner-guided selection of examples for annotation (active learning); and (b) annotator access to the predictions of the learned model (semi-automated annotation). Active learning (AL) has been shown to increase efficacy of annotation effort for many different tasks. Most of the reported results, however, are from studies which simulate annotation, often assuming a single, infallible oracle. In our studies, crucially, annotation is not simulated but rather performed by human annotators. We measure and record the time spent on each annotation, which in turn allows us to evaluate the effectiveness of machine support in terms of actual annotation effort. We report three main findings with respect to active learning. First, in order for efficiency gains reported from active learning to be meaningful for realistic annotation scenarios, the type of cost measurement used to gauge those gains must faithfully reflect the actual annotation cost. Second, the relative effectiveness of different selection strategies in AL seems to depend in part on the characteristics of the annotator, so it is important to model the individual oracle or annotator when choosing a selection strategy. And third, the cost of labeling a given instance from a sample is not a static value but rather depends on the context in which it is labeled. We report two main findings with respect to semi-automated annotation. First, machine label suggestions have the potential to increase annotator efficacy, but the degree of their impact varies by annotator, with annotator expertise a likely contributing factor. At the same time, we find that implementation and interface must be handled very carefully if we are to accurately measure gains from semi-automated annotation. Together these findings suggest that simulated annotation studies fail to model crucial human factors inherent to applying machine learning strategies in real annotation settings. / text Active learning Computational linguistics Language documentation Language endangerment Uspanteko Semi-automated annotation Interlinear text Annotator expertise
2	Body Rumen Fill Scoring of Dairy Cows Using Digital Images Derakhshan, Reza, Yousefzadeh Boroujeni, Soroush January 2024 (has links) The research presented in this thesis focuses on an innovative use of digital imaging, and the machine learning techniques to assess the body rumen fill scoring in dairy cows. This study aims to enhance the efficiency of monitoring and managing dairy cow health, which is crucial for the dairy industry's productivity and sustainability. The primary objective was to develop an automated annotation system fore valuating rumen fill status in dairy cows using digital images extracted from recorded videos. This system leverages advanced machine learning algorithms and neural networks, aiming to mimic manual assessments by veterinarians and specialists on farms. To achieve the above objectives, this thesis made use of already existing video records from a Swedish dairy farm hosting mainly the Swedish Redand the Swedish Holstein breeds. A subset of these images were then processed, manually classified using a modified rumen fill scoring system based on visual assessment, and supervised classification algorithms were trained on 277 manually annotated images. The thesis explored various machine learning techniques for classifying these images, including Logistic Regression, Support Vector Machine (SVM), and a Deep Neural Network using the VGG16 architecture. These models were trained, validated, and tested with a dataset that included variations in cow color patterns, aiming to determine the most effective approach for automated rumen fill scoring.The results indicated that while each model had its strengths and weaknesses, the simple logistic model was performing the best in terms of test accuracy and F1 score. This research contributes to the field of precision livestock farming, particularly in the context of dairy farming. By automating the process of rumen fill scoring, the study aims to provide dairy farmers with a reliable, efficient, and cost-effective tool for monitoring cow health. This tool has the potential to enhance dairy cow welfare, improve milk production, and support the sustainability of dairy farming operations. However, at the current state, the model accuracy of the best model was only moderate. There is a need for further improvement of the prediction performance possibly by adding more cow images, using improved image processing, and feature engineering. Dairy Cows Image Processing Machine Learning Rumen Fill Scoring Automated Annotation Precision Agriculture Dairy Farming VGG16 Computer Sciences Datavetenskap (datalogi) Animal and Dairy Science Husdjursvetenskap

Search results

Semi-automated annotation and active learning for language documentation

Body Rumen Fill Scoring of Dairy Cows Using Digital Images