Global ETD Search

11	On the effective deployment of current machine translation technology González Rubio, Jesús 03 June 2014 (has links) Machine translation is a fundamental technology that is gaining more importance each day in our multilingual society. Companies and particulars are turning their attention to machine translation since it dramatically cuts down their expenses on translation and interpreting. However, the output of current machine translation systems is still far from the quality of translations generated by human experts. The overall goal of this thesis is to narrow down this quality gap by developing new methodologies and tools that improve the broader and more efficient deployment of machine translation technology. We start by proposing a new technique to improve the quality of the translations generated by fully-automatic machine translation systems. The key insight of our approach is that different translation systems, implementing different approaches and technologies, can exhibit different strengths and limitations. Therefore, a proper combination of the outputs of such different systems has the potential to produce translations of improved quality. We present minimum Bayes¿ risk system combination, an automatic approach that detects the best parts of the candidate translations and combines them to generate a consensus translation that is optimal with respect to a particular performance metric. We thoroughly describe the formalization of our approach as a weighted ensemble of probability distributions and provide efficient algorithms to obtain the optimal consensus translation according to the widespread BLEU score. Empirical results show that the proposed approach is indeed able to generate statistically better translations than the provided candidates. Compared to other state-of-the-art systems combination methods, our approach reports similar performance not requiring any additional data but the candidate translations. Then, we focus our attention on how to improve the utility of automatic translations for the end-user of the system. Since automatic translations are not perfect, a desirable feature of machine translation systems is the ability to predict at run-time the quality of the generated translations. Quality estimation is usually addressed as a regression problem where a quality score is predicted from a set of features that represents the translation. However, although the concept of translation quality is intuitively clear, there is no consensus on which are the features that actually account for it. As a consequence, quality estimation systems for machine translation have to utilize a large number of weak features to predict translation quality. This involves several learning problems related to feature collinearity and ambiguity, and due to the ¿curse¿ of dimensionality. We address these challenges by adopting a two-step training methodology. First, a dimensionality reduction method computes, from the original features, the reduced set of features that better explains translation quality. Then, a prediction model is built from this reduced set to finally predict the quality score. We study various reduction methods previously used in the literature and propose two new ones based on statistical multivariate analysis techniques. More specifically, the proposed dimensionality reduction methods are based on partial least squares regression. The results of a thorough experimentation show that the quality estimation systems estimated following the proposed two-step methodology obtain better prediction accuracy that systems estimated using all the original features. Moreover, one of the proposed dimensionality reduction methods obtained the best prediction accuracy with only a fraction of the original features. This feature reduction ratio is important because it implies a dramatic reduction of the operating times of the quality estimation system. An alternative use of current machine translation systems is to embed them within an interactive editing environment where the system and a human expert collaborate to generate error-free translations. This interactive machine translation approach have shown to reduce supervision effort of the user in comparison to the conventional decoupled post-edition approach. However, interactive machine translation considers the translation system as a passive agent in the interaction process. In other words, the system only suggests translations to the user, who then makes the necessary supervision decisions. As a result, the user is bound to exhaustively supervise every suggested translation. This passive approach ensures error-free translations but it also demands a large amount of supervision effort from the user. Finally, we study different techniques to improve the productivity of current interactive machine translation systems. Specifically, we focus on the development of alternative approaches where the system becomes an active agent in the interaction process. We propose two different active approaches. On the one hand, we describe an active interaction approach where the system informs the user about the reliability of the suggested translations. The hope is that this information may help the user to locate translation errors thus improving the overall translation productivity. We propose different scores to measure translation reliability at the word and sentence levels and study the influence of such information in the productivity of an interactive machine translation system. Empirical results show that the proposed active interaction protocol is able to achieve a large reduction in supervision effort while still generating translations of very high quality. On the other hand, we study an active learning framework for interactive machine translation. In this case, the system is not only able to inform the user of which suggested translations should be supervised, but it is also able to learn from the user-supervised translations to improve its future suggestions. We develop a value-of-information criterion to select which automatic translations undergo user supervision. However, given its high computational complexity, in practice we study different selection strategies that approximate this optimal criterion. Results of a large scale experimentation show that the proposed active learning framework is able to obtain better compromises between the quality of the generated translations and the human effort required to obtain them. Moreover, in comparison to a conventional interactive machine translation system, our proposal obtained translations of twice the quality with the same supervision effort. / González Rubio, J. (2014). On the effective deployment of current machine translation technology [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37888 Statistical machine translation Minimum Bayes' Risk System combination Partial least squares regression Quality estimation Confidence measures Interactive machine translation Interactive translation prediction Active Interaction Active learning Online learning ESTADISTICA E INVESTIGACION OPERATIVA LENGUAJES Y SISTEMAS INFORMATICOS
12	TASK-AWARE VIDEO COMPRESSION AND QUALITY ESTIMATION IN PRACTICAL VIDEO ANALYTICS SYSTEMS Praneet Singh (20797433) 28 February 2025 (has links) <p dir="ltr">Practical video analytics systems that perform computer vision tasks are widely used in critical real-world scenarios such as autonomous driving and public safety. These end-to-end systems sequentially perform tasks like object detection, segmentation, and recognition such that the performance of each analytics task depends on how well the previous tasks are performed. Typically, these systems are deployed in resources and bandwidth-constrained environments, so video compression algorithms like HEVC are necessary to minimize transmission bandwidth at the expense of input quality. Furthermore, to optimize resource utilization of these systems, the analytics tasks should be executed solely on inputs that may provide valuable insights on task performance. Hence, it is essential to understand the impact of compression and input data quality on the overall performance of end-to-end video analytics systems, using meaningfully curated datasets and interpretable evaluation procedures. This information is crucial for the overall improvement of system performance. Thus, in this thesis we focus on:</p><ol><li>Understanding the effects of compression on the performance of video analytics systems that perform tasks such as pedestrian detection, face detection, and face recognition. With this, we develop a task-aware video encoding strategy for HEVC that improves system performance under compression.</li><li>Designing methodologies to perform a meaningful and interpretable evaluation of an end-to-end system that sequentially performs face detection, alignment, and recognition. This involves balancing datasets, creating consistent ground truths, and capturing the performance interdependence between the various tasks of the system.</li><li>Estimating how image quality is linked to task performance in end-to-end face analytics systems. Here, we design novel task-aware image Quality Estimators (QEs) that determine the suitability of images for face detection. We also propose systematic evaluation protocols to showcase the efficacy of our novel face detection QEs and existing face recognition QEs. </li></ol><p dir="ltr"><br></p> Image and video coding Video processing Practical Video Analytics Quality Estimation Deep Learning End-to-end systems Evaluation Protocols Image Quality Video Compression Object Detection Face Recognition Dataset Balancing Data Curation
13	Directing Post-Editors’ Attention to Machine Translation Output that Needs Editing through an Enhanced User Interface: Viability and Automatic Application via a Word-level Translation Accuracy Indicator Gilbert, Devin Robert 13 July 2022 (has links) No description available. Linguistics Language Artificial Intelligence translation process research post-editing post-editing user interface machine translation translation entropy machine translation quality estimation translation accuracy indicator postediting PEMT post-editing of machine translation MT postediting of machine translation Trados Studio Qualitivity CRITT CRITT TPR-DB TPR-DB

Page generated in 0.0982 seconds