Spelling suggestions: "subject:"[een] NATURAL LANGUAGE QUERY"" "subject:"[enn] NATURAL LANGUAGE QUERY""
1 |
Audio Moment Retrieval based on Natural Language QueryShevchuk, Danylo January 2020 (has links)
Background. Users spend a lot of time searching through media content to find the desirable fragment. Most of the time people can describe verbally what they are looking for but there is not much of a use for that as of today. Using that verbal description as a query to search for the right interval in a given audio sample would save people a lot of time. Objectives. The aim of this thesis is to compare the performance of the methods suitable for retrieving desired intervals from an audio of an arbitrary length using a natural language query input. There are two objectives. The first one is to train models that match a natural language input to the specific interval of a given soundtrack. The second one is to evaluate the models' performance using conventional metrics. Methods. The research method used in this research is mixed. Various literature on the existing methods suitable for audio classification was reviewed. Three models were selected for conducting the experiments. The selected models are YamNet, AlexNet and ResNet-50. Two experiments were conducted. The goal of the first experiment was to measure the models' performance on classifying audio samples. The goal of the second experiment was to measure the same models' performance on the audio intervals retrieval problem which uses classification as a part of the approach. The steps taken to conduct the experiments were reported as well as the statistical data obtained as a result of the experiments. These steps include data collection, data preprocessing, models training and their performance evaluation. Results. The two tests were conducted to see which model performs better on two separate problems - audio classification and intervals retrieval based on a natural language query. The statistical data was obtained as a result of the tests. The degree (performance-wise) to which can we match a natural language query input to a corresponding interval of an audio of an arbitrary length was calculated for each of the selected models. The aggregated performance of the models are mostly comparable, with YamNet occasionally outperforming the other two models. The average Area Under the Curve, and Accuracy for the studied models are as follows: (67, 71.62), (68.99, 67.72) and (66.59, 71.93) for YamNet, AlexNet and ResNet-50, respectively. Conclusions. We have discovered that the tested models were not capable of retrieving intervals from an audio of an arbitrary length based on a natural language query, however the degree to which the models are able to retrieve the intervals varies depending on the queried keyword and other hyperparameters such as the value of the threshold that is used to filter the audio patches that yield too low probability of the queried class.
|
2 |
[en] IMPROVING THE QUALITY OF THE USER EXPERIENCE BY QUERY ANSWER MODIFICATION / [pt] MELHORANDO A QUALIDADE DA EXPERIÊNCIA DO USUÁRIO ATRAVÉS DA MODIFICAÇÃO DA RESPOSTA DA CONSULTAJOAO PEDRO VALLADAO PINHEIRO 30 June 2021 (has links)
[pt] A resposta de uma consulta, submetida a um banco de dados ou base de
conhecimento, geralmente é longa e pode conter dados redundantes. O usuário
é frequentemente forçado a navegar por uma longa resposta, ou refinar e repetir
a consulta até que a resposta atinja um tamanho gerenciável. Sem o tratamento
adequado, consumir a resposta da consulta pode se tornar uma tarefa tediosa.
Este estudo, então, propõe um processo que modifica a apresentação da
resposta da consulta para melhorar a qualidade de experiência do usuário, no
contexto de uma base de conhecimento RDF. O processo reorganiza a resposta
da consulta original aplicando heurísticas para comprimir os resultados. A
consulta SPARQL original é modificada e uma exploração sobre o conjunto
de resultados começa através de uma navegação guiada sobre predicados e
suas facetas. O artigo também inclui experimentos baseados em versões RDF
do MusicBrainz, enriquecido com dados do DBpedia, e IMDb, cada um com
mais de 200 milhões de triplas RDF. Os experimentos utilizam exemplos de
consultas de benchmarks conhecidos. / [en] The answer of a query, submitted to a database or a knowledge base, is often long and may contain redundant data. The user is frequently forced to browse thru a long answer, or to refine and repeat the query until the answer reaches a manageable size. Without proper treatment, consuming the query
answer may indeed become a tedious task. This study then proposes a process that modifies the presentation of a query answer to improve the quality of the user s experience, in the context of an RDF knowledge base. The process reorganizes the original query answer by applying heuristics to summarize the results. The original SPARQL query is modified and an exploration over the result set starts thru a guided navigation over predicates and its facets. The article also includes experiments based on RDF versions of MusicBrainz,
enriched with DBpedia data, and IMDb, each with over 200 million RDF triples. The experiments use sample queries from well-known benchmarks.
|
Page generated in 0.0388 seconds