Global ETD Search

21	Eﬀective Strategies for Improving Peptide Identiﬁcation with Tandem Mass Spectrometry Han, Xi January 2011 (has links) Tandem mass spectrometry (MS/MS) has been routinely used to identify peptides from protein mixtures in the field of proteomics. However, only about 30% to 40% of current MS/MS spectra can be identified, while many of them remain unassigned, even though they are of reasonable quality. The ubiquitous presence of post-translational modifications (PTMs) is one of the reasons for current low spectral identification rate. In order to identify post-translationally modified peptides, most existing software requires the specification of a few possible modifications. However, such knowledge of possible modifications is not always available. In this thesis, we describe a new algorithm for identifying modified peptides without requiring users to specify the possible modifications before the search routine; instead, all modifications from the Unimod database are considered. Meanwhile, several new techniques are employed to avoid the exponential growth of the search space, as well as to control the false discoveries due to this unrestricted search approach. A software tool, PeaksPTM, has been developed and it has already achieved a stronger performance than competitive tools for unrestricted identification of post-translationally modified peptides. Another important reason for the failure of the search tools is the inaccurate mass or charge state measurement of the precursor peptide ion. In this thesis, we study the precursor mono-isotopic mass and charge determination problem, and propose an algorithm to correct precursor ion mass error by assessing the isotopic features in its parent MS spectrum. The algorithm has been tested on two annotated data sets and achieved almost 100 percent accuracy. Furthermore, we have studied a more complicated problem, the MS/MS preprocessing problem, and propose a spectrum deconvolution algorithm. Experiments were provided to compare its performance with other existing software. Bioinformatics Mass Spectrometry peptide identification post-translational modification data pre-processing Proteomics Computer Science
22	Feature extraction and matching of palmprints using Level I detail Kitching, Peter January 2017 (has links) Current Automatic Palmprint Identification Systems (APIS) closely follow the matching philosophy of Automatic Fingerprint Identification Systems (AFIS), in that they exclusively use a small subset of Level II palmar detail, when matching a latent to an exemplar palm print. However, due the increased size and the significantly more complex structure of the palm, it has long been recognised that there is much detail that remains underutilised. Forensic examiners routinely use this additional information when manually matching latents. The thesis develops novel automatic feature extraction and matching methods which exploit the underutilised Level I detail contained in the friction ridge flow. When applied to a data base of exemplars, the approach creates a ranked list of matches. It is shown that the matching success rate varied with latent size. For latents of diameter 38mm, 91:1% were ranked first and 95:6% of the matches were contained within the ranked top 10. The thesis presents improved orientation field extraction methods which are optimised for friction ridge flow and novel enhancement techniques, based upon the novel use of local circular statistics on palmar orientation fields. In combination, these techniques are shown to provide a more accurate orientation estimate than previous work. The novel feature extraction stages exploit the level sets of higher order local circular statistics, which naturally segment the palm into homogeneous regions representing Level I detail. These homogeneous regions, characterised by their spatial and circular features, are used to form a novel compact tree-like hierarchical representation of the Level I detail. Matching between the latent and an exemplar is performed between their respective tree-like hierarchical structures. The methods developed within the thesis are complementary to current APIS techniques. 006.3
23	Metodologia para mapeamento de informações não estruturadas descritas em laudos médicos para uma representação atributo-valor / A methodology for mapping non-structured medical findings to the attribute-value table format Daniel de Faveri Honorato 29 April 2008 (has links) Devido à facilidade com que informações biomédicas em língua natural são registras e armazenadas no formato digital, a recuperação de informações a partir de registros de pacientes nesse formato não estruturado apresenta diversos problemas a serem solucionados. Assim, a extração de informações estruturadas (por exemplo, no formato atributo-valor) a partir de registros não estruturados é um importante problema de pesquisa. Além disso, a representação de registros médicos não estruturados no formato atributo-valor, permite a aplicação de uma grande variedade de métodos de extração de padrões. Para mapear registros médicos não estruturados no formato atributo-valor, propomos uma metodologia que pode ser utilizada para automaticamente (ou semi-automaticamente, com a ajuda de um especialista do domínio) mapear informações médicas de interesse armazenadas nos registros médicos e descritas em linguagem natural em um formato estruturado. Essa metodologia foi implementada em um sistema computacional chamado TP-DISCOVER, o qual gera uma tabela no formato atributo-valor a partir de um conjunto de registros de pacientes (documentos). De modo a identificar entidades importantes no conjunto de documentos, assim como relacionamentos significantes entre essas entidades, propomos uma abordagem de extração de terminologia híbrida (lingüística/estatística) a qual seleciona palavras e frases que aparecem com freqüência acima de um dado limiar por meio da aplicação de medidas estatísticas. A idéia geral dessa abordagem híbrida de extração de terminologia é que documentos especializados são caracterizados por repetir o uso de certas unidades léxicas ou construções morfo-sintáticas. Nosso objetivo é reduzir o esforço despendido na modelagem manual por meio da observação de regularidades no texto e o mapeamento dessas regularidades como nomes de atributos na representação atributo-valor. A metodologia proposta foi avaliada realizando a estruturação automática de uma coleção de 6000 documentos com informações de resultados de exames de Endoscopia Digestiva Alta descritos em língua natural. Os resultados experimentais, os quais podem ser considerados os piores resultados, uma vez que esses resultados poderiam ser muito melhores caso a metodologia for utilizada semi-automaticamente junto com um especialista do domínio, mostram que a metodologia proposta é adequada e permite reduzir o tempo usado pelo especialista para analisar grande quantidade de registros médicos / The information retrieval from text stored in computer-based patient records is an important open-ended research problem, as the ease in which biomedical information recorded and stored in digital form grows. Thus, means to extract structured information (for example, in the so-called attribute-value format) from free-text records is an important research endeavor. Furthermore, by representing the free-text records in the attribute-value format, available pattern extraction methods can be directly applied. To map free-text medical records into the attribute-value format, we propose a methodology that can be used to automatically (or semi-automatically, with the help of a medical expert) map the important medical information stored in patient records which are described in natural language into an structured format. This methodology has been implemented in a computational system called TP-DISCOVER, which generates a database in the attribute-value format from a set of patient records (documents). In order to identify important entities in the set of documents, as well as significant relations among these entities, we propose a hybrid linguistic/statistical terminology extraction approach which filters out words and phrases that appear with a frequency higher than a given threshold by applying statistical measures. The underlying assumption of this hybrid approach to terminology extraction is that specialized documents are characterized by repeated use of certain lexical units or morpho-syntactic constructions. Our goal is to reduce the effort spent in manual modelling by observing regularities in the texts and by mapping them into suitable attribute names in the attribute-value representation format. The proposed methodology was evaluated to automatically structure a collection of 6000 documents which contains High Digestive Endoscopies exams´ results described in natural language. The experimental results, all of which can be considered lower bound results as they would greatly improve in case the methodology is applied semi-automatically together with a medical expert, show that the proposed methodology is suitable to reduce the medical expert workload in analysing large amounts of medical records Extração de terminologia Mineração de textos Pré-processamento de textos Terminology extraction Text mining Text pre-processing
24	"Pré-processamento de dados em aprendizado de máquina supervisionado" / "Data pre-processing for supervised machine learning" Gustavo Enrique de Almeida Prado Alves Batista 16 May 2003 (has links) A qualidade de dados é uma das principais preocupações em Aprendizado de Máquina - AM -cujos algoritmos são freqüentemente utilizados para extrair conhecimento durante a fase de Mineração de Dados - MD - da nova área de pesquisa chamada Descoberta de Conhecimento de Bancos de Dados. Uma vez que a maioria dos algoritmos de aprendizado induz conhecimento estritamente a partir de dados, a qualidade do conhecimento extraído é amplamente determinada pela qualidade dos dados de entrada. Diversos aspectos podem influenciar no desempenho de um sistema de aprendizado devido à qualidade dos dados. Em bases de dados reais, dois desses aspectos estão relacionados com (i) a presença de valores desconhecidos, os quais são tratados de uma forma bastante simplista por diversos algoritmos de AM, e; (ii) a diferença entre o número de exemplos, ou registros de um banco de dados, que pertencem a diferentes classes, uma vez que quando essa diferença é expressiva, sistemas de aprendizado podem ter dificuldades em aprender o conceito relacionado com a classe minoritária. O problema de tratamento de valores desconhecidos é de grande interesse prático e teórico. Em diversas aplicações é importante saber como proceder quando as informações disponíveis estão incompletas ou quando as fontes de informações se tornam indisponíveis. O tratamento de valores desconhecidos deve ser cuidadosamente planejado, caso contrário, distorções podem ser introduzidas no conhecimento induzido. Neste trabalho é proposta a utilização do algoritmo k-vizinhos mais próximos como método de imputação. Imputação é um termo que denota um procedimento que substitui os valores desconhecidos de um conjunto de dados por valores plausíveis. As análises conduzidas neste trabalho indicam que a imputação de valores desconhecidos com base no algoritmo k-vizinhos mais próximos pode superar o desempenho das estratégias internas utilizadas para tratar valores desconhecidos pelos sistemas C4.5 e CN2, bem como a imputação pela média ou moda, um método amplamente utilizado para tratar valores desconhecidos. O problema de aprender a partir de conjuntos de dados com classes desbalanceadas é de crucial importância, uma vez que esses conjuntos de dados podem ser encontrados em diversos domínios. Classes com distribuições desbalanceadas podem se constituir em um gargalo significante no desempenho obtido por sistemas de aprendizado que assumem uma distribuição balanceada das classes. Uma solução para o problema de aprendizado com distribuições desbalanceadas de classes é balancear artificialmente o conjunto de dados. Neste trabalho é avaliado o uso do método de seleção unilateral, o qual realiza uma remoção cuidadosa dos casos que pertencem à classe majoritária, mantendo os casos da classe minoritária. Essa remoção cuidadosa consiste em detectar e remover casos considerados menos confiáveis, por meio do uso de algumas heurísticas. Uma vez que não existe uma análise matemática capaz de predizer se o desempenho de um método é superior aos demais, análises experimentais possuem um papel importante na avaliação de sistema de aprendizado. Neste trabalho é proposto e implementado o ambiente computacional Discover Learning Environmnet - DLE - o qual é um em framework para desenvolver e avaliar novos métodos de pré-processamento de dados. O ambiente DLE é integrado ao projeto Discover, um projeto de pesquisa em desenvolvimento em nosso laboratório para planejamento e execução de experimentos relacionados com o uso de sistemas de aprendizado durante a fase de Mineração de dados do processo de KDD. / Data quality is a major concern in Machine Learning, which is frequently used to extract knowledge during the Data Mining phase of the relatively new research area called Knowledge Discovery from Databases - KDD. As most Machine Learning algorithms induce knowledge strictly from data, the quality of the knowledge extracted is largely determined by the quality of the underlying data. Several aspects may influence the performance of a learning system due to data quality. In real world databases, two of these aspects are related to (i) the presence of missing data, which is handled in a rather naive way by many Machine Learning algorithms; (ii) the difference between the number of examples, or database records, that belong to different classes since, when this difference is large, learning systems may have difficulties to learn the concept related to the minority class. The problem of missing data is of great practical and theoretical interest. In many applications it is important to know how to react if the available information is incomplete or if sources of information become unavailable. Missing data treatment should be carefully thought, otherwise bias might be introduced into the knowledge induced. In this work, we propose the use of the k-nearest neighbour algorithm as an imputation method. Imputation is a term that denotes a procedure that replaces the missing values in a data set by some plausible values. Our analysis indicates that missing data imputation based on the k-nearest neighbour algorithm can outperform the internal missing data treatment strategies used by C4.5 and CN2, and the mean or mode imputation, a widely used method for treating missing values. The problem of learning from imbalanced data sets is of crucial importance since it is encountered in a large number of domains. Imbalanced class distributions might cause a significant bottleneck in the performance obtained by standard learning methods, which assume a balanced distribution of the classes. One solution to the problem of learning with skewed class distributions is to artificially balance the data set. In this work we propose the use of the one-sided selection method, which performs a careful removal of cases belonging to the majority class while leaving untouched all cases from the minority class. Such careful removal consists of detecting and removing cases considered less reliable, using some heuristics. An experimental application confirmed the efficiency of the proposed method. As there is not a mathematical analysis able to predict whether the performance of a learning system is better than others, experimentation plays an important role for evaluating learning systems. In this work we propose and implement a computational environment, the Discover Learning Environment - DLE - which is a framework to develop and evaluate new data pre-processing methods. The DLE is integrated into the Discover project, a major research project under development in our laboratory for planning and execution of experiments related to the use of learning systems during the Data Mining phase of the KDD process. aprendizado de máquina mineração de dados pré-processamento de dados data mining data pre-processing machine learning
25	Spatio-Temporal Pre-Processing Methods for Region-of-Interest Video Coding Karlsson, Linda S. January 2007 (has links) In video transmission at low bit rates the challenge is to compress the video with a minimal reduction of the percieved quality. The compression can be adapted to knowledge of which regions in the video sequence are of most interest to the viewer. Region of interest (ROI) video coding uses this information to control the allocation of bits to the background and the ROI. The aim is to increase the quality in the ROI at the expense of the quality in the background. In order for this to occur the typical content of an ROI for a particular application is firstly determined and the actual detection is performed based on this information. The allocation of bits can then be controlled based on the result of the detection. In this licenciate thesis existing methods to control bit allocation in ROI video coding are investigated. In particular pre-processing methods that are applied independently of the codec or standard. This makes it possible to apply the method directly to the video sequence without modifications to the codec. Three filters are proposed in this thesis based on previous approaches. The spatial filter that only modifies the background within a single frame and the temporal filter that uses information from the previous frame. These two filters are also combined into a spatio-temporal filter. The abilities of these filters to reduce the number of bits necessary to encode the background and to successfully re-allocate these to the ROI are investigated. In addition the computational compexities of the algorithms are analysed. The theoretical analysis is verified by quantitative tests. These include measuring the quality using both the PSNR of the ROI and the border of the background, as well as subjective tests with human test subjects and an analysis of motion vector statistics. The qualitative analysis shows that the spatio-temporal filter has a better coding efficiency than the other filters and it successfully re-allocates the bits from the foreground to the background. The spatio-temporal filter gives an improvement in average PSNR in the ROI of more than 1.32 dB or a reduction in bitrate of 31 % compared to the encoding of the original sequence. This result is similar to or slightly better than the spatial filter. However, the spatio-temporal filter has a better performance, since its computational complexity is lower than that of the spatial filter. Region-of-interest video coding pre-processing spatio-temporal filters Information Systems
26	"Seleção de atributos importantes para a extração de conhecimento de bases de dados" / "Selection of important features for knowledge extraction from data bases" Huei Diana Lee 16 December 2005 (has links) O desenvolvimento da tecnologia e a propagação de sistemas computacionais nos mais variados domínios do conhecimento têm contribuído para a geração e o armazenamento de uma quantidade constantemente crescente de dados, em uma velocidade maior da que somos capazes de processar. De um modo geral, a principal razão para o armazenamento dessa enorme quantidade de dados é a utilização deles em benefício da humanidade. Diversas áreas têm se dedicado à pesquisa e a proposta de métodos e processos para tratar esses dados. Um desses processos é a Descoberta de Conhecimento em Bases de Dados, a qual tem como objetivo extrair conhecimento a partir das informações contidas nesses dados. Para alcançar esse objetivo, usualmente são construídos modelos (hipóteses), os quais podem ser gerados com o apoio de diferentes áreas tal como a de Aprendizado de Máquina. A Seleção de Atributos desempenha uma tarefa essencial dentro desse processo, pois representa um problema de fundamental importância em aprendizado de máquina, sendo freqüentemente realizada como uma etapa de pré-processamento. Seu objetivo é selecionar os atributos mais importantes, pois atributos não relevantes e/ou redundantes podem reduzir a precisão e a compreensibilidade das hipóteses induzidas por algoritmos de aprendizado supervisionado. Vários algoritmos para a seleção de atributos relevantes têm sido propostosna literatura. Entretanto, trabalhos recentes têm mostrado que também deve-se levar em conta a redundância para selecionar os atributos importantes, pois os atributos redundantes também afetam a qualidade das hipóteses induzidas. Para selecionar alguns e descartar outros, é preciso determinar a importância dos atributos segundo algum critério. Entre os vários critérios de importância de atributos propostos, alguns estão baseados em medidas de distância, consistência ou informação, enquanto outros são fundamentados em medidas de dependência. Outra questão essencial são as avaliações experimentais, as quais representam um importante instrumento de estimativa de performance de algoritmos de seleção de atributos, visto que não existe análise matemática que permita predizer que algoritmo de seleção de atributos será melhor que outro. Essas comparações entre performance de algoritmos são geralmente realizadas por meio da análise do erro do modelo construído a partir dos subconjuntos de atributos selecionados por esses algoritmos. Contudo, somente a consideração desse parâmetro não é suficiente; outras questões devem ser consideradas, tal como a percentagem de redução da quantidade de atributos desses subconjuntos de atributos selecionados. Neste trabalho é proposto um algoritmo que separa as análises de relevância e de redundância de atributos e introduz a utilização da Dimensão Fractal para tratar atributos redundantes em aprendizado supervisionado. É também proposto um modelo de avaliação de performance de algoritmos de seleção de atributos baseado no erro da hipótese construída e na percentagem de redução da quantidade de atributos selecionados. Resultados experimentais utilizando vários conjuntos de dados e diversos algoritmos consolidados na literatura, que selecionam atributos importantes, mostram que nossa proposta é competitiva com esses algoritmos. Outra questão importante relacionada à extração de conhecimento a partir de bases de dados é o formato no qual os dados estão representados. Usualmente, é necessário que os exemplos estejam descritos no formato atributo-valor. Neste trabalho também propomos um metodologia para dar suporte, por meio de um processo semi-automático, à construção de conjuntos de dados nesse formato, originados de informações de pacientes contidas em laudos médicos que estão descritos em linguagem natural. Esse processo foi aplicado com sucesso a um caso real. / Progress in computer systems and devices applied to a different number of fields, have made it possible to collect and store an increasing amount of data. Moreover, this technological advance enables the storage of a huge amount of data which is difficult to process unless new approaches are used. The main reason to maintain all these data is to use it in a general way for the benefit of humanity. Many areas are engaged in the research and proposal of methods and processes to deal with this growing data. One such process is Knowledge Discovery from Databases, which aims at finding valuable and interesting knowledge which may be hidden inside the data. In order to extract knowledge from data, models (hypothesis) are usually developed supported by many fields such as Machine Learning. Feature Selection plays an important role in this process since it represents a central problem in machine learning and is frequently applied as a data pre-processing step. Its objective is to choose a subset from the original features that describes a data set, according to some importance criterion, by removing irrelevant and/or redundant features, as they may decrease data quality and reduce comprehensibility of hypotheses induced by supervised learning algorithms. Most of the state-of-art feature selection algorithms mainly focus on finding relevant features. However, it has been shown that relevance alone is not sufficient to select important features. Different approaches have been proposed to select features, among them the filter approach. The idea of this approach is to remove features before the model's induction takes place, based on general characteristics from the data set. For the purpose of selecting features and discarding others, it is necessary to measure the features' goodness, and many importance measures have been proposed. Some of them are based on distance measures, consistency of data and information content, while others are founded on dependence measures. As there is no mathematical analysis capable of predicting whether a feature selection algorithm will produce better feature subsets than others, it is important to empirically evaluate the performance of these algorithms. Comparisons among algorithms' performance is usually carried out through the model's error analysis. Nevertheless, this sole parameter is not complete enough, and other issues, such as percentage of the feature's subset reduction should also be taken into account. In this work we propose a filter that decouples features' relevance and redundancy analysis, and introduces the use of Fractal Dimension to deal with redundant features. We also propose a performance evaluation model based on the constructed hypothesis' error and the percentage of reduction obtained from the selected feature subset. Experimental results obtained using well known feature selection algorithms on several data sets show that our proposal is competitive with them. Another important issue related to knowledge extraction from data is the format the data is represented. Usually, it is necessary to describe examples in the so-called attribute-value format. This work also proposes a methodology to support, through a semi-automatic process, the construction of a database in the attribute-value format from patient information contained in medical findings which are described in natural language. This process was successfully applied to a real case. Aprendizado de Máquina Dimensão Fractal Mineração de Dados Pré-processamento Data Mining Fractal Dimension Machine Learning Pre-processing
27	Assessing biofilm development in drinking water distribution systems by Machine Learning methods Ramos Martínez, Eva 02 May 2016 (has links) [EN] One of the main challenges of drinking water utilities is to ensure high quality supply, in particular, in chemical and microbiological terms. However, biofilms invariably develop in all drinking water distribution systems (DWDSs), despite the presence of residual disinfectant. As a result, water utilities are not able to ensure total bacteriological control. Currently biofilms represent a real paradigm in water quality management for all DWDSs. Biofilms are complex communities of microorganisms bound by an extracellular polymer that provides them with structure, protection from toxics and helps retain food. Besides the health risk that biofilms involve, due to their role as a pathogen shelter, a number of additional problems associated with biofilm development in DWDSs can be identified. Among others, aesthetic deterioration of water, biocorrosion and disinfectant decay are universally recognized. A large amount of research has been conducted on this field since the earliest 80's. However, due to the complex environment and the community studied most of the studies have been developed under certain simplifications. We resort to this already done work and acquired knowledge on biofilm growth in DWDSs to change the common approaches of these studies. Our proposal is based on arduous preprocessing and posterior analysis by Machine Learning approaches. A multi-disciplinary procedure is undertaken, helping as a practical approach to develop a decision-making tool to help DWDS management to maintain, as much as possible, biofilm at the lowest level, and mitigating its negative effects on the service. A methodology to detect the more susceptible areas to biofilm development in DWDSs is proposed. Knowing the location of these hot-spots of the network, mitigation actions could be focused more specifically, thus saving resources and money. Also, prevention programs could be developed, acting before the consequences of biofilm are noticed by the consumers. In this way, the economic cost would be reduced and the service quality would improve, eventually increasing consumers' satisfaction. / [ES] Uno de los principales objetivos de las empresas encargadas de la gestión de los sistemas de distribución de agua potable (DWDSs, del inglés Drinking Water Distribution Systems) es asegurar una alta calidad del agua en su abastecimiento, tanto química como microbiológica. Sin embargo, la existencia de biofilms en todos ellos, a pesar de la presencia de desinfectante residual, hace que no se pueda asegurar un control bacteriológico total, por lo que, hoy en día, los biofilms representan un paradigma en la gestión de la calidad del agua en los DWDSs. Los biofilms son comunidades complejas de microorganismos recubiertas de un polímero extracelular que les da estructura y les ayuda a retener el alimento y a protegerse de agentes tóxicos. Además del riesgo sanitario que suponen por su papel como refugio de patógenos, existen muchos otros problemas asociados al desarrollo de biofilms en los DWDSs, como deterioro estético del agua, biocorrosión y consumo de desinfectante, entre otros. Una gran cantidad de investigaciones se han realizado en este campo desde los primeros años 80. Sin embargo, debido a la complejidad del entorno y la comunidad estudiada la mayoría de estos estudios se han llevado a cabo bajo ciertas simplificaciones. En nuestro caso, recurrimos a estos trabajos ya realizados y al conocimiento adquirido sobre el desarrollo del biofilm en los DWDSs para cambiar el enfoque en el que normalmente se enmarcan estos estudios. Nuestra propuesta se basa en un intenso pre-proceso y posterior análisis con técnicas de aprendizaje automático. Se implementa un proceso multidisciplinar que ayuda a la realización de un enfoque práctico para el desarrollo de una herramienta de ayuda a la toma de decisiones que ayude a la gestión de los DWDSs, manteniendo, en lo posible, el biofilm en los niveles más bajos, y mitigando sus efectos negativos sobre el servicio de agua. Se propone una metodología para detectar las áreas más susceptibles al desarrollo del biofilm en los DWDSs. Conocer la ubicación de estos puntos calientes de biofilm en la red permitiría llevar a cabo acciones de mitigación de manera localizada, ahorrando recursos y dinero, y asimismo, podrían desarrollarse programas de prevención, actuando antes de que las consecuencias derivadas del desarrollo de biofilm sean percibidas por los consumidores. De esta manera, el coste económico se vería reducido y la calidad del servicio mejoraría, aumentando, finalmente, la satisfacción de los usuarios. / [CAT] Un dels principals reptes dels serveis d'aigua potable és garantir el subministrament d'alta qualitat, en particular, en termes químics i microbiològics. No obstant això, els biofilms desenvolupen invariablement en tots els sistemes de distribució d'aigua potable (DWDSs, de l'anglès, Drinking Water Distribution Systems), tot i la presència de desinfectant residual. Com a resultat, les empreses d'aigua no són capaces de garantir un control bacteriològic total. Actualment el biofilms representen un veritable paradigma en la gestió de la qualitat de l'aigua per a tots les DWDSs. Els biofilms són comunitats complexes de microorganismes vinculats per un polímer extracel·lular que els proporciona estructura, protecció contra els tòxics i ajuda a retenir els aliments. A més del risc de salut que impliquen els biofilms, com a causa del seu paper com a refugi de patògens, una sèrie de problemes addicionals associats amb el desenvolupament del biofilm en els DWDSs pot ser identificat. Entre altres, deteriorament estètic d'aigua, biocorrosión i decadència de desinfectant són universalment reconeguts. Una gran quantitat d'investigació s'ha realitzat en aquest camp des dels primers anys de la dècada del 80. No obstant això, a causa de la complexitat de l'entorn i la comunitat estudiada, la major part dels estudis s'han desenvolupat sota certes simplificacions. Recorrem a aquest treball ja realitzat i a aquest coneixement adquirit en el creixement de biofilms en els DWDSs per canviar el punt de vista clàssic del biofilm en estudis en els DWDSs. La nostra proposta es basa en l'ardu processament previ i posterior anàlisi mitjançant enfocaments d'aprenentatge automàtic. Es va dur a terme un procediment multidisciplinari, ajudant com un enfocament pràctic per desenvolupar una eina de presa de decisions per ajudar a la gestió dels DWDS a mantenir, en la mesura possible, els biofilm en els nivells més baixos, i la mitigació dels seus efectes negatius sobre el servei. Es proposa una metodologia per detectar les àrees més susceptibles al desenvolupament de biofilms en els DWDSs. En conèixer la ubicació d'aquests punts calents de la xarxa, les accions de mitigació podrien centrar-se més específicament, estalviant recursos i diners. A més, els programes de prevenció es podrien desenvolupar, actuant abans que les conseqüències del biofilm es noten pels consumidors. D'aquesta manera, el cost econòmic seria reduït i la qualitat del servei podria millorar, finalment augmentant la satisfacció dels consumidors. / Ramos Martínez, E. (2016). Assessing biofilm development in drinking water distribution systems by Machine Learning methods [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/63257 / TESIS MATEMATICA APLICADA INGENIERIA HIDRAULICA
28	Prostorovo-časová analýza HD-EEG dat u pacientů s neurodegenerativním onemocněním / Spatial-temporal analysis of HD-EEG data in pacients with nerodegenerative disease Jordánek, Tomáš January 2021 (has links) This master’s thesis deals with diagnostics of prodromal stage of Lewy body disease using microstate analysis. First part of the thesis includes theoretical background which is needed for understanding discussed topics and presented results. This part consists of description of the disease, diagnostic options, electroencephalography, pre-processing of the EEG record and the microstate analysis process. Theoretical background is followed by a practical part of the thesis. In the beginning, there is a chapter about a dataset, used EEG device, and own solution of the pre-processing. Microstate analysis is discussed next, its output parameters were compared between groups with statistical methods. Comparison of the subjects in prodromal stage of Lewy body disease and healthy controls brought significant differences in three parameters of microstates, in rate of unlabelled time frames and also for some counts of transitions between each map or unlabelled sections. Comparison of the subjects in prodromal stage of Lewy body disease and healthy controls brought significant differences in three parameters of microstates, in rate of unlabelled time frames and also for some counts of transitions between each map or unlabelled sections.
29	Biometrická identifikace otisku prstu / Biometric fingerprint identification Ruttkay, Michal January 2015 (has links) This thesis describes the anatomical characteristics of fingerprints and their applications in identifying the person. The theoretical part describes the importance of papillary lines on fingerprints, statistical analysis and pre-processing of images in particular. The practical section provides the necessary operations to compare fingerprints. The implementation was done in Matlab.
30	Automation of support service using Natural Language Processing : - Automation of errands tagging Haglund, Kristoffer January 2020 (has links) In this paper, Natural Language Processing and classification algorithms were used to create a program that automatically can tag different errands that are connected to Fortnox (an IT company based in Växjö) support service. Controlled experiments were conducted to find the best classification algorithm together with different Bag-of-Word pre-processing algorithms to find what was best suited for this problem. All data were provided by Fortnox and were manually labeled with tags connected to it as training and test data. The result of the final algorithm was 69.15% correctly/accurately predicted errands using all original data. When looking at the data that were incorrectly predicted a pattern was noticed where many errands have identical text attached to them. By removing the majority of these errands, the result was increased to 94.08%. Natural Language Processing Naïve Bayes Support Vector Machine Neural Network Pre-processing Engineering and Technology Teknik och teknologier

Search results