Global ETD Search

21	Extrakce strukturovaných dat z českého webu s využitím extrakčních ontologií / Extracting Structured Data from Czech Web Using Extraction Ontologies Pouzar, Aleš January 2012 (has links) The presented thesis deals with the task of automatic information extraction from HTML documents for two selected domains. Laptop offers are extracted from e-shops and free-published job offerings are extracted from company sites. The extraction process outputs structured data of high granularity grouped into data records, in which corresponding semantic label is assigned to each data item. The task was performed using the extraction system Ex, which combines two approaches: manually written rules and supervised machine learning algorithms. Due to the expert knowledge in the form of extraction rules the lack of training data could be overcome. The rules are independent of the specific formatting structure so that one extraction model could be used for heterogeneous set of documents. The achieved success of the extraction process in the case of laptop offers showed that extraction ontology describing one or a few product types could be combined with wrapper induction methods to automatically extract all product type offers on a web scale with minimum human effort.
22	Rozpoznávání hudební nálady a emocí za pomoci technik Music Information Retrieval / Music mood and emotion recognition using Music information retrieval techniques Smělý, Pavel January 2019 (has links) This work focuses on scientific area called Music Information Retrieval, more precisely it’s subdivision focusing on the recognition of emotions in music called Music Emotion Recognition. The beginning of the work deals with general overview and definition of MER, categorization of individual methods and offers a comprehensive view of this discipline. The thesis also concentrates on the selection and description of suitable parameters for the recognition of emotions, using tools openSMILE and MIRtoolbox. A freely available DEAM database was used to obtain the set of music recordings and their subjective emotional annotations. The practical part deals with the design of a static dimensional regression evaluation system for numerical prediction of musical emotions in music recordings, more precisely their position in the AV emotional space. The thesis publishes and comments on the results obtained by individual analysis of the significance of individual parameters and for the overall analysis of the prediction of the proposed model.
23	Active Learning pro zpracování archivních pramenů / Active Learning for Processing of Archive Sources Hříbek, David January 2021 (has links) This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
24	Datová sada pro klasifikaci síťových zařízení pomocí strojového učení / Dataset for Classification of Network Devices Using Machine Learning Eis, Pavel January 2021 (has links) Automatic classification of devices in computer network can be used for detection of anomalies in a network and also it enables application of security policies per device type. The key to creating a device classifier is a quality data set, the public availability of which is low and the creation of a new data set is difficult. The aim of this work is to create a tool, that will enable automated annotation of the data set of network devices and to create a classifier of network devices that uses only basic data from network flows. The result of this work is a modular tool providing automated annotation of network devices using system ADiCT of Cesnet's association, search engines Shodan and Censys, information from PassiveDNS, TOR, WhoIs, geolocation database and information from blacklists. Based on the annotated data set are created several classifiers that classify network devices according to the services they use. The results of the work not only significantly simplify the process of creating new data sets of network devices, but also show a non-invasive approach to the classification of network devices.
25	Změny dokumentu v editoru anotací / Document Modifications in Annotation Editor Cudrák, Miloš January 2014 (has links) This thesis deals with the design and implementation of the document modifications and another annotation editor improvements developed as the part of the Decipher project. Explains the nature of the Decipher project and the inclusion of annotation system 4A in this project. It examines the annotation editor and propose solutions to problems and adding new functionality which makes it easier to work with annotations and also with editor itself.
26	Využití anotací primární struktury pro strukturní predikci protein-ligand aktivních míst / Use of residue-level annotations for structural prediction of protein-ligand binding sites Břicháčková, Kateřina January 2021 (has links) The number of experimentally resolved protein structures in the Protein Data Bank has been growing fast in the last 20 years, which motivates the develop- ment of many computational tools for protein-ligand binding sites prediction. Binding sites prediction from protein 3D structure has many important applica- tions; it is an essential step in the complex process of rational drug design, it helps to infer the side-effects of drugs, it provides insight into proteins biological functions and it is helpful in many other fields, such as protein-ligand docking and molecular dynamics. As far as we know, there has not been a study that would systematically investigate general properties of known ligand binding sites on a large scale. In this thesis, we examine these properties using existing experimen- tal and predicted residue-level annotations of protein sequence and structure. We present an automated pipeline for statistical analysis of these annotations, based on hypothesis testing and effect size estimation. It is implemented in Python and it is easily extensible by user-defined annotations. The usage is demonstrated on 33 existing annotations and 4 different datasets. The practical significance of the results is tested with P2Rank prediction method. We hope that the results as well as the pipeline...
27	Zpracování dat z vysokokapacitního DNA sekvenování pro studium variability genomu a transkriptomu. / Study of genome and transcriptome variability employing data processing from massive parallel DNA sequencing. Vojta, Petr January 2018 (has links) Massive parallel sequencing (MPS) data analysis tasks are often computationally demanding and their execution time would take too long using standard computing machines. Thus there is a need for parallelization of this tasks and ability to execute them on a sufficiently powerful computing machines. In the first chapter we describe a newly created platform for resequencing analysis of MPS data - MOLDIMED and novel annotation tool, which is ready to deploy on HPC infrastructure. The second chapter describes MPS approaches in Diamond-Blackfan anaemia (DBA), which is predominantly underlined by mutations in genes encoding ribosomal proteins (RP); however, its etiology remains unexplained in approximately 25% of patients. We performed panel sequencing of all ribosomal genes in DBA patient without previously known molecular pathology. A novel heterozygous RPS7 mutation coding RPS7 p.V134F was found in one female patient and subsequently confirmed in two asymptomatic family members, in whom mild anemia were detected on further examination. Subsequently, we performed whole transcriptome analysis in all family members and patient with RPS7 mutation in comparison with healthy control group and with DBA patients with known mutation in RPS19. We observed dysregulation mainly in signal pathways of translation,...
28	Syntéza železo-sirných center v Monocercomonoides exilis / Iron-Sulfur cluster assembly in Monocercomonoides exilis Vacek, Vojtěch January 2020 (has links) In the search for the mitochondrion of oxymonads, DNA of Monocercomonoides exilis - an oxymonad isolated from the gut of Chinchilla, was isolated and its genome was sequenced. Sequencing resulted in a fairly complete genome which was extensively searched or genes for mitochondrion related proteins, but no reliable candidate for such gene was identified. Even genes for the ISC pathway, which is responsible for Fe-S cluster assembly and considered to be the only essential function of reduced mitochondrion-like organelles (MROs), were absent. Instead, we were able to detect the presence of a SUF pathway which functionally replaced the ISC pathway. Closer examination of the SUF pathway based on heterologous localisation revealed that this pathway localised in the cytosol. In silico analysis showed that SUF genes are highly conserved at the level of secondary and tertiary structure and most catalytic residues and motifs are present in their sequences. The functionality of these proteins was further indirectly confirmed by complementation experiments in Escherichia coli where SUF proteins of M. exilis were able to restore at least partially Fe-S cluster assembly of strains deficient in the SUF and ISC pathways. We also proved by bacterial adenylate cyclase two-hybrid system that SufB and SufC can form...
29	Komponent pro sémantické obohacení / Semantic Enrichment Component Doležal, Jan January 2018 (has links) This master's thesis describes Semantic Enrichment Component (SEC), that searches entities (e.g., persons or places) in the input text document and returns information about them. The goals of this component are to create a single interface for named entity recognition tools, to enable parallel document processing, to save memory while using the knowledge base, and to speed up access to its content. To achieve these goals, the output of the named entity recognition tools in the text was specified, the tool for storing the preprocessed knowledge base into the shared memory was implemented, and the client-server scheme was used to create the component.
30	Chybovost v písemném projevu romských žáků 9. ročníků základních škol praktických na základě elektronické databanky ROMi / Error Analysis of Czech Written Expression of the Romani Pupils in the 9th Grade of the Secondary Practical Schools Based on the Corpora ROMi Bedřichová, Zuzanna January 2015 (has links) English Summary - Error Analysis of Czech Written Expression of the Romani Pupils in the 9th Grade of the Secondary Practical Schools Based on the Corpora ROMi Zuzanna Bedřichová ÚČJTK FFUK Prague 2014 The study is focused on practice of error making in written expressions of the Romani pupils in the 9th grade of Secondary practical schools (schools for children with special educational needs). Here 130 written school works of these pupils, which are available through the database ROMi (database of written and spoken accounts in Czech language of children and youth of Romani origin), have been analysed. The author offers innovative concept of new and elaborate scheme of error analysis, and qualitatively - quantitative analyses of the pupils' written accounts. Beside the qualitatively - quantitative analyses, the study outlines current situation of issues such as education of Romani children in the Czech language, the Romani ethnolect of Czech language, and spoken language as a source of stigmatisation. Furthermore, details about the ROMi database, 130 original written accounts in full length and practical proposals of compensation in the practice of error making are provided.

Search results