91 |
Aplikace metod předzpracování při dolování znalostí z textových datKotíková, Michaela January 2014 (has links)
The diploma thesis focuses on unstructured textual data preprocessing in relation to text mining. A series of experiments oriented to text mining is designed and carried out. The effect of different techniques of textual data preprocessing to the entire text mining process and its results is evaluated based on output of the experiments.
|
92 |
Vyhledávač údajů ve webových stránkách / Web page data figure finderJanata, Dominik January 2016 (has links)
The thesis treats automatic extraction of semantic data from Web pages. Within this broad problem, it focuses on finding values of data figures within the page presenting certain entity (e.g. price of a laptop). The main idea we wanted to evaluate is that a figure can be found using its context in the page: the words that surround it and values of the attributes of the containing HTML tags, class attribute in particular. Our research revealed there are two types of contemporary solutions of this problem: either the author of the Web page must inline semantic information inside the markup of the page or there are commercial tools that can be trained to parse a particular page format (targetting pages from a single Web domain). We examined the possibilities of developing a general solution that would - for given entity - find its properties across the Web domains using text analysis and machine learning. The naïve algorithm had about 30% accuracy, the lear- ning algorithms had the accuracy between 40 and 50% in finding the properties. Despite the accuracy is not acceptable for a final solution, we believe it confirms the potential of the idea. Keywords: Web pages data extraction 1
|
93 |
Porovnatelnost dat v dobývání znalostí z databází / Data comparability in knowledge discovery in databasesHoráková, Linda January 2017 (has links)
The master thesis is focused on analysis of data comparability and commensurability in datasets, which are used for obtaining knowledge using methods of data mining. Data comparability is one of aspects of data quality, which is crucial for correct and applicable results from data mining tasks. The aim of the theoretical part of the thesis is to briefly describe the field of knowledqe discovery and define specifics of mining of aggregated data. Moreover, the terms of comparability and commensurability is discussed. The main part is focused on process of knowledge discovery. These findings are applied in practical part of the thesis. The main goal of this part is to define general methodology, which can be used for discovery of potential problems of data comparability in analyzed data. This methodology is based on analysis of real dataset containing daily sales of products. In conclusion, the methodology is applied on data from the field of public budgets.
|
94 |
IBM Cognos Report Studio as an Effective Tool for Human Capital Reporting / IBM Cognos Report Studio jako efektivní nástroj reportingu v oblasti lidského kapitáluZinchenko, Yulia January 2013 (has links)
Main topic discussed in this diploma thesis is corporate reporting in terms of Human Capital using Business Intelligence tools, specifically IBM Cognos Report Studio. One of the objectives is to show step-by-step methodology of creating complex dynamic report, which includes data structure modeling, layout design and quality check. Another objective is to conduct Cost-Benefit Analysis for a real-life project, which is focused on recreating of Excel-based report in Cognos-based environment in order to automate information flows. Essential part of the diploma thesis is theoretical background of Business Intelligence aspects of data quality and visualization as well as purposes of human capital reporting and description of appropriate KPIs. Objectives are addressed by conducting analysis and research of resources related to topics described above as well as using IBM Cognos Report Studio provided by one of the major companies in financial advisory field. This diploma thesis represents relevant reading for those, who are interested in real-life application of data quality improvement and information flow automation using Business Intelligence reporting tools.
|
95 |
Datová kvalita v prostředí otevřených a propojitelných dat / Data quality on the context of open and linked dataTomčová, Lucie January 2014 (has links)
The master thesis deals with data quality in the context of open and linked data. One of the goals is to define specifics of data quality in this context. The specifics are perceived mainly with orientation to data quality dimensions (i. e. data characteristics which we study in data quality) and possibilities of their measurement. The thesis also defines the effect on data quality that is connected with data transformation to linked data; the effect if defined with consideration to possible risks and benefits that can influence data quality. The list of metrics verified on real data (open linked data published by government institution) is composed for the data quality dimensions that are considered to be relevant in context of open and linked data. The thesis points to the need of recognition of differences that are specific in this context when assessing and managing data quality. At the same time, it offers possibilities for further study of this question and it presents subsequent directions for both theoretical and practical evolution of the topic.
|
96 |
Analýza dat pro řešení problémů s vlhkostí v budovách / Analysis of Data to Solve Problems with Humidity in BuildingsNečasová, Klára January 2019 (has links)
The aim of this work was to solve problems with excessive humidity in buildings using data analysis. The theoretical part of the work deals with impacts of excessive humidity on the health of building occupants and also the condition of the building structure. Data mining methods including classification, prediction, and clustering are described together with model evaluation and selection. The practical part focuses on hardware platform description and measurement scenarios. Key parameters affecting indoor relative humidity are indoor and outdoor temperature and outdoor relative humidity. The long-term measurement of the mentioned parameters was performed using the set of sensors and BeeeOn system. Measured data was used to design a system for event detection related to a humidity change. The approach to air change regulation in the room was based on natural ventilation.
|
97 |
Komplexní on-line tréninkový deník / Complex On-Line Training DiaryKamenský, Zdeněk January 2021 (has links)
Design and implementation of online training diary for athletes is the main goal of this thesis. At the beginning, it was necessary to explain some of key words, related to the thesis topic. One of the most important things is data mining and its usability for sports data analysis. After that, existing solutions of sport applications were analyzed and also it was obligatory to analyze potential users requirements. Application design, implementation and testing were the next steps. Some of data mining methods were used for analysis of sports data intended for individual athletes and their coaches.
|
98 |
Zabezpečení přenosu dat proti dlouhým shlukům chyb / Protection of data transmission against long error burstsMalach, Roman January 2008 (has links)
This Master´s thesis discuses the protection of data transmission against long error bursts. The data is transmited throught the channel with defined error rate. Parameters of the channel are error-free interval 2000 bits and length of burst error 250 bits. One of the aims of this work is to make a set of possible methods for the realization of a system for data correction. The basic selection is made from most known codes. These codes are divided into several categories and then the best one is chosen for higher selection. Of course interleaving is used too. Only one code from each category can pass on to the higher level of the best code selection. At the end the codes are compared and the best three are simulated using the Matlab program to check correct function. From these three options, one is chosen as optimal regarding practical realization. Two options exist, hardware or software realization. The second one would seem more useful. The real codec is created in validator language C. Nowadays, considering language C and from a computer architecture point of view the 8 bits size of element unit is convenient. That´s why the code RS(255, 191), which works with 8 bits symbols, is optimal. In the next step the codec of this code is created containing the coder and decoder of the code above. The simulation of error channel is ensured by last program. Finally the results are presented using several examples.
|
99 |
Extrakce textových dat z internetových stránek / Extracting text data from the webpagesMazal, Zdeněk January 2011 (has links)
This work focus at data and especially text mining from Web pages, an overview of programs for downloading the text and ways of their extraction. It also contains an overview of the most frequently used programs for extracting data from internet. The output of this thesis is a Java program that can download text from a selection of servers and save them into xml le.
|
100 |
Bezdrátový sběr diagnostických dat z automobilu podporujících OBD-II / Wireless gathering of diagnostic data of cars supporting OBD-IIFadrný, Jaroslav January 2014 (has links)
Read up protocol OBD-II and familiarize yourself with the concept of existing devices, which use this protocol for wireless data transmission diagnostics car. Different conception mutually compare and suggest own concept device. The device should be consist from standard components without development own hardware. Implement the proposed device. Verify function of device and measure its range and data throughput. Discuss the connection to the database server and simple presentation of data.
|
Page generated in 0.0497 seconds