Global ETD Search

571	MotifGP: DNA Motif Discovery Using Multiobjective Evolution Belmadani, Manuel January 2016 (has links) The motif discovery problem is becoming increasingly important for molecular biologists as new sequencing technologies are producing large amounts of data, at rates which are unprecedented. The solution space for DNA motifs is too large to search with naive methods, meaning there is a need for fast and accurate motif detection tools. We propose MotifGP, a multiobjective motif discovery tool evolving regular expressions that characterize overrepresented motifs in a given input dataset. This thesis describes and evaluates a multiobjective strongly typed genetic programming algorithm for the discovery of network expressions in DNA sequences. Using 13 realistic data sets, we compare the results of our tool, MotifGP, to that of DREME, a state-of-art program. MotifGP outperforms DREME when the motifs to be sought are long, and the specificity is distributed over the length of the motif. For shorter motifs, the performance of MotifGP compares favourably with the state-of-the-art method. Finally, we discuss the advantages of multi-objective optimization in the context of this specific motif discovery problem. Genetic Programming Multiobjective optimization Motif Discovery Evolutionary Computing Bioinformatics ChIP-seq
572	Analýza reálných dat produktové redakce Alza.cz pomocí metod DZD / Analysis of real data from Alza.cz product department using methods of KDD Válek, Martin January 2014 (has links) This thesis deals with data analysis using methods of knowledge discovery in databases. The goal is to select appropriate methods and tools for implementation of a specific project based on real data from Alza.cz product department. Data analysis is performed by using association rules and decision rules in the Lisp-Miner and decision trees in the RapidMiner. The methodology used is the CRISP-DM. The thesis is divided into three main sections. First section is focused on the theoretical summary of information about KDD. There are defined basic terms and described the types of tasks and methods of KDD. In the second section is introduced the methodology CRISP-DM. The practical part firstly introduces company Alza.cz and its goals for this task. Afterwards, the basic structure of the data and preparation for the next step (data mining) is described. In conclusion, the results are evaluated and the possibility of their use is outlined.
573	Reálná úloha dobývání znalostí / The Real Knowledge Discovery Task Kolafa, Ondřej January 2012 (has links) The major objective of this thesis is to perform a real data mining task of classifying term deposit accounts holders. For this task an anonymous bank customers with low funds position data are used. In correspondence with CRISP-DM methodology the work is guided through these steps: business understanding, data understanding, data preparation, modeling, evaluation and deployment. The RapidMiner application is used for modeling. Methods and procedures used in actual task are described in theoretical part. Basic concepts of data mining with special respect to CRM segment was introduced as well as CRISP-DM methodology and technics suitable for this task. A difference in proportions of long term accounts holders and non-holders enforced data set had to be balanced in favour of holders. At the final stage, there are twelve models built. According to chosen criterias (area under curve and f-measure) two best models (logistic regression and bayes network) were elected. In the last stage of data mining process a possible real-world utilisation is mentioned. The task is developed only in form of recommendations, because it can't be applied to the real situation.
574	Neighbour discovery and distributed spatio-temporal cluster detection in pocket switched networks Orlinski, Matthew January 2013 (has links) Pocket Switched Networks (PSNs) offer a means of infrastructureless inter-human communication by utilising Delay and Disruption Tolerant Networking (DTN) technology. However, creating PSNs involves solving challenges which were not encountered in the Deep Space Internet for which DTN technology was originally intended.End-to-end communication over multiple hops in PSNs is a product of short range opportunistic wireless communication between personal mobile wireless devices carried by humans. Opportunistic data delivery in PSNs is far less predictable than in the Deep Space Internet because human movement patterns are harder to predict than the orbital motion of satellites. Furthermore, PSNs require some scheme for efficient neighbour discovery in order to save energy and because mobile devices in PSNs may be unaware of when their next encounter will take place.This thesis offers novel solutions for neighbour discovery and opportunistic data delivery in PSNs that make practical use of dynamic inter-human encounter patterns.The first contribution is a novel neighbour discovery algorithm for PSNs called PISTONS which relies on a new inter-probe time calculation (IPC) and the bursty encounter patterns of humans to set the time between neighbour discovery scans. The IPC equations and PISTONS also give participants the ability to easily specify their required level of connectivity and energy saving with a single variable.This thesis also contains novel distributed spatio-temporal clustering and opportunistic data delivery algorithms for PSNs which can be used to deliver data over multiple hops. The spatio-temporal clustering algorimths are also used to analyse the social networks and transient groups which are formed when humans interact. 004.6
575	Computer programming and kindergarten children in two learning environments Clouston, Dorothy Ruth January 1988 (has links) This study examined the appropriateness of introducing computer programming to kindergarten children. Three issues were explored in the research: 1. the programming capabilities of kindergarten children using a single keystroke program 2. suitable teaching techniques and learning environments for introducing programming 3. the benefits of programming at the kindergarten level. The subjects for the study were 40 kindergarten students from a surburban community in British Columbia, Canada. All students used the single keystroke program, DELTA DRAWING. Two teaching techniques were used—a structured method and a guided discovery method. Quantitative data were collected by administering five skills tests (skills relating to programming) as pretests and postests to both groups. A programming posttest was also given. Qualitative data were obtained by recording detailed observation reports for each of the 22 lessons (11 for each group), conducting an interview with each child at the end of the study and distributing a parent questionnaire. It can be concluded that it is appropriate to introduce computer programming to kindergarten students. The children in this study showed they are capable of programming. All students mastered some programming commands to instruct the "turtle" to move on the screen. DELTA DRAWING was determined to be a suitable means to introduce programming to kindergarten children. A combination of a structured teaching method and a guided discovery method is recommended for introducing a single keystroke program. It was observed that students in a guided discovery learning environment are more enthusiastic and motivated than students in a structured environment. Students need time to explore and make discoveries, but some structure is necessary to teach specific commands and procedures which may otherwise not be discovered. Social interaction should be encouraged while children use the computer, however most kindergarten children prefer to work on their own computer. There was no significant difference between the two groups on all but one of the five skills tests for both the pretests and the posttests. On the Programming Test the two groups did not perform significantly different. It can also be concluded that learning to program promotes cognitive development in certain areas. On all but one of the five skills test both the Structured Group and the Guided Discovery Group scored significantly better on the posttest than on the pretest. Lesson observation reports, student interviews and responses on parent questionnaires suggested that the computer experience was positive and rewarding for the kindergarten students. / Education, Faculty of / Graduate Kindergarten -- Methods and manuals Learning by discovery Direct instruction
576	Porovnání nástrojů pro Data Discovery / Data Discovery Tools Comparison Kopecký, Martin January 2012 (has links) Diploma thesis focuses on Data Discovery tools, which have been growing in im-portance in the Business Intelligence (BI) field during the last few years. Increasing number of companies of all sizes tend to include them in their BI environments. The main goal of this thesis is to compare QlikView, Tableau and PowerPivot using a defined set of criteria. The comparison is based on development of human resources report, which was modeled on a real life banking sector business case. The main goal is supported by a number of minor goals, namely: analysis of existing comparisons, definition of a new set of criteria, basic description of the compared platforms, and documentation of the case study. The text can be divided into two major parts. The theoretical part describes elemental BI architecture, discusses In-memory databases and data visualisation in context of a BI solution, and analyses existing comparisons of Data Discovery tools and BI platforms in general. Eight different comparisons are analysed in total, including reports of consulting companies and diploma theses. The applied part of the thesis builds upon the previous analysis and defines comparison criteria divided into five groups: Data import, transformation and storage; Data analysis and presentation; Operations criteria; User friendliness and support; Business criteria. The subsequent chapter describes the selected platforms, their brief history, component architecture, available editions and licensing. Case study chapter documents development of the report in each of the platforms and pinpoints their pros and cons. The final chapter applies the defined set of criteria and uses it to compare the selected Data Discovery platforms to fulfil the main goal of this thesis. The results are presented both numerically, utilising the weighted sum model, and verbally. The contribution of the thesis lies in the transparent confrontation of three Data Discovery tools, in the definition of a new set of comparison criteria, and in the documentation of the practical testing. The thesis offers an indirect answer to the question: "Which analytical tool should we use to supplement our existing BI solution?"
577	Implementace procedur pro předzpracování dat v systému Rapid Miner / Implementation of data preparation procedures for RapidMiner Černý, Ján January 2014 (has links) Knowledge Discovery in Databases (KDD) is gaining importance with the rising amount of data being collected lately, despite this analytic software systems often provide only the basic and most used procedures and algorithms. The aim of this thesis is to extend RapidMiner, one of the most frequently used systems, with some new procedures for data preprocessing. To understand and develop the procedures, it is important to be acquainted with the KDD, with emphasis on the data preparation phase. It's also important to describe the analytical procedures themselves. To be able to develop an extention for Rapidminer, its needed to get acquainted with the process of creating the extention and the tools that are used. Finally, the resulting extension is introduced and tested.
578	Discovery of Resources and Conflict in the Interstate System, 1816-2001 Clark, Bradley 05 1900 (has links) This study tests a theory detailing the increased likelihood of conflict following an initial resource discovery in the discovering nation and its region. A survey of prior literature shows a multitude of prior research concerning resources and nations' willingness to initiate conflict over those resources, but this prior research lacks any study concerning the effects of the discovery of resources on interstate conflict. The theory discusses the increased likelihood of conflict in the discovering nation as both target and initiator. It further looks at the increased chance of conflict in the discoverer's region due to security dilemmas and proxy wars. The results show strong support for the theory, suggesting nations making new resource discoveries must take extra care to avoid conflict. Resources discovery conflict Natural resources -- Political aspects. International relations. World politics.
579	Computação Evolutiva para a Construção de Regras de Conhecimento com Propriedades Específicas / Evolutionary Computing for Knowledge Rule Construction with Specific Properties Adriano Donizete Pila 12 April 2007 (has links) A maioria dos algoritmos de aprendizado de máquina simbólico utilizam regras de conhecimento if-then como linguagem de descrição para expressar o conhecimento aprendido. O objetivo desses algoritmos é encontrar um conjunto de regras de classificação que possam ser utilizadas na predição da classe de novos casos que não foram vistos a priori pelo algoritmo. Contudo, este tipo de algoritmo considera o problema da interação entre as regras, o qual consiste na avaliação da qualidade do conjunto de regras induzidas (classificador) como um todo, ao invés de avaliar a qualidade de cada regra de forma independente. Assim, como os classificadores têm por objetivo uma boa precisão nos casos não vistos, eles tendem a negligenciar outras propriedades desejáveis das regras de conhecimento, como a habilidade de causar surpresa ou trazer conhecimento novo ao especialista do domínio. Neste trabalho, estamos interessados em construir regras de conhecimento com propriedades específicas de forma isolada, i.e. sem considerar o problema da interação entre as regras. Para esse fim, propomos uma abordagem evolutiva na qual cada individuo da população do algoritmo representa uma única regra e as propriedades específicas são codificadas como medidas de qualidade da regra, as quais podem ser escolhidas pelo especialista do domínio para construir regras com as propriedades desejadas. O algoritmo evolutivo proposto utiliza uma rica estrutura para representar os indivíduos (regras), a qual possibilita considerar uma grande variedade de operadores evolutivos. O algoritmo utiliza uma função de aptidão multi-objetivo baseada em ranking que considera de forma concomitante mais que uma medida de avaliação de regra, transformando-as numa função simples-objetivo. Como a avaliação experimental é fundamental neste tipo de trabalho, para avaliar nossa proposta foi implementada a Evolutionary Computing Learning Environment --- ECLE --- que é uma biblioteca de classes para executar e avaliar o algoritmo evolutivo sob diferentes cenários. Além disso, a ECLE foi implementada considerando futuras implementações de novos operadores evolutivos. A ECLE está integrada ao projeto DISCOVER, que é um projeto de pesquisa em desenvolvimento em nosso laboratório para a aquisição automática de conhecimento. Analises experimentais do algoritmo evolutivo para construir regras de conhecimento com propriedades específicas, o qual pode ser considerado uma forma de análise inteligente de dados, foram realizadas utilizando a ECLE. Os resultados mostram a adequabilidade da nossa proposta / Most symbolic machine learning approaches use if-then know-ledge rules as the description language in which the learned knowledge is expressed. The aim of these learners is to find a set of classification rules that can be used to predict new instances that have not been seen by the learner before. However, these sorts of learners take into account the rule interaction problem, which consists of evaluating the quality of the set of rules (classifier) as a whole, rather than evaluating the quality of each rule in an independent manner. Thus, as classifiers aim at good precision to classify unseen instances, they tend to neglect other desirable properties of knowledge rules, such as the ability to cause surprise or bring new knowledge to the domain specialist. In this work, we are interested in building knowledge rules with specific properties in an isolated manner, i.e. not considering the rule interaction problem. To this end, we propose an evolutionary approach where each individual of the algorithm population represents a single rule and the specific properties are encoded as rule quality measure, a set of which can be freely selected by the domain specialist. The proposed evolutionary algorithm uses a rich structure for individual representation which enables one to consider a great variety of evolutionary operators. The algorithm uses a ranking-based multi-objective fitness function that considers more than one rule evaluation measure concomitantly into a single objective. As experimentation plays an important role in this sort of work, in order to evaluate our proposal we have implemented the Evolutionary Computing Learning Environment --- ECLE --- which is a framework to evaluate the evolutionary algorithm in different scenarios. Furthermore, the ECLE has been implemented taking into account future development of new evolutionary operators. The ECLE is integrated into the DISCOVER project, a major research project under constant development in our laboratory for automatic knowledge acquisition and analysis. Experimental analysis of the evolutionary algorithm to construct knowledge rules with specific properties, which can also be considered an important form of intelligent data analysis, was carried out using ECLE. Results show the suitability of our proposal Computação evolutiva Descoberta de conhecimento Regras de conhecimento Evolutionary computing Knowledge discovery Knowledge rules
580	O processo de extração de conhecimento de base de dados apoiado por agentes de software. / The process of knowledge discovery in databases supported by software agents. Robson Butaca Taborelli de Oliveira 01 December 2000 (has links) Os sistemas de aplicações científicas e comerciais geram, cada vez mais, imensas quantidades de dados os quais dificilmente podem ser analisados sem que sejam usados técnicas e ferramentas adequadas de análise. Além disso, muitas destas aplicações são voltadas para Internet, ou seja, possuem seus dados distribuídos, o que dificulta ainda mais a realização de tarefas como a coleta de dados. A área de Extração de Conhecimento de Base de Dados diz respeito às técnicas e ferramentas usadas para descobrir automaticamente conhecimento embutido nos dados. Num ambiente de rede de computadores, é mais complicado realizar algumas das etapas do processo de KDD, como a coleta e processamento de dados. Dessa forma, pode ser feita a utilização de novas tecnologias na tentativa de auxiliar a execução do processo de descoberta de conhecimento. Os agentes de software são programas de computadores com propriedades, como, autonomia, reatividade e mobilidade, que podem ser utilizados para esta finalidade. Neste sentido, o objetivo deste trabalho é apresentar a proposta de um sistema multi-agente, chamado Minador, para auxiliar na execução e gerenciamento do processo de Extração de Conhecimento de Base de Dados. / Nowadays, commercial and scientific application systems generate huge amounts of data that cannot be easily analyzed without the use of appropriate tools and techniques. A great number of these applications are also based on the Internet which makes it even more difficult to collect data, for instance. The field of Computer Science called Knowledge Discovery in Databases deals with issues of the use and creation of the tools and techniques that allow for the automatic discovery of knowledge from data. Applying these techniques in an Internet environment can be particulary difficult. Thus, new techniques need to be used in order to aid the knowledge discovery process. Software agents are computer programs with properties such as autonomy, reactivity and mobility that can be used in this way. In this context, this work has the main goal of presenting the proposal of a multiagent system, called Minador, aimed at supporting the execution and management of the Knowledge Discovery in Databases process. agentes KDD mineração de dados sistema multiagentes agents data mining knowledge discovery in databases multi-agents system

Search results