Global ETD Search

571	Reálná úloha dobývání znalostí / The Real Knowledge Discovery Task Kolafa, Ondřej January 2012 (has links) The major objective of this thesis is to perform a real data mining task of classifying term deposit accounts holders. For this task an anonymous bank customers with low funds position data are used. In correspondence with CRISP-DM methodology the work is guided through these steps: business understanding, data understanding, data preparation, modeling, evaluation and deployment. The RapidMiner application is used for modeling. Methods and procedures used in actual task are described in theoretical part. Basic concepts of data mining with special respect to CRM segment was introduced as well as CRISP-DM methodology and technics suitable for this task. A difference in proportions of long term accounts holders and non-holders enforced data set had to be balanced in favour of holders. At the final stage, there are twelve models built. According to chosen criterias (area under curve and f-measure) two best models (logistic regression and bayes network) were elected. In the last stage of data mining process a possible real-world utilisation is mentioned. The task is developed only in form of recommendations, because it can't be applied to the real situation.
572	Neighbour discovery and distributed spatio-temporal cluster detection in pocket switched networks Orlinski, Matthew January 2013 (has links) Pocket Switched Networks (PSNs) offer a means of infrastructureless inter-human communication by utilising Delay and Disruption Tolerant Networking (DTN) technology. However, creating PSNs involves solving challenges which were not encountered in the Deep Space Internet for which DTN technology was originally intended.End-to-end communication over multiple hops in PSNs is a product of short range opportunistic wireless communication between personal mobile wireless devices carried by humans. Opportunistic data delivery in PSNs is far less predictable than in the Deep Space Internet because human movement patterns are harder to predict than the orbital motion of satellites. Furthermore, PSNs require some scheme for efficient neighbour discovery in order to save energy and because mobile devices in PSNs may be unaware of when their next encounter will take place.This thesis offers novel solutions for neighbour discovery and opportunistic data delivery in PSNs that make practical use of dynamic inter-human encounter patterns.The first contribution is a novel neighbour discovery algorithm for PSNs called PISTONS which relies on a new inter-probe time calculation (IPC) and the bursty encounter patterns of humans to set the time between neighbour discovery scans. The IPC equations and PISTONS also give participants the ability to easily specify their required level of connectivity and energy saving with a single variable.This thesis also contains novel distributed spatio-temporal clustering and opportunistic data delivery algorithms for PSNs which can be used to deliver data over multiple hops. The spatio-temporal clustering algorimths are also used to analyse the social networks and transient groups which are formed when humans interact. 004.6
573	Computer programming and kindergarten children in two learning environments Clouston, Dorothy Ruth January 1988 (has links) This study examined the appropriateness of introducing computer programming to kindergarten children. Three issues were explored in the research: 1. the programming capabilities of kindergarten children using a single keystroke program 2. suitable teaching techniques and learning environments for introducing programming 3. the benefits of programming at the kindergarten level. The subjects for the study were 40 kindergarten students from a surburban community in British Columbia, Canada. All students used the single keystroke program, DELTA DRAWING. Two teaching techniques were used—a structured method and a guided discovery method. Quantitative data were collected by administering five skills tests (skills relating to programming) as pretests and postests to both groups. A programming posttest was also given. Qualitative data were obtained by recording detailed observation reports for each of the 22 lessons (11 for each group), conducting an interview with each child at the end of the study and distributing a parent questionnaire. It can be concluded that it is appropriate to introduce computer programming to kindergarten students. The children in this study showed they are capable of programming. All students mastered some programming commands to instruct the "turtle" to move on the screen. DELTA DRAWING was determined to be a suitable means to introduce programming to kindergarten children. A combination of a structured teaching method and a guided discovery method is recommended for introducing a single keystroke program. It was observed that students in a guided discovery learning environment are more enthusiastic and motivated than students in a structured environment. Students need time to explore and make discoveries, but some structure is necessary to teach specific commands and procedures which may otherwise not be discovered. Social interaction should be encouraged while children use the computer, however most kindergarten children prefer to work on their own computer. There was no significant difference between the two groups on all but one of the five skills tests for both the pretests and the posttests. On the Programming Test the two groups did not perform significantly different. It can also be concluded that learning to program promotes cognitive development in certain areas. On all but one of the five skills test both the Structured Group and the Guided Discovery Group scored significantly better on the posttest than on the pretest. Lesson observation reports, student interviews and responses on parent questionnaires suggested that the computer experience was positive and rewarding for the kindergarten students. / Education, Faculty of / Graduate Kindergarten -- Methods and manuals Learning by discovery Direct instruction
574	Porovnání nástrojů pro Data Discovery / Data Discovery Tools Comparison Kopecký, Martin January 2012 (has links) Diploma thesis focuses on Data Discovery tools, which have been growing in im-portance in the Business Intelligence (BI) field during the last few years. Increasing number of companies of all sizes tend to include them in their BI environments. The main goal of this thesis is to compare QlikView, Tableau and PowerPivot using a defined set of criteria. The comparison is based on development of human resources report, which was modeled on a real life banking sector business case. The main goal is supported by a number of minor goals, namely: analysis of existing comparisons, definition of a new set of criteria, basic description of the compared platforms, and documentation of the case study. The text can be divided into two major parts. The theoretical part describes elemental BI architecture, discusses In-memory databases and data visualisation in context of a BI solution, and analyses existing comparisons of Data Discovery tools and BI platforms in general. Eight different comparisons are analysed in total, including reports of consulting companies and diploma theses. The applied part of the thesis builds upon the previous analysis and defines comparison criteria divided into five groups: Data import, transformation and storage; Data analysis and presentation; Operations criteria; User friendliness and support; Business criteria. The subsequent chapter describes the selected platforms, their brief history, component architecture, available editions and licensing. Case study chapter documents development of the report in each of the platforms and pinpoints their pros and cons. The final chapter applies the defined set of criteria and uses it to compare the selected Data Discovery platforms to fulfil the main goal of this thesis. The results are presented both numerically, utilising the weighted sum model, and verbally. The contribution of the thesis lies in the transparent confrontation of three Data Discovery tools, in the definition of a new set of comparison criteria, and in the documentation of the practical testing. The thesis offers an indirect answer to the question: "Which analytical tool should we use to supplement our existing BI solution?"
575	Implementace procedur pro předzpracování dat v systému Rapid Miner / Implementation of data preparation procedures for RapidMiner Černý, Ján January 2014 (has links) Knowledge Discovery in Databases (KDD) is gaining importance with the rising amount of data being collected lately, despite this analytic software systems often provide only the basic and most used procedures and algorithms. The aim of this thesis is to extend RapidMiner, one of the most frequently used systems, with some new procedures for data preprocessing. To understand and develop the procedures, it is important to be acquainted with the KDD, with emphasis on the data preparation phase. It's also important to describe the analytical procedures themselves. To be able to develop an extention for Rapidminer, its needed to get acquainted with the process of creating the extention and the tools that are used. Finally, the resulting extension is introduced and tested.
576	Discovery of Resources and Conflict in the Interstate System, 1816-2001 Clark, Bradley 05 1900 (has links) This study tests a theory detailing the increased likelihood of conflict following an initial resource discovery in the discovering nation and its region. A survey of prior literature shows a multitude of prior research concerning resources and nations' willingness to initiate conflict over those resources, but this prior research lacks any study concerning the effects of the discovery of resources on interstate conflict. The theory discusses the increased likelihood of conflict in the discovering nation as both target and initiator. It further looks at the increased chance of conflict in the discoverer's region due to security dilemmas and proxy wars. The results show strong support for the theory, suggesting nations making new resource discoveries must take extra care to avoid conflict. Resources discovery conflict Natural resources -- Political aspects. International relations. World politics.
577	Computação Evolutiva para a Construção de Regras de Conhecimento com Propriedades Específicas / Evolutionary Computing for Knowledge Rule Construction with Specific Properties Adriano Donizete Pila 12 April 2007 (has links) A maioria dos algoritmos de aprendizado de máquina simbólico utilizam regras de conhecimento if-then como linguagem de descrição para expressar o conhecimento aprendido. O objetivo desses algoritmos é encontrar um conjunto de regras de classificação que possam ser utilizadas na predição da classe de novos casos que não foram vistos a priori pelo algoritmo. Contudo, este tipo de algoritmo considera o problema da interação entre as regras, o qual consiste na avaliação da qualidade do conjunto de regras induzidas (classificador) como um todo, ao invés de avaliar a qualidade de cada regra de forma independente. Assim, como os classificadores têm por objetivo uma boa precisão nos casos não vistos, eles tendem a negligenciar outras propriedades desejáveis das regras de conhecimento, como a habilidade de causar surpresa ou trazer conhecimento novo ao especialista do domínio. Neste trabalho, estamos interessados em construir regras de conhecimento com propriedades específicas de forma isolada, i.e. sem considerar o problema da interação entre as regras. Para esse fim, propomos uma abordagem evolutiva na qual cada individuo da população do algoritmo representa uma única regra e as propriedades específicas são codificadas como medidas de qualidade da regra, as quais podem ser escolhidas pelo especialista do domínio para construir regras com as propriedades desejadas. O algoritmo evolutivo proposto utiliza uma rica estrutura para representar os indivíduos (regras), a qual possibilita considerar uma grande variedade de operadores evolutivos. O algoritmo utiliza uma função de aptidão multi-objetivo baseada em ranking que considera de forma concomitante mais que uma medida de avaliação de regra, transformando-as numa função simples-objetivo. Como a avaliação experimental é fundamental neste tipo de trabalho, para avaliar nossa proposta foi implementada a Evolutionary Computing Learning Environment --- ECLE --- que é uma biblioteca de classes para executar e avaliar o algoritmo evolutivo sob diferentes cenários. Além disso, a ECLE foi implementada considerando futuras implementações de novos operadores evolutivos. A ECLE está integrada ao projeto DISCOVER, que é um projeto de pesquisa em desenvolvimento em nosso laboratório para a aquisição automática de conhecimento. Analises experimentais do algoritmo evolutivo para construir regras de conhecimento com propriedades específicas, o qual pode ser considerado uma forma de análise inteligente de dados, foram realizadas utilizando a ECLE. Os resultados mostram a adequabilidade da nossa proposta / Most symbolic machine learning approaches use if-then know-ledge rules as the description language in which the learned knowledge is expressed. The aim of these learners is to find a set of classification rules that can be used to predict new instances that have not been seen by the learner before. However, these sorts of learners take into account the rule interaction problem, which consists of evaluating the quality of the set of rules (classifier) as a whole, rather than evaluating the quality of each rule in an independent manner. Thus, as classifiers aim at good precision to classify unseen instances, they tend to neglect other desirable properties of knowledge rules, such as the ability to cause surprise or bring new knowledge to the domain specialist. In this work, we are interested in building knowledge rules with specific properties in an isolated manner, i.e. not considering the rule interaction problem. To this end, we propose an evolutionary approach where each individual of the algorithm population represents a single rule and the specific properties are encoded as rule quality measure, a set of which can be freely selected by the domain specialist. The proposed evolutionary algorithm uses a rich structure for individual representation which enables one to consider a great variety of evolutionary operators. The algorithm uses a ranking-based multi-objective fitness function that considers more than one rule evaluation measure concomitantly into a single objective. As experimentation plays an important role in this sort of work, in order to evaluate our proposal we have implemented the Evolutionary Computing Learning Environment --- ECLE --- which is a framework to evaluate the evolutionary algorithm in different scenarios. Furthermore, the ECLE has been implemented taking into account future development of new evolutionary operators. The ECLE is integrated into the DISCOVER project, a major research project under constant development in our laboratory for automatic knowledge acquisition and analysis. Experimental analysis of the evolutionary algorithm to construct knowledge rules with specific properties, which can also be considered an important form of intelligent data analysis, was carried out using ECLE. Results show the suitability of our proposal Computação evolutiva Descoberta de conhecimento Regras de conhecimento Evolutionary computing Knowledge discovery Knowledge rules
578	O processo de extração de conhecimento de base de dados apoiado por agentes de software. / The process of knowledge discovery in databases supported by software agents. Robson Butaca Taborelli de Oliveira 01 December 2000 (has links) Os sistemas de aplicações científicas e comerciais geram, cada vez mais, imensas quantidades de dados os quais dificilmente podem ser analisados sem que sejam usados técnicas e ferramentas adequadas de análise. Além disso, muitas destas aplicações são voltadas para Internet, ou seja, possuem seus dados distribuídos, o que dificulta ainda mais a realização de tarefas como a coleta de dados. A área de Extração de Conhecimento de Base de Dados diz respeito às técnicas e ferramentas usadas para descobrir automaticamente conhecimento embutido nos dados. Num ambiente de rede de computadores, é mais complicado realizar algumas das etapas do processo de KDD, como a coleta e processamento de dados. Dessa forma, pode ser feita a utilização de novas tecnologias na tentativa de auxiliar a execução do processo de descoberta de conhecimento. Os agentes de software são programas de computadores com propriedades, como, autonomia, reatividade e mobilidade, que podem ser utilizados para esta finalidade. Neste sentido, o objetivo deste trabalho é apresentar a proposta de um sistema multi-agente, chamado Minador, para auxiliar na execução e gerenciamento do processo de Extração de Conhecimento de Base de Dados. / Nowadays, commercial and scientific application systems generate huge amounts of data that cannot be easily analyzed without the use of appropriate tools and techniques. A great number of these applications are also based on the Internet which makes it even more difficult to collect data, for instance. The field of Computer Science called Knowledge Discovery in Databases deals with issues of the use and creation of the tools and techniques that allow for the automatic discovery of knowledge from data. Applying these techniques in an Internet environment can be particulary difficult. Thus, new techniques need to be used in order to aid the knowledge discovery process. Software agents are computer programs with properties such as autonomy, reactivity and mobility that can be used in this way. In this context, this work has the main goal of presenting the proposal of a multiagent system, called Minador, aimed at supporting the execution and management of the Knowledge Discovery in Databases process. agentes KDD mineração de dados sistema multiagentes agents data mining knowledge discovery in databases multi-agents system
579	SNP discovery, high-density genetic map construction, and identification of genes associated with climate adaptation, and lack of intermuscular bone in tambaqui (Colossoma macropomum) / Descoberta de SNP, construção de mapa genético de alta densidade e identificação de genes associados com adaptação climática e ausência da espinha intermuscular em tambaqui (Colossoma macropomum) José de Ribamar da Silva Nunes 08 March 2017 (has links) Tambaqui (Colossoma macropomum) is the largest native Characiform species from the Amazon and Orinoco river basins of South America. Tambaqui farming is growing rapidly in Brazil, its production reached 139.209 tons in 2014, what corresponds to 57.7% of increase compared with 2013. However, few genetic studies of tambaqui are currently available. The tambaqui genetic studies for cultured and wild populations need a holistic approach for a rational action facing ecological and market challenges in aquaculture. Approaches based on genetic studies have provided important tools to understand population dynamics, local adaptation, and gene function to improve selection strategies to be applied in breeding programs. The next-generation sequencing (NGS) allowed a great advance in genomic and transcriptomic approaches, especially related to non-model species. The genotype-by-sequencing (GBS) is one of this approaches based on genome complexity reduction using restriction enzymes (REs). This thesis presents the application of these approaches to provide advances in the genetic background for tambaqui studies. The GBS approach provided a high-density SNPs panel that allowed us to develop the first linkage map, and association studies with environmental variables, local adaptation, and lack of intermuscular bones, both using tambaqui as a model. This work can give us many theoretical references to be applied in genetic breeding programs for tambaqui, allowing a better understanding of genetic processes related to traits of interest in aquaculture. / O tambaqui (Colossoma macropomum) é a maior espécie nativa de Characiforme da América do Sul e é encontrado nas bacias do rio Amazonas e Orinoco. O cultivo do tambaqui está crescendo rapidamente no Brasil, sua produção atingiu 139.209 toneladas em 2014, o que corresponde a 57,7% de aumento em relação a 2013. No entanto, poucos estudos genéticos realizados com o tambaqui estão disponíveis atualmente. Estudos genéticos em tambaqui, tanto em populações cultivadas quanto em populações selvagens, necessitam de uma abordagem holística para uma ação racional frente aos desafios ecológicos e mercadológicos na aquicultura. Abordagens baseadas em estudos genéticos têm fornecido ferramentas importantes para se entender a dinâmica populacional, adaptação local e função gênica visando melhorar as estratégias de seleção a serem aplicadas em programas de melhoramento genético. O sequenciamento de nova geração (NGS) permitiu um grande avanço nas abordagens genômicas e transcriptômicas, especialmente relacionadas a espécies não-modelo. A genotipagem por sequenciamento (GBS) é uma dessas abordagens que utilizam enzimas de restrição (REs) para reduzir a complexidade do genoma. Esta tese apresenta a aplicação desta abordagem objetivando proporcionar avanços significativos nos estudos genéticos de base para tambaqui. A técnica de GBS forneceu um painel de SNPs de alta densidade que nos permitiu desenvolver o primeiro mapa de ligação e estudos de associação com variáveis ambientais, adaptação local e ausência de ossos intermusculares no tambaqui. Este trabalho pode nos dar muitas referências teóricas a serem aplicadas em programas de melhoramento genético do tambaqui, permitindo uma melhor compreensão dos processos genéticos relacionados a traços de interesse na aquicultura. Associação Descoberta de SNP Mapa genético Osso intermuscular Tambaqui Association Genetic map Intermuscular bone SNP discovery Tambaqui
580	Geração automática de metadados: uma contribuição para a Web semântica. / Automatic metadata generation: a contribution to the semantic Web. Ferreira, Eveline Cruz Hora Gomes 05 April 2006 (has links) Esta Tese oferece uma contribuição na área de Web Semântica, no âmbito da representação e indexação de documentos, definindo um Modelo de geração automática de metadados baseado em contexto, a partir de documentos textuais na língua portuguesa, em formato não estruturado (txt). Um conjunto teórico amplo de assuntos ligados à criação de ambientes digitais semântico também é apresentado. Conforme recomendado em SemanticWeb.org, os documentos textuais aqui estudados foram automaticamente convertidos em páginas Web anotadas semanticamente, utilizando o Dublin Core como padrão para definição dos elementos de metadados, e o padrão RDF/XML para representação dos documentos e descrição dos elementos de metadados. Dentre os quinze elementos de metadados Dublin Core, nove foram gerados automaticamente pelo Modelo, e seis foram gerados de forma semi-automática. Os metadados Description e Subject foram os que necessitaram de algoritmos mais complexos, sendo obtidos através de técnicas estatísticas, de mineração de textos e de processamento de linguagem natural. A finalidade principal da avaliação do Modelo foi verificar o comportamento dos documentos convertidos para o formato RDF/XML, quando estes foram submetidos a um processo de recuperação de informação. Os elementos de metadados Description e Subject foram exaustivamente avaliados, uma vez que estes são os principais responsáveis por apreender a semântica de documentos textuais. A diversidade de contextos, a complexidade dos problemas relativos à língua portuguesa, e os novos conceitos introduzidos pelos padrões e tecnologias da Web Semântica, foram alguns dos fortes desafios enfrentados na construção do Modelo aqui proposto. Apesar de se ter utilizado técnicas não muito novas para a exploração dos conteúdos dos documentos, não se pode ignorar que os elementos inovadores introduzidos pela Web Semântica ofereceram avanços que possibilitaram a obtenção de resultados importantes nesta Tese. Como demonstrado aqui, a junção dessas técnicas com os padrões e tecnologias recomendados pela Web Semântica pode minimizar um dos maiores problemas da Web atual, e uma das fortes razões para a implementação da Web Semântica: a tendência dos mecanismos de busca de inundarem os usuários com resultados irrelevantes, por não levarem em consideração o contexto específico desejado pelo usuário. Dessa forma, é importante que se dê continuidade aos estudos e pesquisas em todas as áreas relacionadas à implementação da Web Semântica, dando abertura para que sistemas de informação mais funcionais sejam projetados / This Thesis offers a contribution to the Semantic Web area, in the scope of the representation and indexing of documents, defining an Automatic metadata generation model based on context, starting from textual documents not structured in the Portuguese language. A wide theoretical set of subjects related to the creation of semantic digital environments is also presented. As recommended in SemanticWeb.org, the textual documents studied here were automatically converted to Web pages written in semantic format, using Dublin Core as standard for definition of metadata elements, and the standard RDF/XML for representation of documents and description of the metadata elements. Among the fifteen Dublin Core metadata elements, nine were automatically generated by the Model, and six were generated in a semiautomatic manner. The metadata Description and Subject were the ones that required more complex algorithms, being obtained through statistical techniques, text mining techniques and natural language processing. The main purpose of the evaluation of the Model was to verify the behavior of the documents converted to the format RDF/XML, when these were submitted to an information retrieval process. The metadata elements Description and Subject were exhaustively evaluated, since these are the main ones responsible for learning the semantics of textual documents. The diversity of contexts, the complexity of the problems related to the Portuguese language, and the new concepts introduced by the standards and technologies of the Semantic Web, were some of the great challenges faced in the construction of the Model here proposed. In spite of having used techniques which are not very new for the exploration and exploitation of the contents of the documents, we cannot ignore that the innovative elements introduced by the Web Semantic have offered improvements that made possible the obtention of important results in this Thesis. As demonstrated here, the joining of those techniques with the standards and technologies recommended by the Semantic Web can minimize one of the largest problems of the current Web, and one of the strong reasons for the implementation of the Semantic Web: the tendency of the search mechanisms to flood the users with irrelevant results, because they do not take into account the specific context desired by the user. Therefore, it is important that the studies and research be continued in all of the areas related to the Semantic Web?s implementation, opening the door for more functional systems of information to be designed. Biblioteca digital Descoberta de conhecimento Digital library Information recovery Knowledge discovery Recuperação da informação Semântica Semantics

Search results