Global ETD Search

21	Análise associativa: identificação de padrões de associação entre o perfil socioeconômico dos alunos do ensino básico e os resultados nas provas de matemática / Association analysis: identification of patterns related to the socioeconomic profiles Lyvia Aloquio 20 February 2014 (has links) Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Nos dias atuais, a maioria das operações feitas por empresas e organizações é armazenada em bancos de dados que podem ser explorados por pesquisadores com o objetivo de se obter informações úteis para auxílio da tomada de decisão. Devido ao grande volume envolvido, a extração e análise dos dados não é uma tarefa simples. O processo geral de conversão de dados brutos em informações úteis chama-se Descoberta de Conhecimento em Bancos de Dados (KDD - Knowledge Discovery in Databases). Uma das etapas deste processo é a Mineração de Dados (Data Mining), que consiste na aplicação de algoritmos e técnicas estatísticas para explorar informações contidas implicitamente em grandes bancos de dados. Muitas áreas utilizam o processo KDD para facilitar o reconhecimento de padrões ou modelos em suas bases de informações. Este trabalho apresenta uma aplicação prática do processo KDD utilizando a base de dados de alunos do 9 ano do ensino básico do Estado do Rio de Janeiro, disponibilizada no site do INEP, com o objetivo de descobrir padrões interessantes entre o perfil socioeconômico do aluno e seu desempenho obtido em Matemática na Prova Brasil 2011. Neste trabalho, utilizando-se da ferramenta chamada Weka (Waikato Environment for Knowledge Analysis), foi aplicada a tarefa de mineração de dados conhecida como associação, onde se extraiu regras por intermédio do algoritmo Apriori. Neste estudo foi possível descobrir, por exemplo, que alunos que já foram reprovados uma vez tendem a tirar uma nota inferior na prova de matemática, assim como alunos que nunca foram reprovados tiveram um melhor desempenho. Outros fatores, como a sua pretensão futura, a escolaridade dos pais, a preferência de matemática, o grupo étnico o qual o aluno pertence, se o aluno lê sites frequentemente, também influenciam positivamente ou negativamente no aprendizado do discente. Também foi feita uma análise de acordo com a infraestrutura da escola onde o aluno estuda e com isso, pôde-se afirmar que os padrões descobertos ocorrem independentemente se estes alunos estudam em escolas que possuem infraestrutura boa ou ruim. Os resultados obtidos podem ser utilizados para traçar perfis de estudantes que tem um melhor ou um pior desempenho em matemática e para a elaboração de políticas públicas na área de educação, voltadas ao ensino fundamental. / Nowadays, most of the transactions made by companies and organizations is stored in databases that can be explored by researchers in order to obtain useful information to aid decision making. Due to the large volume involved, the extraction and analysis of data is not a simple task. The general process of converting raw data into useful information is called Knowledge Discovery in Databases (KDD). One step in this process is the Data Mining, which involves the application of algorithms and statistical techniques to exploit information contained implicitly in large databases. Many areas use the KDD process to facilitate the recognition of patterns or models on their bases of information. This work presents a practical application of KDD process using the database of students in the 9th grade of elementary education in the State of Rio de Janeiro, available in INEP site, with the aim of finding interesting patterns between the socioeconomic profile of the student and his/her performance obtained in Mathematics. The tool called Weka was used and the Apriori algorithm was applied to extracting association rules. This study revealed, for example, that students who have been reproved once tend to get a lower score on the math test, as well as students who had never been disapproved have had superior performance. Other factors like student future perspectives, ethnic group, parent's schooling, satisfaction in mathematics studying, and the frequency of access to Internet also affect positively or negatively the students learning. An analysis related to the schools infrastructure was made, with the conclusion that patterns do not change regardless of the student studying in good or bad infrastructure schools. The results obtained can be used to trace the students profiles which have a better or a worse performance in mathematics and to the development of public policies in education, aimed at elementary education. Aprendizagem de matemática Algoritmo Apriori Mineração de dados Regras de associação Learning of Mathematics Association rules Apriori Algorithm Data mining MATEMATICA APLICADA
22	Metody pro získávání asociačních pravidel z dat / Methods for Mining Association Rules from Data Uhlíř, Martin January 2007 (has links) The aim of this thesis is to implement Multipass-Apriori method for mining association rules from text data. After the introduction to the field of knowledge discovery, the specific aspects of text mining are mentioned. In the mining process, preprocessing is a very important problem, use of stemming and stop words dictionary is necessary in this case. Next part of thesis deals with meaning, usage and generating of association rules. The main part is focused on the description of Multipass-Apriori method, which was implemented. On the ground of executed tests the most optimal way of dividing partitions was set and also the best way of sorting the itemsets. As a part of testing, Multipass-Apriori method was compared with Apriori method.
23	pcApriori: Scalable apriori for multiprocessor systems Schlegel, Benjamin, Kiefer, Tim, Kissinger, Thomas, Lehner, Wolfgang 16 September 2022 (has links) Frequent-itemset mining is an important part of data mining. It is a computational and memory intensive task and has a large number of scientific and statistical application areas. In many of them, the datasets can easily grow up to tens or even several hundred gigabytes of data. Hence, efficient algorithms are required to process such amounts of data. In the recent years, there have been proposed many efficient sequential mining algorithms, which however cannot exploit current and future systems providing large degrees of parallelism. Contrary, the number of parallel frequent-itemset mining algorithms is rather small and most of them do not scale well as the number of threads is largely increased. In this paper, we present a highly-scalable mining algorithm that is based on the well-known Apriori algorithm; it is optimized for processing very large datasets on multiprocessor systems. The key idea of pcApriori is to employ a modified producer--consumer processing scheme, which partitions the data during processing and distributes it to the available threads. We conduct many experiments on large datasets. pcApriori scales almost linear on our test system comprising 32 cores. info:eu-repo/classification/ddc/004 ddc:004
24	Parallel Mining of Association Rules Using a Lattice Based Approach Thomas, Wessel Morant 01 January 2009 (has links) The discovery of interesting patterns from database transactions is one of the major problems in knowledge discovery in database. One such interesting pattern is the association rules extracted from these transactions. Parallel algorithms are required for the mining of association rules due to the very large databases used to store the transactions. In this paper we present a parallel algorithm for the mining of association rules. We implemented a parallel algorithm that used a lattice approach for mining association rules. The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets. Experimental results show that DDRM utilizes the processors efficiently and performed better than the prefix-based and partition algorithms that use a static approach to assign classes to the processors. The DDRM algorithm scales well and shows good speedup. Apriori Association rules Data mining Distributed algorithms Lattice Parallel systems Computer Sciences
25	Mining Medical Data in a Clinical Environment Ivanovskiy, Tim V. 07 July 2006 (has links) The availability of new treatments for a disease depends on the success of clinical trials. In order for a clinical trial to be successful and approved, medical researchers must first recruit patients with a specific set of conditions in order to test the effectiveness of the proposed treatment. In the past, the accrual process was tedious and time-consuming. Since accruals rely heavily on the ability of physicians and their staff to be familiar with the protocol eligibility criteria, candidates tend to be missed. This can result and has resulted in unsuccessful trials.A recent project at the University of South Florida aimed to assist research physicians at H. Lee Moffitt Cancer Center & Research Institute, Tampa, Florida, with a screening process by utilizing a web-based expert system, Moffitt Expedited Accrual Network System (MEANS). This system allows physicians to determine the eligibility of a patient for several clinical trials simultaneously.We have implemented this web-based expert system at the H. Lee Moffitt Cancer Center & Research Gastroenterology (GI) Clinic. Based on our findings and staff feedback, the system has undergone many optimizations. We used data mining techniques to analyze the medical data of current gastrointestinal patients. The use of the Apriori algorithm allowed us to discover new rules (implications) in the patient data. All of the discovered implications were checked for medical validity by a physician, and those that were determined to be valid were entered into the expert system. Additional analysis of the data allowed us to streamline the system and decrease the number of mouse clicks required for screening. We also used a probability-based method to reorder the questions, which decreased the amount of data entry required to determine a patient's ineligibility. data mining rule association medical expert system apriori medical implications American Studies Arts and Humanities
26	Didelių duomenų sekų analizės problemos / Data mining problems Ambraziūnas, Valdas 11 June 2004 (has links) The main goal of these thesis is to compare association rules finding algorithms and to indicate the usability of finding association rules in business area. In order to achieve this goal, the theoretical analysis of three algorithms is done: 1. The Apriori algorithm – the most well known association rule algorithm – based on the property: “Any subset of a large itemset must be large”. This algorithm assumes that the database is memory-resident. The maximum number of database scans is one more than the cardinality of the largest large itemset. 2. The Sampling algorithm deals with the database sample prior the full database scan. The database sample is drawn such that it can be memory-resident. The Sampling algorithm reduces the number of database scans to one in the best case and two in the worst case. 3. The Partitioning algorithm divides database into partitions and bases on the property: “A large itemset must be large in at least one of the partitions”. This algorithm reduces the number of database scans to two and divides the database into partitions such that each partition can be placed into main memory. There are created programs for all three algorithms plus the program for the full set of itemsets algorithm. Programs are created in C++ language. In order to achieve topmost performance, the GUI is missed. Nine test data sets are created to compare the algorithms. Six of them contains real life data from telecommunications business area. Datasets varies from the... [to full text] Informatics Apriori Duomenų analizė Partitioning Sampling Association rules Data mining Asociacinės taisyklės
27	Applying the Apriori and FP-Growth Association Algorithms to Liver Cancer Data Pinheiro, Fabiola M. R. 27 August 2013 (has links) Cancer is the leading cause of deaths globally. Although liver cancer ranks only fourth in incidence worldwide among all types of cancer, its survivability rate is the lowest. Liver cancer is often diagnosed at an advanced stage, because in the early stages of the disease patients usually do not have signs or symptoms. After initial diagnosis, therapeutic options are limited and tend to be effective only for small size tumors with limited spread and minimal vascular invasion. As a result, long-term patient survival remains minimal, and has not improved in the past three decades. In order to reduce morbidity and mortality from liver cancer, improvement in early diagnosis and the evaluation of current treatments are essential. This study tested the applicability of the Apriori and FP-Growth association data mining algorithms to liver cancer patient data, obtained from the British Columbia Cancer Agency. The data was used to develop association rules which indicate what combinations of factors are most commonly observed with liver cancer incidence as well as with increased or decreased rates of mortality. Ideally, these association rules will be applied in future studies using liver cancer data extracted from other Electronic Health Record (EHR) systems. The main objective of making these rules available is to facilitate early detection guidelines for liver cancer and to evaluate current treatment options. / Graduate / 0566 / 0984 / fabiola@uvic.ca liver cancer association analysis association rule apriori fp-growth data mining British Columbia Yukon
28	Uma arquitetura de software para descoberta de regras de associação multidimensional, multinível e de outliers em cubos OLAP: um estudo de caso com os algoritmos APriori e FPGrowth Moreira Tanuro, Carla 31 January 2010 (has links) Made available in DSpace on 2014-06-12T15:55:26Z (GMT). No. of bitstreams: 2 arquivo2236_1.pdf: 2979608 bytes, checksum: 3c3ed256a9de67bd5b716bb15d15cb6c (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2010 / Conselho Nacional de Desenvolvimento Científico e Tecnológico / O processo tradicional de descoberta de conhecimento em bases de dados (KDD Knowledge Discovery in Databases) não contempla etapas de processamento multidimensional e multinível (i.e., processamento OLAP - OnLine Analytical Processing) para minerar cubos de dados. Por conseqüência, a maioria das abordagens de OLAM (OLAP Mining) propõe adaptações no algoritmo minerador. Dado que esta abordagem provê uma solução fortemente acoplada ao algoritmo minerador, ela impede que as adaptações para mineração multidimensional e multinível sejam utilizadas com outros algoritmos. Além disto, grande parte das propostas de OLAM para regras de associação não considera o uso de um servidor OLAP e não tira proveito de todo o potencial multidimensional e multinível presentes nos cubos OLAP. Por estes motivos, algum retrabalho (e.g., re-implementação de operações OLAP) é realizado e padrões possivelmente fortes decorrentes de generalizações não são identificados. Diante desse cenário, este trabalho propõe a arquitetura DOLAM (Decoupled OLAM) para mineração desacoplada de regras de associação multidimensional, multinível e de outliers em cubos OLAP. A arquitetura DOLAM deve ser inserida no processo de KDD (Knowledge Discovery in Databases) como uma etapa de processamento que fica entre as etapas de Pré-Processamento e Transformação de Dados. A arquitetura DOLAM define e implementa três componentes: 1) Detector de Outliers, 2) Explorador de Subcubos e 3) Expansor de Ancestrais. A partir de uma consulta do usuário, estes componentes são capazes de, respectivamente: 1) identificar ruídos significativos nas células do resultado; 2) explorar, recursivamente, todas as células do resultado, de forma a contemplar todas as possibilidades de combinações multidimensional e multinível e 3) recuperar todos os antecessores (generalizações) das células do resultado. O componente central da arquitetura é o Expansor de Ancestrais - o único de uso obrigatório. Ressalta-se que, a partir desses componentes, o processamento OLAM fica desacoplado do algoritmo minerador e permite realizar descobertas mais abrangentes, as quais, por conseqüência, podem retornar padrões potencialmente mais fortes. Como prova de conceito, foi realizado um estudo de caso com dados reais de uma empresa de micro-crédito. O estudo de caso foi implementado em Java, fez uso do servidor OLAP Mondrian e utilizou as implementações dos algoritmos para mineração de regras de associação APriori e FP-Growth do pacote de software Weka OLAP Mineração de dados KDD OLAM Regras de associação APriori FP-growth Mineração multidimensional Mineração multinível Outlier
29	Identification of Discriminating Motifs in Heart Rate Time Series Data of Soccer Players Ravindranathan, Sampurna January 2018 (has links) No description available. Computer Science Wearable Devices Soccer Analytics Heart Rate Motifs Apriori Algorithm Feature Selection
30	Analysis of Classes of Singular Boundary Value Problems Ko, Eunkyung 11 August 2012 (has links) In this dissertation we study positive solutions to a singular p-Laplacian elliptic boundary value problem on a bounded domain with smooth boundary when a positive parameter varies. Our main focus is the analysis of a challenging class of singular p-Laplacian problems. We establish the existence of a positive solution for all positive values of the parameter and the existence of at least two positive solutions for a certain explicit range of the parameter. In the Laplacian case, we also prove the uniqueness of the positive solution for large values of the parameter. We extend our existence and multiplicity results to classes of singular systems and to the case when a domain is an exterior domain. We prove our existence and multiplicity results by the method of sub and supersolutions and our uniqueness result by establishing apriori and boundary estimates. Such results are well known in the literature for the nonsingular case. In this study, we extend these results to the more difficult singular case. sub-supersolutions apriori estimate uniqueness multiplicity existence positive solution singular boundary value problems p-Laplacian operator

Search results