• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 23
  • 14
  • 10
  • 5
  • 5
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 65
  • 38
  • 34
  • 30
  • 17
  • 15
  • 13
  • 12
  • 9
  • 9
  • 9
  • 9
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Implementation of the Apriori algorithm for effective item set mining in VigiBaseTM : Project report in Teknisk Fysik 15 hp

Olofsson, Niklas January 2010 (has links)
No description available.
12

Van apriorisme tot positivisme in de fysica

Sjoerdsma, Wijmer. January 1900 (has links)
Academisch proefschrift--Amsterdam. / Bibliography: p. 147.
13

Van apriorisme tot positivisme in de fysica

Sjoerdsma, Wijmer. January 1900 (has links)
Academisch proefschrift--Amsterdam. / Bibliography: p. 147.
14

Une approche à base de règles d'association pour l'explication et la prévision de l'évolution territoriale / An association rule-based approach for the explanation and prediction of territorial evolution

Gharbi, Asma 13 February 2018 (has links)
Dans ce mémoire, nous partons de l'hypothèse que les dynamiques spatiales et les évolutions des usages des objets géographiques peuvent, en partie, être expliquées ou anticipées par leurs historiques de changements de fonctions et de co-localisations. Nous proposons d'exploiter la recherche des motifs fréquents et des règles d'associations pour en extraire des règles régissant ces dynamiques. Ce travail adapte également le processus de fouille de données pour tenir compte de la spécificité des données spatio-temporelles utilisées, en particulier, leur asymétrie.Dans ce contexte, notre proposition traite des questions liées à la modélisation des relations spatio-temporelles incorporées dans le jeu de données, la représentation adéquate des données d'apprentissage, pour ainsi, produire des règles adaptées à notre problème de prédiction. La prise en compte de l'asymétrie des attributs d'apprentissage en termes de fréquence est traitée selon deux approches : une approche utilisant plusieurs seuils de support minimum et une approche traitant disjointement les attributs. Pour la première approche, deux adaptations de l'algorithme MSApriori ont été proposées pour la définition et l'affectation de ces seuils. Pour la seconde, nous proposons l'algorithme BERA pour la génération de règles en allant de la construction de la conclusion vers la construction des prémisses.Afin de vérifier et évaluer nos propositions, nous proposons une étude expérimentale menée sur différents jeux de données issus des données Corine Land Cover dans le cadre d’un dispositif expérimental appelé SAFFIET. / In this dissertation, we start from the hypothesis that spatial dynamics and geographical object usage evolution may partially be explained or predicted by their different previous spatial configuration. Thus, we propose to exploit frequent pattern mining and association rule mining in order to extract rules governing these dynamics. This work tries, as well, to adapt the data mining process to take into account the specificity of the used spatiotemporal data, in particular, their asymmetry. In this context, our proposal deals with questions related to the modeling of the spatiotemporal relations incorporated in the data set, the adequate representation of the learning data in order to produce rules adapted to our prediction problem. Addressing the asymmetric aspect of learning attributes, mainly in terms of their frequencies, is tackled according to two approaches: the first one is based on using multiple minimum supports (minsup) and the second one consists in addressing the attributes in a disjointed manner. The first approach is based on two adaptations of the MSApriori algorithm for the definition and assignment of these thresholds. The second approach exploits the novel BERA algorithm which is based on semantics of the predicates for the generation of rules, going from the construction of the conclusion part to the construction of the premise part. In order to verify and evaluate our proposals, an experimental study is carried out on different datasets from Corine Land Cover in an experimental tool called SAFFIET.
15

Previsão de incidência de Dengue nas cidades brasileiras através de Inteligência Artificial

Rezende Brasil Neto, Carlos 31 January 2009 (has links)
Made available in DSpace on 2014-06-12T15:52:24Z (GMT). No. of bitstreams: 1 license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2009 / Dengue é uma doença tropical que causa alto impacto na sociedade Brasileira. Este trabalho apresenta um método para estimar o nível de incidência da doença baseado em dados demográficos e socioeconômicos a nível municipal. Uma rede neural Multi-Layer Perceptron (MLP) estimou o risco de alto nível de incidência e as regras induzidas por uma versão adaptada do algoritmo APriori ofereceram potencial explicação para as cidades tendo alto ou baixo nível de incidência. Os dados foram coletados em 2002 e referem-se a cerca de 5500 cidades (todas as cidades brasileiras). Os resultados em um conjunto de teste estatisticamente independente mostraram alto desempenho nas estimativas do risco e boa qualidade das regras induzidas
16

A distributed approach to Frequent Itemset Mining at low support levels

Clark, Neal 22 December 2014 (has links)
Frequent Itemset Mining, the process of finding frequently co-occurring sets of items in a dataset, has been at the core of the field of data mining for the past 25 years. During this time the datasets have grown much faster than the algorithms capacity to process them. Great progress was made at optimizing this task on a single computer however, despite years of research, very little progress has been made on parallelizing this task. FPGrowth based algorithms have proven notoriously difficult to parallelize and Apriori has largely fallen out of favor with the research community. In this thesis we introduce a parallel, Apriori based, Frequent Itemset Mining algo- rithm capable of distributing computation across large commodity clusters. Our case study demonstrates that our algorithm can efficiently scale to hundreds of cores, on a standard Hadoop MapReduce cluster, and can improve executions times by at least an order of magnitude at the lowest support levels. / Graduate / 0984 / 0800 / nclark@uvic.ca
17

Displacement Convexity for First-Order Mean-Field Games

Seneci, Tommaso 01 May 2018 (has links)
In this thesis, we consider the planning problem for first-order mean-field games (MFG). These games degenerate into optimal transport when there is no coupling between players. Our aim is to extend the concept of displacement convexity from optimal transport to MFGs. This extension gives new estimates for solutions of MFGs. First, we introduce the Monge-Kantorovich problem and examine related results on rearrangement maps. Next, we present the concept of displacement convexity. Then, we derive first-order MFGs, which are given by a system of a Hamilton-Jacobi equation coupled with a transport equation. Finally, we identify a large class of functions, that depend on solutions of MFGs, which are convex in time. Among these, we find several norms. This convexity gives bounds for the density of solutions of the planning problem.
18

一個基於記憶體內運算之多維度多顆粒度資料探勘之研究-以yahoo user profile為例 / A Research of Multi-dimensional and Multigranular Data Mining with In-memory Computingwith yahoo user profile

林洸儂, Lin, Guang-Nung Unknown Date (has links)
近年來雲端運算技術的發展與電腦設備效能提升,使得以大量電腦主機以水 平擴充的方式組成叢集運算系統,成為一可行的選擇。Apache Hadoop 是Apache 基金會的一個開源軟體框架,它是由Google 公司的MapReduce 與Google 檔案 系統實作成的分布式系統,可以管理數千台以上的電腦群集。Hadoop 利用分散 式檔案系統HDFS 可以提供PB 級以上的資料存放空間,透過MapReduce 框架 可以將應用程式分割成小工作分散到叢集中的運算節點上執行。 此外,企業累積了巨量的資料,如何處理與分析這些結構化或者是非結構化 的資料成了現在熱門研究的議題。因此傳統的資料挖掘方式與演算法必須因應新 的雲端運算技術與分散式框架的概念,進行調整與改良,發展新的方法。 關聯規則是分析資料庫龐大的資料中,項目之間隱含的關聯,常見的應用為 購物籃分析。一般情形下會在特定的維度與特定的顆粒度範圍內挖掘關聯規則, 但這樣的方式無法找出更細微範圍下之規則,例如挖掘一個年度的交易資料無法 發現消費者在聖誕節為了慶祝而購買的商品項目間的規則,但若將時間限縮在 12 月份即可挖掘出這些規則。 Apriori 演算法是挖掘關聯規則的一個著名的演算法,透過產生候選項目集 合與使用自訂的最小支持度進行篩選,產生高頻項目集合,接著以最小信賴度篩 選獲得關聯規則的結果。若有k 種單一項目集合,則候選項目集合最多有2𝑘 − 1 個,計算高頻項目時則需反覆掃描整個資料庫,Apriori 這兩個主要步驟需要耗費 相當大量的運算能力。 因此本研究將資料庫分割成多個資料區塊挖掘關聯規則,再將結果逐步更新 的演算法,解決大範圍挖掘遺失關聯規則的問題,結合spark 分散式運算的架構 實作程式,在電腦群集上平行運算減少關聯規則的挖掘時間。 / Because of improving technique of cloud-computing and increasing capability of computer equipment, it is feasible to use clusters of computers by horizon scalable a lot of computers. Apache Hadoop is an open-source software of Apache. It allows the management of cluster resource, a distributed storage system named Hadoop Distributed File System (HDFS), and a parallel processing technique called MapReduce. Enterprises have accumulated a huge amount of data. It is a hot issue to process and analyze these structured or unstructured data. Traditional methods and algorithms of data mining must make adjustments and improvement to new cloud computing technology and concept of decentralized framework. Association rules is the relations of items from large database. In general, we find association rules in fixed dimensions and granular database. However, it might loss infrequent association rules. Apriori algorithm is one famous algorithm of mining association rule. There are two main steps in this algorithm spend a lot of computing resource. To generate Candidate itemset has quantity 2𝑘 − 1, if there are k different item. Second step is to find frequent, this step must compare all tractions in the database. This approach divides database to segmentations and finds association rules of these segmentations. Then, we combine rules of segmentations. It can solve the problem of missing infrequent itemset. In addition, we implement this method in Spark and reduce the time of computing.
19

Advancing the discovery of unique column combinations

Abedjan, Ziawasch, Naumann, Felix January 2011 (has links)
Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations. / Unique-Spaltenkombinationen sind Spaltenkombinationen einer Datenbanktabelle, die nur einzigartige Werte beinhalten. Das Finden von Unique-Spaltenkombinationen spielt sowohl eine wichtige Rolle im Bereich der Grundlagenforschung von Informationssystemen als auch in Anwendungsgebieten wie dem Datenmanagement und der Erkenntnisgewinnung aus Datenbeständen. Vorhandene Algorithmen, die dieses Problem angehen, sind entweder Brute-Force oder benötigen zu viel Hauptspeicher. Deshalb können diese Algorithmen nur auf kleine Datenmengen angewendet werden. In dieser Arbeit werden der bekannte GORDIAN-Algorithmus und Apriori-basierte Algorithmen zum Zwecke weiterer Optimierung analysiert. Wir verbessern die Apriori Algorithmen durch eine effiziente Kandidatengenerierung und Heuristikbasierten Kandidatenfilter. Eine Hybride Lösung, HCA-GORDIAN, kombiniert die Vorteile von GORDIAN und unserem neuen Algorithmus HCA, welche die bisherigen Algorithmen hinsichtlich der Effizienz in vielen Situationen übertrifft.
20

Análise associativa: identificação de padrões de associação entre o perfil socioeconômico dos alunos do ensino básico e os resultados nas provas de matemática / Association analysis: identification of patterns related to the socioeconomic profiles

Lyvia Aloquio 20 February 2014 (has links)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Nos dias atuais, a maioria das operações feitas por empresas e organizações é armazenada em bancos de dados que podem ser explorados por pesquisadores com o objetivo de se obter informações úteis para auxílio da tomada de decisão. Devido ao grande volume envolvido, a extração e análise dos dados não é uma tarefa simples. O processo geral de conversão de dados brutos em informações úteis chama-se Descoberta de Conhecimento em Bancos de Dados (KDD - Knowledge Discovery in Databases). Uma das etapas deste processo é a Mineração de Dados (Data Mining), que consiste na aplicação de algoritmos e técnicas estatísticas para explorar informações contidas implicitamente em grandes bancos de dados. Muitas áreas utilizam o processo KDD para facilitar o reconhecimento de padrões ou modelos em suas bases de informações. Este trabalho apresenta uma aplicação prática do processo KDD utilizando a base de dados de alunos do 9 ano do ensino básico do Estado do Rio de Janeiro, disponibilizada no site do INEP, com o objetivo de descobrir padrões interessantes entre o perfil socioeconômico do aluno e seu desempenho obtido em Matemática na Prova Brasil 2011. Neste trabalho, utilizando-se da ferramenta chamada Weka (Waikato Environment for Knowledge Analysis), foi aplicada a tarefa de mineração de dados conhecida como associação, onde se extraiu regras por intermédio do algoritmo Apriori. Neste estudo foi possível descobrir, por exemplo, que alunos que já foram reprovados uma vez tendem a tirar uma nota inferior na prova de matemática, assim como alunos que nunca foram reprovados tiveram um melhor desempenho. Outros fatores, como a sua pretensão futura, a escolaridade dos pais, a preferência de matemática, o grupo étnico o qual o aluno pertence, se o aluno lê sites frequentemente, também influenciam positivamente ou negativamente no aprendizado do discente. Também foi feita uma análise de acordo com a infraestrutura da escola onde o aluno estuda e com isso, pôde-se afirmar que os padrões descobertos ocorrem independentemente se estes alunos estudam em escolas que possuem infraestrutura boa ou ruim. Os resultados obtidos podem ser utilizados para traçar perfis de estudantes que tem um melhor ou um pior desempenho em matemática e para a elaboração de políticas públicas na área de educação, voltadas ao ensino fundamental. / Nowadays, most of the transactions made by companies and organizations is stored in databases that can be explored by researchers in order to obtain useful information to aid decision making. Due to the large volume involved, the extraction and analysis of data is not a simple task. The general process of converting raw data into useful information is called Knowledge Discovery in Databases (KDD). One step in this process is the Data Mining, which involves the application of algorithms and statistical techniques to exploit information contained implicitly in large databases. Many areas use the KDD process to facilitate the recognition of patterns or models on their bases of information. This work presents a practical application of KDD process using the database of students in the 9th grade of elementary education in the State of Rio de Janeiro, available in INEP site, with the aim of finding interesting patterns between the socioeconomic profile of the student and his/her performance obtained in Mathematics. The tool called Weka was used and the Apriori algorithm was applied to extracting association rules. This study revealed, for example, that students who have been reproved once tend to get a lower score on the math test, as well as students who had never been disapproved have had superior performance. Other factors like student future perspectives, ethnic group, parent's schooling, satisfaction in mathematics studying, and the frequency of access to Internet also affect positively or negatively the students learning. An analysis related to the schools infrastructure was made, with the conclusion that patterns do not change regardless of the student studying in good or bad infrastructure schools. The results obtained can be used to trace the students profiles which have a better or a worse performance in mathematics and to the development of public policies in education, aimed at elementary education.

Page generated in 0.0386 seconds