• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 65
  • 47
  • 24
  • 8
  • 5
  • 5
  • 3
  • 3
  • 3
  • 1
  • Tagged with
  • 173
  • 173
  • 115
  • 114
  • 41
  • 40
  • 34
  • 29
  • 28
  • 27
  • 25
  • 25
  • 22
  • 22
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Smart Meters Big Data : Behavioral Analytics via Incremental Data Mining and Visualization

Singh, Shailendra January 2016 (has links)
The big data framework applied to smart meters offers an exception platform for data-driven forecasting and decision making to achieve sustainable energy efficiency. Buying-in consumer confidence through respecting occupants' energy consumption behavior and preferences towards improved participation in various energy programs is imperative but difficult to obtain. The key elements for understanding and predicting household energy consumption are activities occupants perform, appliances and the times that appliances are used, and inter-appliance dependencies. This information can be extracted from the context rich big data from smart meters, although this is challenging because: (1) it is not trivial to mine complex interdependencies between appliances from multiple concurrent data streams; (2) it is difficult to derive accurate relationships between interval based events, where multiple appliance usage persist; (3) continuous generation of the energy consumption data can trigger changes in appliance associations with time and appliances. To overcome these challenges, we propose an unsupervised progressive incremental data mining technique using frequent pattern mining (appliance-appliance associations) and cluster analysis (appliance-time associations) coupled with a Bayesian network based prediction model. The proposed technique addresses the need to analyze temporal energy consumption patterns at the appliance level, which directly reflect consumers' behaviors and provide a basis for generalizing household energy models. Extensive experiments were performed on the model with real-world datasets and strong associations were discovered. The accuracy of the proposed model for predicting multiple appliances usage outperformed support vector machine during every stage while attaining accuracy of 81.65\%, 85.90\%, 89.58\% for 25\%, 50\% and 75\% of the training dataset size respectively. Moreover, accuracy results of 81.89\%, 75.88\%, 79.23\%, 74.74\%, and 72.81\% were obtained for short-term (hours), and long-term (day, week, month, and season) energy consumption forecasts, respectively.
162

Možnosti prezentace výsledků DZD na webu / Options of presentation of KDD results on Web

Koválik, Tomáš January 2015 (has links)
This diploma thesis covers KDD analysis of data and options of presentation of KDD results on Web. The paper is divided into three main sections, which follow the whole process of this thesis. In the first section are mentioned theoretical basics needed for understanding of discussed problem. In this section are described notions data matrix and domain knowledge, concept of CRISP-DM methodology, GUHA method, system LISp-Miner and implementation of GUHA method in LISp-Miner including description of core procedures 4ft-Miner and CF-Miner. The second section is dedicated to the first goal of this paper. It briefly summarizes analysis made during pre-analysis phase. Then is described process of analysis of domain knowledge in a given data set. The third part focuses on the second goal of this thesis, which is problem of presentation of KDD results on Web. This section covers brief theoretical basis for used technologies. Then is described development of export script for automatic generation of website from results found using LISp-Miner system including description of structure of the output and recommendations for work in LISp-Miner system.
163

Pokročilé dolování v datech v kardiologii / Advanced Data Mining in Cardiology

Mézl, Martin January 2009 (has links)
The aim of this master´s thesis is to analyse and search unusual dependencies in database of patients from Internal Cardiology Clinic Faculty Hospital Brno. The part of the work is theoretical overview of common data mining methods used in medicine, especially decision trees, naive Bayesian classifier, artificial neural networks and association rules. Looking for unusual dependencies between atributes is realized by association rules and naive Bayesian classifier. The output of this work is a complex system for Knowledge discovery in databases process for any data set. This work was realized with collaboration of Internal Cardiology Clinic Faculty Hospital Brno. All programs were made in Matlab 7.0.1.
164

Database Support for 3D-Protein Data Set Analysis

Lehner, Wolfgang, Hinneburg, Alexander 25 May 2022 (has links)
The progress in genome research demands for an adequate infrastructure to analyze the data sets. Database systems reflect a key technology to organize data and speed up the analysis process. This paper discusses the role of a relational database system based on the problem of finding frequent substructures in multi-dimensional protein databases. The specific problem consists of producing a set of association rules regarding frequent substructures with different lengths and gaps between the amino acid residues of a protein. From a database point of view, the process of finding association rules building the base for a more in-depth analysis of the data material is split into two parts. The first part performs a discretization of the conformational angle space of a single amino acid residue by computing the nearest neighbor of a given set of representatives. The second part consists in adapting a well-known association rule algorithm to determine the frequent substructures. Both steps within this comprehensive analysis task requires substantial support of the underlying database in order to reduce the programming overhead at the application level.
165

Exploiting Graphic Card Processor Technology to Accelerate Data Mining Queries in SAP NetWeaver BIA

Lehner, Wolfgang, Weyerhaeuser, Christoph, Mindnich, Tobias, Faerber, Franz 15 June 2022 (has links)
Within business Intelligence contexts, the importance of data mining algorithms is continuously increasing, particularly from the perspective of applications and users that demand novel algorithms on the one hand and an efficient implementation exploiting novel system architectures on the other hand. Within this paper, we focus on the latter issue and report our experience with the exploitation of graphic card processor technology within the SAP NetWeaver business intelligence accelerator (BIA). The BIA represents a highly distributed analytical engine that supports OLAP and data mining processing primitives. The system organizes data entities in column-wise fashion and its operation is completely main-memory-based. Since case studies have shown that classic data mining queries spend a large portion of their runtime on scanning and filtering the data as a necessary prerequisite to the actual mining step, our main goal was to speed up this expensive scanning and filtering process. In a first step, the paper outlines the basic data mining processing techniques within SAP NetWeaver BIA and illustrates the implementation of scans and filters. In a second step, we give insight into the main features of a hybrid system architecture design exploiting graphic card processor technology. Finally, we sketch the implementation and give details of our vast evaluations.
166

Implementace části standardu SQL/MM DM pro asociační pravidla / Implementation of SQL/MM DM for Association Rules

Škodík, Zdeněk Unknown Date (has links)
This project is concerned with problems of knowledge discovery in databases, in the concrete then is concerned with an association rules, which are part of the system of data mining. By that way we try to get knowledge which we can´t find directly in the database and which can be useful. There is the description of SQL/MM DM, especially then all user-defined types given by standard for association rules as well as common types which create framework for data mining. Before the description of implementation these types, there is mentioned the instruments which are used for that - programming language PL/SQL and Oracle Data Mining support. The accuracy of implementation is verified by a sample application. In the conclusion, achieved results are evaluated and possible continuation of this work is mentioned.
167

Data mining v oblasti kurzového sázení 3. anglické fotbalové ligy / Data Mining in the Field of English Football League Third Division's Betting Odds

Faruzel, Jiří January 2009 (has links)
Thesis "Data Mining in the Field of English Football League Third Division's Betting Odds" deals with data mining referring to acquiring knowledge from data. The main objective of this work is to develop data models for prediction of match results and to compare these predictions with a chosen strategy of betting. The selected betting strategy is based on betting single bets with odds belonging to chosen intervals, which generate a profit. These odds intervals were discovered by analyzing 2006-2009 football matches in a created simulator. On the basis of these odds ranges data models were constructed. Each data model contains a hypothesis which is generated by SD4ft procedure of LispMiner based on all football matches played in seasons 2001-2008. Developed data models are tested afterwards using 2006-2009 football matches data. Results show that all derived data models are profitable in all four seasons under consideration. More than half of them successfully predicted 2009 matches as well. The analysis showed that betting agencies offer mostly odds which make it almost impossible to be profitable while betting on matches according to their odds. In spite of this fact I identified some odds intervals with which you can success while betting single bets on home-team, draw or visitor-team with odds falling within these intervals. Association rules with reasonable confidence and support can generate high profitability. It is important to realize that there are no data models which guarantee a certain profit. Most of developed data models are not applicable in the real world, some of them can actually generate a loss. Nevertheless there are data models to be found that could generate a profit in the real world.
168

Metodika vývoje a nasazování Business Intelligence v malých a středních podnicích / Methodology of development and deployment of Business Intelligence solutions in Small and Medium Sized Enterprises

Rydzi, Daniel January 2005 (has links)
Dissertation thesis deals with development and implementation of Business Intelligence (BI) solutions for Small and Medium Sized Enterprises (SME) in the Czech Republic. This thesis represents climax of author's up to now effort that has been put into completing a methodological model for development of this kind of applications for SMEs using self-owned skills and minimum of external resources and costs. This thesis can be divided into five major parts. First part that describes used technologies is divided into two chapters. First chapter describes contemporary state of Business Intelligence concept and it also contains original taxonomy of Business Intelligence solutions. Second chapter describes two Knowledge Discovery in Databases (KDD) techniques that were used for building those BI solutions that are introduced in case studies. Second part describes the area of Czech SMEs, which is an environment where the thesis was written and which it is meant to contribute to. This environment is represented by one chapter that defines the differences of SMEs against large corporations. Furthermore, there are author's reasons why he is personally focusing on this area explained. Third major part introduces the results of survey that was conducted among Czech SMEs with support of Department of Information Technologies of Faculty of Informatics and Statistics of University of Economics in Prague. This survey had three objectives. First one was to map the readiness of Czech SMEs for BI solutions development and deployment. Second was to determine major problems and consequent decisions of Czech SMEs that could be supported by BI solutions and the third objective was to determine top factors preventing SMEs from developing and deploying BI solutions. Fourth part of the thesis is also the core one. In two chapters there is the original Methodology for development and deployment of BI solutions by SMEs described as well as other methodologies that were studied. Original methodology is partly based on famous CRISP-DM methodology. Finally, last part describes particular company that has become a testing ground for author's theories and that supports his research. In further chapters it introduces case-studies of development and deployment of those BI solutions in this company, that were build using contemporary BI and KDD techniques with respect to original methodology. In that sense, these case-studies verified theoretical methodology in real use.
169

透過圖片標籤觀察情緒字詞與事物概念之關聯 / An analysis on association between emotion words and concept words based on image tags

彭聲揚, Peng, Sheng-Yang Unknown Date (has links)
本研究試圖從心理學出發,探究描述情緒狀態的分類方法為何, 為了進行情緒與語意的連結,我們試圖將影像當作情緒狀態的刺激 來源,針對Flickr網路社群所共建共享的內容進行抽樣與觀察,使 用心理學研究中基礎的情緒字詞與詞性變化,提取12,000張帶有字 詞標籤的照片,進行標籤字詞與情緒分類字詞共現的計算、關聯規則 計算。同時,透過語意差異量表,提出了新的偏向與強度的座標分類 方法。 透過頻率門檻的過濾、詞性加註與詞幹合併字詞的方法,從 65983個不重複的文字標籤中,最後得到272個帶有情緒偏向的事物 概念字詞,以及正負偏向的情緒關聯規則。為了透過影像驗證這些字 詞是否與影像內容帶給人們的情緒狀態有關聯,我們透過三種查詢 管道:Flickr單詞查詢、google image單詞查詢、以及我們透過照片 標籤綜合指標:情緒字詞比例、社群過濾參數來選定最後要比較的 42張照片。透過語意差異量表,測量三組照片在136位使用者的答案 中,是否能吻合先前提出的強度-偏向模型。 實驗結果發現,我們的方法和google image回傳的結果類似, 使用者問卷調查結果支持我們的方法對於正負偏向的判定,且比 google有更佳的強弱分離程度。 / This study attempts to proceed from psychology to explore the emotional state of the classification method described why, in order to be emotional and semantic links, images as we try to stimulate the emotional state of the source, the Internet community for sharing Flickr content sampling and observation, using basic psychological research in terms of mood changes with the parts of speech, with word labels extracted 12,000 photos, label and classification of words and word co-occurrence of emotional computing, computing association rules. At the same time, through the semantic differential scale, tend to put forward a new classification of the coordinates and intensity. Through the frequency threshold filter, filling part of speech combined with the terms of the method stems from the 65,983 non-duplicate text labels, the last 272 to get things with the concept of emotional bias term, and positive and negative emotions tend to association rules. In order to verify these words through images is to bring people's emotional state associated with our pipeline through the three sources: Flickr , google image , and photos through our index labels: the proportion of emotional words, the community filtering parameters to select the final 42 photos to compare. Through the semantic differential scale, measuring three photos in 136 users of answers, whether the agreement made earlier strength - bias model. Experimental results showed that our methods and google image similar to the results returned, the user survey results support our approach to determine the positive and negative bias, and the strength of better than google degree of separation.
170

Development of an intelligent analytics-based model for product sales optimisation in retail enterprises

Matobobo, Courage 03 July 2016 (has links)
A retail enterprise is a business organisation that sells goods or services directly to consumers for personal use. Retail enterprises such as supermarkets enable customers to go around the shop picking items from the shelves and placing them into their baskets. The basket of each customer is captured into transactional systems. In this research study, retail enterprises were classified into two main categories: centralised and distributed retail enterprises. A distributed retail enterprise is one that issues the decision rights to the branches or groups nearest to the data collection, while in centralised retail enterprises the decision rights of the branches are concentrated in a single authority. It is difficult for retail enterprises to ascertain customer preferences by merely observing transactions. This has led to quantifiable losses. Although some enterprises implemented classical business models to address these challenging issues, they still lacked analytics-based marketing programs to gain competitive advantage. This research study develops an intelligent analytics-based (ARANN) model for both distributed and centralised retail enterprises in the cross-demographics of a developing country. The ARANN model is built on association rules (AR), complemented by artificial neural networks (ANN) to strengthen the results of these two individual models. The ARANN model was tested using real-life and publicly available transactional datasets for the generation of product arrangement sets. In centralised retail enterprises, the data from different branches was integrated and pre-processed to remove data impurities. The cleaned data was then fed into the ARANN model. On the other hand, in distributed retail enterprises data was collected branch per branch and cleaned. The cleaned data was fed into the ARANN model. According to experimental analytics, the ARANN model can generate improved product arrangement sets, thereby improving the confidence of retail enterprise decision-makers in competitive environments. It was also observed that the ARANN model performed faster in distributed than in centralised retail enterprises. This research is beneficial for sustainable businesses and consideration of the results is therefore recommended to retail enterprises. / Computing / M Sc. (Computing)

Page generated in 0.0688 seconds