Spelling suggestions: "subject:"csrknowledge discovery inn databases"" "subject:"csrknowledge discovery inn atabases""
71 |
Dolování asociačních pravidel z datových skladů / Association Rules Mining over Data WarehousesHlavička, Ladislav January 2009 (has links)
This thesis deals with association rules mining over data warehouses. In the first part the reader will be familiarized with terms like knowledge discovery in databases and data mining. The following part of the work deals with data warehouses. Further the association analysis, the association rules, their types and mining possibilities are described. The architecture of Microsoft SQL Server and its tools for working with data warehouses are presented. The rest of the thesis includes description and analysis of the Star-miner algorithm, design, implementation and testing of the application.
|
72 |
Získávání znalostí z multimediálních databází / Knowledge Discovery in Multimedia DatabasesJirmásek, Tomáš Unknown Date (has links)
This master's thesis deals with knowledge discovery in databases, especially basic methods of classification and prediction used for data mining are described here. The next chapter contains introduction to multimedia databases and knowledge discovery in multimedia databases. The main goal of this chapter was to focus on extraction of low level features from video data and images. In the next parts of this work, there is described data set and results of experiments in applications RapidMiner, LibSVM and own developed application. The last chapter summarises results of used methods for high level feature extraction from low level description of data.
|
73 |
Vytvoření modulu pro dolování dat z databází / Creation of Unit for DataminingKrásenský, David Unknown Date (has links)
The goal of this work is to create data mining module for information system Belinda. Data from database of clients will be analyzed using SAS Enterprise Miner. Results acquired using several data mining methods will be compared. During the second phase selected data mining method will be implemented such as module of information system Belinda. The final part of this work is evaluation of acquired results and possibility of using this module.
|
74 |
Rozšíření funkcionality systému pro dolování z dat na platformě NetBeans / Functionality Extension of Data Mining System on NetBeans PlatformŠebek, Michal January 2009 (has links)
Databases increase by new data continually. A process called Knowledge Discovery in Databases has been defined for analyzing these data and new complex systems has been developed for its support. Developing of one of this systems is described in this thesis. Main goal is to analyse the actual state of implementation of this system which is based on the Java NetBeans Platform and the Oracle database system and to extend it by data preprocessing algorithms and the source data analysis. Implementation of data preprocessing components and changes in kernel of this system are described in detail in this thesis.
|
75 |
Implementace části standardu SQL/MM DM pro asociační pravidla / Implementation of SQL/MM DM for Association RulesŠkodík, Zdeněk Unknown Date (has links)
This project is concerned with problems of knowledge discovery in databases, in the concrete then is concerned with an association rules, which are part of the system of data mining. By that way we try to get knowledge which we can´t find directly in the database and which can be useful. There is the description of SQL/MM DM, especially then all user-defined types given by standard for association rules as well as common types which create framework for data mining. Before the description of implementation these types, there is mentioned the instruments which are used for that - programming language PL/SQL and Oracle Data Mining support. The accuracy of implementation is verified by a sample application. In the conclusion, achieved results are evaluated and possible continuation of this work is mentioned.
|
76 |
Metodika vývoje a nasazování Business Intelligence v malých a středních podnicích / Methodology of development and deployment of Business Intelligence solutions in Small and Medium Sized EnterprisesRydzi, Daniel January 2005 (has links)
Dissertation thesis deals with development and implementation of Business Intelligence (BI) solutions for Small and Medium Sized Enterprises (SME) in the Czech Republic. This thesis represents climax of author's up to now effort that has been put into completing a methodological model for development of this kind of applications for SMEs using self-owned skills and minimum of external resources and costs. This thesis can be divided into five major parts. First part that describes used technologies is divided into two chapters. First chapter describes contemporary state of Business Intelligence concept and it also contains original taxonomy of Business Intelligence solutions. Second chapter describes two Knowledge Discovery in Databases (KDD) techniques that were used for building those BI solutions that are introduced in case studies. Second part describes the area of Czech SMEs, which is an environment where the thesis was written and which it is meant to contribute to. This environment is represented by one chapter that defines the differences of SMEs against large corporations. Furthermore, there are author's reasons why he is personally focusing on this area explained. Third major part introduces the results of survey that was conducted among Czech SMEs with support of Department of Information Technologies of Faculty of Informatics and Statistics of University of Economics in Prague. This survey had three objectives. First one was to map the readiness of Czech SMEs for BI solutions development and deployment. Second was to determine major problems and consequent decisions of Czech SMEs that could be supported by BI solutions and the third objective was to determine top factors preventing SMEs from developing and deploying BI solutions. Fourth part of the thesis is also the core one. In two chapters there is the original Methodology for development and deployment of BI solutions by SMEs described as well as other methodologies that were studied. Original methodology is partly based on famous CRISP-DM methodology. Finally, last part describes particular company that has become a testing ground for author's theories and that supports his research. In further chapters it introduces case-studies of development and deployment of those BI solutions in this company, that were build using contemporary BI and KDD techniques with respect to original methodology. In that sense, these case-studies verified theoretical methodology in real use.
|
77 |
Extraction de connaissances pour la modélisation tri-dimensionnelle de l'interactome structural / Knowledge-based approaches for modelling the 3D structural interactomeGhoorah, Anisah W. 22 November 2012 (has links)
L'étude structurale de l'interactome cellulaire peut conduire à des découvertes intéressantes sur les bases moléculaires de certaines pathologies. La modélisation par homologie et l'amarrage de protéines ("protein docking") sont deux approches informatiques pour modéliser la structure tri-dimensionnelle (3D) d'une interaction protéine-protéine (PPI). Des études précédentes ont montré que ces deux approches donnent de meilleurs résultats quand des données expérimentales sur les PPIs sont prises en compte. Cependant, les données PPI ne sont souvent pas disponibles sous une forme facilement accessible, et donc ne peuvent pas être re-utilisées par les algorithmes de prédiction. Cette thèse présente une approche systématique fondée sur l'extraction de connaissances pour représenter et manipuler les données PPI disponibles afin de faciliter l'analyse structurale de l'interactome et d'améliorer les algorithmes de prédiction par la prise en compte des données PPI. Les contributions majeures de cette thèse sont de : (1) décrire la conception et la mise en oeuvre d'une base de données intégrée KBDOCK qui regroupe toutes les interactions structurales domaine-domaine (DDI); (2) présenter une nouvelle méthode de classification des DDIs par rapport à leur site de liaison dans l'espace 3D et introduit la notion de site de liaison de famille de domaines protéiques ("domain family binding sites" ou DFBS); (3) proposer une classification structurale (inspirée du système CATH) des DFBSs et présenter une étude étendue sur les régularités d'appariement entre DFBSs en terme de structure secondaire; (4) introduire une approche systématique basée sur le raisonnement à partir de cas pour modéliser les structures 3D des complexes protéiques à partir des DDIs connus. Une interface web (http://kbdock.loria.fr) a été développée pour rendre accessible le système KBDOCK / Understanding how the protein interactome works at a structural level could provide useful insights into the mechanisms of diseases. Comparative homology modelling and ab initio protein docking are two computational methods for modelling the three-dimensional (3D) structures of protein-protein interactions (PPIs). Previous studies have shown that both methods give significantly better predictions when they incorporate experimental PPI information. However, in general, PPI information is often not available in an easily accessible way, and cannot be re-used by 3D PPI modelling algorithms. Hence, there is currently a need to develop a reliable framework to facilitate the reuse of PPI data. This thesis presents a systematic knowledge-based approach for representing, describing and manipulating 3D interactions to study PPIs on a large scale and to facilitate knowledge-based modelling of protein-protein complexes. The main contributions of this thesis are: (1) it describes an integrated database of non-redundant 3D hetero domain interactions; (2) it presents a novel method of describing and clustering DDIs according to the spatial orientations of the binding partners, thus introducing the notion of "domain family-level binding sites" (DFBS); (3) it proposes a structural classification of DFBSs similar to the CATH classification of protein folds, and it presents a study of secondary structure propensities of DFBSs and interaction preferences; (4) it introduces a systematic case-base reasoning approach to model on a large scale the 3D structures of protein complexes from existing structural DDIs. All these contributions have been made publicly available through a web server (http://kbdock.loria.fr)
|
78 |
Apport des images satellites à très haute résolution spatiale couplées à des données géographiques multi-sources pour l’analyse des espaces urbains / Contribution of very high spatial resolution satellite images combined with multi-sources geographic data to analyse urban spacesRougier, Simon 28 September 2016 (has links)
Les villes sont confrontées à de nombreuses problématiques environnementales. Leurs gestionnaires ont besoin d'outils et d'une bonne connaissance de leur territoire. Un objectif est de mieux comprendre comment s'articulent les trames grise et verte pour les analyser et les représenter. Il s'agit aussi de proposer une méthodologie pour cartographier la structure urbaine à l'échelle des tissus en tenant compte de ces trames. Les bases de données existantes ne cartographient pas la végétation de manière exhaustive. Ainsi la première étape est d'extraire la végétation arborée et herbacée à partir d'images satellites Pléiades par une analyse orientée-objet et une classification par apprentissage actif. Sur la base de ces classifications et de données multi-sources, la cartographie des tissus se base sur une démarche d'extraction de connaissances à partir d'indicateurs issus de l'urbanisme et de l'écologie du paysage. Cette méthodologie est construite sur Strasbourg puis appliquée à Rennes. / Climate change presents cities with significant environmental challenges. Urban planners need decision-making tools and a better knowledge of their territory. One objective is to better understand the link between the grey and the green infrastructures in order to analyse and represent them. The second objective is to propose a methodology to map the urban structure at urban fabric scale taking into account the grey and green infrastructures. In current databases, vegetation is not mapped in an exhaustive way. Therefore the first step is to extract tree and grass vegetation using Pléiades satellite images using an object-based image analysis and an active learning classification. Based on those classifications and multi-sources data, an approach based on knowledge discovery in databases is proposed. It is focused on set of indicators mostly coming from urbanism and landscape ecology. The methodology is built on Strasbourg and applied on Rennes to validate and check its reproducibility.
|
79 |
有關對調適與演化機制的再審思-在財務時間序列資料中應用的統計分析 / Rethinking the Appeal of Adaptation and Evolution: Statistical Analysis of Empirical Study in the Financial Time Series林維垣 Unknown Date (has links)
本研究的主要目的是希望喚起國內、外學者對演化科學在經濟學上的重視,結合電腦、生物科技、心理學與數學於經濟學中,希望對傳統經濟學上因簡化假設而無法克服的實際經濟問題,可以利用電腦模擬技術獲得解決,並獲取新知與技能。
本研究共有六章,第一章為緒論,敘述緣由與研究動機。第二章介紹傳統經濟學的缺失,再以資料掘取知識及智慧系統建構金融市場。第三章則介紹各種不同人工智慧的方法以模擬金融市場的投資策略。第四章建立無結構性變遷時間序列模型--交易策略電腦模擬分析,僅以遺傳演算法模擬金融市場的投資策略,分別由投資組合、交易成本、調適性、演化、與統計的觀點對策略作績效評分析。第五章則建立簡單的結構性變遷模型,分別由調適性與統計的觀點,採取遺傳演算法再對投資策略進行有效性評估分析。第六章則利用資料掘取知識與智慧系統結合計量經濟學的方法,建構遺傳演算法發展投資策略的步驟,以台灣股票市場的資料進行實証研究,分別就投資策略、交易成本、調適性與演化的觀點作分析。最後一章則為結論。
未來研究的方向有:
1. 其他各種不同人工智慧的方法的比較分析,如人工神經網路、遺傳規劃法等進行績效的交叉比較分析。
2. 利用分類系統(Classifier System)與模糊邏輯的方法,改善標準遺傳演算法對策略編碼的效率,並建構各種不同的複雜策略以符合真實世界的決策過程。
3. 建構其他人工時間資料的模擬比較分析,例如ARCH (Autoregressive Conditional Heteroskedasticity)模型、Threshold 模型、 確定性(Deterministic) 模型等其他時間序列模型與更複雜的結構性變遷模型。
4. 進一步研究遺傳演算法所使用的完整資訊(例如,各種不同指標的選取)。
5. 本研究係採用非即時分析系統(Offline System),進一步研究即時分析系統 (Online Sysetem)在實務上是有必要的。 / Historically, the study of economics has been advanced by a combination of empirical observation and theoretic development. The analysis of mathematical equilibrium in theoretical economic models has been the predominant mode of progress in recent decades. Such models provide powerful insights into economic processes, but usually make restrictive assumptions and appear to be over simplifications of complex economic system. However, the advent of cheap computing power and new intelligent technologies makes it possible to delve further into some of the complexities inherent in the real economy. It is now feasible to create a rudimentary form of “artificial economic life”.
First, we build the framework of artificial stock markets by using data mining and intelligent system. Second, in order to analyze competition among buyers and sellers in the artificial market, we introduce various methods of artificial intelligence to design trading rules, and investigate how machine-learning techniques might be applied to search the optimal investment strategy. Third, we create a miniature economic laboratory to build the artificial stock market by genetic algorithms to analyze investment strategies, by using real and artificial data, which consider both structural change and nonstructural change cases. Finally, we use statistical analysis to examine the performance of the portfolio strategies generated by genetic algorithms.
|
80 |
An?lise dos indicadores de qualidade versus taxa de abandono utilizando m?todo de regress?o m?ltipla para servi?o de banda largaFernandes Neto, Andr? Pedro 20 June 2008 (has links)
Made available in DSpace on 2014-12-17T14:52:36Z (GMT). No. of bitstreams: 1
AndrePFN.pdf: 1525936 bytes, checksum: edb576494fd35f42e78d512df4fc02df (MD5)
Previous issue date: 2008-06-20 / Telecommunication is one of the most dynamic and strategic areas in the world. Many technological innovations has modified the way information is exchanged. Information and knowledge are now shared in networks. Broadband Internet is the new way of sharing contents and information.
This dissertation deals with performance indicators related to maintenance services of telecommunications networks and uses models of multivariate regression to estimate churn, which is the loss of customers to other companies. In a competitive environment, telecommunications companies have devised strategies to minimize the loss of customers.
Loosing customers presents a higher cost than obtaining new ones. Corporations have plenty of data stored in a diversity of databases. Usually the data are not explored properly. This work uses the Knowledge Discovery in Databases (KDD) to establish rules and new models to explain how churn, as a dependent variable, are related to a diversity of service indicators, such as time to deploy the service (in hours), time to repair (in hours), and so on. Extraction of meaningful knowledge is, in many cases, a challenge. Models were tested and statistically analyzed. The work also shows results that allows the analysis and identification of which quality services indicators influence the churn. Actions are also proposed to solve, at least in part, this problem / A ?rea de telecomunica??es ? uma das mais estrat?gicas e din?micas do mundo atual. Esse fato se deve a in?meras inova??es tecnol?gicas que afetaram a forma como as
informa??es trafegam. O conhecimento deixou de ser percebido como um ac?mulo linear, l?gico e cronol?gico de informa??es e passou a ser visto como uma constru??o em rede, consequentemente a massifica??o da Internet banda larga em alta velocidade teve grande influ?ncia sobre esse fen?meno. Essa disserta??o aborda um estudo sobre medi??o de desempenho e servi?os de manuten??o em
telecomunica??es, com o uso de ferramentas de descoberta de conhecimento em base de dados (KDD). Objetiva-se transformar informa??es, armazenadas nas bases de
dados de uma grande empresa de telecomunica??es do pa?s, em conhecimento ?til. A metodologia de pesquisa utilizada focou no uso de an?lise de regress?o m?ltipla como
ferramenta para estimar a taxa de abandono de clientes em servi?os de Internet de banda larga, como vari?vel dependente, e indicadores de qualidade de servi?o como vari?veis independentes. Modelos foram testados e analisados estatisticamente. O trabalho apresenta resultados
que permitem analisar e identificar quais os indicadores de qualidade que exercem maior influ?ncia na taxa de abandono dos clientes. S?o propostas sugest?es que possam ser
aplicadas para melhoria de qualidade do servi?o percebido e consequentemente diminui??es das perdas com a taxa de abandono
|
Page generated in 0.0674 seconds