Spelling suggestions: "subject:"[een] DECISION TREE"" "subject:"[enn] DECISION TREE""
271 |
Árbol de decisión para la selección de un motor de base de datos / Decision tree for the selection of database engineBendezú Kiyán , Enrique Renato, Monjaras Flores, Álvaro Gianmarco 30 August 2020 (has links)
Desde los últimos años, la cantidad de usuarios que navega en internet ha crecido exponencialmente. Por consecuencia, la cantidad de información que se maneja crece a manera desproporcionada y, por ende, el manejo de grandes volúmenes de información obtenidos de internet ha ocasionado grandes problemas.
Los diferentes tipos de bases de datos tienen un funcionamiento variado, dado que, se ve afectado el rendimiento para ejecutar las transacciones cuando se lidia con diferentes cantidades de información. Entre este tipo de variedades, se analizará las bases de datos relacionales, bases de datos no relaciones y bases de datos en memoria.
Para las organizaciones es muy importante contar con un acelerado manejo de información debido a la gran demanda por parte de los clientes y el mercado en general, permitiendo que no se disminuya la agilidad de operación interna cuando se requiera manejar información, y conservar la integridad de esta. Sin embargo, cada categoría de base de datos está diseñada para cubrir diferentes casos de usos específicos para mantener un alto rendimiento con respecto al manejo de los datos.
El presente proyecto tiene como objetivo el estudio de diversos escenarios de los principales casos de uso, costos, aspectos de escalabilidad y rendimiento de cada base de datos, mediante la elaboración de un árbol de decisión, en el cual, se determine la mejor opción de categoría de base de datos según el flujo que decida tomar el usuario.
Palabras clave: Base de Datos, Base de Datos Relacional, Base de Datos No Relacional, Base de Datos en Memoria, Árbol de Decisión. / In recent years, the number of users browsing the internet has grown exponentially. Consequently, the amount of information handled grows disproportionately and, therefore, the handling of large volumes of information obtained from the Internet has caused major problems.
Different types of databases work differently, since the performance of executing transactions suffers when dealing with different amounts of information. Among this type of varieties, relational databases, non-relationship databases and in-memory databases will be analyzed.
For organizations it is very important to have an accelerated information management due to the great demand from customers and the market in general, allowing the agility of internal operation to not be diminished when it is required to manage information, and to preserve the integrity of is. However, each category of database is designed to cover different specific use cases to maintain high performance regarding data handling.
The purpose of this project is to study various scenarios of the main use cases, costs, scalability and performance aspects of each database, through the development of a decision tree, in which the best option for database category according to the flow that the user decides to take. / Tesis
|
272 |
Získávání znalostí pro modelování následných akcí / Data Mining for Suggesting Further ActionsVeselovský, Martin January 2017 (has links)
Knowledge discovery from databases is a complex issue involving integration, data preparation, data mining using machine learning methods and visualization of results. The thesis deals with the whole process of knowledge discovery, especially with the issue of data warehousing, where it offers the design and implementation of a specific data warehouse for the company ROI Hunter, a.s. In the field of data mining, the work focuses on the classification and forecasting of the advertising data available from the prepared data warehouse and, in particular, on the decision tree classification. When predicting the development of new ads, emphasis is put on the rationale for the prediction as well as the proposal to adjust the ad settings so that the prediction ends positively and, with a certain likelihood, the ads actually get better results.
|
273 |
Identifikace objektů v obraze / The idnetification of the objects in the imegeZavalina, Viktoriia January 2014 (has links)
Master´s thesis deals with methods of objects detection in the image. It contains theoretical, practical and experimental parts. Theoretical part describes image representation, the preprocessing image methods, and methods of detection and identification of objects. The practical part contains a description of the created programs and algorithms which were used in the programs. Application was created in MATLAB. The application offers intuitive graphical user interface and three different methods for the detection and identification of objects in an image. The experimental part contains a test results for an implemented program.
|
274 |
Umělá inteligence ve hře Bang! / Artificial Intelligence in Bang! GameKolář, Vít January 2010 (has links)
The goal of this master's thesis is to create an artificial intelligence for the Bang! game. There is a full description of the Bang! game, it's entire rules, player's using strategy principles and game analysis from UI point of view included. The thesis also resumes methods of the artificial intelligence and summarizes basic information about the domain of game theory. Next part describes way of the implementation in C++ language and it's proceeding with use of Bayes classification and decision trees based on expert systems. Last part represent analysis of altogether positive results and the conclusion with possible further extensions.
|
275 |
Adaptivní klient pro sociální síť Twitter / Adaptive Client for Twitter Social NetworkGuňka, Jiří January 2011 (has links)
The goal of this term project is create user friendly client of Twitter. They may use methods of machine learning as naive bayes classifier to mentions new interests tweets. For visualissation this tweets will be use hyperbolic trees and some others methods.
|
276 |
Prise en compte économique du long terme dans les choix énergétiques relatifs à la gestion des déchets radioactifs / Economic analysis of long-term energy choices related to the radioactive waste managementDoan, Phuong Hoai Linh 07 December 2017 (has links)
Actuellement, bien que la plupart des pays nucléaires converge vers la même solution technique: le stockage profond pour la gestion des déchets radioactifs de haute activité et à vie longue, les objectifs calendaires divergent d'un pays à l'autre. Grâce au calcul économique, nous souhaitons apporter des éléments de réponse à la question suivante : En termes de temporalité, comment les générations présentes, qui bénéficient de la production d'électricité nucléaire, doivent-elles supporter les charges de la gestion des déchets radioactifs en tenant compte des générations futures ? Cette thèse se propose d'analyser spécifiquement la décision française en tenant compte de son contexte. Nous proposons un ensemble d'outils qui permet d'évaluer l'Utilité du projet de stockage profond en fonction des choix de temporalité. Notre thèse étudie également l'influence en retour des choix de stockage sur le cycle du combustible nucléaire. Au-delà, nous prenons en compte les interactions entre le stockage profond et les choix de parc nucléaire et de cycle du combustible qui constituent un « système complet ». / Nowadays, the deep geological repository is generally considered as the reference solution for the definitive management of spent nuclear fuel/high-level waste, but different countries have decided different disposal deployment schedules. Via the economic calculation, we hope to offer some answers to the following question: In terms of disposal time management, how should the present generations, benefiting from the nuclear power generation, bear the costs of radioactive waste management, while taking into account future generations? This thesis proposes to analyze specifically the French decision in its context. We propose a set of tools to evaluate the Utility of the deep geological repository project according to the deployment schedule choices. Our thesis also studies the influence of disposal choices on the nuclear fuel cycle. Beyond, we also take into account the interactions between the deep geological repository, nuclear fleet and cycle choices which constitute a "complete system".
|
277 |
Automatic Patent ClassificationYehe, Nala January 2020 (has links)
Patents have a great research value and it is also beneficial to the community of industrial, commercial, legal and policymaking. Effective analysis of patent literature can reveal important technical details and relationships, and it can also explain business trends, propose novel industrial solutions, and make crucial investment decisions. Therefore, we should carefully analyze patent documents and use the value of patents. Generally, patent analysts need to have a certain degree of expertise in various research fields, including information retrieval, data processing, text mining, field-specific technology, and business intelligence. In real life, it is difficult to find and nurture such an analyst in a relatively short period of time, enabling him or her to meet the requirement of multiple disciplines. Patent classification is also crucial in processing patent applications because it will empower people with the ability to manage and maintain patent texts better and more flexible. In recent years, the number of patents worldwide has increased dramatically, which makes it very important to design an automatic patent classification system. This system can replace the time-consuming manual classification, thus providing patent analysis managers with an effective method of managing patent texts. This paper designs a patent classification system based on data mining methods and machine learning techniques and use KNIME software to conduct a comparative analysis. This paper will research by using different machine learning methods and different parts of a patent. The purpose of this thesis is to use text data processing methods and machine learning techniques to classify patents automatically. It mainly includes two parts, the first is data preprocessing and the second is the application of machine learning techniques. The research questions include: Which part of a patent as input data performs best in relation to automatic classification? And which of the implemented machine learning algorithms performs best regarding the classification of IPC keywords? This thesis will use design science research as a method to research and analyze this topic. It will use the KNIME platform to apply the machine learning techniques, which include decision tree, XGBoost linear, XGBoost tree, SVM, and random forest. The implementation part includes collection data, preprocessing data, feature word extraction, and applying classification techniques. The patent document consists of many parts such as description, abstract, and claims. In this thesis, we will feed separately these three group input data to our models. Then, we will compare the performance of those three different parts. Based on the results obtained from these three experiments and making the comparison, we suggest using the description part data in the classification system because it shows the best performance in English patent text classification. The abstract can be as the auxiliary standard for classification. However, the classification based on the claims part proposed by some scholars has not achieved good performance in our research. Besides, the BoW and TFIDF methods can be used together to extract efficiently the features words in our research. In addition, we found that the SVM and XGBoost techniques have better performance in the automatic patent classification system in our research.
|
278 |
Spatial patterns of humus forms, soil organisms and soil biological activity at high mountain forest sites in the Italian AlpsHellwig, Niels 24 October 2018 (has links)
The objective of the thesis is the model-based analysis of spatial patterns of decomposition properties on the forested slopes of the montane level (ca. 1200-2200 m a.s.l.) in a study area in the Italian Alps (Val di Sole / Val di Rabbi, Autonomous Province of Trento). The analysis includes humus forms and enchytraeid assemblages as well as pH values, activities of extracellular enzymes and C/N ratios of the topsoil. The first aim is to develop, test and apply data-based techniques for spatial modelling of soil ecological parameters. This methodological approach is based on the concept of digital soil mapping. The second aim is to reveal the relationships between humus forms, soil organisms and soil microbiological parameters in the study area. The third aim is to analyze if the spatial patterns of indicators of decomposition differ between the landscape scale and the slope scale.
At the landscape scale, sample data from six sites are used, covering three elevation levels at both north- and south-facing slopes. A knowledge-based approach that combines a decision tree analysis with the construction of fuzzy membership functions is introduced for spatial modelling. According to the sampling design, elevation and slope exposure are the explanatory variables.
The investigations at the slope scale refer to one north-facing and one south-facing slope, with 30 sites occurring on each slope. These sites have been derived using conditioned Latin Hypercube Sampling, and thus reasonably represent the environmental conditions within the study area. Predictive maps have been produced in a purely data-based approach with random forests.
At both scales, the models indicate a high variability of spatial decomposition patterns depending on the elevation and the slope exposure. In general, sites at high elevation on north-facing slopes almost exclusively exhibit the humus forms Moder and Mor. Sites on south-facing slopes and at low elevation exhibit also Mull and Amphimull. The predictions of those enchytraeid species characterized as Mull and Moder indicators match the occurrence of the corresponding humus forms well. Furthermore, referencing the mineral topsoil, the predictive models show increasing pH values, an increasing leucine-aminopeptidase activity, an increasing ratio alkaline/acid phosphomonoesterase activity and a decreasing C/N ratio from north-facing to south-facing slopes and from high to low elevation.
The predicted spatial patterns of indicators of decomposition are basically similar at both scales. However, the patterns are predicted in more detail at the slope scale because of a larger data basis and a higher spatial precision of the environmental covariates. These factors enable the observation of additional correlations between the spatial patterns of indicators of decomposition and environmental influences, for example slope angle and curvature. Both the corresponding results and broad model evaluations have shown that the applied methods are generally suitable for modelling spatial patterns of indicators of decomposition in a heterogeneous high mountain environment. The overall results suggest that the humus form can be used as indicator of organic matter decomposition processes in the investigated high mountain area.
|
279 |
Vytvoření modulu pro dolování dat z databází / Creation of Unit for DataminingKrásenský, David Unknown Date (has links)
The goal of this work is to create data mining module for information system Belinda. Data from database of clients will be analyzed using SAS Enterprise Miner. Results acquired using several data mining methods will be compared. During the second phase selected data mining method will be implemented such as module of information system Belinda. The final part of this work is evaluation of acquired results and possibility of using this module.
|
280 |
Metody klasifikace www stránek / Methods for Classification of WWW PagesSvoboda, Pavel January 2009 (has links)
The main goal of this master's thesis was to study the main principles of classification methods. Basic principles of knowledge discovery process, data mining and using an external class CSSBox are described. Special attantion was paid to implementation of a ,,k-nearest neighbors`` classification method. The first objective of this work was to create training and testing data described by 'n' attributes. The second objective was to perform experimental analysis to determine a good value for 'k', the number of neighbors.
|
Page generated in 0.0488 seconds