Global ETD Search

21	An Empirical Application with Data Mining in the Construction of Predictive Model on Corruption Wu, Hsing-yi 03 August 2006 (has links) Now Taiwan is not only the country that facts the corruption threat. The greedy politician and never satisfied merchant unceasingly perform the scandal in the whole world. The national economy and the people¡¦s wealth are also injured. The topic of this research is how to choose the important variable from the corruption case. In recent years the Data Mining technique application in the behavioral analysis of shopping, customer relations management, crime investigation is in fashion; however the Data Mining technique application in politics and social domain is still not enough. In this research, we attempt to introduce the concepts and techniques of Data Mining and use Data Mining technique to set up a selective model for the consideration for the government in the corruption preventing. It attempts to explore the opportunity for the social sciences research. Artificial Neural Network Corruption Decision Tree Data Mining Clustering
22	Text Categorization for E-Government Applications: The Case of City Mayor¡¦s Mailbox Kuo, Chiung-Jung 29 August 2006 (has links) The central government and most of local governments in Taiwan have adopted the e-mail services to provide citizens for requesting services or expressing their opinions through Internet. Traditionally, these requests/opinions need to be manually classified into appropriate departments for service rendering. However, due to the ever-increasing number of requests/opinions received, the manual classification approach is time consuming and becomes impractical. Therefore, in this study, we attempt to apply text categorization techniques for constructing automatically a classification mechanism in order to establish an efficient e-government service portal. The purpose of this thesis is to investigate effectiveness of different text categorization methods in supporting automatic classification of service requests/opinions emails sent to Mayor¡¦s mailbox. Specifically, in each phase of text categorization learning, we adopt and evaluate two methods commonly employed in prior research. In the feature selection phase, both the maximal x2¡@statistic method and the weighted average x2¡@statistic method of x2¡@statistic are evaluated. We consider the Binary and TFxIDF representation schemes in the document representation phase. Finally, we adopt the decision tree induction technique and the support vector machines (SVM) technique for inducing a text categorization model for our target e-government application. Our empirical evaluation results show that the text categorization method that employs the maximal x2 statistic method for feature selection, the Binary representation scheme, and the support vector machines as the underlying induction algorithm can reach an accuracy rate of 77.28% and an recall and precision rates of more than 77%. Such satisfactory classification effectiveness suggests that the text categorization approach can be employed to establish an effective and intelligent e-government service portal. Decision Tree Induction Support Vector Machines E-government Text categorization
23	The Application of Data Mining¡XID3 Decision Tree and Fuzzy Theory on Distribution System Service Restoration Lu, Shao-Yi 23 June 2003 (has links) The distribution system containing numerous protective facilities and switch equipment ranges over a wide boundary . The most urgent problem the dispatcher has to tackle right after the breakdown of distribution system in how to resume as soon as possible power supply for the none-faulted out-of-service area. Therefore, distribution system service restoration is an important and practical subject. During the process of maintaining and operation ,the distribution district collected a lot of data. These data includes substation-related automatic operating data , switch section of doing a feeder , load data of high and low tension customers, and historical distribution system planning information concerning load transfer while operation, etc. The useful resource could help the dispatcher to get the best load transfer on distribution system service restoration. Datum-oriented researcher refers to the exploration of the concealing regulations and knowledge and the techniques regarding comprehensible model. This thesis manages to collect the maintaining and operation data of the distribution district ; it makes use of the information exploration techniques and ID3 decision tree and fuzzy theory to get the load transfer rule and knowledge , and to establish the load transfer model. After the fault occurred, it would help the dispatcher to get the best tactics for the load transfer on distribution system service restoration under the constraint condition. This thesis chooses the underground distribution system in sijhih of the Keeling District Offices of Taiwan Power Company as a sample. That underground distribution system comprising 8 distribution feeder ,73 distribution spaces,16 high tension customers, 28 feeder section, 4 feeder-tie-switch and 19 lateral-tie-switch will be simulated on computers to verify the proposed method for the distribution system service restoration. decision tree distribution system data mining fuzzy theory service restoration
24	Evaluating feature selection in a marketing classification problem Salmeron Perez, Irving Ivan January 2015 (has links) Nowadays machine learning is becoming more popular in prediction andclassification tasks for many fields. In banks, telemarketing area is usingthis approach by gathering information from phone calls made to clientsover the past campaigns. The true fact is that sometimes phone calls areannoying and time consuming for both parts, the marketing department andthe client. This is why this project is intended to prove that feature selectioncould improve machine learning models. A Portuguese bank gathered data regarding phone calls and clientsstatistics information like their actual jobs, salaries and employment statusto determine the probabilities if a person would buy the offered productand/or service. C4.5 decision tree (J48) and multilayer perceptron (MLP)are the machine learning models to be used for the experiments. For featureselection correlation-based feature selection (Cfs), Chi-squared attributeselection and RELIEF attribute selection algorithms will be used. WEKAframework will provide the tools to test and implement the experimentscarried out in this research. The results were very close over the two data mining models with aslight improvement by C4.5 over the correct classifications and MLP onROC curve rate. With these results it was confirmed that feature selectionimproves classification and/or prediction results. Neural networks bank marketing decision tree feature selection
25	Prognosis of Glioblastoma Multiforme Using Textural Properties on MRI Heydari, Maysam Unknown Date No description available. glioblastoma GBM MRI texture machine learning prognosis survival decision tree
26	Decision Trees for Dynamic Decision Making And System Dynamics Modelling Calibration and Expansion 2014 June 1900 (has links) Many practical problems raise the challenge of making decisions over time in the presence of both dynamic complexity and pronounced uncertainty regarding evolution of important factors that affect the dynamics of the system. In this thesis, we provide an end-to-end implementation of an easy-to-use system to confront such challenges. This system gives policy makers a new approach to take complementary advantage of decision analysis techniques and System Dynamics by allowing easy creation, evaluation, and interactive exploration of hybrid models. As an important application of this methodology, we extended a System Dynamic model within the context of West Nile virus transmission in Saskatchewan.
27	Using Decision Tree Voting to Select a Polyhedral Model Loop Transformation Ruvinskiy, Ray January 2013 (has links) Algorithms in fields like image manipulation, sound and signal processing, and statistics frequently employ tight loops. These loops are computationally intensive and CPU-bound, making their performance highly dependent on efficient utilization of the CPU pipeline and memory bus. Recent years have seen CPU pipelines becoming more and more complicated, with features such as branch prediction and speculative execution. At the same time, clock speeds have stopped their prior exponential growth rate due to heat dissipation issues, and multiple cores have become prevalent. These developments have made it more difficult for developers to reason about how their code executes on the CPU, which in turn makes it difficult to write performant code. An automated method to take code and optimize it for most efficient execution would, therefore, be desirable. The Polyhedral Model allows the generation of alternative transformations for a loop nest that are semantically equivalent to the original. The transformations vary the degree of loop tiling, loop fusion, loop unrolling, parallelism, and vectorization. However, selecting the transformation that would most efficiently utilize the architecture remains challenging. Previous work utilizes regression models to select a transformation, using as features hardware performance counter values collected during a sample run of the program being optimized. Due to inaccuracies in the resulting regression model, the transformation selected by the model as the best transformation often yields unsatisfactory performance. As a result, previous work resorts to using a five-shot technique, which entails running the top five transformations suggested by the model and selecting the best one based on their actual runtime. However, for long-running benchmarks, five runs may be take an excessive amount of time. I present a variation on the previous approach which does not need to resort to the five-shot selection process to achieve performance comparable to the best five-shot results reported in previous work. With the transformations in the search space ranked in reverse runtime order, the transformation selected by my classifier is, on average, in the 86th percentile. There are several key contributing factors to the performance improvements attained by my method: formulating the problem as a classification problem rather than a regression problem, using static features in addition to dynamic performance counter features, performing feature selection, and using ensemble methods to boost the performance of the classifier. Decision trees are constructed from pairs of features (performance counters and structural features than can be determined statically from the source code). The trees are then evaluated according to the number of benchmarks for which they select a transformation that performs better than two baseline variants, the original program and the expected runtime if a randomly selected transformation were applied. The top 20 trees vote to select a final transformation.
28	Prognosis of Glioblastoma Multiforme Using Textural Properties on MRI Heydari, Maysam 11 1900 (has links) This thesis addresses the challenge of prognosis, in terms of survival prediction, for patients with Glioblastoma Multiforme brain tumors. Glioblastoma is the most malignant brain tumor, which has a median survival time of no more than a year. Accurate assessment of prognostic factors is critical in deciding amongst different treatment options and in designing stratified clinical trials. This thesis is motivated by two observations. Firstly, clinicians often refer to properties of glioblastoma tumors based on magnetic resonance images when assessing prognosis. However, clinical data, along with histological and most recently, molecular and gene expression data, have been more widely and systematically studied and used in prognosis assessment than image based information. Secondly, patient survival times are often used along with clinical data to conduct population studies on brain tumor patients. Recursive Partitioning Analysis is typically used in these population studies. However, researchers validate and assess the predictive power of these models by measuring the statistical association between survival groups and survival times. In this thesis, we propose a learning approach that uses historical training data to produce a system that predicts patient survival. We introduce a classification model for predicting patient survival class, which uses texture based features extracted from magnetic resonance images as well as other patient properties. Our prognosis approach is novel as it is the first to use image-extracted textural characteristics of glioblastoma scans, in a classification model whose accuracy can be reliably validated by cross validation. We show that our approach is a promising new direction for prognosis in brain tumor patients. glioblastoma GBM MRI texture machine learning prognosis survival decision tree
29	Metodologias para mapeamento de suscetibilidade a movimentos de massa Riffel, Eduardo Samuel January 2017 (has links) O mapeamento de áreas com predisposição à ocorrência de eventos adversos, que resultam em ameaça e danos a sociedade, é uma demanda de elevada importância, principalmente pelo papel que exerce em ações de planejamento, gestão ambiental, territorial e de riscos. Diante disso, este trabalho busca contribuir na qualificação de metodologias e parâmetros morfométricos para mapeamento de suscetibilidade a movimentos de massa através de SIG e Sensoriamento Remoto, um dos objetivos é aplicar e comparar metodologias de suscetibilidade a movimentos de massa, entre elas o Shalstab, e a Árvore de Decisão que ainda é pouco utilizada nessa área. Buscando um consenso acerca da literatura, fez-se necessário organizar as informações referentes aos eventos adversos através de classificação, para isso foram revisados os conceitos relacionados com desastres, tais como suscetibilidade, vulnerabilidade, perigo e risco. Também foi realizado um estudo no município de Três Coroas – RS, onde foram relacionadas as ocorrências de movimentos de massa e as zonas de risco da CPRM. A partir de parâmetros morfométricos, foram identificados padrões de ocorrência de deslizamentos, e a contribuição de fatores como uso, ocupação e declividade. Por fim, foram comparados dois métodos de mapeamento de suscetibilidade, o modelo Shalstab e a Árvore de Decisão. Como dado de entrada dos modelos foram utilizados parâmetros morfométricos, extraídos de imagens SRTM, e amostras de deslizamentos, identificadas por meio de imagens de satélite de alta resolução espacial. A comparação das metodologias e a análise da acurácia obteve uma resposta melhor para a Árvore de Decisão. A diferença, entretanto, foi pouco significativa e ambos podem representar de forma satisfatória o mapa de suscetibilidade. No entanto, o Shalstab apresentou mais limitações, devido à necessidade de dados de maior resolução espacial. A aplicação de metodologias utilizando SIG e Sensoriamento Remoto contribuíram com uma maior qualificação em relação à prevenção de danos ocasionados por movimentos de massa. Ressalta-se, entretanto, a necessidade de inventários consistentes, para obter uma maior confiabilidade na aplicação dos modelos. / The mapping of areas with predisposition to adverse events, which result in threat and damage to society, is a demand of great importance, mainly for the role it plays in planning, environmental, territorial and risk management actions. Therefore, this work seeks to contribute to the qualification of methodologies and morphometric parameters for mapping susceptibility to mass movements through GIS and Remote Sensing, one of the objectives is to apply and compare methodologies of susceptibility to mass movements, among them Shalstab, and the Decision Tree that is still little used in this area. Seeking a consensus about the literature, it was necessary to organize the information regarding the adverse events through classification, for this the concepts related to disasters such as susceptibility, vulnerability, danger and risk were reviewed. A study was also carried out in the city of Três Coroas - RS, where the occurrence of mass movements and the risk zones of CPRM were related. From morphometric parameters, patterns of occurrence of landslides were identified, and the contribution of factors such as use, occupation and declivity. Finally, two methods of susceptibility mapping, the Shalstab model and the Decision Tree, were compared. Morphometric parameters, extracted from SRTM images, and sliding samples, identified by means of high spatial resolution satellite images, were used as input data. The comparison of the methodologies and the analysis of the accuracy obtained a better answer for the Decision Tree. The difference, however, was insignificant and both can represent satisfactorily the map of susceptibility. However, Shalstab presented more limitations due to the need for higher spatial resolution data. The application of methodologies using GIS and Remote Sensing contributed with a higher qualification in relation to the prevention of damages caused by mass movements. However, the need for consistent inventories to obtain greater reliability in the application of the models is emphasized. Deslizamento Desastres Geoprocessamento Landslides Decision Tree Shalstab Geoprocessing Disasters
30	Real-Time Power System Topology Monitoring Supported by Synchrophasor Measurements January 2015 (has links) abstract: ABSTRACT This dissertation introduces a real-time topology monitoring scheme for power systems intended to provide enhanced situational awareness during major system disturbances. The topology monitoring scheme requires accurate real-time topology information to be effective. This scheme is supported by advances in transmission line outage detection based on data-mining phasor measurement unit (PMU) measurements. A network flow analysis scheme is proposed to track changes in user defined minimal cut sets within the system. This work introduces a new algorithm used to update a previous network flow solution after the loss of a single system branch. The proposed new algorithm provides a significantly decreased solution time that is desired in a real- time environment. This method of topology monitoring can provide system operators with visual indications of potential problems in the system caused by changes in topology. This work also presents a method of determining all singleton cut sets within a given network topology called the one line remaining (OLR) algorithm. During operation, if a singleton cut set exists, then the system cannot withstand the loss of any one line and still remain connected. The OLR algorithm activates after the loss of a transmission line and determines if any singleton cut sets were created. These cut sets are found using properties of power transfer distribution factors and minimal cut sets. The topology analysis algorithms proposed in this work are supported by line outage detection using PMU measurements aimed at providing accurate real-time topology information. This process uses a decision tree (DT) based data-mining approach to characterize a lost tie line in simulation. The trained DT is then used to analyze PMU measurements to detect line outages. The trained decision tree was applied to real PMU measurements to detect the loss of a 500 kV line and had no misclassifications. The work presented has the objective of enhancing situational awareness during significant system disturbances in real time. This dissertation presents all parts of the proposed topology monitoring scheme and justifies and validates the methodology using a real system event. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2015 Electrical engineering Decision Tree Maximum Flow PMU Power Systems

Search results