Spelling suggestions: "subject:"[een] DECISION TREE"" "subject:"[enn] DECISION TREE""
51 |
Detecting students who are conducting inquiry Without Thinking Fastidiously (WTF) in the Context of Microworld Learning EnvironmentsWixon, Michael 09 April 2013 (has links)
In recent years, there has been increased interest and research on identifying the various ways that students can deviate from expected or desired patterns while using educational software. This includes research on gaming the system, player transformation, haphazard inquiry, and failure to use key features of the learning system. Detection of these sorts of behaviors has helped researchers to better understand these behaviors, thus allowing software designers to develop interventions that can remediate them and/or reduce their negative impacts on student learning. This work addresses two types of student disengagement: carelessness and a behavior we term WTF (“Without Thinking Fastidiously”) behavior. Carelessness is defined as not demonstrating a skill despite knowing it; we measured carelessness using a machine learned model. In WTF behavior, the student is interacting with the software, but their actions appear to have no relationship to the intended learning task. We discuss the detector development process, validate the detectors with human labels of the behavior, and discuss implications for understanding how and why students conduct inquiry without thinking fastidiously while learning in science inquiry microworlds. Following this work we explore the relationship between student learner characteristics and the aforementioned disengaged behaviors carelessness and WTF. Our goal was to develop a deeper understanding of which learner characteristics correlate to carelessness or WTF behavior. Our work examines three alternative methods for predicting carelessness and WTF behaviors from learner characteristics: simple correlations, k-means clustering, and decision tree rule learners.
|
52 |
Classifying textual fast food restaurant reviews quantitatively using text mining and supervised machine learning algorithmsWright, Lindsey 01 May 2018 (has links)
Companies continually seek to improve their business model through feedback and customer satisfaction surveys. Social media provides additional opportunities for this advanced exploration into the mind of the customer. By extracting customer feedback from social media platforms, companies may increase the sample size of their feedback and remove bias often found in questionnaires, resulting in better informed decision making. However, simply using personnel to analyze the thousands of relative social media content is financially expensive and time consuming. Thus, our study aims to establish a method to extract business intelligence from social media content by structuralizing opinionated textual data using text mining and classifying these reviews by the degree of customer satisfaction. By quantifying textual reviews, companies may perform statistical analysis to extract insight from the data as well as effectively address concerns. Specifically, we analyzed a subset of 56,000 Yelp reviews on fast food restaurants and attempt to predict a quantitative value reflecting the overall opinion of each review. We compare the use of two different predictive modeling techniques, bagged Decision Trees and Random Forest Classifiers. In order to simplify the problem, we train our model to accurately classify strongly negative and strongly positive reviews (1 and 5 stars) reviews. In addition, we identify drivers behind strongly positive or negative reviews allowing businesses to understand their strengths and weaknesses. This method provides companies an efficient and cost-effective method to process and understand customer satisfaction as it is discussed on social media.
|
53 |
Learning From Spatially Disjoint DataBhadoria, Divya 02 April 2004 (has links)
Committees of classifiers, also called mixtures or ensembles of classifiers, have become popular because they have the potential to improve on the performance of a single classifier constructed from the same set of training data. Bagging and boosting are some of the better known methods of constructing a committee of classifiers. Committees of classifiers are also important because they have the potential to provide a computationally scalable approach to handling massive datasets. When the emphasis is on computationally scalable approaches to handling massive datasets, the individual classifiers are often constructed from a small faction of the total data. In this context, the ability to improve on the accuracy of a hypothetical single classifier created from all of the training data may be sacrificed.
The design of a committee of classifiers typically assumes that all of the training data is equally available to be assigned to subsets as desired, and that each subset is used to train a classifier in the committee. However, there are some important application contexts in which this assumption is not valid. In many real life situations, massive data sets are created on a distributed computer, recording the simulation of important physical processes.
Currently, experts visually browse such datasets to search for interesting events in the simulation. This sort of manual search for interesting events in massive datasets is time consuming. Therefore, one would like to construct a classifier that could automatically label the "interesting" events. The problem is that the dataset is distributed across a large number of processors in chunks that are spatially homogenous with respect to the underlying physical context in the simulation. Here, a potential solution to this problem using ensembles is explored.
|
54 |
Detecção automática de voçorocas a partir da análise de imagens baseada em objetos geográficos - GEOBIA /Utsumi, Alex Garcez. January 2019 (has links)
Orientador: Teresa Cristina Tarlé Pissarra / Coorientador: David Luciano Rosalen / Banca: Luiz Henrique da Silva Rotta / Banca: Marcílio Vieira Martins Filho / Banca: Rejane Ennes Cicerelli / Banca: Newton La Scxala Junior / Resumo: A voçoroca é o estágio mais avançado da erosão hídrica, causando inúmeros prejuízos para o meio ambiente e para o homem. Devido à extensão desse fenômeno e a dificuldade de acesso em campo, as técnicas de detecção automática de voçorocas têm despertado interesse, especialmente por meio da Análise de Imagens Baseada em Objetos Geográficos (GEOBIA). O objetivo desse trabalho foi mapear voçorocas utilizando a GEOBIA a partir de imagens RapidEye e dados SRTM, em duas regiões localizadas em Uberaba, Minas Gerais. Para isso, foi proposto aplicar o Índice de Avaliação da Segmentação (SEI) na etapa de segmentação da imagem. A criação das regras para detecção das voçorocas foi realizada de forma empírica, no software InterIMAGE, e de forma automática, a partir do algoritmo de árvore de decisão. A avaliação da acurácia foi realizada por meio dos coeficientes de concordância extraídos da matriz de confusão e, adicionalmente, a partir da sobreposição com dados de referência vetorizados manualmente. O índice SEI proporcionou a criação de objetos semelhantes às voçorocas, permitindo a extração de atributos específicos desses alvos. As regras de classificação do modelo empírico permitem detectar voçorocas nas duas áreas de estudos, ainda que essas feições ocupem uma pequena porção da cena. Os modelos empíricos alcançaram resultados satisfatórios: índice Kappa de 0,74 e F-measure de 53,46% na área 1, e índice Kappa de 0,73 e F-measure de 55,95% na área 2. A informação altimétrica mostrou ser... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Gully is the most advanced stage of water erosion, causing numerous damages to the environment and man. Due to the extension of this phenomenon and the difficulty of access in the field, automatic gully detection techniques have aroused interest, especially through Geographic Object Based Image Analysis (GEOBIA). The objective of this work was to map gullies using GEOBIA from RapidEye images and SRTM data, in two regions located in Uberaba, Minas Gerais. It was proposed to apply the Segmentation Evaluation Index (SEI) in the image segmentation stage. The rule set creation for gully detection was made empirically in the InterIMAGE software, and automatically, from the decision tree algorithm. The accuracy assessment was performed based on concordance coefficients extracted from the confusion matrix and, additionally, overlapping manually digitized reference data. The SEI index allowed the creation of objects similar to real gullies, providing the extraction of specific attributes of these targets. Empirical model rule set allowed gully detection on both study areas, although these features occupied a small portion of the scene. Empirical models have achieved very good results: Kappa index of 0.74 and F-measure of 53.46% in area 1, and Kappa index of 0.73 and F-measure of 55.95% in area 2. Altimetric information proved to be an important parameter for gully detection, since slope removal from the empirical models reduced the F-measure index by 34,90% in area 1 and 28,65% in are... (Complete abstract click electronic access below) / Doutor
|
55 |
Classification techniques for hyperspectral remote sensing image dataJia, Xiuping, Electrical Engineering, Australian Defence Force Academy, UNSW January 1996 (has links)
Hyperspectral remote sensing image data, such as that recorded by AVIRIS with 224 spectral bands, provides rich information on ground cover types. However, it presents new problems in machine assisted interpretation, mainly in long processing times and the difficulties of class training due to the low ratio of number of training samples to the number of bands. This thesis investigates feasible and efficient feature reduction and image classification techniques which are appropriate for hyperspectral image data. The study is reported in three parts. The first concerns a deterministic approach for hyperspectral data interpretation. Multigroup and multiple threshold spectral coding procedures, and associated techniques for spectral matching and classification, are proposed and tested. By coding on subgroups of bands using one or three thresholds, spectral searching and matching becomes simple, fast and free of the need for radiometric correction. Modifications of existing statistical techniques are proposed in the second part of the investigation A block-based maximum likelihood classification technique is developed. Several subgroups are formed from the complete set of spectral bands in the data, based on the properties of global correlation among the bands. Subgroups which are poorly correlated with each other are treated independently using conventional maximum likelihood classification. Experimental results demonstrate that, when using appropriate subgroup sizes, the new method provides a compromise among classification accuracy, processing time and available training pixels. Furthermore, a segmented, and possibly multi-layer, principal components transformation is proposed as a possible feature reduction technique prior to classification, and for effective colour display. The transformation is performed efficiently on each of the highly correlated subgroups of bands independently. Selected features from each transformed subgroup can be then transformed again to achieve a satisfactory data reduction ratio and to generate the three most significant components for colour display. Classification accuracy is improved and high quality colour image display is achieved in experiments using two AVIRIS data sets.
|
56 |
The Decision making processes of semi-commercial farmers: a case study of technology adoption in IndonesiaSambodo, Leonardo Adypurnama Alias Teguh January 2007 (has links)
An exploration of the creation and use of farmers' commonly used "rules of thumb" is required to conceptualize farmers' decision making processes. While farmers face complex situations, particularly when subsistence is an issue, they do appear to use simple rules in their decision making. To date inadequate attention has been given to understanding their reasoning processes in creating the rules, so this study traces the origins of farmers' beliefs, and extracts the decisive and dynamic elements in their decision making systems to provide this understanding.
The analysis was structured by using a model based on the Theory of Planned Behaviour (TPB). Modifications included recognizing a bargaining process (BP) and other decision stimuli to represent socio-cultural influences and sources of perception, respectively. Two analyses based on the Personal Construct Theory (PCT) and the Ethnographic Decision Tree Modelling (EDTM) were also applied to help elaborate the farmers' cognitive process and actual decision criteria. The method involved interviews in two villages in Lamongan Regency in East Java Province of Indonesia, where the farmers adopted an improved paddy-prawn system ("pandu").
The results highlighted that farmers use rational strategies, and that socio-cultural factors influence decision making. This was represented by interactions between the farmers' perceptions, their bargaining effort, and various background factors. The TPB model revealed that the farmers' perceptions about the potential of "pandu", and the interaction with their "significant others", influenced their intention to adopt "pandu". The farmers appeared to prefer a steady income and familiar practices at the same time as obtaining new information, mainly from their peers. When "pandu" failed to show sufficiently profitable results, most farmers decided to ignore or discontinue "pandu". This became the biggest disincentive to a wide and sustainable adoption. However, the PCT analysis showed that part of this problem also stemmed from the farmers' lack of resources and knowledge.
The farmers' restrictive conditions also led them to seek socio-cultural and practical support for their actions. This was highlighted by a bargaining process (BP) that integrated what the farmers had learned, and believed, into their adoption behaviour. The BP also captured the farmers' communication strategies when dealing with "pandu" as its adoption affected resource allocation within the family and required cooperation with neighbours. The PCT and EDTM analyses also confirmed how the BP accommodated different sets of decision criteria to form different adoption behaviours. Such a process indicated the importance of considering the adoption decision and the relevant changes resulting from the farmers' cognition. This provided a more dynamic and realistic description of the farmers' decision-making process than has previously been attempted.
Overall, the results suggested that semi-commercial farmers need to know, and confirm, that a new technology is significantly superior to the existing system, and can provide a secure income. The introduction of a new technology should use a participatory approach allowing negotiation, conflict mitigation and the creation of consensus among the relevant parties. This can be supported through better access to knowledge, information and financing. A specific and well-targeted policy intervention may also be needed to accommodate the diversity in the farmers' ways of learning and making decisions. Ways to improve the current analytical approaches are also suggested.
|
57 |
An approach to boosting from positive-only dataMitchell, Andrew, Computer Science & Engineering, Faculty of Engineering, UNSW January 2004 (has links)
Ensemble techniques have recently been used to enhance the performance of machine learning methods. However, current ensemble techniques for classification require both positive and negative data to produce a result that is both meaningful and useful. Negative data is, however, sometimes difficult, expensive or impossible to access. In this thesis a learning framework is described that has a very close relationship to boosting. Within this framework a method is described which bears remarkable similarities to boosting stumps and that does not rely on negative examples. This is surprising since learning from positive-only data has traditionally been difficult. An empirical methodology is described and deployed for testing positive-only learning systems using commonly available multiclass datasets to compare these learning systems with each other and with multiclass learning systems. Empirical results show that our positive-only boosting-like method learns, using stumps as a base learner and from positive data only, successfully, and in the process does not pay too heavy a price in accuracy compared to learners that have access to both positive and negative data. We also describe methods of using positive-only learners on multiclass learning tasks and vice versa and empirically demonstrate the superiority of our method of learning in a boosting-like fashion from positive-only data over a traditional multiclass learner converted to learn from positive-only data. Finally we examine some alternative frameworks, such as when additional unlabelled training examples are given. Some theoretical justifications of the results and methods are also provided.
|
58 |
A Feasibility Study of Setting-up New Production Line : Either Partly Outsource a process or Fully Produce In-HouseCheepweasarash, Piansiri, Pakapongpan, Sarinthorn January 2008 (has links)
<p>This paper presents the feasibility study of setting up the new potting tray production line based on the two alternatives: partly outsource a process in the production line or wholly make all processes in-house. Both the qualitative and quantitative approaches have been exploited to analyze and compare between the make or buy decision. Also the nature of business, particularly SMEs, in Thailand has been presented, in which it has certain characteristics that influence the business doing and decision, especially to the supply chain management. The literature relating to the forecasting techniques, outsourcing decision framework, inventory management, and investment analysis have been reviewed and applied with the empirical findings. As this production line has not yet been in place, monthly sales volumes are forecasted within the five years time frame. Based on the forecasted sales volume, simulations are implemented to distribute the probability and project a certain demand required for each month. The projected demand is used as a baseline to determine required safety stock of materials, inventory cost, time between production runs and resources utilization for each option. Finally, in the quantitative analysis, the five years forecasted sales volume is used as a framework and several decision making-techniques such as break-even analysis, cash flow and decision trees are employed to come up with the results in financial aspects.</p>
|
59 |
Analys av kvalitet i en webbpanel : Studie av webbpanelsmedlemmarna och deras svarsmönsterTran, Vuong, Öhgren, Sebastian January 2013 (has links)
During 2012, the employer of this essay carried out a telephone survey with 18000 participants and a web panel survey with 708 participants. Those who partook in the telephone survey were given a choice to join the web panel. The purpose of this work is to study the participants of the telephone survey and see if they reflect the Swedish population with regards to several socio-demographic factors. Also, we intend to investigate if the propensity to join the web panel differs for participants of the telephone survey with regards to various socio-demographic affiliations. It is also of interest to study if the response pattern is different for participants of the telephone survey that would like to join the web panel and those who reject. A comparison of response pattern between the telephone survey and web panel survey has also been done, to see if there exist any differences for these two groups of surveys. The statistical methods used in this essay are descriptive statistics, multiple logistic regression and decision trees. Conclusions to be drawn with result from these methods are that the participants from the telephone survey do reflect the Swedish population regarding certain socio-demographic factors and that there is a slight difference in propensity to join the web panel for people which have dissimilar socio-demographic affiliation. It has also been found that there is a slight difference in response pattern for participants who would or would not like to join the web panel, as well as differences in response pattern also exist between the telephone survey and the web panel survey.
|
60 |
A Feasibility Study of Setting-up New Production Line : Either Partly Outsource a process or Fully Produce In-HouseCheepweasarash, Piansiri, Pakapongpan, Sarinthorn January 2008 (has links)
This paper presents the feasibility study of setting up the new potting tray production line based on the two alternatives: partly outsource a process in the production line or wholly make all processes in-house. Both the qualitative and quantitative approaches have been exploited to analyze and compare between the make or buy decision. Also the nature of business, particularly SMEs, in Thailand has been presented, in which it has certain characteristics that influence the business doing and decision, especially to the supply chain management. The literature relating to the forecasting techniques, outsourcing decision framework, inventory management, and investment analysis have been reviewed and applied with the empirical findings. As this production line has not yet been in place, monthly sales volumes are forecasted within the five years time frame. Based on the forecasted sales volume, simulations are implemented to distribute the probability and project a certain demand required for each month. The projected demand is used as a baseline to determine required safety stock of materials, inventory cost, time between production runs and resources utilization for each option. Finally, in the quantitative analysis, the five years forecasted sales volume is used as a framework and several decision making-techniques such as break-even analysis, cash flow and decision trees are employed to come up with the results in financial aspects.
|
Page generated in 0.0512 seconds