Global ETD Search

1	Predictive analysis at Krononfogden : Classifying first-time debtors with an uplift model Rantzer, Måns January 2016 (has links) The use of predictive analysis is becoming more commonplace with each passing day, which lends increased credence to the fact that even governmental institutions should adopt it. Kronofogden is in the middle of a digitization process and is therefore in a unique position to implement predictive analysis into the core of their operations. This project aims to study if methods from predictive analysis can predict how many debts will be received for a first-time debtor, through the use of uplift modeling. The difference between uplift modeling and conventional modeling is that it aims to measure the difference in behavior after a treatment, in this case guidance from Kronofogden. Another aim of the project is to examine whether the scarce literature about uplift modeling have it right about how the conventional two-model approach fails to perform well in practical situations. The project shows similar results as Kronofogden’s internal evaluations. Three models were compared: random forests, gradient-boosted models and neural networks, the last performing the best. Positive uplift could be found for 1-5% of the debtors, meaning the current cutoff level of 15% is too high. The models have several potential sources of error, however: modeling choices, that the data might not be informative enough or that the actual expected uplift for new data is equal to zero. predictive analysis machine learning Kronofogden uplift uplift modeling
2	Empirical Studies of Online Crowdfunding Gao, Qiang, Gao, Qiang January 2016 (has links) Online crowdfunding, an emerging business model, has been thriving for the last decade. It enables small firms and individuals to conduct financial transactions that would previously been impossible. Along with unprecedented opportunities, two fundamental issues still hinder crowdfunding ability to fulfill its potentials: the information asymmetry and the understanding of the impact of crowdfunding. Both are actually exacerbated by the "virtual" nature of these marketplaces. The success of this new market therefore critically depends on both improving existing mechanisms or designing new ones to mitigate the issue of unobservable fundraiser quality, which can lead to adverse selection and market collapse; and better understanding the impact of crowdfunding, and particularly its offline impact, which will allow the effective allocation of scarce resources. My dissertation includes three essays around these topics, using data from debt-, reward- and donation-based crowdfunding contexts, respectively. My first two essays focus on two popular but understudied components in crowdfunding campaigns, texts and videos, and aim at predicting fundraiser quality by quantifying texts and videos. In particular, the first essay focuses on developing scalable approaches to extracting linguistic features from texts provided by borrowers when they request funds; and on using those features to explain and predict the repayment probability of the problematic loans. The second essay focuses on videos in reward crowdfunding, and preliminary results show excellent predictive performance and strong associations between multi-dimensional video information and crowdfunding campaign success and quality. The last essay investigates the impact of educational crowdfunding on school performance, using data from a crowdfunding platform for educational purposes. The results show that educational crowdfunding plays a role far beyond simply a financial source. Overall, my dissertation identifies the non-financial impact of crowdfunding as well as potential opportunities for efficiency improvement in the crowdfunding market, which have thus far not been documented in the literature. Explanatory Analysis Impact Information Asymmetry Predictive Analysis Video Management Crowdfunding
3	A machine learning approach to fundraising success in higher education Ye, Liang 01 May 2017 (has links) New donor acquisition and current donor promotion are the two major programs in fundraising for higher education, and developing proper targeting strategies plays an important role in the both programs. This thesis presents machine learning solutions as targeting strategies for the both programs based on readily available alumni data in almost any institution. The targeting strategy for new donor acquisition is modeled as a donor identification problem. The Gaussian na ̈ıve bayes, random forest, and support vector machine algorithms are used and evaluated. The test results show that having been trained with enough samples, all three algorithms can distinguish donors from rejectors well, and big donors are identified more often than others.While there is a trade off between the cost of soliciting candidates and the success of donor acquisition, the results show that in a practical scenario where the models are properly used as the targeting strategy, more than 85% of new donors and more than 90% of new big donors can be acquired when only 40% of the candidates are solicited. The targeting strategy for donor promotion is modeled as a promising donor(i.e., those who will upgrade their pledge) prediction problem in machine learning.The Gaussian na ̈ıve bayes, random forest, and support vector machine algorithms are tested. The test results show that all the three algorithms can distinguish promising donors from non-promising donors (i.e., those who will not upgrade their pledge).When the age information is known, the best model produces an overall accuracy of 97% in the test set. The results show that in a practical scenario where the models are properly used as the targeting strategy, more than 85% of promising donors can be acquired when only 26% candidates are solicited. / Graduate / liangye714@gmail.com machine learning fundraising support vector machine random forest na ̈ıve bayes predictive analysis prospect research
4	Analyse prédictive du devenir des médicaments dans l'environnement / Predictive analysis of the fate of pharmaceuticals in the environment Laurencé, Céline 05 December 2011 (has links) Les substances pharmaceutiques sont classées comme contaminants environnementaux émergents et suscitent une attention croissante du fait de leurs effets potentiellement néfastes sur les écosystèmes. Après excrétion ou élimination inappropriée, les médicaments vont se retrouver dans les eaux de surface, souterraines voire de consommation. De nombreuses études écotoxicologiques ont pour objet la mesure de leurs impacts sur les écosystèmes. Pour autant, ces études portent essentiellement sur le médicament lui-même alors que nombre d'entre eux sont susceptibles de se transformer dans l'environnement selon des processus biotiques (microorganismes) et/ou abiotiques (traitements chimiques, photodégradation). Les produits de transformations (PTs) ainsi formés vont d'une part, progressivement remplacer le médicament parent dans l'environnement, et d'autre part y exprimer une écotoxicité potentielle. Face à ce problème, nous nous proposons à partir d'un médicament largement utilisé, de procéder à la synthèse de ses PTs plausibles et à la mise au point de méthode de détection dans des matrices complexes. L'accès aux PTs suivra une approche pluridisciplinaire faisant appel à la bioconversion, l'électro-Fenton et l'oxydation électrochimique. L'analyse comparative des composés obtenus par ces différentes approches permettra de sélectionner les PTs les plus probables / Pharmaceuticals as well as personal care products are classified as emerging pollutants of increasing concern due to possible negative impacts on ecosystems. They are constantly introduced in sewage treatment plants either through excretion, or disposal by flushing of unused or expired medication, or directly within the sewage effluents of plants or hospitals. They end up in surface and ground waters and can even be found in drinking water. Many studies report on adverse effects on terrestrial and aquatic organisms. Pharmaceuticals have complex chemical structures capable of reacting in an aqueous medium under the action of chemical, biological or physical agents. Thus, the transformation products (TPs) gradually replace the parent drug in the environment. In addition these transformation products constitute markers of past or current presence of the drug in the environment. Faced with this problem, we believe it is necessary to synthesize the transformation products of the parent compounds to development their detection. The proposed method consists, firstly, to prepare the largest number of (TPs) of a particular drug using three complementary approaches : bioconversion, electro-Fenton and electrochemical oxidation. A second step is to identify the structures which are the most likely present in the environment. Expected advances are the development of a predictive methodology applicable to the study of any molecule involved in environmental risk Médicaments Environnement Polluants émergents Produits de transformation Analyse prédictive Drugs Environment Emerging pollutants Transformation products Predictive analysis
5	Preana: Game-theory Based Prediction with Reinforcement Learning Eftekhari, Zahra 01 December 2014 (has links) We have developed a game-theory based prediction tool, named Preana, based on a promising model developed by Professor Bruce Beuno de Mesquita. The first part of this work is dedicated to exploration of the specifics of Mesquita's algorithm and reproduction of the factors and features that have not been revealed in literature. In addition, we have developed a learning mechanism to model the players' reasoning ability when it comes to taking risks. Preana can predict the outcome of any issue with multiple stake-holders who have conflicting interests in economic, business, and political sciences. We have utilized game theory, expected utility theory, Median voter theory, probability distribution and reinforcement learning. We were able to reproduce Mesquita's reported results and have included two case studies from his publications and compared his results to that of Preana. We have also applied Preana on Iran's 2013 presidential election to verify the accuracy of the prediction made by Preana. Expected Utility Theory Game Theory Median Voter Theory Predictive Analysis Reinforcement Learning
6	Integrating predictive analysis in self-adaptive pervasive systems / Intégration de l’analyse prédictive dans des systèmes auto-adaptatifs Paez Anaya, Ivan Dario 22 September 2015 (has links) Au cours des dernières années, il y a un intérêt croissant pour les systèmes logiciels capables de faire face à la dynamique des environnements en constante évolution. Actuellement, les systèmes auto-adaptatifs sont nécessaires pour l’adaptation dynamique à des situations nouvelles en maximisant performances et disponibilité. Les systèmes ubiquitaires et pervasifs fonctionnent dans des environnements complexes et hétérogènes et utilisent des dispositifs à ressources limitées où des événements peuvent compromettre la qualité du système. En conséquence, il est souhaitable de s’appuyer sur des mécanismes d’adaptation du système en fonction des événements se produisant dans le contexte d’exécution. En particulier, la communauté du génie logiciel pour les systèmes auto-adaptatif (Software Engineering for Self-Adaptive Systems - SEAMS) s’efforce d’atteindre un ensemble de propriétés d’autogestion dans les systèmes informatiques. Ces propriétés d’autogestion comprennent les propriétés dites self-configuring, self-healing, self-optimizing et self-protecting. Afin de parvenir à l’autogestion, le système logiciel met en œuvre un mécanisme de boucle de commande autonome nommé boucle MAPE-K [78]. La boucle MAPE-K est le paradigme de référence pour concevoir un logiciel auto-adaptatif dans le contexte de l’informatique autonome. Cet modèle se compose de capteurs et d’effecteurs ainsi que quatre activités clés : Monitor, Analyze, Plan et Execute, complétées d’une base de connaissance appelée Knowledge, qui permet le passage des informations entre les autres activités [78]. L’étude de la littérature récente sur le sujet [109, 71] montre que l’adaptation dynamique est généralement effectuée de manière réactive, et que dans ce cas les systèmes logiciels ne sont pas en mesure d’anticiper des situations problématiques récurrentes. Dans certaines situations, cela pourrait conduire à des surcoûts inutiles ou des indisponibilités temporaires de ressources du système. En revanche, une approche proactive n’est pas simplement agir en réponse à des événements de l’environnement, mais a un comportement déterminé par un but en prenant par anticipation des initiatives pour améliorer la performance du système ou la qualité de service. / In this thesis we proposed a proactive self-adaptation by integrating predictive analysis into two phases of the software process. At design time, we propose a predictive modeling process, which includes the activities: define goals, collect data, select model structure, prepare data, build candidate predictive models, training, testing and cross-validation of the candidate models and selection of the ''best'' models based on a measure of model goodness. At runtime, we consume the predictions from the selected predictive models using the running system actual data. Depending on the input data and the time allowed for learning algorithms, we argue that the software system can foresee future possible input variables of the system and adapt proactively in order to accomplish middle and long term goals and requirements. Analyse prédictive Systèmes auto-Adaptatifs Predictive analysis Self-adaptive pervasive systems
7	Applications of proteochemometrics (PCM) : from species extrapolation to cell-line sensitivity modelling / Applications de proteochemometrics : à partir de l'extrapolation des espèces à la modélisation de la sensibilité de la lignée cellulaire Cortes Ciriano, Isidro 16 June 2015 (has links) Proteochemometrics (PCM) est une bioactivité prophétique la méthode posante de simultanément modeler la bioactivité de ligands multiple contre des objectifs multiples... / Proteochemometrics (PCM) is a predictive bioactivity modelling method to simultaneously model the bioactivity of multiple ligands against multiple targets. Therefore, PCM permits to explore the selectivity and promiscuity of ligands on biomolecular systems of different complexity, such proteins or even cell-line models. In practice, each ligand-target interaction is encoded by the concatenation of ligand and target descriptors. These descriptors are then used to train a single machine learning model. This simultaneous inclusion of both chemical and target information enables the extra- and interpolation to predict the bioactivity of compounds on targets, which can be not present in the training set. In this thesis, a methodological advance in the field is firstly introduced, namely how Bayesian inference (Gaussian Processes) can be successfully applied in the context of PCM for (i) the prediction of compounds bioactivity along with the error estimation of the prediction; (ii) the determination of the applicability domain of a PCM model; and (iii) the inclusion of experimental uncertainty of the bioactivity measurements. Additionally, the influence of noise in bioactivity models is benchmarked across a panel of 12 machine learning algorithms, showing that the noise in the input data has a marked and different influence on the predictive power of the considered algorithms. Subsequently, two R packages are presented. The first one, Chemically Aware Model Builder (camb), constitues an open source platform for the generation of predictive bioactivity models. The functionalities of camb include : (i) normalized chemical structure representation, (ii) calculation of 905 one- and two-dimensional physicochemical descriptors, and of 14 fingerprints for small molecules, (iii) 8 types of amino acid descriptors, (iv) 13 whole protein sequence descriptors, and (iv) training, validation and visualization of predictive models. The second package, conformal, permits the calculation of confidence intervals for individual predictions in the case of regression, and P values for classification settings. The usefulness of PCM to concomitantly optimize compounds selectivity and potency is subsequently illustrated in the context of two application scenarios, which are: (a) modelling isoform-selective cyclooxygenase inhibition; and (b) large-scale cancer cell-line drug sensitivity prediction, where the predictive signal of several cell-line profiling data is benchmarked (among others): basal gene expression, gene copy-number variation, exome sequencing, and protein abundance data. Overall, the application of PCM in these two case scenarios let us conclude that PCM is a suitable technique to model the activity of ligands exhibiting uncorrelated bioactivity profiles across a panel of targets, which can range from protein binding sites (a), to cancer cell-lines (b). Analyse prédictive Chemoinformatique Statistique Bioinformatique Criblage virtuel Exploration de données Predictive analysis Virtual screening 570.285
8	Využitie VBA v podnikovej praxi / Use of VBA in business practice Nagrant, Richard January 2013 (has links) My diploma thesis is dedicated to the issue of data mining, its part of predictive analysis and then process that is used in HP for different prediction. The introductory part describes the concept of data mining, its role and benefits. To one of its tasks fall predictive analysis, where I discussed the principles of its use. Then follows framework HP, which uses one of the tasks of predictive analysis. In the practical part, I applied the use VBA to automate one part of the process of converting text to numerical data, assuming different rules. The result is a set of macros support, system sheet and the new ribbon for data transformation.
9	Monitorování stavu při obrábění bloku motoru - Plazma / Condition monitoring of engine block machining - Plasma Váško, Ondřej January 2020 (has links) The aim of this thesis is to design and implement two methods of predictive analysis for company Škoda Auto a.s. In the first part I have conducted a literature search on methods of predictive diagnostics. In the next part, with help from thesis consultant in Škoda Auto a.s., the analysis of the assembly line and data blocks from machinery and measuring has been made. Then I designed and programmed data generator based on real data. I created two methods of predictive diagnostics, capable of analyzing input data and deciding about their condition. In the end I tested these two methods and evaluated accuracy of their prediction. Main output of my thesis is two methods of predictive diagnostics, feasible in real world.
10	Predictive model to reduce the dropout rate of university students in Perú: Bayesian Networks vs. Decision Trees Medina, Erik Cevallos, Chunga, Claudio Barahona, Armas-Aguirre, Jimmy, Grandon, Elizabeth E. 01 June 2020 (has links) El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado. / This research proposes a prediction model that might help reducing the dropout rate of university students in Peru. For this, a three-phase predictive analysis model was designed which was combined with the stages proposed by the IBM SPSS Modeler methodology. Bayesian network techniques was compared with decision trees for their level of accuracy over other algorithms in an Educational Data Mining (EDM) scenario. Data were collected from 500 undergraduate students from a private university in Lima. The results indicate that Bayesian networks behave better than decision trees based on metrics of precision, accuracy, specificity, and error rate. Particularly, the accuracy of Bayesian networks reaches 67.10% while the accuracy for decision trees is 61.92% in the training sample for iteration with 8:2 rate. On the other hand, the variables athletic person (0.30%), own house (0.21%), and high school grades (0.13%) are the ones that contribute most to the prediction model for both Bayesian networks and decision trees. Bayesian Networks Decision Trees Educational Data Mining predictive analysis university dropout

Search results