Spelling suggestions: "subject:"[een] PREDICTIVE-ANALYSIS"" "subject:"[enn] PREDICTIVE-ANALYSIS""
1 |
Empirical Studies of Online CrowdfundingGao, Qiang, Gao, Qiang January 2016 (has links)
Online crowdfunding, an emerging business model, has been thriving for the last decade. It enables small firms and individuals to conduct financial transactions that would previously been impossible. Along with unprecedented opportunities, two fundamental issues still hinder crowdfunding ability to fulfill its potentials: the information asymmetry and the understanding of the impact of crowdfunding. Both are actually exacerbated by the "virtual" nature of these marketplaces. The success of this new market therefore critically depends on both improving existing mechanisms or designing new ones to mitigate the issue of unobservable fundraiser quality, which can lead to adverse selection and market collapse; and better understanding the impact of crowdfunding, and particularly its offline impact, which will allow the effective allocation of scarce resources. My dissertation includes three essays around these topics, using data from debt-, reward- and donation-based crowdfunding contexts, respectively. My first two essays focus on two popular but understudied components in crowdfunding campaigns, texts and videos, and aim at predicting fundraiser quality by quantifying texts and videos. In particular, the first essay focuses on developing scalable approaches to extracting linguistic features from texts provided by borrowers when they request funds; and on using those features to explain and predict the repayment probability of the problematic loans. The second essay focuses on videos in reward crowdfunding, and preliminary results show excellent predictive performance and strong associations between multi-dimensional video information and crowdfunding campaign success and quality. The last essay investigates the impact of educational crowdfunding on school performance, using data from a crowdfunding platform for educational purposes. The results show that educational crowdfunding plays a role far beyond simply a financial source. Overall, my dissertation identifies the non-financial impact of crowdfunding as well as potential opportunities for efficiency improvement in the crowdfunding market, which have thus far not been documented in the literature.
|
2 |
A machine learning approach to fundraising success in higher educationYe, Liang 01 May 2017 (has links)
New donor acquisition and current donor promotion are the two major programs in fundraising for higher education, and developing proper targeting strategies plays an important role in the both programs. This thesis presents machine learning solutions as targeting strategies for the both programs based on readily available alumni data in almost any institution. The targeting strategy for new donor acquisition is modeled as a donor identification problem. The Gaussian na ̈ıve bayes, random forest, and support vector machine algorithms are used and evaluated. The test results show that having been trained with enough samples, all three algorithms can distinguish donors from rejectors well, and big donors are identified more often than others.While there is a trade off between the cost of soliciting candidates and the success of donor acquisition, the results show that in a practical scenario where the models are properly used as the targeting strategy, more than 85% of new donors and more than 90% of new big donors can be acquired when only 40% of the candidates are solicited. The targeting strategy for donor promotion is modeled as a promising donor(i.e., those who will upgrade their pledge) prediction problem in machine learning.The Gaussian na ̈ıve bayes, random forest, and support vector machine algorithms are tested. The test results show that all the three algorithms can distinguish promising donors from non-promising donors (i.e., those who will not upgrade their pledge).When the age information is known, the best model produces an overall accuracy of 97% in the test set. The results show that in a practical scenario where the models are properly used as the targeting strategy, more than 85% of promising donors can be acquired when only 26% candidates are solicited. / Graduate / liangye714@gmail.com
|
3 |
Analyse prédictive du devenir des médicaments dans l'environnement / Predictive analysis of the fate of pharmaceuticals in the environmentLaurencé, Céline 05 December 2011 (has links)
Les substances pharmaceutiques sont classées comme contaminants environnementaux émergents et suscitent une attention croissante du fait de leurs effets potentiellement néfastes sur les écosystèmes. Après excrétion ou élimination inappropriée, les médicaments vont se retrouver dans les eaux de surface, souterraines voire de consommation. De nombreuses études écotoxicologiques ont pour objet la mesure de leurs impacts sur les écosystèmes. Pour autant, ces études portent essentiellement sur le médicament lui-même alors que nombre d'entre eux sont susceptibles de se transformer dans l'environnement selon des processus biotiques (microorganismes) et/ou abiotiques (traitements chimiques, photodégradation). Les produits de transformations (PTs) ainsi formés vont d'une part, progressivement remplacer le médicament parent dans l'environnement, et d'autre part y exprimer une écotoxicité potentielle. Face à ce problème, nous nous proposons à partir d'un médicament largement utilisé, de procéder à la synthèse de ses PTs plausibles et à la mise au point de méthode de détection dans des matrices complexes. L'accès aux PTs suivra une approche pluridisciplinaire faisant appel à la bioconversion, l'électro-Fenton et l'oxydation électrochimique. L'analyse comparative des composés obtenus par ces différentes approches permettra de sélectionner les PTs les plus probables / Pharmaceuticals as well as personal care products are classified as emerging pollutants of increasing concern due to possible negative impacts on ecosystems. They are constantly introduced in sewage treatment plants either through excretion, or disposal by flushing of unused or expired medication, or directly within the sewage effluents of plants or hospitals. They end up in surface and ground waters and can even be found in drinking water. Many studies report on adverse effects on terrestrial and aquatic organisms. Pharmaceuticals have complex chemical structures capable of reacting in an aqueous medium under the action of chemical, biological or physical agents. Thus, the transformation products (TPs) gradually replace the parent drug in the environment. In addition these transformation products constitute markers of past or current presence of the drug in the environment. Faced with this problem, we believe it is necessary to synthesize the transformation products of the parent compounds to development their detection. The proposed method consists, firstly, to prepare the largest number of (TPs) of a particular drug using three complementary approaches : bioconversion, electro-Fenton and electrochemical oxidation. A second step is to identify the structures which are the most likely present in the environment. Expected advances are the development of a predictive methodology applicable to the study of any molecule involved in environmental risk
|
4 |
Preana: Game-theory Based Prediction with Reinforcement LearningEftekhari, Zahra 01 December 2014 (has links)
We have developed a game-theory based prediction tool, named Preana, based on a promising model developed by Professor Bruce Beuno de Mesquita. The first part of this work is dedicated to exploration of the specifics of Mesquita's algorithm and reproduction of the factors and features that have not been revealed in literature. In addition, we have developed a learning mechanism to model the players' reasoning ability when it comes to taking risks. Preana can predict the outcome of any issue with multiple stake-holders who have conflicting interests in economic, business, and political sciences. We have utilized game theory, expected utility theory, Median voter theory, probability distribution and reinforcement learning. We were able to reproduce Mesquita's reported results and have included two case studies from his publications and compared his results to that of Preana. We have also applied Preana on Iran's 2013 presidential election to verify the accuracy of the prediction made by Preana.
|
5 |
Integrating predictive analysis in self-adaptive pervasive systems / Intégration de l’analyse prédictive dans des systèmes auto-adaptatifsPaez Anaya, Ivan Dario 22 September 2015 (has links)
Au cours des dernières années, il y a un intérêt croissant pour les systèmes logiciels capables de faire face à la dynamique des environnements en constante évolution. Actuellement, les systèmes auto-adaptatifs sont nécessaires pour l’adaptation dynamique à des situations nouvelles en maximisant performances et disponibilité. Les systèmes ubiquitaires et pervasifs fonctionnent dans des environnements complexes et hétérogènes et utilisent des dispositifs à ressources limitées où des événements peuvent compromettre la qualité du système. En conséquence, il est souhaitable de s’appuyer sur des mécanismes d’adaptation du système en fonction des événements se produisant dans le contexte d’exécution. En particulier, la communauté du génie logiciel pour les systèmes auto-adaptatif (Software Engineering for Self-Adaptive Systems - SEAMS) s’efforce d’atteindre un ensemble de propriétés d’autogestion dans les systèmes informatiques. Ces propriétés d’autogestion comprennent les propriétés dites self-configuring, self-healing, self-optimizing et self-protecting. Afin de parvenir à l’autogestion, le système logiciel met en œuvre un mécanisme de boucle de commande autonome nommé boucle MAPE-K [78]. La boucle MAPE-K est le paradigme de référence pour concevoir un logiciel auto-adaptatif dans le contexte de l’informatique autonome. Cet modèle se compose de capteurs et d’effecteurs ainsi que quatre activités clés : Monitor, Analyze, Plan et Execute, complétées d’une base de connaissance appelée Knowledge, qui permet le passage des informations entre les autres activités [78]. L’étude de la littérature récente sur le sujet [109, 71] montre que l’adaptation dynamique est généralement effectuée de manière réactive, et que dans ce cas les systèmes logiciels ne sont pas en mesure d’anticiper des situations problématiques récurrentes. Dans certaines situations, cela pourrait conduire à des surcoûts inutiles ou des indisponibilités temporaires de ressources du système. En revanche, une approche proactive n’est pas simplement agir en réponse à des événements de l’environnement, mais a un comportement déterminé par un but en prenant par anticipation des initiatives pour améliorer la performance du système ou la qualité de service. / In this thesis we proposed a proactive self-adaptation by integrating predictive analysis into two phases of the software process. At design time, we propose a predictive modeling process, which includes the activities: define goals, collect data, select model structure, prepare data, build candidate predictive models, training, testing and cross-validation of the candidate models and selection of the ''best'' models based on a measure of model goodness. At runtime, we consume the predictions from the selected predictive models using the running system actual data. Depending on the input data and the time allowed for learning algorithms, we argue that the software system can foresee future possible input variables of the system and adapt proactively in order to accomplish middle and long term goals and requirements.
|
6 |
Applications of proteochemometrics (PCM) : from species extrapolation to cell-line sensitivity modelling / Applications de proteochemometrics : à partir de l'extrapolation des espèces à la modélisation de la sensibilité de la lignée cellulaireCortes Ciriano, Isidro 16 June 2015 (has links)
Proteochemometrics (PCM) est une bioactivité prophétique la méthode posante de simultanément modeler la bioactivité de ligands multiple contre des objectifs multiples... / Proteochemometrics (PCM) is a predictive bioactivity modelling method to simultaneously model the bioactivity of multiple ligands against multiple targets. Therefore, PCM permits to explore the selectivity and promiscuity of ligands on biomolecular systems of different complexity, such proteins or even cell-line models. In practice, each ligand-target interaction is encoded by the concatenation of ligand and target descriptors. These descriptors are then used to train a single machine learning model. This simultaneous inclusion of both chemical and target information enables the extra- and interpolation to predict the bioactivity of compounds on targets, which can be not present in the training set. In this thesis, a methodological advance in the field is firstly introduced, namely how Bayesian inference (Gaussian Processes) can be successfully applied in the context of PCM for (i) the prediction of compounds bioactivity along with the error estimation of the prediction; (ii) the determination of the applicability domain of a PCM model; and (iii) the inclusion of experimental uncertainty of the bioactivity measurements. Additionally, the influence of noise in bioactivity models is benchmarked across a panel of 12 machine learning algorithms, showing that the noise in the input data has a marked and different influence on the predictive power of the considered algorithms. Subsequently, two R packages are presented. The first one, Chemically Aware Model Builder (camb), constitues an open source platform for the generation of predictive bioactivity models. The functionalities of camb include : (i) normalized chemical structure representation, (ii) calculation of 905 one- and two-dimensional physicochemical descriptors, and of 14 fingerprints for small molecules, (iii) 8 types of amino acid descriptors, (iv) 13 whole protein sequence descriptors, and (iv) training, validation and visualization of predictive models. The second package, conformal, permits the calculation of confidence intervals for individual predictions in the case of regression, and P values for classification settings. The usefulness of PCM to concomitantly optimize compounds selectivity and potency is subsequently illustrated in the context of two application scenarios, which are: (a) modelling isoform-selective cyclooxygenase inhibition; and (b) large-scale cancer cell-line drug sensitivity prediction, where the predictive signal of several cell-line profiling data is benchmarked (among others): basal gene expression, gene copy-number variation, exome sequencing, and protein abundance data. Overall, the application of PCM in these two case scenarios let us conclude that PCM is a suitable technique to model the activity of ligands exhibiting uncorrelated bioactivity profiles across a panel of targets, which can range from protein binding sites (a), to cancer cell-lines (b).
|
7 |
Využitie VBA v podnikovej praxi / Use of VBA in business practiceNagrant, Richard January 2013 (has links)
My diploma thesis is dedicated to the issue of data mining, its part of predictive analysis and then process that is used in HP for different prediction. The introductory part describes the concept of data mining, its role and benefits. To one of its tasks fall predictive analysis, where I discussed the principles of its use. Then follows framework HP, which uses one of the tasks of predictive analysis. In the practical part, I applied the use VBA to automate one part of the process of converting text to numerical data, assuming different rules. The result is a set of macros support, system sheet and the new ribbon for data transformation.
|
8 |
Predictive model to reduce the dropout rate of university students in Perú: Bayesian Networks vs. Decision TreesMedina, Erik Cevallos, Chunga, Claudio Barahona, Armas-Aguirre, Jimmy, Grandon, Elizabeth E. 01 June 2020 (has links)
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado. / This research proposes a prediction model that might help reducing the dropout rate of university students in Peru. For this, a three-phase predictive analysis model was designed which was combined with the stages proposed by the IBM SPSS Modeler methodology. Bayesian network techniques was compared with decision trees for their level of accuracy over other algorithms in an Educational Data Mining (EDM) scenario. Data were collected from 500 undergraduate students from a private university in Lima. The results indicate that Bayesian networks behave better than decision trees based on metrics of precision, accuracy, specificity, and error rate. Particularly, the accuracy of Bayesian networks reaches 67.10% while the accuracy for decision trees is 61.92% in the training sample for iteration with 8:2 rate. On the other hand, the variables athletic person (0.30%), own house (0.21%), and high school grades (0.13%) are the ones that contribute most to the prediction model for both Bayesian networks and decision trees.
|
9 |
Predictive Analysis of Heating Systems for Fault DetectionVemana, Syam Kumar, Applili, Sai Keerthi January 2021 (has links)
Background : The heat load has an emergent role in the energy consumption of the heating system in buildings. The industry experts also have been constantly focusing on the heat load optimization techniques and in the recent years, numerous Machine Learning (ML) techniques have come into picture to resolve various tasks. Objectives : This study is mainly focused on to analyze the time-series hourly data and choose suitable Supervised Machine Learning approach among Multivariate Linear Regression (MLR), Support Vector Regression, and Multi-layer Perceptron (MLP) Regressor so as to predict heat demand for identifying the deviating behaviors and potentially faults. Methods : An experiment is performed and the method consists of imputing the missing values, extreme values and selection of six different feature sets. Cross validation on Multivariate Linear Regression, Support Vector Regression, and Multi-layer Perceptron Regressor was performed to find the best suitable algorithm. Finally the residuals of the best algorithm and the best feature set was used to find the fault using the calculation of studentized residuals. Because of the time-series based data in data set, regression based algorithms was the best suitable choice to work with such type of data that is continuous. The faults in the system were identified based on the studentized residuals that exceeds the threshold value of 3 are classified as fault. Results : Among the regression based algorithms, Multi-layer Perceptron Regressor resulted in Mean Absolute Error (MAE) of 1.77 and Mean Absolute Percentage Error (MAPE) 0.29% on the feature set 1. Multivariate Linear Regression shown Mean Absolute Error 1.83 and Mean Absolute Percentage Error 0.31% on feature set 1 that has relatively higher error for the metrics of Mean Absolute Error and Mean Absolute Percentage Error as comparing to Multi-layer Perceptron Regressor. Support Vector Regression (SVR) shown Mean Absolute Error 2.54 that is higher than that of both Multivariate Linear Regression and Multi-layer Perceptron Regressor, while theMean Absolute Percentage Error 0.24% that is similar to Multivariate Linear Regression and Multi-layer Perceptron Regressor on the feature set 1. So the best performing algorithm is Multi-layer Perceptron Regressor. The feature sets 4,5 and 6 which are super-sets of 1, 2 and 3 feature sets along with addition of outdoor temperature. These feature sets 4, 5 and 6 did not show much impact even after considering the outdoor temperature. From, the Table 5.1 the feature sets 1, 2 and 3 are comparitively better than feature sets 4, 5 and 6 for the metrics Mean Absolute Error and Mean Absolute Percentage Error.Finally on comparing the first three feature sets, the feature set 1 resulted in less error for all three algorithms as comparing to feature set 2 and feature set 3 that can be seen in Table 5.1. So the feature set 1 is the best feature set. Conclusions : Multi-layer Perceptron Regressor perfomed well on six different feature sets comparing with Multivariate Linear Regression and Support Vector Regression. The feature set 1 had shown Mean Absolute Error and Mean Absolute Percentage values relatively low than other feature sets. Therefore the feature set 1 was the best performing and the best suited algorithm was Multi-layer Perceptron Regressor. The Figure A.3 represents the flow of work done in the thesis.
|
10 |
Generating Value Using Predictive Analysis in E-retail : A Case Study on How Predictive Analysis Affords Value-Generating ActionsEmitslöf, Isak January 2023 (has links)
Information systems and information technology are rapidly evolving, and the usage of it at the same pace. In different fields, predictive analysis is used daily. Within the area of e-retail, referring to online retailing, it is used for personalisation and as decision support. There’s a lot of research on how to increase the accuracy of the predictions and different methods for this, however, there’s a lack of research regarding the actions an organisation can take given different predictions. Hence, this master’s study researches what factors affording or constraining value are in relation to the usage of predictive tools given different organisational roles. This thesis is made by a case study with a qualitative approach, following the interpretivism paradigm. The data used in this research comes from document analysis followed by semi-structured interviews to gather additional information about what the document analysis or previous research has not covered. The empirical findings were analysed using thematic analysis and are then discussed in relation to the research questions and theoretical framework, together with what’s previously been stated in the literature. The research questions for this thesis are the following: RQ1: How do different organisational roles affect the actions taken on information from predictive analysis in e-retail? and RQ2: What are the key factors affording or constraining value generation in predictive analysis within e-retail? The empirical findings resulted in six themes, where three are relevant to each research question. The findings suggests that there are four major categories of roles that have similar affordances of predictive analysis, these are customer-, sales and financial operations-, management-, andlastly supply chain and inventory related. When several roles within an organisation use the same prediction tool, there are positive effects such as less biased decisions and improved communication through collaboration. Several factors, both constraining and affording value were found. The main constraining factors are related to technological knowledge and interpreted value as well as trustworthiness. The affording factors are instead the allowance of tying predictions to certain KPIs and the ability to be able to slice into the data to show what’s relevant for the individual. In addition to these factors, some desires for functionality were found. These were, among others, a confidence score of the predictions, prediction for certain goals, and predicted optimal send times for emails in the future. My suggestion for future research is to approach the same problem using another theoretical framework to further enhance a novel field, as well as involving participants with different backgrounds than was used in this thesis.
|
Page generated in 0.0569 seconds