Global ETD Search

621	Exploratory Analysis of Human Sleep Data Laxminarayan, Parameshvyas 19 January 2004 (has links) In this thesis we develop data mining techniques to analyze sleep irregularities in humans. We investigate the effects of several demographic, behavioral and emotional factors on sleep progression and on patient's susceptibility to sleep-related and other disorders. Mining is performed over subjective and objective data collected from patients visiting the UMass Medical Center and the Day Kimball Hospital for treatment. Subjective data are obtained from patient responses to questions posed in a sleep questionnaire. Objective data comprise observations and clinical measurements recorded by sleep technicians using a suite of instruments together called polysomnogram. We create suitable filters to capture significant events within sleep epochs. We propose and employ a Window-based Association Rule Mining Algorithm to discover associations among sleep progression, pathology, demographics and other factors. This algorithm is a modified and extended version of the Set-and-Sequences Association Rule Mining Algorithm developed at WPI to support the mining of association rules from complex data types. We analyze both the medical as well as the statistical significance of the associations discovered by our algorithm. We also develop predictive classification models using logistic regression and compare the results with those obtained through association rule mining. association rule mining logistic regression statistical significance of rules window-based association rule mining data mining sleep data Data mining Sleep disorders
622	Proposta de construção de um modelo econométrico para estimar a probabilidade de risco de inadimplência: uma verificação empírica na Universidade Católica de Pelotas Ribeiro, Cristiane Freitas 26 September 2008 (has links) Made available in DSpace on 2015-03-05T19:13:44Z (GMT). No. of bitstreams: 0 Previous issue date: 26 / Nenhuma / As facilidades na concessão de crédito a pessoas físicas têm aumentado no decorrer dos últimos três anos. Variáveis como redução das taxas de juros, aumento de prazos de pagamentos e empréstimos consignados à folha de pagamento possibilitaram à população em geral acesso a aquisição de bens móveis, imóveis entre outros. Neste contexto, a procura por mecanismos mais robustos de análise de risco de crédito, no sentido de evitar ou reduzir os níveis de inadimplência do setor se tornaram necessários. Este estudo objetiva construir um modelo econométrico para estimar a probabilidade do risco de inadimplência em uma Instituição Privada de Ensino Superior. Utilizando a técnica estatística de regressão logística, o modelo de risco de crédito foi construído com base em uma amostra de alunos (pessoas físicas) matriculados na Universidade Católica de Pelotas, situada em Pelotas/RS. As variáveis explicativas do modelo foram obtidas a partir da aplicação de um questionário socioeconômico, que gerou um rol de 59 variáveis d / The facilitation in the credit concession to individuals has increased over the last three years. Variables such as the reduction in the interest taxes, increase in maturity, payroll-attached loans, have provided the population in general, with access to consumption property, buildings among others. In this scenario, the search for stronger tools of credit risk analysis, trying to avoid or at least reduce the default rates in the field, has become necessary. This study aims at elaborating an econometric model to predict the probability of default risk in a Private University. By using the statistical technique of logistical regression, the credit risk model has been built based on a sample of students (individuals) enrolled at “Universidade Católica de Pelotas”, located in Pelotas/RS. The explaining variables of the model have been obtained from a socio-economical questionnaire, which has generated 59 variables from which, only 3 were really relevant: existence of previously negotiated debts, possession of a Ciências Sociais Aplicadas instituição privada de ensino superior modelos de credit scoring regressão logística risco de crédito credit scoring models logistic regression private university credit risk
623	Algoritmo kNN na imputação de dados de espectros de massa do tipo MALDI-TOF: uma análise da influência da imputação com kNN sobre o desempenho de classificadores logísticos para identificação de bactérias Santos, Fábio dos 14 September 2018 (has links) Submitted by Angela Maria de Oliveira (amolivei@uepg.br) on 2018-11-06T17:08:39Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) Fábio dos Santos.pdf: 1456053 bytes, checksum: 5ee15a88a68aaef87a46a8f42f816e32 (MD5) / Made available in DSpace on 2018-11-06T17:08:39Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) Fábio dos Santos.pdf: 1456053 bytes, checksum: 5ee15a88a68aaef87a46a8f42f816e32 (MD5) Previous issue date: 2018-09-14 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O processo de identiﬁcação de bactérias relacionadas ao crescimento vegetal,é alvo de diversos estudos na área de bioinformática. Uma das formas para realizar esta identiﬁcação é utilizar dados de espectrometria de massa do tipo MALDI-TOF para detectar a presença de proteínas ribossomaisemumaamostra,eentão,usarclassiﬁcadoresparaprocessarestesdadoseselecionar o rótulo com a maior probabilidade. Durante o processo de geração dos espectros de massa paraclassiﬁcaçãoécomumanãodetecçãodealgumdospicosrelacionadosaproteínasribossomais. Considerando isto, este trabalho apresenta um estudo sobre o uso do algoritmo kNN para imputação desses casos. O estudo foi desenvolvido com o uso de classiﬁcadores logísticos para identiﬁcação de bactérias da espécie Staphylococcus aureus e do gênero Bacillus. Durante os experimentos foram testados três técnicas para imputar dados: imputação com zero, imputação com a média do atributo faltante, e a imputação com kNN. Desta última foram usadas duas abordagens: função de agregação de média e função de agregação de mediana. O protocolo experimental implementado possibilitou avaliar a inﬂuência da imputação sobre os resultados de classiﬁcação sob diferentes cenários no que se refere ao número de variáveis faltantes. Os resultadosobtidosmostramqueoempregodokNNnãolevouàumareduçãododesempenhodos classiﬁcadores, em relação àquele observado quando do uso de dados completos. Além disto, a classiﬁcação de dados submetidos a imputação pelo kNN apresentou desempenho superior àquele veriﬁcado quando do uso dos demais métodos. / It is subject of several studies in bioinformatics area the plant growth promoting bacteria identiﬁcation process. An approach to performing it is to process sample’s ribosomal proteins data obtained by MALDI-TOF mass spectrometry through a classiﬁer and select the highest probability label. However, at the time of mass spectra generation, it is common not detecting some ribosomal proteins related peaks data. With this in mind, this work presents a study about data imputation through the kNN algorithm. Logistic classiﬁers were applied to identify bacteria of the Bacillus genus and the Staphylococcus aureus species while three data imputation techniques were tested: with zero, with the average of the missing attribute, and with kNN algorithm. From this latter imputation technique, two approaches were considered: average aggregation function and median aggregation function. The adopted experimental protocol investigated the imputation inﬂuence on classiﬁcation results under different scenarios regarding missing variablesnumber.TheresultsshowthatbothkNN’sapproachesdidnotpromotesigniﬁcantreduction on classiﬁers’ performance when compared with complete data approach and that the classiﬁcation of imputed data by kNN presented superior performance to that of other considered methods. Imputação com kNN Espectrometria de Massa Regressão Logística Classiﬁcação de Bactérias Imputation with kNN Mass Spectrometry Logistic Regression Bacterial Classiﬁcation
624	Evolution et facteurs pronostiques de la Neurofibromatose 1 / Factors Associated to Neurofibromatosis1 Sbidian, Émilie 23 October 2012 (has links) La Neurofibromatose 1 (NF1) est une maladie autosomique dominante dont l’évolutivité est inconnue. En effet, ni le type de mutation du gène, la gravité d’éventuels cas familiaux, ni une première complication ne permettent de prédire le pronostic de la maladie. L’objectif général de ce travail de thèse était de cibler les malades les plus à risque de morbi-mortalité au cours de la NF1. Méthode. Les différents travaux se sont appuyés sur les données phénotypiques de patients NF1 suivis dans le Réseau NF-France labellisé par le ministère de la Santé. Il s’agit d’une filière nationale monothématique ayant pour mission la prise en charge des malades atteints de NF1. Une cohorte d’environ 2500 malades est actuellement suivie dans ce réseau. Résultats. La mortalité des patients NF1 a tout d’abord été comparée à celle de la population générale française par l’estimation du rapport de mortalité standardisée (SMR). Entre 1980 et 2006, 1 895 patients NF1 ont été rétrospectivement inclus dans la cohorte. Un excès de mortalité était observé chez les [10-20[ ans (SMR=5.2, IC95% : 2.6 – 9.3, p<10-4) et les [20-40[ ans (SMR=4.1, IC95% : 2.8 – 5.8, p<10-4). Les principales causes de décès étaient la transformation de neurofibromes internes en tumeurs malignes des gaines nerveuses (TMGN). Une étude cas témoins portant sur 208 patients NF1 a permis d’expliquer le risque de mortalité accru chez les patients présentant des neurofibromes sous cutanés (SC-NF) en confirmant en IRM la présence chez ces patients de neurofibromes internes à fort risque de transformation en TMGN (OR=4.3, IC95% : 2.2 – 8.2). Cet effet était d’autant plus marqué que le nombre de SC-NF était important et notamment au-delà d’un seuil de 10 (OR=82, IC95% : 10.4 – 647.9) et que les neurofibromes internes étaient diffus (OR=14.7, IC95% : 3.8 – 57.3) et de taille ≥ 3 cm (OR=6.3, IC95% : 2.3 – 17.4). Les patients présentant des SC-NF représentent 20 à 30% de la population NF1. Afin d’identifier les patients à risque de développer une TMGN, nous avons élaboré un score prédictif de la présence des neurofibromes internes à partir des caractéristiques phénotypiques des patients. La présence de SC-NF (OR=4.7, IC95% : 2.1 – 10.5), l’absence de neurofibromes cutanés (OR=2.6, IC95% : 0.9 – 7.5), un âge inférieur ou égal à 30 ans (OR=3.1, IC95% : 1.4 – 6.8) et moins de 6 tâches café au lait (OR=2, IC95% : 0.9 – 4.6) étaient les variables qui constituaient le NF1Score. Le NF1Score = 10(âge ≤ 30 ans) + 10(absence de neurofibromes cutanés) + 5(moins de 6 tâches café-au-lait) + 15(plus de 2 neurofibrome sous cutanés) avait une excellente adéquation (test C de Hosmer-Lemeshow=4,53 avec 7ddl, p>0,50) et une capacité discriminante satisfaisante (aire sous la courbe ROC non paramétrique = 0,75 [0,68-0,82]). Enfin, l’expression phénotypique variant au cours du temps chez un même patient nous avons réalisé une étude spécifique chez l’enfant. Ainsi, l’âge (OR=1.1, IC95% : 1.0 – 1.2), la présence de xanthogranulomes (OR=4.5, IC95% : 0.9 – 21.7), celle de neurofibromes sous cutanés et plexiformes (OR=5.0, IC95% : 1.8 – 13.6) étaient indépendamment associés à celle des neurofibromes internes chez l’enfant NF1 de moins de 17 ans. Dans cette dernière étude, les neurofibromes internes se développaient de façon exponentielle pendant l’adolescence et plus précocement chez les femmes en accord avec les données de la littérature. Conclusion. La période à risque de développer des neurofibromes internes semblent donc sesituer entre l’adolescence et l’âge de 30 ans. Les recommandations de suivi pourraient prendre en compte le phénotype à risque, mais également la période de survenue de ces complications en réévaluant l’intérêt dans ce contexte d’investigations complémentaires / Neurofibromatosis-1 (NF1) is a common autosomal dominant condition which is a source of various multisystemic manifestations related either to the accumulation of neurofibromas or to specific developmental abnormalities. There are no obvious factors that predict disease progression. Thus, the aim of our project was to characterize the phenotype of NF1 patients with a severe prognosis. Patients were identified among adults with NF-1 followed up in the Réseau NF-France. The Réseau NF-France is a French medical network devoted to neurofibromatosis 1. It has elaborated recommendations for the management of the disease and recommended a coordinated follow-up in specialized multidisciplinary centres. About 2 500 patients were enrolled. We first evaluated the mortality in a large retrospective cohort of NF1 patients. The standardized mortality ratio (SMR) with its 95% confidence interval (CI) was calculated as the ratio of observed over expected numbers of deaths. Between 1980 and 2006, 1895 NF1 patients were seen. The excess mortality occurred among patients aged 10 to 20 years (SMR=5.2; CI, 2.6-9.3; P<10-4) and 20 to 40 years (SMR, 4.1; 2.8-5.8; P<10-4). The main cause of death was the malignant tumors of the nerve sheath (MPNSTs) developing from preexisting internal neurofibromas. Then, a case-control study including 208 patients with NF1 allowed us to explain the increased risk of mortality among NF1 patients harboring subcutaneous neurofibromas (SC-NF) by the presence of internal neurofibromas (NF) at risk of MPNSTs systematically investigated with imaging (MRI) (OR=4.3, IC95% : 2.2 – 8.2). The association with SC-NF was stronger for patients with ten or more SC-NFs (OR=82, IC95% : 10.4 – 647.9) and for diffuse (OR=14.7, IC95% : 3.8 – 57.3), and ≥ 3 cm (OR=6.3, IC95% : 2.3 – 17.4) internal neurofibromas. Patients with SC-NF constituted 20 to 30% of the NF1 population. So, to characterize patients at risk of developping MPNSTs, we developped and validated a clinical score for predicting internal neurofibromas in adults. Four variables were independently associated with internal neurofibromas: at least two subcutaneous neurofibromas (OR=4.7, IC95% : 2.1 – 10.5), age ≤30 years (OR=3.1, IC95% : 1.4 – 6.8), absence of cutaneous neurofibromas (OR=2.6, IC95% : 0.9 – 7.5), and fewer than six café-au-lait spots (OR=2, IC95% : 0.9 – 4.6). The NF1Score was computed as 10 . [age ≤30 years] + 10 • [absence of cutaneous neurofibromas] + 15 • [≥2 subcutaneous neurofibromas] + 5 • [<6 café-au-lait spots]). Calibration was excellent (Hosmer-Lemeshow statistic=4.53; degrees of freedom=7; P>0.5) and discrimination was good (AUC-ROC= 0.75; 95%CI, 0.7-0.8). Finally clinical expressivity is variable and manifestations of NF1 change at different times in an individual’s life. Consequently, a specific study was needed in pediatric patients. We identified easily recognizable clinical characteristics associated with internal neurofibromas in children with NF1. By multivariate analysis, age (OR=1.1, IC95% : 1.0 – 1.2), xanthogranulomas (OR=4.5, IC95% : 0.9 – 21.7), and presence of both subcutaneous and plexiform neurofibromas (OR=5.0, IC95% : 1.8 – 13.6) were independently associated with internal neurofibromas. Moreover internal neurofibromas increased during adolescence. Excess risk of developing internal neurofibromas seems to occur between the adolescence and the age of to 30 in NF1 patients. These clinical features in adults and children would define a new population at risk for complications that may need closer clinical and imaging follow-up Neurofibromatose 1 Neurofibromes internes Score prédictif Régression logistique Mortalité Etude cas témoins Neurofibromatosis 1 Internal Neurofibroma Prediction score Logistic regression Mortality Case-control study
625	Detecção de fraudes em cartões: um classificador baseado em regras de associação e regressão logística / Card fraud detection: a classifier based on association rules and logistic regression Oliveira, Paulo Henrique Maestrello Assad 11 December 2015 (has links) Os cartões, sejam de crédito ou débito, são meios de pagamento altamente utilizados. Esse fato desperta o interesse de fraudadores. O mercado de cartões enxerga as fraudes como custos operacionais, que são repassados para os consumidores e para a sociedade em geral. Ainda, o alto volume de transações e a necessidade de combater as fraudes abrem espaço para a aplicação de técnicas de Aprendizagem de Máquina; entre elas, os classificadores. Um tipo de classificador largamente utilizado nesse domínio é o classificador baseado em regras. Entretanto, um ponto de atenção dessa categoria de classificadores é que, na prática, eles são altamente dependentes dos especialistas no domínio, ou seja, profissionais que detectam os padrões das transações fraudulentas, os transformam em regras e implementam essas regras nos sistemas de classificação. Ao reconhecer esse cenário, o objetivo desse trabalho é propor a uma arquitetura baseada em regras de associação e regressão logística - técnicas estudadas em Aprendizagem de Máquina - para minerar regras nos dados e produzir, como resultado, conjuntos de regras de detecção de transações fraudulentas e disponibilizá-los para os especialistas no domínio. Com isso, esses profissionais terão o auxílio dos computadores para descobrir e gerar as regras que embasam o classificador, diminuindo, então, a chance de haver padrões fraudulentos ainda não reconhecidos e tornando as atividades de gerar e manter as regras mais eficientes. Com a finalidade de testar a proposta, a parte experimental do trabalho contou com cerca de 7,7 milhões de transações reais de cartões fornecidas por uma empresa participante do mercado de cartões. A partir daí, dado que o classificador pode cometer erros (falso-positivo e falso-negativo), a técnica de análise sensível ao custo foi aplicada para que a maior parte desses erros tenha um menor custo. Além disso, após um longo trabalho de análise do banco de dados, 141 características foram combinadas para, com o uso do algoritmo FP-Growth, gerar 38.003 regras que, após um processo de filtragem e seleção, foram agrupadas em cinco conjuntos de regras, sendo que o maior deles tem 1.285 regras. Cada um desses cinco conjuntos foi submetido a uma modelagem de regressão logística para que suas regras fossem validadas e ponderadas por critérios estatísticos. Ao final do processo, as métricas de ajuste estatístico dos modelos revelaram conjuntos bem ajustados e os indicadores de desempenho dos classificadores também indicaram, num geral, poderes de classificação muito bons (AROC entre 0,788 e 0,820). Como conclusão, a aplicação combinada das técnicas estatísticas - análise sensível ao custo, regras de associação e regressão logística - se mostrou conceitual e teoricamente coesa e coerente. Por fim, o experimento e seus resultados demonstraram a viabilidade técnica e prática da proposta. / Credit and debit cards are two methods of payments highly utilized. This awakens the interest of fraudsters. Businesses see fraudulent transactions as operating costs, which are passed on to consumers. Thus, the high number of transactions and the necessity to combat fraud stimulate the use of machine learning algorithms; among them, rule-based classifiers. However, a weakness of these classifiers is that, in practice, they are highly dependent on professionals who detect patterns of fraudulent transactions, transform them into rules and implement these rules in the classifier. Knowing this scenario, the aim of this thesis is to propose an architecture based on association rules and logistic regression - techniques studied in Machine Learning - for mining rules on data and produce rule sets to detect fraudulent transactions and make them available to experts. As a result, these professionals will have the aid of computers to discover the rules that support the classifier, decreasing the chance of having non-discovered fraudulent patterns and increasing the efficiency of generate and maintain these rules. In order to test the proposal, the experimental part of the thesis has used almost 7.7 million transactions provided by a real company. Moreover, after a long process of analysis of the database, 141 characteristics were combined using the algorithm FP-Growth, generating 38,003 rules. After a process of filtering and selection, they were grouped into five sets of rules which the biggest one has 1,285 rules. Each of the five sets was subjected to logistic regression, so their rules have been validated and weighted by statistical criteria. At the end of the process, the goodness of fit tests were satisfied and the performance indicators have shown very good classification powers (AUC between 0.788 and 0.820). In conclusion, the combined application of statistical techniques - cost sensitive learning, association rules and logistic regression - proved being conceptually and theoretically cohesive and coherent. Finally, the experiment and its results have demonstrated the technical and practical feasibilities of the proposal. Análise sensível ao custo Aprendizagem de máquina Association rule learning Cost sensitive learning Detecção e prevenção de fraudes Fraud detection and prevention Logistic regression Machine learning Mineração de regras de associação Regressão logística
626	Uma contribuição ao estudo de acidentes fatais por queda de rochas: o caso da mineração peruana. / A contribuition to the study of fatal accidents by rocks falls: the case of peruvian mining. Collantes Candia, Renan 26 July 2011 (has links) A dependência de países em vias de desenvolvimento com relação às indústrias primárias como a mineração é evidente. Na economia peruana, aproximadamente, 6% do PIB e mais de 50% das exportações são provenientes desta atividade econômica, destacando sua posição competitiva no cenário mundial. A importância desta atividade aparece, também, quando o assunto em questão é a segurança do trabalho. Assim, embora nos últimos anos tenha-se percebido uma diminuição no número de acidentes na mineração peruana, a taxa de mortalidade ainda é alta quando comparada com outros países de tradição mineira, especialmente os mais desenvolvidos. No Peru, oficialmente, as causas fundamentais para a ocorrência de acidentes são atribuídas aos fatores pessoais e de trabalho, assim como às condições e aos atos inseguros. Nesse contexto, a identificação dessas causas, visando à proposta de soluções efetivas para melhor gerenciar os sistemas de segurança e de saúde na indústria da mineração, é muito importante. Esta tese estuda os acidentes por queda de rochas em minas subterrâneas do Peru. Para tal foi utilizado como fonte de informação primária o registro de acidentes fatais de 2007 em minas de médio e grande porte. Esse registro foi concedido pela Oficina de Fiscalización Minera del Organismo Superior de la Inversión en Energía y Minería del Peru (OSINERGMIN), órgão pertencente ao Ministério de Energía y Minas del Perú (MEM). O estudo mostra que a maioria dos acidentes fatais são provocados pela queda de rochas em escavações subterrâneas; assim, no período em estudo, este tipo de acidente representou 29,41% dos eventos. O estudo das características pessoais das vítimas mostra ainda que trabalhadores que desenvolvem funções de perfuração, preparação e instalação de suporte pós-desmonte tanto em frentes de lavra de produção quanto em escavações de desenvolvimento morrem por causa de traumatismos múltiplos e encefalo-cranianos severos. A maioria das vítimas pertencia a empresas mineiras terceirizadas. A partir do estudo das características pessoais das vítimas e utilizando os Métodos de Regressão Logística (MRL), propõe-se um modelo matemático para determinar a chance de se sofrer acidente por queda de rochas, em relação a outros tipos de acidentes. Os resultados mostram que trabalhadores que desempenham a função de ajudante, bem como trabalhadores com experiência de mais de três anos têm menos chance de sofrer acidentes por queda de rochas. Finalmente, foram identificados as causas fundamentais e imediatas dos acidentes estudados. Entre os fatores pessoais e de trabalho destacam-se o excesso de confiança e a supervisão deficiente como sendo as principais causas deste tipo de acidente. O estudo mostra também que o descumprimento de procedimentos operacionais e a presença de rochas soltas nas escavações constituem os principais tipos de atos e condições inseguras, respectivamente. / There are several evidences that developing countries depend on primary industries like mining. In fact about 6% of the Peruvian Gross Domestic Product (GDP) and 50% of exports are provided by mining. As well as in economy, mining has been strongly affecting the statistics concerning the safety in the workplace. Thus, although in recent years there was a decrease in the number of mining accidents in Peruvian mining, the fatality rate is still high compared to other traditional mining countries, especially the developed ones. In Peru, according to official statements, the primary causes of the accidents are attributed to personal and work factors, as well as unsafe conditions and acts. Based on this information, the identification of these causes, aiming the proposal of effective solutions to enhance safety and health management systems in mining becomes a very important issue. This thesis has studied the accidents caused by the fall of rocks in Peruvian underground mines, using as the main source of information about the fatalities occurred in 2007 in medium and large mines. This information was provided by the Oficina de Fiscalización Minera del Organismo Superior de la Inversión en Energía y Minería del Perú (OSINERGMIN), an agency under administration of the Ministry of Energy and Mines of Peru (MEM). The study shows that the majority of fatal accidents are caused by rock falls in underground excavations, and also that rock falls have accounted for 29.41% of all events during the studied period. Studying the personal characteristics of the victims also showed that the main victims are workers when they were developing drilling and preparation and installation of rock support activities in development areas as well as in production and excavations areas. The data showed that the majority died by severe multiple and cranial traumas and most of them were third part workers. From the study of the personal characteristics of victims and using the Methods of Logistic Regression (MLR), this research proposes a mathematical model to determine the chance of suffering an accident by rocks falls compared to other types of accidents. Also, the selected model showed that, from the statistical point of view, the experience in mining is the most representative variable and those workers having most of three years of experience have lower probability to suffer injuries by rock falls. Finally, the root and immediate causes of accidents were identified. Among personal and working factors the overconfidence and lack of supervision were respectively highlighted. The study also showed that non-complying operational procedures and the presence of loose rocks during the excavations are respectively the main types of unsafe acts and conditions. Acidentes de trabalho Factors causes Fatores causais Logistic regression methods Métodos de regressão logística Mineração peruana Occupational accidents Peruvian mining Queda de rochas Rock fall
627	Misskötta studielån : Hur mycket förväntas de kosta? / Defaulted student loans : What to expect? Peco, Amina January 2016 (has links) När propositionen för ett reformerat studiestödssystem lades 1999 poängterades det att studiestödssystemet skulle bära sina egna kostnader. Trots det skrivs stora belopp av. Både Riksrevisionen och Riksgälden har visat att CSN inte använder vedertagna metoder vid beräkningen av det som förväntas gå förlorat på grund av misskötta betalningar. Uppsatsens syfte har varit att skatta vad misskötta betalningar väntas kosta staten i form av framtida avskrivningar samt beräkna vad det skulle innebära för individen att istället bära kostnaden. Som en del i det arbetet har även faktorer som påverkar sannolikheten för misskötta betalningar av studielån identifierats. Resultaten av denna uppsats har bland annat visat att sannolikheten för misskötta betalningar är lägre för individer med eftergymnasial utbildning, hög skuld och låg ålder. Statens kreditförluster på studielån för till exempel individer som blev återbetalningsskyldiga under 2012 förväntas bli mellan 100 och 338 miljoner kronor. Om denna kostnad istället skulle bäras av årskullen innebär det en kostnadsökning på 2,2-7,8 procent för en individ med genomsnittlig skuld. risk premium student loans logistic regression probability of default expected loss credit losses Kreditförluster studielån logistisk regression sannolikheten för fallissemang förväntad förlust riskpremie
628	透過利率期限結構建立總體經濟產出缺口之預測模型 ─ 以美國為例 / Construct the forecast models for economic output gap through the term structure of interest rates ─ evidences for the United States 張楷翊 Unknown Date (has links) 經濟體的產出缺口一直是政策執行者的觀察重點，當一國出現產出缺口時，代表資源配置並不均衡，將發生通貨膨脹或是失業的現象，如能提早預期到未來是否會出現產出缺口，將可讓政策執行者即早進行政策實施，且有文獻指出，殖利率曲線資料中具有隱含未來經濟狀況之資訊。本研究以美國財政部與聯準會之公開資料，將以殖利率曲線之斜率進行預測產出缺口；本文研究美國1977年至2016年之國民生產毛額成分與殖利率之資料，目標為建立對於未來一季將出現正向或負向缺口現象之模型，本研究建立三種預測模型進行比較，分別為線性迴歸模型、羅吉斯迴歸模型與機器學習中的支持向量機，以實質GDP的缺口預測而言，研究結果顯示，三者預測準確度均達到65%以上，支持向量機的準確度更達到80.85%。得出以下結論，第一，殖利率曲線對於未來總體經濟產出缺口具有一定之解釋力；第二，對於高維度之預測模型在機器學習中的支持向量機表現會較一般常用之迴歸模型佳；第三，進出口的預測力在三個模型下均表現較差，可能為殖利率曲線對於進出口並不具有完整有效的資訊，可能有其餘的經濟指標或金融市場資訊可以解釋；第四，對於實質消費與投資等民間部門經濟行為有超過80%的預測力。 / The output gap of the economy has always been the objectives of policy practitioners. When a country appear the output gap, it means that the allocation of resources is not equilibrium and the inflation or unemployment will occur. The output gap will allow policymakers to implement the policy as early as possible, and the literature notes that the information of the yield curve has information about the future economic situation. In this paper, we using the data from the U.S. Department of Treasury and the Federal Reserve to predict the output gap by the slopes of the yield curve. Our goal is to construct the prediction model for the next quarter. To forecast the real GDP gap, three prediction models were compared, linear regression model, logistic regression model and support vector machine. The results show that the accuracy of the three predictions are more than 65%, support vector machine accuracy to reach 80.85%. We can have conclusions showing below: First, the yield curve has significant explanatory power for the overall economic output gap in the future. Second, the support vector machine perform better than the commonly used regression model. Third, the predictive power of real import and export in the three models are poor performance, there may be the rest of the economic indicators or financial market information can be explained. Fourth, the real consumption and investment has the predictive power more than 80% of the forecast. 殖利率曲線總體經濟預測支持向量機羅吉斯迴歸 Yield curve Economic forecast SVM Logistic regression
629	A estimativa do risco na constituição da PDD. / The risk estimation for the allowance for doubtful accounts. Ernesto Fernando Rodrigues Vicente 15 May 2001 (has links) Neste trabalho foram revisados os principais modelos, para a avaliação do risco de crédito e para o provisionamento de perdas com clientes, concluindo-se com uma proposta de adoção de um modelo estatístico, com o objetivo de medir o risco associado ao financiamento e empréstimo a clientes, com o conseqüente impacto na mensuração dos ativos. Sem o objetivo de exaurir o assunto, foram adotados os passos relacionados a seguir para o desenvolvimento do tema até a proposição final. Na introdução, são feitas as justificativas sobre o tema, qual a questão problema associada ao tema e os desafios da contabilidade quanto à mensuração dos ativos. Em relação à gestão de riscos, são relacionados os tipos de riscos em geral, detalhado o risco de crédito em particular e avaliados os modelos de concessão de crédito. Sobre a constituição da Provisão para Devedores Duvidosos, foram pesquisados os principais autores de contabilidade e de finanças, onde se constatou proposições semelhantes, que podem ser resumidas em 4 modelos de provisionamento para Perdas com Devedores Duvidosos: 1. Baixa "Write-off"; 2. Percentual sobre as vendas; 3. Percentual sobre o montante de contas a receber; 4. Idade da carteira "aging". Em seguida são analisadas as correlações entre os modelos de previsão de insolvência e as perdas com crédito, onde é possível identificar que os modelos de insolvência são úteis para a concessão do crédito, mas pouco utilizados para a estimativa da perda provável com devedores duvidosos. Em 21 de dezembro de 1999, o Banco Central do Brasil, emitiu a Resolução 2.682, na qual recomenda às instituições financeiras que alterem suas metodologias de provisionamento para perdas com devedores duvidosos. O Banco Central, entretanto, não indica qual modelo utilizar, deixando a cargo de cada instituição o desenvolvimento dos modelos. Utilizando a norma do Banco Central como referência, e procurando um embasamento científico para a constituição da PDD, é proposto um modelo para a sua constituição, modelo esse testado e avaliado, tanto em conformidade às normas do Banco Central, como com orientação gerencial. Para tanto, foi desenvolvido um modelo estatístico, aplicando-se a técnica da regressão logística, a 202 clientes de uma instituição financeira, onde foi possível concluir-se que o uso do modelo, na constituição da Provisão para Devedores Duvidosos, poderá trazer benefícios na mensuração do real valor dos investimentos em contas a receber. / This project aims at evaluating the used models and proposing the adoption of new models, for the Allowance for Doubtful Accounts, with the objective of measuring the risk related to customers financing and loan activities, and the resulting impact in assets measurements. In order to achieve this goal and try not to over exploit the subject, the following steps related to the development of the theme were adopted. In the introduction the theme is explained; the main issue associated to the theme and the challenges the accounting has to face concerning the assets measurements. As for the risk management the general kinds of risks are described, particularly the credit risk and the credit concession models are evaluated. Referring to the Allowance for Doubtful Accounts constitution, most meaningful authors in the field of Accounting and Finance were researched and similar propositions underlined their writings; that can be summarized in 4 allowance models for Doubtful Accounts. 1. Write off; 2. Percentage over the sales; 3. Percentage over receivables; 4. Aging. Then the correlations between the models of insolvency prediction and the credit losses are analyzed where it is possible to verify that the insolvency models are useful for credit concession, though not very much used for estimating probable loss with Doubtful Accounts. The Central Bank of Brazil (Banco Central do Brasil) issued the act number 2682 on the 21st of December, 1999, that urges all financial institutions to change their methodologies of Allowance for Doubtful Accounts. The Central Bank, however, does not indicate the model to be used, leaving the task of developing the models to each institution. Based on the policy of Central Bank and keeping a scientific approach to the constitution of Allowance for Doubtful Accounts, a model for their constitution in the portfolio is proposed. Such model is tested and evaluated not only, according to the rules of Central Bank of Brazil, but also in terms of management orientation. Having this purpose in mind a statistic model was developed, using LOGIT Regression applied to 202 customers of a financial institution where it was possible to come to the conclusion that the use of the model in the constitution of the Allowance for Doubtful Accounts can bring benefits in measuring the real value of investments in Receivables. contabilização do risco estimativa do risco de crédito LOGIT regressão logística allowance for doubtful accounts credit risk estimation LOGIT logistic regression risk accountancy
630	Modelos de classificação de risco de crédito para financiamentos imobiliários: regressão logística, análise discriminante, árvores de decisão, bagging e boosting Lopes, Neilson Soares 08 August 2011 (has links) Made available in DSpace on 2016-03-15T19:25:35Z (GMT). No. of bitstreams: 1 Neilson Soares Lopes.pdf: 983372 bytes, checksum: 2233d489295cd76cb2d8dcbd78e1e5de (MD5) Previous issue date: 2011-08-08 / Fundo Mackenzie de Pesquisa / This study applied the techniques of traditional parametric discriminant analysis and logistic regression analysis of credit real estate financing transactions where borrowers may or may not have a payroll loan transaction. It was the hit rate compared these methods with the non-parametric techniques based on classification trees, and the methods of meta-learning bagging and boosting that combine classifiers for improved accuracy in the algorithms.In a context of high housing deficit, especially in Brazil, the financing of real estate can still be very encouraged. The impacts of sustainable growth in the mortgage not only bring economic benefits and social. The house is, for most individuals, the largest source of expenditure and the most valuable asset that will have during her lifetime.At the end of the study concluded that the computational techniques of decision trees are more effective for the prediction of payers (94.2% correct), followed by bagging (80.7%) and boosting (or arcing , 75.2%). For the prediction of bad debtors in mortgages, the techniques of logistic regression and discriminant analysis showed the worst results (74.6% and 70.7%, respectively). For the good payers, the decision tree also showed the best predictive power (75.8%), followed by discriminant analysis (75.3%) and boosting (72.9%). For the good paying mortgages, bagging and logistic regression showed the worst results (72.1% and 71.7%, respectively). Logistic regression shows that for a borrower with payroll loans, the chance to be a bad credit is 2.19 higher than if the borrower does not have such type of loan.The presence of credit between the payroll operations of mortgage borrowers also has relevance in the discriminant analysis. / Neste estudo foram aplicadas as técnicas paramétricas tradicionais de análise discriminante e regressão logística para análise de crédito de operações de financiamento imobiliário. Foi comparada a taxa de acertos destes métodos com as técnicas não-paramétricas baseadas em árvores de classificação, além dos métodos de meta-aprendizagem BAGGING e BOOSTING, que combinam classificadores para obter uma melhor precisão nos algoritmos.Em um contexto de alto déficit de moradias, em especial no caso brasileiro, o financiamento de imóveis ainda pode ser bastante fomentado. Os impactos de um crescimento sustentável no crédito imobiliário trazem benefícios não só econômicos como sociais. A moradia é, para grande parte dos indivíduos, a maior fonte de despesas e o ativo mais valioso que terão durante sua vida. Ao final do estudo, concluiu-se que as técnicas computacionais de árvores de decisão se mostram mais efetivas para a predição de maus pagadores (94,2% de acerto), seguida do BAGGING (80,7%) e do BOOSTING (ou ARCING, 75,2%). Para a predição de maus pagadores em financiamentos imobiliários, as técnicas de regressão logística e análise discriminante apresentaram os piores resultados (74,6% e 70,7%, respectivamente). Para os bons pagadores, a árvore de decisão também apresentou o melhor poder preditivo (75,8%), seguida da análise discriminante (75,3%) e do BOOSTING (72,9%). Para os bons pagadores de financiamentos imobiliários, BAGGING e regressão logística apresentaram os piores resultados (72,1% e 71,7%, respectivamente).A regressão logística mostra que, para um tomador com crédito consignado, a chance se ser um mau pagador é 2,19 maior do que se este tomador não tivesse tal modalidade de empréstimo. A presença de crédito consignado entre as operações dos tomadores de financiamento imobiliário também apresenta relevância na análise discriminante. risco de crédito análise discriminante regressão logística bagging boosting arcing financiamento imobiliário credit risk discriminant analisys logistic regression bagging boosting arcing real estate financing

Search results