Global ETD Search

391	Email Mining Classifier : The empirical study on combining the topic modelling with Random Forest classification Halmann, Marju January 2017 (has links) Filtering out and replying automatically to emails are of interest to many but is hard due to the complexity of the language and to dependencies of background information that is not present in the email itself. This paper investigates whether Latent Dirichlet Allocation (LDA) combined with Random Forest classifier can be used for the more general email classification task and how it compares to other existing email classifiers. The comparison is based on the literature study and on the empirical experimentation using two real-life datasets. Firstly, a literature study is performed to gain insight of the accuracy of other available email classifiers. Secondly, proposed model’s accuracy is explored with experimentation. The literature study shows that the accuracy of more general email classifiers differs greatly on different user sets. The proposed model accuracy is within the reported accuracy range, however in the lower part. It indicates that the proposed model performs poorly compared to other classifiers. On average, the classifier performance improves 15 percentage points with additional information. This indicates that Latent Dirichlet Allocation (LDA) combined with Random Forest classifier is promising, however future studies are needed to explore the model and ways to further increase the accuracy. Email mining Latent Dirichlet Allocation Random Forest classification Computer Sciences Datavetenskap (datalogi)
392	Multivariate Ordinal Regression Models: An Analysis of Corporate Credit Ratings Hirk, Rainer, Hornik, Kurt, Vana, Laura 01 1900 (has links) (PDF) Correlated ordinal data typically arise from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. We investigate how sensitive the pairwise likelihood estimates are to the number of subjects and to the presence of observations missing completely at random, and find that these estimates are robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor's, Moody's and Fitch). Firm-level and stock price data for publicly traded US companies as well as an incomplete panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework. / Series: Research Report Series / Department of Statistics and Mathematics
393	Oncolytic Viruses as a Potential Approach to Eliminate Cells That Constitute the Latent HIV Reservoir Ranganath, Nischal 03 April 2018 (has links) HIV infection represents a major health and socioeconomic challenge worldwide. Despite significant advances in therapy, a cure for HIV continues to be elusive. The design of novel curative strategies will require targeting and elimination of cells that constitute the latent HIV-1 reservoir. However, such an approach is impeded by the inability to distinguish latently HIV-infected cells from uninfected cells. The type-I interferon (IFN-I) response is an integral antiviral defense mechanism, but is impaired at multiple levels during productive HIV infection. Interestingly, similar global impairments in IFN-I signaling have been observed in various human cancers. This led to the development of IFN-sensitive oncolytic viruses, including the recombinant Vesicular Stomatitis Virus (VSV 51) and Maraba virus (MG1), as virotherapy designed to treat various cancers. Based on this, it was hypothesized that IFN-I signaling is impaired in latently HIV-infected cells (as observed in productively infected cells) and that VSV 51 and MG1 may be able to exploit such intracellular defects to target and eliminate latently HIV-infected cells, while sparing healthy cells. First, using cell line models of HIV-1 latency, intracellular defects in IFN-I responses, including impaired IFN / production and expression of IFNAR1, MHC-I, ISG15, and PKR, were demonstrated to represent an important feature of latently HIV-infected cells. Consistent with this, the latently HIV-infected cell lines were observed to have a greater sensitivity to VSV 51 and MG1 infection, and MG1-mediated killing, than the HIV-uninfected parental cells. Next, the ability of oncolytic viruses to kill latently HIV-infected human primary cells was demonstrated using an in vitro resting CD4+ T cell model of latency. Interestingly, while both VSV 51 and MG1 infection resulted in a significant reduction in inducible p24 expression, a dose-dependent decrease in integrated HIV-1 DNA was only observed following MG1 infection. In keeping with this, MG1 infection of memory CD4+ T cells from HIV-1 infected individuals on HAART also resulted in a significant decrease in inducible HIV-1 gag RNA expression. By targeting an intracellular pathway that is impaired in latently HIV-infected cells, the findings presented in this dissertation highlight a novel, proof-of-concept approach to eliminate the latent HIV-1 reservoir. Given that VSV 51 and MG1 are currently being studied in cancer clinical trials, there is significant potential to translate this work to in vivo studies. Latent HIV reservoir Oncolytic viruses Type I interferon Vesicular stomatitis virus Marba virus
394	Diseño, desarrollo y evaluación de un algoritmo para detectar sub-comunidades traslapadas usando análisis de redes sociales y minería de datos Muñoz Cancino, Ricardo Luis January 2013 (has links) Magíster en Gestión de Operaciones / Ingeniero Civil Industrial / Los sitios de redes sociales virtuales han tenido un enorme crecimiento en la última década. Su principal objetivo es facilitar la creación de vínculos entre personas que, por ejemplo, comparten intereses, actividades, conocimientos, o conexiones en la vida real. La interacción entre los usuarios genera una comunidad en la red social. Existen varios tipos de comunidades, se distinguen las comunidades de interés y práctica. Una comunidad de interés es un grupo de personas interesadas en compartir y discutir un tema de interés particular. En cambio, en una comunidad de práctica las personas comparten una preocupación o pasión por algo que ellos hacen y aprenden cómo hacerlo mejor. Si las interacciones se realizan por internet, se les llama comunidades virtuales (VCoP/VCoI por sus siglas en inglés). Es común que los miembros compartan solo con algunos usuarios formando así subcomunidades, pudiendo pertenecer a más de una. Identificar estas subestructuras es necesario, pues allí se generan las interacciones para la creación y desarrollo del conocimiento de la comunidad. Se han diseñado muchos algoritmos para detectar subcomunidades. Sin embargo, la mayoría de ellos detecta subcomunidades disjuntas y además, no consideran el contenido generado por los miembros de la comunidad. El objetivo principal de este trabajo es diseñar, desarrollar y evaluar un algoritmo para detectar subcomunidades traslapadas mediante el uso de análisis de redes sociales (SNA) y Text Mining. Para ello se utiliza la metodología SNA-KDD propuesta por Ríos et al. [79] que combina Knowledge Discovery in Databases (KDD) y SNA. Ésta fue aplicada sobre dos comunidades virtuales, Plexilandia (VCoP) y The Dark Web Portal (VCoI). En la etapa de KDD se efectuó el preprocesamiento de los posts de los usuarios, para luego aplicar Latent Dirichlet Allocation (LDA), que permite describir cada post en términos de tópicos. En la etapa SNA se construyeron redes filtradas con la información obtenida en la etapa anterior. A continuación se utilizaron dos algoritmos desarrollados en esta tesis, SLTA y TPA, para encontrar subcomunidades traslapadas. Los resultados muestran que SLTA logra un desempeño, en promedio, un 5% superior que el mejor algoritmo existente cuando es aplicado sobre una VCoP. Además, se encontró que la calidad de la estructura de sub-comunidades detectadas aumenta, en promedio, un 64% cuando el filtro semántico es aumentado. Con respecto a TPA, este algoritmo logra, en promedio, una medida de modularidad de 0.33 mientras que el mejor algoritmo existente 0.043 cuando es aplicado sobre una VCoI. Además la aplicación conjunta de nuestros algoritmos parece mostrar una forma de determinar el tipo de comunidad que se está analizando. Sin embargo, esto debe ser comprobado analizando más comunidades virtuales. Redes sociales Algoritmos computacionales Minería de datos Comunidad virtual Latent dirichlet allocation
395	Profiles of Trauma Exposure and Biopsychosocial Health among Sex Trafficking Survivors: Exploring Differences in Help-Seeking Attitudes and Intentions Ruhlman, Lauren January 1900 (has links) Doctor of Philosophy / School of Family Studies and Human Services / Briana S. Goff / Human sex trafficking is a complex and unique phenomenon involving the commercial sexual exploitation (CSE) of persons by means of force, fraud, or coercion. The purpose of this study was to investigate unique patterns of trauma exposure and biopsychosocial health among a sample of CSE survivors. Results from a latent profile analysis with 135 adults trafficked in the United States yielded three distinct survivor sub-groups: mildly distressed, moderately distressed, and severely distressed. The mildly distressed class (18.5%) was characterized by the lowest reports of trauma exposure and an absence of clinically significant psycho-social stress symptoms. The moderately distressed class (48.89%) endorsed comparatively medial levels of trauma exposure, as well as clinically significant disturbance in six domains of psycho-social health. The severely distressed class (32.59%) reported the highest degree of trauma exposure and exhibited clinically significant symptoms of pervasive psycho-social stress across all domains assessed. To better understand variation in CSE survivors’ engagement with formal support services, this study also examined differences in help-seeking attitudes and intentions between latent classes. Results indicated that compared to those in the mildly and moderately distressed classes, severely distressed survivors endorsed significantly more unfavorable attitudes toward seeking professional help, along with no intention to seek help from any source when facing a personal or emotional crisis. Findings from this study provide a snapshot of significant heterogeneity in trauma exposure and biopsychosocial health among CSE survivors, as well as associated differences in help-seeking attitudes and intentions. The identification of distinct survivor sub-groups in these and future analyses mark an important intermediate step toward developing empirically-testable support services that are specifically designed to meet the unique needs of CSE survivors. Sex Trafficking Biopsychosocial Trauma Exposure Help-seeking Attitudes Help-seeking Behaviors Latent Profile Analysis
396	Zdrojové faktory indexů ekonomické svobody / Factors of economical freedom indices Ondruš, Martin January 2015 (has links) This work discusses the detection of latent variables, which create indices of economic freedom. Firstly, we present the most well-known indices of economic freedom (IEF, EFW). Secondly, this work discusses multivariate statistical method - factor analysis, which we use to detect latent variables. We show different methods of estimates in factor analysis and we focus on principal factor method. Furthermore, we compare already defined methods by analysing the structure of EFW index. According to estimated models, we interpret detected latent variables. We use statistical software SPSS and R for factor analysis of EFW index.
397	Food safety, perceptions and preferences : empirical studies on risks, responsibility, trust, and consumer choices Erdem, Seda January 2011 (has links) This thesis addresses various food safety issues and investigates them from an economic perspective within four different, but related, studies. The studies are intended to provide policy-makers and other decision-makers in the industry with valuable information that will help them to implement better mitigation strategies and policies. The studies also present some applications of advancements in choice modelling, and thus contribute to the literature. To address these issues, various surveys were conducted in the UK.The first study investigates different stakeholder groups’ perceptions of responsibility among the stages of the meat chain for ensuring the meat they eat does not cause them to become ill, and how this differed with food types. The means by which this is achieved is novel, as we elicit stakeholders’ relative degrees of responsibility using the Best-Worst Scaling (BWS) technique. BWS is particularly useful because it avoids the necessity of ranking a large set of items, which people have been found to struggle with. The results from this analysis reveal a consistent pattern among respondents of downplaying the extent of their own responsibility. The second study explores people’s perceptions of various food and non-food risks within a framework characterised by the level of control that respondents believe they have over the risks, and the level of worry that the risks prompt. The means by which this is done differs from past risk perception analyses in that it questions people directly regarding their relative assessments of the levels of control and worry over the risks presented. The substantive analysis of the risk perceptions has three main foci concerning the relative assessment of (i) novel vs. more familiar risks, (ii) food vs. non-food risks, (iii) differences in the risk perceptions across farmers and consumers, with a particular orientation on E. coli. The third study investigates consumers’ willingness to pay (WTP) for reductions in the level foodborne health risk achieved by (1) nanotechnology and (2) less controversial manners in the food system. The difference between consumers’ valuations provides an implicit value for nanotechnology. This comparison is achieved via a split sample Discrete Choice Experiment study. Valuations of the risk reductions are derived from conditional, heteroskedastic conditional, mixed, and heteroscedastic mixed logit models. General results show the existence of heterogeneity in British consumers’ preferences and variances, and that the value of nanotechnology differs for different types of consumers. The fourth study investigates consumers’ perceptions of trust in institutions to provide information about nanotechnology and its use in food production and packaging. It is shown how the use of BWS and Latent Class modelling of survey data can provide in-depth information on consumer categories useful for the design of effective public policy, which in turn would allow the development of best practice in risk communication for novel technologies. Results show heterogeneity in British consumers’ preferences. Three distinct consumer segments are identified: Class-1, who trust “government institutions and scientists” most; Class-2, who trust “non-profit organisations and environmental groups” most; and Class-3, who trust “food producers and handlers, and media” most. 363.19
398	Exploring the latent structure of IT employees’ intention to resign in South Africa Le Roux, Mark January 2013 (has links) One of the major challenges facing South African IT organisations today is the dramatic shortage of IT professionals. Both literature and business sentiment have indicated that employee turnover within the IT sector is on a continually rising trend. The ramifications of these high turnover rates translate into exorbitant direct and indirect costs to organisations. The purpose of this research was to identify the factors pertaining to the underlying structure of the turnover intention of these employees. A deeper understanding of these drivers may possibly enable management to reduce the turnover intention of employees within their organisations. A quantitative, multi-disciplinary research approach, focussing on the antecedents of turnover intention and the three systemic levels of organisational behaviour (micro, meso and macro) was used to operationalise the main research construct of this study. Data was collected by means of an anonymous self-administered web-based survey. A sample of 188 completed questionnaires was collected using a snowball sampling technique from the population of employees in the IT industry in South Africa. A statistical data reduction method, exploratory factor analysis, was conducted on the dataset to determine the underlying nature of the construct, IT employees’ perceived intention to resign from employment. After an appropriate number of factor analytic rounds, a robust 4-factor model of the data set was established. The results indicated that the factor, Personal Enrichment from Management Support, possibly plays the most significant role in understanding, monitoring, and managing IT employees’ perceived intention to resign from employment. The study provided support that monetary factors had the most significant influence in an employee’s decision to join an organisation; however, nonmonetary benefits, such as job satisfaction and skills development, were found to be more effective in retaining employees. The practical implications uncovered from this study will enable management to gain further insight into understanding the underlying factors and drivers of turnover intention and thereby minimise its impact on the organisation. / Dissertation (MBA)--University of Pretoria, 2013. / lmgibs2014 / Gordon Institute of Business Science (GIBS) / MBA / Unrestricted UCTD
399	The Effects of Age and Gender on Pedestrian Traffic Injuries: A Random Parameters and Latent Class Analysis Raharjo, Tatok Raharjo 21 June 2016 (has links) Pedestrians are vulnerable road users because they do not have any protection while they walk. They are unlike cyclists and motorcyclists who often have at least helmet protection and sometimes additional body protection (in the case of motorcyclists with body-armored jackets and pants). In the US, pedestrian fatalities are increasing and becoming an ever larger proportion of overall roadway fatalities (NHTSA, 2016), thus underscoring the need to study factors that influence pedestrian-injury severity and potentially develop appropriate countermeasures. One of the critical elements in the study of pedestrian-injury severities is to understand how injuries vary across age and gender ‒ two elements that have been shown to be critical injury determinants in past research. In the current research effort, 4829 police-reported pedestrian crashes from Chicago in 2011 and 2012 are used to estimate multinomial logit, mixed logit, and latent class logit models to study the effects of age and gender on resulting injury severities in pedestrian crashes. The results from these model estimations show that the injury severity level for older males, younger males, older females, and younger females are statistically different. Moreover, the overall findings also show that older males and older females are more likely to have higher injury-severity levels in many instances (if a crash occurs on city streets, state maintained urban roads, the primary cause of the crash is failing to yield right-of way, pedestrian entering/ leaving/ crossing is not at intersection, road surface condition is dry, and road functional class is a local road or street). The findings suggest that well-designed and well-placed crosswalks, small islands in two-way streets, narrow streets, clear road signs, provisions for resting places, and wide, flat sidewalks all have the potential to result in lower pedestrian-injury severities across age/gender combinations. Pedestrian Injury Severities Mixed Logit Model Latent Class Logit Model Statistics and Probability Urban Studies and Planning
400	Unveiling Covariate Inclusion Structures In Economic Growth Regressions Using Latent Class Analysis Crespo Cuaresma, Jesus, Grün, Bettina, Hofmarcher, Paul, Humer, Stefan, Moser, Mathias January 2016 (has links) (PDF) We propose the use of Latent Class Analysis methods to analyze the covariate inclusion patterns across specifications resulting from Bayesian model averaging exercises. Using Dirichlet Process clustering, we are able to identify and describe dependency structures among variables in terms of inclusion in the specifications that compose the model space. We apply the method to two datasets of potential determinants of economic growth. Clustering the posterior covariate inclusion structure of the model space formed by linear regression models reveals interesting patterns of complementarity and substitutability across economic growth determinants. JEL C11; C21; O47

Search results