Spelling suggestions: "subject:"data tet"" "subject:"data beet""
1 |
Statistical analysis of grouped dataCrafford, Gretel 01 July 2008 (has links)
The maximum likelihood (ML) estimation procedure of Matthews and Crowther (1995: A maximum likelihood estimation procedure when modelling in terms of constraints. South African Statistical Journal, 29, 29-51) is utilized to fit a continuous distribution to a grouped data set. This grouped data set may be a single frequency distribution or various frequency distributions that arise from a cross classification of several factors in a multifactor design. It will also be shown how to fit a bivariate normal distribution to a two-way contingency table where the two underlying continuous variables are jointly normally distributed. This thesis is organized in three different parts, each playing a vital role in the explanation of analysing grouped data with the ML estimation of Matthews and Crowther. In Part I the ML estimation procedure of Matthews and Crowther is formulated. This procedure plays an integral role and is implemented in all three parts of the thesis. In Part I the exponential distribution is fitted to a grouped data set to explain the technique. Two different formulations of the constraints are employed in the ML estimation procedure and provide identical results. The justification of the method is further motivated by a simulation study. Similar to the exponential distribution, the estimation of the normal distribution is also explained in detail. Part I is summarized in Chapter 5 where a general method is outlined to fit continuous distributions to a grouped data set. Distributions such as the Weibull, the log-logistic and the Pareto distributions can be fitted very effectively by formulating the vector of constraints in terms of a linear model. In Part II it is explained how to model a grouped response variable in a multifactor design. This multifactor design arise from a cross classification of the various factors or independent variables to be analysed. The cross classification of the factors results in a total of T cells, each containing a frequency distribution. Distribution fitting is done simultaneously to each of the T cells of the multifactor design. Distribution fitting is also done under the additional constraints that the parameters of the underlying continuous distributions satisfy a certain structure or design. The effect of the factors on the grouped response variable may be evaluated from this fitted design. Applications of a single-factor and a two-factor model are considered to demonstrate the versatility of the technique. A two-way contingency table where the two variables have an underlying bivariate normal distribution is considered in Part III. The estimation of the bivariate normal distribution reveals the complete underlying continuous structure between the two variables. The ML estimate of the correlation coefficient ρ is used to great effect to describe the relationship between the two variables. Apart from an application a simulation study is also provided to support the method proposed. / Thesis (PhD (Mathematical Statistics))--University of Pretoria, 2007. / Statistics / unrestricted
|
2 |
Investigating the association between atypical antipsychotic medication use and falls among personal care home residents in the Winnipeg Health RegionBozat-Emre, Songul 16 January 2012 (has links)
Falls among older adults (age 65 years and older) residing in personal care homes (PCHs) are an important health concern. Atypical antipsychotic drugs (AADs) have been shown to be associated with fall risk among older adults. However, previous studies face some methodological limitations that affect the quality, consistency, and comparability of these studies. Therefore, a population-based study was undertaken to examine the effect of AAD use on the risk of falling among older PCH residents.
A nested case-control study was conducted using the administrative healthcare records and Minimum Data Set for PCHs (MDS) housed at the Manitoba Centre for Health Policy in the Faculty of Medicine, University of Manitoba. The study period was from April 1, 2005 to March 31, 2007. Cases (n=626) were fallers as recorded in MDS. Using incidence density sampling, each case was matched to four controls on length of PCH stay, age, and sex (n=2,388). Exposure to AADs was obtained from the Drug Program Information Network database. Conditional logistic regression was used to model the effects of AAD use on the risk of falling while accounting for matching and for confounding of other covariates.
While the adjusted odds of falling was statistically greater for AAD users versus nonusers (adjusted odds ratio = 1.60, 95% CI 1.10-2.32), this association was type and dose dependent. Compared to nonusers, the odds of falling was greater for quetiapine users, regardless of this drug's dose, and high dose risperidone users. On the other hand, low dose risperidone and olanzapine, irrespective of drug dose, use was not associated with the risk of falling. Furthermore, the effect of AAD use, in general, on the risk of falling was significantly greater for people with wandering problems (adjusted odds ratio = 1.84, 95% CI 1.09-3.09).
Despite some methodological limitations, this research has provided some unique findings that enhance our understanding of AAD use as a fall risk factor. Study findings allow policymakers to further develop evidence-based interventions specific to AADs in order to better manage falls in the PCH setting. However, a great deal of research is still needed to address other important unanswered questions.
|
3 |
Investigating the association between atypical antipsychotic medication use and falls among personal care home residents in the Winnipeg Health RegionBozat-Emre, Songul 16 January 2012 (has links)
Falls among older adults (age 65 years and older) residing in personal care homes (PCHs) are an important health concern. Atypical antipsychotic drugs (AADs) have been shown to be associated with fall risk among older adults. However, previous studies face some methodological limitations that affect the quality, consistency, and comparability of these studies. Therefore, a population-based study was undertaken to examine the effect of AAD use on the risk of falling among older PCH residents.
A nested case-control study was conducted using the administrative healthcare records and Minimum Data Set for PCHs (MDS) housed at the Manitoba Centre for Health Policy in the Faculty of Medicine, University of Manitoba. The study period was from April 1, 2005 to March 31, 2007. Cases (n=626) were fallers as recorded in MDS. Using incidence density sampling, each case was matched to four controls on length of PCH stay, age, and sex (n=2,388). Exposure to AADs was obtained from the Drug Program Information Network database. Conditional logistic regression was used to model the effects of AAD use on the risk of falling while accounting for matching and for confounding of other covariates.
While the adjusted odds of falling was statistically greater for AAD users versus nonusers (adjusted odds ratio = 1.60, 95% CI 1.10-2.32), this association was type and dose dependent. Compared to nonusers, the odds of falling was greater for quetiapine users, regardless of this drug's dose, and high dose risperidone users. On the other hand, low dose risperidone and olanzapine, irrespective of drug dose, use was not associated with the risk of falling. Furthermore, the effect of AAD use, in general, on the risk of falling was significantly greater for people with wandering problems (adjusted odds ratio = 1.84, 95% CI 1.09-3.09).
Despite some methodological limitations, this research has provided some unique findings that enhance our understanding of AAD use as a fall risk factor. Study findings allow policymakers to further develop evidence-based interventions specific to AADs in order to better manage falls in the PCH setting. However, a great deal of research is still needed to address other important unanswered questions.
|
4 |
Incorporating User Reviews as Implicit Feedback for Improving Recommender SystemsHeshmat Dehkordi, Yasamin 26 August 2014 (has links)
Recommendation systems have become extremely common in recent years due to
the ubiquity of information across various applications. Online entertainment (e.g.,
Netflix), E-commerce (e.g., Amazon, Ebay) and publishing services such as Google
News are all examples of services which use recommender systems. Recommendation systems are rapidly evolving in these years, but these methods have fallen short in coping with several emerging trends such as likes or votes on reviews. In this work we have proposed a new method based on collaborative filtering by considering other users' feedback on each review. To validate our approach we have used Yelp data set with more than 335,000 product and service category ratings and 70,817 real users. We present our results using comparative analysis with other well-known recommendation systems for particular categories of users and items. / Graduate / 0984 / 0800 / yheshmat@uvic.ca
|
5 |
On the efficient distributed evaluation of SPARQL queries / Sur l'évaluation efficace de requêtes SPARQL distribuéesGraux, Damien 15 December 2016 (has links)
Le Web Sémantique est une extension du Web standardisée par le World Wide Web Consortium. Les différents standards utilisent comme format de base pour les données le Resource Description Framework (rdf) et son langage de requêtes nommé sparql. Plus généralement, le Web Sémantique tend à orienter l’évolution du Web pour permettre de trouver et de traiter l’information plus facilement. L'augmentation des volumes de données rdf disponibles tend à faire rendre standard la distribution des jeux de données. Par conséquent, des évaluateurs de requêtes sparql efficaces et distribués sont de plus en plus nécessaires. Pour faire face à ces challenges, nous avons commencé par comparer plusieurs évaluateurs sparql distribués de l'état-de-l'art tout en adaptant le jeu de métriques considéré. Ensuite, une analyse guidée par des cas typiques d'utilisation nous a conduit à définir de nouveaux champs de développement dans le domaine de l'évaluation distribuée de sparql. Sur la base de ces nouvelles perspectives, nous avons développé plusieurs évaluateurs efficaces pour ces différents cas d'utilisation que nous avons comparé expérimentalement. / The Semantic Web standardized by the World Wide Web Consortium aims at providing a common framework that allows data to be shared and analyzed across applications. Thereby, it introduced as common base for data the Resource Description Framework (rdf) and its query language sparql.Because of the increasing amounts of rdf data available, dataset distribution across clusters is poised to become a standard storage method. As a consequence, efficient and distributed sparql evaluators are needed.To tackle these needs, we first benchmark several state-of-the-art distributed sparql evaluators while adapting the considered set of metrics to a distributed context such as e.g. network traffic. Then, an analysis driven by typical use cases leads us to define new development areas in the field of distributed sparql evaluation. On the basis of these fresh perspectives, we design several efficient distributed sparql evaluators which fit into each of these use cases and whose performances are validated compared with the already benchmarked evaluators. For instance, our distributed sparql evaluator named sparqlgx offers efficient time performances while being resilient to the loss of nodes.
|
6 |
Relationship between nurse staffing and quality of life in Iowa nursing homesShin, Juh Hyun 01 January 2008 (has links)
The purpose of this study was to investigate the relationship between nursing staffing and quality of life (QOL) in nursing homes (NHs). The relationships between nursing staff hours per resident day, nursing staffing skill mix, turnover of nursing staff, and the answers given to QOL questions by 231 residents in Iowa NHs were investigated. Unexpectedly, only part of staffing variables were statistically significantly correlated with QOL of residents and nurse staffing variables seemed to have little influence on predicting QOL of residents in this study. The major differences between this study and previous studies are that previous research focused on quality of care (QOC) and this study measured QOL by measuring residents' outcomes. Previous studies found that nurse staffing is an important factor in improving QOC (and by implication, QOL) of NH residents. Based on the statistically significant relationships, RNs' unique contributions were supported by the findings that NHs with more RNs, compared with LPNs/LVNs and CNAs, had residents with higher scores in the functional competence domain and overall QOL summary items. This study found that nurse staffing turnover is positively correlated with QOL, especially in the individuality domain. However, the whole study takes place in one state, Iowa. Iowa has a homogeneous population with limited racial diversity. Only Iowa NHs were selected and it is questionable whether the findings may be generalizable to the rest of the United States. Further research is required to confirm the relationship and provide policy guidelines, including nurse staffing recommendations, to guarantee optimal QOL for NH residents.
|
7 |
Tools for AI Music Creatives : Mapping the fieldMartin, Elliot, Avila Rojas, Ley-Olivia January 2022 (has links)
Within the creative industries, such as visual arts and music, there has been a rise of AI implementations to solve various tasks, in each respective creative field. Implementations within the field of AI music creation have gained a lot of attention in recent years, due to the fact that many tools have become proficient in making music. Previously, there has been a lot of research dedicated to the algorithms behind these tools, but not as much to other software qualities that may be useful to both users of these tools, and developers of such tools to know. Hence, the focus of this thesis will be on completing a mapping of 6 established AI music creation tools, after a set of technical evaluation components. The mapping was carried out by a functional taxonomy. The results showcase that a majority of the tools implement DL algorithms, all data-sets are constructed differently, the majority apply user-friendly cloud-based environments for their tools, and that there was an equal divide between open-and closed source tools. The discussion chapter analyzes why developers have created the tools in a certain way, why potential developers should consider to implement a music creation tool with a DL algorithm, and why they should consider studying existing open-source tools, due to the knowledge and resources developers stand to gain from such a platform. Closed-source tools are more suitable for users who only want to create music with AI music creation tools, considering the uncomplicated usage, and access of such a tool. / Inom de kreativa branscherna, till exempel bildkonst och musik, har det skett en ökning av AI-implementeringar för att lösa olika uppgifter, inom respektive kreativt område. Implementeringar inom området AI-musikskapande har fått stor uppmärksamhet de senaste åren, på grund av att många verktyg har blivit skickliga i att skapa musik. Tidigare har det gjorts mycket forskning tillägnad till algoritmerna bakom dessa verktyg, men inte lika mycket andra mjukvaru-kvaliteter som kan vara användbara för både användare av dessa verktyg och utvecklare av sådana verktyg att känna till. Denna avhandling kommer därmed fokusera på att slutföra en kartläggning av 6 etablerade AI-musikskapande verktyg, med hjälp av en uppsättning tekniska utvärderingskomponenter. Kartläggningen utfördes med en funktionell taxonomi. Resultaten visar att en majoritet av verktygen implementerar DL-algoritmer, alla datamängder är konstruerade på olika sätt, majoriteten tillämpar användarvänliga molnbaserade miljöer för sina verktyg, och att det fanns en lika uppdelning mellan verktyg med öppen,-och sluten källkod. Diskussionskapitlet analyserar varför utvecklare har skapat verktygen på ett visst sätt, varför potentiella utvecklare bör överväga att implementera ett musikskapande verktyg med en DL-algoritm och varför de bör överväga att studera befintliga verktyg med öppen källkod, på grund av den kunskap och resurser som utvecklare har att vinna på från en sådan plattform. Verktyg med sluten källkod är mer lämpade för användare som endast vill skapa musik med AI-musikskapande verktyg, med tanke på den okomplicerade användningen och tillgången till dessa verktyg.
|
8 |
Single data set detection for multistatic Doppler radarShtarkalev, Bogomil Iliev January 2015 (has links)
The aim of this thesis is to develop and analyse single data set (SDS) detection algorithms that can utilise the advantages of widely-spaced (statistical) multiple-input multiple-output (MIMO) radar to increase their accuracy and performance. The algorithms make use of the observations obtained from multiple space-time adaptive processing (STAP) receivers and focus on covariance estimation and inversion to perform target detection. One of the main interferers for a Doppler radar has always been the radar’s own signal being reflected off the surroundings. The reflections of the transmitted waveforms from the ground and other stationary or slowly-moving objects in the background generate observations that can potentially raise false alarms. This creates the problem of searching for a target in both additive white Gaussian noise (AWGN) and highly-correlated (coloured) interference. Traditional STAP deals with the problem by using target-free training data to study this environment and build its characteristic covariance matrix. The data usually comes from range gates neighbouring the cell under test (CUT). In non-homogeneous or non-stationary environments, however, this training data may not reflect the statistics of the CUT accurately, which justifies the need to develop SDS methods for radar detection. The maximum likelihood estimation detector (MLED) and the generalised maximum likelihood estimation detector (GMLED) are two reduced-rank STAP algorithms that eliminate the need for training data when mapping the statistics of the background interference. The work in this thesis is largely based on these two algorithms. The first work derives the optimal maximum likelihood (ML) solution to the target detection problem when the MLED and GMLED are used in a multistatic radar scenario. This application assumes that the spatio-temporal Doppler frequencies produces in the individual bistatic STAP pairs of the MIMO system are ideally synchronised. Therefore the focus is on providing the multistatic outcome to the target detection problem. It is shown that the derived MIMO detectors possess the desirable constant false alarm rate (CFAR) property. Gaussian approximations to the statistics of the multistatic MLED and GMLED are derived in order to provide a more in-depth analysis of the algorithms. The viability of the theoretical models and their approximations are tested against a numerical simulation of the systems. The second work focuses on the synchronisation of the spatio-temporal Doppler frequency data from the individual bistatic STAP pairs in the multistatic MLED scenario. It expands the idea to a form that could be implemented in a practical radar scenario. To reduce the information shared between the bistatic STAP channels, a data compression method is proposed that extracts the significant contributions of the MLED likelihood function before transmission. To perform the inter-channel synchronisation, the Doppler frequency data is projected into the space of potential target velocities where the multistatic likelihood is formed. Based on the expected structure of the velocity likelihood in the presence of a target, a modification to the multistatic MLED is proposed. It is demonstrated through numerical simulations that the proposed modified algorithm performs better than the basic multistatic MLED while having the benefit of reducing the data exchange in the MIMO radar system.
|
9 |
Improved quantification under dataset shift / Quantificação em problemas com mudança de domínioVaz, Afonso Fernandes 17 May 2018 (has links)
Several machine learning applications use classifiers as a way of quantifying the prevalence of positive class labels in a target dataset, a task named quantification. For instance, a naive way of determining what proportion of positive reviews about given product in the Facebook with no labeled reviews is to (i) train a classifier based on Google Shopping reviews to predict whether a user likes a product given its review, and then (ii) apply this classifier to Facebook posts about that product. Unfortunately, it is well known that such a two-step approach, named Classify and Count, fails because of data set shift, and thus several improvements have been recently proposed under an assumption named prior shift. However, these methods only explore the relationship between the covariates and the response via classifiers and none of them take advantage of the fact that one often has access to a few labeled samples in the target set. Moreover, the literature lacks in approaches that can handle a target population that varies with another covariate; for instance: How to accurately estimate how the proportion of new posts or new webpages in favor of a political candidate varies in time? We propose novel methods that fill these important gaps and compare them using both real and artificial datasets. Finally, we provide a theoretical analysis of the methods. / Muitas aplicações de aprendizado de máquina usam classificadores para determinar a prevalência da classe positiva em um conjunto de dados de interesse, uma tarefa denominada quantificação. Por exemplo, uma maneira ingênua de determinar qual a proporção de postagens positivas sobre um determinado protuto no Facebook sem ter resenhas rotuladas é (i) treinar um classificador baseado em resenhas do Google Shopping para prever se um usuário gosta de um produto qualquer, e então (ii) aplicar esse classificador às postagens do Facebook relacionados ao produtos de interesse. Infelizmente, é sabido que essa técnica de dois passos, denominada classificar e contar, falha por não levar em conta a mudança de domínio. Assim, várias melhorias vêm sendo feitas recentemente sob uma suposição denominada prior shift. Entretanto, estes métodos exploram a relação entre as covariáveis apenas via classificadores e nenhum deles aproveitam o fato de que, em algumas situações, podemos rotular algumas amostras do conjunto de dados de interesse. Além disso, a literatura carece de abordagens que possam lidar com uma população-alvo que varia com outra covariável; por exemplo: Como estimar precisamente como a proporção de novas postagens ou páginas web a favor de um candidato político varia com o tempo? Nós propomos novos métodos que preenchem essas lacunas importantes e os comparamos utilizando conjuntos de dados reais e similados. Finalmente, nós fornecemos uma análise teórica dos métodos propostos.
|
10 |
A General Framework for Multi-Resolution VisualizationYang, Jing 05 May 2005 (has links)
Multi-resolution visualization (MRV) systems are widely used for handling large amounts of information. These systems look different but they share many common features. The visualization research community lacks a general framework that summarizes the common features among the wide variety of MRV systems in order to help in MRV system design, analysis, and enhancement. This dissertation proposes such a general framework. This framework is based on the definition that a MRV system is a visualization system that visually represents perceptions in different levels of detail and allows users to interactively navigate among the representations. The visual representations of a perception are called a view. The framework is composed of two essential components: view simulation and interactive visualization. View simulation means that an MRV system simulates views of non-existing perceptions through simplification on the data structure or the graphics generation process. This is needed when the perceptions provided to the MRV system are not at the user's desired level of detail. The framework identifies classes of view simulation approaches and describes them in terms of simplification operators and operands (spaces). The simplification operators are further divided into four categories, namely sampling operators, aggregation operators, approximation operators, and generalization operators. Techniques in these categories are listed and illustrated via examples. The simplification operands (spaces) are also further divided into categories, namely data space and visualization space. How different simplification operators are applied to these spaces is also illustrated using examples. Interactive visualization means that an MRV system visually presents the views to users and allows users to interactively navigate among different views or within one view. Three types of MRV interface, namely the zoomable interface, the overview + context interface, and the focus + detail interface, are presented with examples. Common interaction tools used in MRV systems, such as zooming and panning, selection, distortion, overlap reduction, previewing, and dynamic simplification are also presented. A large amount of existing MRV systems are used as examples in this dissertation, including several MRV systems developed by the author based on the general framework. In addition, a case study that analyzes and suggests possible improvements for an existing MRV system is described. These examples and the case study reveal that the framework covers the common features of a wide variety of existing MRV systems, and helps users analyze and improve existing MRV systems as well as design new MRV systems.
|
Page generated in 0.0477 seconds