Spelling suggestions: "subject:"istatistical software"" "subject:"bystatistical software""
1 |
The development of a statistical computer software resource for medical researchBuchan, Iain Edward January 2000 (has links)
Medical research is often weakened by poor statistical practice, and inappropriate use of statistical computer software is part of this problem. The statistical knowledge that medical researchers require has traditionally been gained in both dedicated and ad hoc learning time, often separate from the research processes in which the statistical methods are applied. Computer software, however, can be written to flexibly support statistical practice. The work of this thesis was to explore the possibility of, and if possible, to create, a resource supporting medical researchers in statistical knowledge and calculation at the point of need. The work was carried out over eleven years, and was directed towards the medical research community in general. Statistical and Software Engineering methods were used to produce a unified statistical computational and knowledge support resource. Mathematically and computationally robust approaches to statistical methods were continually sought from current literature. The type of evaluation undertaken was formative; this included monitoring uptake of the software and feedback from its users, comparisons with other software, reviews in peer reviewed publications, and testing of results against classical and reference data. Large-scale opportunistic feedback from users of this resource was employed in its continuous improvement. The software resulting from the work of this thesis is provided herein as supportive evidence. Results of applying the software to classical reference data are shown in the written thesis. The scope and presentation of statistical methods are considered in a comparison of the software with common statistical software resources. This comparison showed that the software written for this thesis more closely matched statistical methods commonly used in medical research, and contained more statistical knowledge support materials. Up to October 31st 2000, uptake of the software was recorded for 5621 separate instances by individuals or institutions. The development has been self-sustaining. Medical researchers need to have sufficient statistical understanding, just as statistical researchers need to sufficiently understand the nature of data. Statistical software tools may damage statistical practice if they distract attention from statistical goals and tasks, onto the tools themselves. The work of this thesis provides a practical computing framework supporting statistical knowledge and calculation in medical research. This work has shown that sustainable software can be engineered to improve statistical appreciation and practice in ways that are beyond the reach of traditional medical statistical education.
|
2 |
Statistická analýza ve webovém prostředí / Statistical Analysis in Web EnvironmentPostler, Štěpán January 2013 (has links)
The aim of this thesis is creating a web application that allows dataset import and analyzing data with use of statistical methods. The application uses a user access that allows multiple number of persons manipulate with a single dataset, as well as interact with each other. Data is stored on a remote server and application is accessible from any computer that is connected to the Internet. The application is created in PHP programming language with use of MySQL database system, and user interface is built in HTML language with use of CSS styles. All parts of application are stored on an attached CD in form of text files. In addition to the web application, a part of the thesis is also a text output, which contains a theoretical part in form of description of the chosen statistical analysis methods, and a practical part containing list of application's functions, data model's description and demonstration of data analysis options on specific examples.
|
3 |
On Multivariate Longitudinal Binary Data Models And Their Applications In ForecastingAsar, Ozgur 01 July 2012 (has links) (PDF)
Longitudinal data arise when subjects are followed over time. This type of data is typically dependent, due to including repeated observations and this type of dependence is termed as within-subject dependence. Often the scientific interest is on multiple longitudinal measurements which introduce two additional types of associations, between-response and cross-response temporal dependencies. Only the statistical methods which take these association structures might yield reliable and valid statistical inferences. Although the methods for univariate longitudinal data have been mostly studied, multivariate longitudinal data still needs more work. In this thesis, although we mainly focus on multivariate longitudinal binary data models, we also consider other types of response families when necessary. We extend a work on multivariate marginal models, namely multivariate marginal models with response specific parameters (MMM1), and propose multivariate marginal models with shared regression parameters (MMM2). Both of these models are generalized estimating equation (GEE) based, and are valid for several response families such as Binomial, Gaussian, Poisson, and Gamma. Two different R packages, mmm and mmm2 are proposed to fit them, respectively. We further develop a marginalized multilevel model, namely probit normal marginalized transition random effects models (PNMTREM) for multivariate longitudinal binary response. By this model, implicit function theorem is introduced to explicitly link the levels of marginalized multilevel models with transition structures for the first time. An R package, bf pnmtrem is proposed to fit the model. PNMTREM is applied to data collected through Iowa Youth and Families Project (IYFP). Five different models, including univariate and multivariate ones, are considered to forecast multivariate longitudinal binary data. A comparative simulation study, which includes a model-independent data simulation process, is considered for this purpose. Forecasting independent variables are taken into account as well. To assess the forecasts, several accuracy measures, such as expected proportion of correct prediction (ePCP), area under the receiver operating characteristic (AUROC) curve, mean absolute scaled error (MASE) are considered. Mother' / s Stress and Children' / s Morbidity (MSCM) data are used to illustrate this comparison in real life. Results show that marginalized models yield better forecasting results compared to marginal models. Simulation results are in agreement with these results as well.
|
4 |
Alternative estimation approaches for some common Item Response Theory modelsSabouri, Pooneh, 1980- 06 January 2011 (has links)
In this report we give a brief introduction to Item Response Theory models and multilevel models. The general assumptions of two classical Item Response Theory, 1PL and 2PL models are discussed. We follow the discussion by introducing a multilevel level framework for these two Item Response Theory Models. We explain Bock and Aitkin's (1981) work to estimate item parameters for these two models. Finally we illustrate these models with a LSAT exam data and two statistical softwares; R project and Stata. / text
|
5 |
Design and adaptation of a general purpose, user friendly statistical software package for the IBM personal computer and IBM PC compatibles (PC VSTAT)Morley, Deborah G. January 1986 (has links)
No description available.
|
6 |
Détection de ruptures multiples – application aux signaux physiologiques. / Multiple change point detection – application to physiological signals.Truong, Charles 29 November 2018 (has links)
Ce travail s’intéresse au problème de détection de ruptures multiples dans des signaux physiologiques (univariés ou multivariés). Ce type de signaux comprend par exemple les électrocardiogrammes (ECG), électroencéphalogrammes (EEG), les mesures inertielles (accélérations, vitesses de rotation, etc.). L’objectif de cette thèse est de fournir des algorithmes de détection de ruptures capables (i) de gérer de long signaux, (ii) d’être appliqués dans de nombreux scénarios réels, et (iii) d’intégrer la connaissance d’experts médicaux. Par ailleurs, les méthodes totalement automatiques, qui peuvent être utilisées dans un cadre clinique, font l’objet d’une attention particulière. Dans cette optique, des procédures robustes de détection et des stratégies supervisées de calibration sont décrites, et une librairie Python open-source et documentée, est mise en ligne.La première contribution de cette thèse est un algorithme sous-optimal de détection de ruptures, capable de s’adapter à des contraintes sur temps de calcul, tout en conservant la robustesse des procédures optimales. Cet algorithme est séquentiel et alterne entre les deux étapes suivantes : une rupture est détectée, puis retranchée du signal grâce à une projection. Dans le cadre de sauts de moyenne, la consistance asymptotique des instants estimés de ruptures est démontrée. Nous prouvons également que cette stratégie gloutonne peut facilement être étendue à d’autres types de ruptures, à l’aide d’espaces de Hilbert à noyau reproduisant. Grâce à cette approche, des hypothèses fortes sur le modèle génératif des données ne sont pas nécessaires pour gérer des signaux physiologiques. Les expériences numériques effectuées sur des séries temporelles réelles montrent que ces méthodes gloutonnes sont plus précises que les méthodes sous-optimales standards et plus rapides que les algorithmes optimaux.La seconde contribution de cette thèse comprend deux algorithmes supervisés de calibration automatique. Ils utilisent tous les deux des exemples annotés, ce qui dans notre contexte correspond à des signaux segmentés. La première approche apprend le paramètre de lissage pour la détection pénalisée d’un nombre inconnu de ruptures. La seconde procédure apprend une transformation non-paramétrique de l’espace de représentation, qui améliore les performances de détection. Ces deux approches supervisées produisent des algorithmes finement calibrés, capables de reproduire la stratégie de segmentation d’un expert. Des résultats numériques montrent que les algorithmes supervisés surpassent les algorithmes non-supervisés, particulièrement dans le cas des signaux physiologiques, où la notion de rupture dépend fortement du phénomène physiologique d’intérêt.Toutes les contributions algorithmiques de cette thèse sont dans "ruptures", une librairie Python open-source, disponible en ligne. Entièrement documentée, "ruptures" dispose également une interface consistante pour toutes les méthodes. / This work addresses the problem of detecting multiple change points in (univariate or multivariate) physiological signals. Well-known examples of such signals include electrocardiogram (ECG), electroencephalogram (EEG), inertial measurements (acceleration, angular velocities, etc.). The objective of this thesis is to provide change point detection algorithms that (i) can handle long signals, (ii) can be applied on a wide range of real-world scenarios, and (iii) can incorporate the knowledge of medical experts. In particular, a greater emphasis is placed on fully automatic procedures which can be used in daily clinical practice. To that end, robust detection methods as well as supervised calibration strategies are described, and a documented open-source Python package is released.The first contribution of this thesis is a sub-optimal change point detection algorithm that can accommodate time complexity constraints while retaining most of the robustness of optimal procedures. This algorithm is sequential and alternates between the two following steps: a change point is estimated then its contribution to the signal is projected out. In the context of mean-shifts, asymptotic consistency of estimated change points is obtained. We prove that this greedy strategy can easily be extended to other types of changes, by using reproducing kernel Hilbert spaces. Thanks this novel approach, physiological signals can be handled without making assumption of the generative model of the data. Experiments on real-world signals show that those approaches are more accurate than standard sub-optimal algorithms and faster than optimal algorithms.The second contribution of this thesis consists in two supervised algorithms for automatic calibration. Both rely on labeled examples, which in our context, consist in segmented signals. The first approach learns the smoothing parameter for the penalized detection of an unknown number of changes. The second procedure learns a non-parametric transformation of the representation space, that improves detection performance. Both supervised procedures yield finely tuned detection algorithms that are able to replicate the segmentation strategy of an expert. Results show that those supervised algorithms outperform unsupervised algorithms, especially in the case of physiological signals, where the notion of change heavily depends on the physiological phenomenon of interest.All algorithmic contributions of this thesis can be found in ``ruptures'', an open-source Python library, available online. Thoroughly documented, ``ruptures'' also comes with a consistent interface for all methods.
|
7 |
常用統計套裝軟體的U(0,1)亂數產生器之探討張浩如, Chang, Hao-Ju Unknown Date (has links)
由於電腦的發展與普及,在各個領域的應用上,有越來越多的人利用電腦模擬的結果作為參考的依據。而在電腦模擬的過程中,亂數的產生是相當重要的一環。目前大多數的使用者都是直接利用套裝軟體內設的亂數產生器(random number generator)來產生亂數,但是在一般的文獻中對於各軟體內設的亂數產生器,則少有詳盡的探討。因此本論文的主要目的在於:針對SAS 6.12、SPSS 8.0、EXCEL 97、S-PLUS 2000及MINITAB 12等五種統計分析上常使用的套裝軟體,針對其內設U(0,1)亂數產生器進行較完整的介紹、比較、與探討。除了從週期長短、統計性質、電腦執行效率等三種不同觀點來評估這五種軟體內設亂數產生器的優劣之外,同時亦利用樣本平均蒙地卡羅法(sample-mean Monte Carlo method)在求解積分值上的表現作為電腦模擬的應用實例。 / With the development and popularity of computers, in different fields more and more people are using the result from computer simulation as reference. The generation of random number is one of the most important factors in applying computer simulation. Nowadays most of users use intrinsic random number generators in software to produce random numbers. However, only a few articles focus on detailed comparisons of those random number generators. Thus, in this study, we explore the random number generators in frequently used statistical software; such as SAS 6.12, SPSS 8.0, EXCEL 97, S-PLUS 2000, MINITAB 12, etc. and discuss their performances in uniform (0,1) random number generators. This study focuses not only on the comparison of period length and statistical properties of these random number generators, but also on computer executive efficiency. In addition, we also use sample-mean Monte Carlo method as an integral example of computer simulation to evaluate these random number generators.
|
8 |
Design, Development and Testing of Web Services for Multi-Sensor Snow Cover MappingKadlec, Jiri 01 March 2016 (has links) (PDF)
This dissertation presents the design, development and validation of new data integration methods for mapping the extent of snow cover based on open access ground station measurements, remote sensing images, volunteer observer snow reports, and cross country ski track recordings from location-enabled mobile devices. The first step of the data integration procedure includes data discovery, data retrieval, and data quality control of snow observations at ground stations. The WaterML R package developed in this work enables hydrologists to retrieve and analyze data from multiple organizations that are listed in the Consortium of Universities for the Advancement of Hydrologic Sciences Inc (CUAHSI) Water Data Center catalog directly within the R statistical software environment. Using the WaterML R package is demonstrated by running an energy balance snowpack model in R with data inputs from CUAHSI, and by automating uploads of real time sensor observations to CUAHSI HydroServer. The second step of the procedure requires efficient access to multi-temporal remote sensing snow images. The Snow Inspector web application developed in this research enables the users to retrieve a time series of fractional snow cover from the Moderate Resolution Imaging Spectroradiometer (MODIS) for any point on Earth. The time series retrieval method is based on automated data extraction from tile images provided by a Web Map Tile Service (WMTS). The average required time for retrieving 100 days of data using this technique is 5.4 seconds, which is significantly faster than other methods that require the download of large satellite image files. The presented data extraction technique and space-time visualization user interface can be used as a model for working with other multi-temporal hydrologic or climate data WMTS services. The third, final step of the data integration procedure is generating continuous daily snow cover maps. A custom inverse distance weighting method has been developed to combine volunteer snow reports, cross-country ski track reports and station measurements to fill cloud gaps in the MODIS snow cover product. The method is demonstrated by producing a continuous daily time step snow presence probability map dataset for the Czech Republic region. The ability of the presented methodology to reconstruct MODIS snow cover under cloud is validated by simulating cloud cover datasets and comparing estimated snow cover to actual MODIS snow cover. The percent correctly classified indicator showed accuracy between 80 and 90% using this method. Using crowdsourcing data (volunteer snow reports and ski tracks) improves the map accuracy by 0.7 – 1.2 %. The output snow probability map data sets are published online using web applications and web services.
|
9 |
Application of Fluid Inclusions and Mineral Textures in Exploration for Epithermal Precious Metals DepositsMoncada de la Rosa, Jorge Daniel 05 January 2009 (has links)
Fluid inclusion and mineralogical features indicative of boiling have been characterized in 855 samples from epithermal precious metals deposits along the Veta Madre at Guanajuato, Mexico. Features associated with boiling that have been identified at Guanajuato include colloform texture silica, plumose texture silica, moss texture silica, ghost-sphere texture silica, lattice-bladed calcite, lattice-bladed calcite replaced by quartz and pseudo-acicular quartz after calcite and coexisting liquid-rich and vapor-rich fluid inclusions. Most samples were assayed for Au, Ag, Cu, Pb, Zn, As and Sb, and were divided into high-grade and low-grade samples based on the gold and silver concentrations. For silver, the cutoff for high grade was 100 ppm Ag, and for gold the cutoff was 1 ppm Au. The feature that is most closely associated with high grades of both gold and silver is colloform texture silica, and this feature also shows the largest difference in grade between the presence or absence of that feature (178.8 ppm Ag versus 17.2 ppm Ag, and 1.1 ppm Au versus 0.2 ppm Au). For both Ag and Au, there is no significant difference in average grade as a function of whether or not coexisting liquid-rich and vapor-rich fluid inclusions are present.
The textural and fluid inclusion data obtained in this study were analyzed using the binary classifier within SPSS Clementine. The models that correctly predicted high versus low grade samples most consistently (~70-75% of the tests) for both Ag and Au were the neural network, the C5 decision tree and Quest decision tree models. For both Au and Ag, the presence of colloform silica texture was the variable with the greatest importance, i.e., the variable that has the greatest predictive power.
Boiling features are absent or rare in samples collected along a traverse perpendicular to the Veta Madre. This suggests that if an explorationist observes these features in samples collected during exploration that an environment favorable to precious metal mineralization is nearby. Similarly, good evidence for boiling is observed in the deepest levels of the Veta Madre that have been sampled in the mines and drill cores, suggesting that additional precious metal reserves are likely beneath the deepest levels sampled. / Master of Science
|
10 |
Gestão do risco de granizo pelo seguro e outras alternativas: estudo de caso em pomares de maçã de Santa Catarina. / Hail risk management using insurance and other alternatives: case study on apple orchards in Santa Catarina, Brazil.Yuri, Henrique Massaru 03 February 2004 (has links)
O prejuízo causado por chuvas de granizo é um dos principais problemas enfrentados pelos produtores de maçã no Brasil e em outros países. O trabalho apresenta, a partir de uma revisão abrangente da literatura e pesquisas de campo realizadas em região produtora do Estado de Santa Catarina, uma caracterização do problema e das alternativas existentes para a definição de estratégias ótimas para o gerenciamento do risco existente. As alternativas consideradas incluíram: seguro comercial, seguro mútuo, diversificação espacial, telas anti- granizo, foguetes anti- granizo e geradores de solo. O trabalho utiliza um modelo conceitual, especificado num diagrama de decisão e dados levantados na pesquisa, para evidenciar a relação qualitativa e quantitativa existente entre as diferentes alternativas consideradas e as variáveis mais relevantes para caracterização do problema. O modelo definido no diagrama de decisão foi implementado em software, em planilha eletrônica, de forma a facilitar o processo de seleção das melhores alternativas para gestão do risco de granizo, a partir da situação existente com relação a preços, custos e outras informações relevantes. Esse software foi utilizado em estudo de caso realizado junto a uma cooperativa de produtores de maçã de São Joaquim, SC, para a análise quantitativa das alternativas levantadas. O trabalho visa fornecer subsídios técnicos que auxiliem os agricultores no processo de tomada de decisão quanto à estratégia mais adequada para a administração do risco de chuvas de granizo em seus pomares, auxiliem as empresas de seguro na elaboração de novos contratos em suas carteiras agrícolas e, também, o governo no desenvolvimento de novas políticas voltadas ao setor agropecuário. / The damage caused by hailstorms is one of the most important problems faced by the apple producers, in Brazil, and in other countries. This work presents, from a literature review and local surveys within the apple producing region in Santa Catarina State, Brazil, a characterization of the hail risk problem and an evaluation of existing alternatives for risk management. The alternatives considered in the risk management process included: commercial insurance, mutual insurance, spatial diversification, antihail nets, hail rockets and ground burners. Besides, the research presents a conceptual model that uses decision diagrams to describe the qualitative relationship among the different alternatives for administration of the hail risk and the most important variables for the problem. The decision diagram guided the development of a software tool designed to help the selection of the best combination of alternatives for hail risk management. This software tool, implemented in a spreadsheet, was used in a case study involving an association of apple producers in São Joaquim, SC. The intention of this work is to aid apple producers in the selection of the most appropriate strategy for the administration of the risk of hailstorms in their orchards, aid the insurance companies in the design of new contracts in your agricultural portfolio and, also, aid the government in the development of new agricultural policies.
|
Page generated in 0.0977 seconds