Spelling suggestions: "subject:"nonparametric estatistics"" "subject:"nonparametric cstatistics""
161 |
An analysis of the technical efficiency in Hong Kong's construction industryWang, You-song, 王幼松. January 1998 (has links)
published_or_final_version / Real Estate and Construction / Doctoral / Doctor of Philosophy
|
162 |
Propriétés fréquentistes des méthodes Bayésiennes semi-paramétriques et non paramétriques / Frequentist properties of Bayesian semiparametric and nonparametric proceduresSalomond, Jean-Bernard 30 September 2014 (has links)
La recherche sur les méthodes bayésiennes non-paramétriques connaît un essor considérable depuis les vingt dernières années notamment depuis le développement d'algorithmes de simulation permettant leur mise en pratique. Il est donc nécessaire de comprendre, d'un point de vue théorique, le comportement de ces méthodes. Cette thèse présente différentes contributions à l'analyse des propriétés fréquentistes des méthodes bayésiennes non-paramétriques. Si se placer dans un cadre asymptotique peut paraître restrictif de prime abord, cela permet néanmoins d'appréhender le fonctionnement des procédures bayésiennes dans des modèles extrêmement complexes. Cela permet notamment de détecter les aspects de l'a priori particulièrement influents sur l’inférence. De nombreux résultats généraux ont été obtenus dans ce cadre, cependant au fur et à mesure que les modèles deviennent de plus en plus complexes, de plus en plus réalistes, ces derniers s'écartent des hypothèses classiques et ne sont plus couverts par la théorie existante. Outre l'intérêt intrinsèque de l'étude d'un modèle spécifique ne satisfaisant pas les hypothèses classiques, cela permet aussi de mieux comprendre les mécanismes qui gouvernent le fonctionnement des méthodes bayésiennes non-paramétriques. / Research on Bayesian nonparametric methods has received a growing interest for the past twenty years, especially since the development of powerful simulation algorithms which makes the implementation of complex Bayesian methods possible. From that point it is necessary to understand from a theoretical point of view the behaviour of Bayesian nonparametric methods. This thesis presents various contributions to the study of frequentist properties of Bayesian nonparametric procedures. Although studying these methods from an asymptotic angle may seems restrictive, it allows to grasp the operation of the Bayesian machinery in extremely complex models. Furthermore, this approach is particularly useful to detect the characteristics of the prior that are strongly influential in the inference. Many general results have been proposed in the literature in this setting, however the more complex and realistic the models the further they get from the usual assumptions. Thus many models that are of great interest in practice are not covered by the general theory. If the study of a model that does not fall under the general theory has an interest on its owns, it also allows for a better understanding of the behaviour of Bayesian nonparametric methods in a general setting.
|
163 |
Essays on econometric modelling of temporal networks / Essais sur la modélisation économétrique des réseaux temporelsIacopini, Matteo 05 July 2018 (has links)
La théorie des graphes a longtemps été étudiée en mathématiques et en probabilité en tant qu’outil pour décrire la dépendance entre les nœuds. Cependant, ce n’est que récemment qu’elle a été mise en œuvre sur des données, donnant naissance à l’analyse statistique des réseaux réels.La topologie des réseaux économiques et financiers est remarquablement complexe: elle n’est généralement pas observée, et elle nécessite ainsi des procédures inférentielles adéquates pour son estimation, d’ailleurs non seulement les nœuds, mais la structure de la dépendance elle-même évolue dans le temps. Des outils statistiques et économétriques pour modéliser la dynamique de changement de la structure du réseau font défaut, malgré leurs besoins croissants dans plusieurs domaines de recherche. En même temps, avec le début de l’ère des “Big data”, la taille des ensembles de données disponibles devient de plus en plus élevée et leur structure interne devient de plus en plus complexe, entravant les processus inférentiels traditionnels dans plusieurs cas. Cette thèse a pour but de contribuer à ce nouveau champ littéraire qui associe probabilités, économie, physique et sociologie en proposant de nouvelles méthodologies statistiques et économétriques pour l’étude de l’évolution temporelle des structures en réseau de moyenne et haute dimension. / Graph theory has long been studied in mathematics and probability as a tool for describing dependence between nodes. However, only recently it has been implemented on data, giving birth to the statistical analysis of real networks.The topology of economic and financial networks is remarkably complex: it is generally unobserved, thus requiring adequate inferential procedures for it estimation, moreover not only the nodes, but the structure of dependence itself evolves over time. Statistical and econometric tools for modelling the dynamics of change of the network structure are lacking, despite their increasing requirement in several fields of research. At the same time, with the beginning of the era of “Big data” the size of available datasets is becoming increasingly high and their internal structure is growing in complexity, hampering traditional inferential processes in multiple cases.This thesis aims at contributing to this newborn field of literature which joins probability, economics, physics and sociology by proposing novel statistical and econometric methodologies for the study of the temporal evolution of network structures of medium-high dimension.
|
164 |
Abordagem não-paramétrica para cálculo do tamanho da amostra com base em questionários ou escalas de avaliação na área de saúde / Non-parametric approach for calculation of sample size based on questionnaires or scales of assessment in the health careCouto Junior, Euro de Barros 01 October 2009 (has links)
Este texto sugere sobre como calcular um tamanho de amostra com base no uso de um instrumento de coleta de dados formado por itens categóricos. Os argumentos para esta sugestão estão embasados nas teorias da Combinatória e da Paraconsistência. O propósito é sugerir um procedimento de cálculo simples e prático para obter um tamanho de amostra aceitável para coletar informações, organizá-las e analisar dados de uma aplicação de um instrumento de coleta de dados médicos baseado, exclusivamente, em itens discretos (itens categóricos), ou seja, cada item do instrumento é considerado como uma variável não-paramétrica com um número finito de categorias. Na Área de Saúde, é muito comum usar instrumentos para levantamento com base nesse tipo de itens: protocolos clínicos, registros hospitalares, questionários, escalas e outras ferramentas para inquirição consideram uma sequência organizada de itens categóricos. Uma fórmula para o cálculo do tamanho da amostra foi proposta para tamanhos de população desconhecidos e um ajuste dessa fórmula foi proposto para populações de tamanho conhecido. Pôde-se verificar, com exemplos práticos, a possibilidade de uso de ambas as fórmulas, o que permitiu considerar a praticidade de uso nos casos em que se tem disponível pouca ou nenhuma informação sobre a população de onde a amostra será coletada. / This text suggests how to calculate a sample size based on the use of a data collection instrument consisting of categorical items. The arguments for this suggestion are based on theories of Combinatorics and Paraconsistency. The purpose is to suggest a practical and simple calculation procedure to obtain an acceptable sample size to collect information, organize it and analyze data from an application of an instrument for collecting medical data, based exclusively on discrete items (categorical items), i.e., each item of the instrument is considered as a non-parametric variable with finite number of categories. In the health care it is very common to use survey instruments on the basis of such items: clinical protocols, hospital registers, questionnaires, scales and other tools for hearing consider a sequence of items organized categorically. A formula for calculating the sample size was proposed for a population of unknown size, and an adjusted formula has been proposed for population of known size. It was seen, with practical examples, the possibility of using both formulas, allowing to consider the practicality of the use in cases that have little or no information available about the population from which the sample is collected
|
165 |
An empirical analysis of hedge ratio: the case of Nikkei 225 options.January 2001 (has links)
Lam Suet-man. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 111-117). / Abstracts in English and Chinese. / ACKNOWOLEDGMENTS --- p.iii / LIST OF TABLES --- p.iv / LIST OF ILLUSTRATIONS --- p.vi / CHAPTER / Chapter ONE --- INTRODUCTION --- p.1 / Chapter TWO --- REVIEW OF THE LITERATURE --- p.6 / Parametric Models / Nonparametric Estimation Techniques / Chapter THREE --- METHODOLOGY --- p.21 / Parametric Models / Nonparametric Models / Chapter FOUR --- DATA DESCRIPTION --- p.33 / Chapter FIVE --- EMPIRICAL FINDINGS --- p.39 / Estimation Results / Evaluation of Model Performance / Out-of-sample Forecast Evaluation / Chapter SIX --- CONCLUSION --- p.58 / TABLES --- p.62 / ILLUSTRATIONS --- p.97 / APPENDIX --- p.107 / BIBOGRAPHY --- p.111
|
166 |
Abordagem não-paramétrica para cálculo do tamanho da amostra com base em questionários ou escalas de avaliação na área de saúde / Non-parametric approach for calculation of sample size based on questionnaires or scales of assessment in the health careEuro de Barros Couto Junior 01 October 2009 (has links)
Este texto sugere sobre como calcular um tamanho de amostra com base no uso de um instrumento de coleta de dados formado por itens categóricos. Os argumentos para esta sugestão estão embasados nas teorias da Combinatória e da Paraconsistência. O propósito é sugerir um procedimento de cálculo simples e prático para obter um tamanho de amostra aceitável para coletar informações, organizá-las e analisar dados de uma aplicação de um instrumento de coleta de dados médicos baseado, exclusivamente, em itens discretos (itens categóricos), ou seja, cada item do instrumento é considerado como uma variável não-paramétrica com um número finito de categorias. Na Área de Saúde, é muito comum usar instrumentos para levantamento com base nesse tipo de itens: protocolos clínicos, registros hospitalares, questionários, escalas e outras ferramentas para inquirição consideram uma sequência organizada de itens categóricos. Uma fórmula para o cálculo do tamanho da amostra foi proposta para tamanhos de população desconhecidos e um ajuste dessa fórmula foi proposto para populações de tamanho conhecido. Pôde-se verificar, com exemplos práticos, a possibilidade de uso de ambas as fórmulas, o que permitiu considerar a praticidade de uso nos casos em que se tem disponível pouca ou nenhuma informação sobre a população de onde a amostra será coletada. / This text suggests how to calculate a sample size based on the use of a data collection instrument consisting of categorical items. The arguments for this suggestion are based on theories of Combinatorics and Paraconsistency. The purpose is to suggest a practical and simple calculation procedure to obtain an acceptable sample size to collect information, organize it and analyze data from an application of an instrument for collecting medical data, based exclusively on discrete items (categorical items), i.e., each item of the instrument is considered as a non-parametric variable with finite number of categories. In the health care it is very common to use survey instruments on the basis of such items: clinical protocols, hospital registers, questionnaires, scales and other tools for hearing consider a sequence of items organized categorically. A formula for calculating the sample size was proposed for a population of unknown size, and an adjusted formula has been proposed for population of known size. It was seen, with practical examples, the possibility of using both formulas, allowing to consider the practicality of the use in cases that have little or no information available about the population from which the sample is collected
|
167 |
Obtenção dos níveis de significância para os testes de Kruskal-Wallis, Friedman e comparações múltiplas não-paramétricas. / Obtaining significance levels for Kruskal-Wallis, Friedman and nonparametric multiple comparisons tests.Pontes, Antonio Carlos Fonseca 29 June 2000 (has links)
Uma das principais dificuldades encontradas pelos pesquisadores na utilização da Estatística Experimental Não-Paramétrica é a obtenção de resultados confiáveis. Os testes mais utilizados para os delineamentos com um fator de classificação simples inteiramente casualizados e blocos casualizados são o de Kruskal-Wallis e o de Friedman, respectivamente. As tabelas disponíveis para estes testes são pouco abrangentes, fazendo com que o pesquisador seja obrigado a recorrer a aproximações. Estas aproximações diferem dependendo do autor a ser consultado, podendo levar a resultados contraditórios. Além disso, tais tabelas não consideram empates, mesmo no caso de pequenas amostras. No caso de comparações múltiplas isto é mais evidente ainda, em especial quando ocorrem empates ou ainda, nos delineamentos inteiramente casualizados onde se tem número diferente de repetições entre tratamentos. Nota-se ainda que os softwares mais utilizados em geral recorrem a aproximações para fornecer os níveis de significância, além de não apresentarem resultados para as comparações múltiplas. Assim, o objetivo deste trabalho é apresentar um programa, em linguagem C, que realiza os testes de Kruskal-Wallis, de Friedman e de comparações múltiplas entre todos os tratamentos (bilateral) e entre os tratamentos e o controle (uni e bilateral) considerando todas as configurações sistemáticas de postos ou com 1.000.000 de configurações aleatórias, dependendo do número total de permutações possíveis. Dois níveis de significância são apresentados: o DW ou MaxDif , baseado na comparação com a diferença máxima dentro de cada configuração e o Geral, baseado na comparação com todas as diferenças em cada configuração. Os valores do nível de significância Geral assemelham-se aos fornecidos pela aproximação normal. Os resultados obtidos através da utilização do programa mostram, ainda, que os testes utilizando as permutações aleatórias podem ser bons substitutos nos casos em que o número de permutações sistemáticas é muito grande, já que os níveis de probabilidade são bastante próximos. / One of the most difficulties for the researchers in using Nonparametric Methods is to obtain reliable results. Kruskal-Wallis and Friedman tests are the most used for one-way layout and for randomized blocks, respectively. Tables available for these tests are not too wild, so the research must use approximate values. These approximations are different, depending on the author and the results can be not similar. Furthermore, these tables do not taking account tied observations, even in the case of small sample. For multiple comparisons, this is more evident, specially when tied observations occur or the number of replications is different. Many softwares like SAS, STATISTICA, S-Plus, MINITAB, etc., use approximation in order to get the significance levels and they do not present results for multiple comparisons. Thus, the aim of this work is to present a routine in C language that runs Kruskal-Wallis, Friedman and multiple comparisons among all treatments (bi-tailed) and between treatment and control (uni and bi-tailed), considering all the systematic configurations of the ranks or with more than 1,000,000 random ones, depending on the total of possible permutations. Two levels of significance are presented: DW or MaxDif, based on the comparison of the maximum difference within each configuration and the Geral, based on the comparison of all differences for each configuration. The Geral values of the significance level are very similar for the normal approximation. The obtaining results through this routine show that, the tests using random permutations can be nice substitutes for the case of the number of systematic permutations is too large, once the levels of probability are very near.
|
168 |
Résumé des Travaux en Statistique et Applications des StatistiquesClémençon, Stéphan 01 December 2006 (has links) (PDF)
Ce rapport présente brièvement l'essentiel de mon activité de recherche depuis ma thèse de doctorat [53], laquelle visait principalement à étendre l'utilisation des progrès récents de l'Analyse Harmonique Algorithmique pour l'estimation non paramétrique adaptative dans le cadre d'observations i.i.d. (tels que l'analyse par ondelettes) à l'estimation statistique pour des données markoviennes. Ainsi qu'il est éxpliqué dans [123], des résultats relatifs aux propriétés de concentration de la mesure (i.e. des inégalités de probabilité et de moments sur certaines classes fonctionnelles, adaptées à l'approximation non linéaire) sont indispensables pour exploiter ces outils d'analyse dans un cadre probabiliste et obtenir des procédures d'estimation statistique dont les vitesses de convergence surpassent celles de méthodes antérieures. Dans [53] (voir également [54], [55] et [56]), une méthode d'analyse fondée sur le renouvellement, la méthode dite 'régénérative' (voir [185]), consistant à diviser les trajectoires d'une chaîne de Markov Harris récurrente en segments asymptotiquement i.i.d., a été largement utilisée pour établir les résultats probabilistes requis, le comportement à long terme des processus markoviens étant régi par des processus de renouvellement (définissant de façon aléatoire les segments de la trajectoire). Une fois l'estimateur construit, il importe alors de pouvoir quantifier l'incertitude inhérente à l'estimation fournie (mesurée par des quantiles spécifiques, la variance ou certaines fonctionnelles appropriées de la distribution de la statistique considérée). A cet égard et au delà de l'extrême simplicité de sa mise en oeuvre (puisqu'il s'agit simplement d'eectuer des tirages i.i.d. dans l'échantillon de départ et recalculer la statistique sur le nouvel échantillon, l'échantillon bootstrap), le bootstrap possède des avantages théoriques majeurs sur l'approximation asymptotique gaussienne (la distribution bootstrap approche automatiquement la structure du second ordre dans le développement d'Edegworth de la distribution de la statistique). Il m'est apparu naturel de considérer le problème de l'extension de la procédure traditionnelle de bootstrap aux données markoviennes. Au travers des travaux réalisés en collaboration avec Patrice Bertail, la méthode régénérative s'est avérée non seulement être un outil d'analyse puissant pour établir des théorèmes limites ou des inégalités, mais aussi pouvoir fournir des méthodes pratiques pour l'estimation statistique: la généralisation du bootstrap proposée consiste à ré-échantillonner un nombre aléatoire de segments de données régénératifs (ou d'approximations de ces derniers) de manière à imiter la structure de renouvellement sous-jacente aux données. Cette approche s'est révélée également pertinente pour de nombreux autres problèmes statistiques. Ainsi la première partie du rapport vise essentiellement à présenter le principe des méthodes statistiques fondées sur le renouvellement pour des chaînes de Markov Harris. La seconde partie du rapport est consacrée à la construction et à l'étude de méthodes statistiques pour apprendre à ordonner des objets, et non plus seulement à les classer (i.e. leur aecter un label), dans un cadre supervisé. Ce problème difficile est d'une importance cruciale dans de nombreux domaines d' application, allant de l'élaboration d'indicateurs pour le diagnostic médical à la recherche d'information (moteurs de recherche) et pose d'ambitieuses questions théoriques et algorithmiques, lesquelles ne sont pas encore résolues de manière satisfaisante. Une approche envisageable consiste à se ramener à la classification de paires d'observations, ainsi que le suggère un critère largement utilisé dans les applications mentionnées ci-dessus (le critère AUC) pour évaluer la pertinence d'un ordre. Dans un travail mené en collaboration avec Gabor Lugosi et Nicolas Vayatis, plusieurs résultats ont été obtenus dans cette direction, requérant l'étude de U-processus: l'aspect novateur du problème résidant dans le fait que l'estimateur naturel du risque a ici la forme d'une U-statistique. Toutefois, dans de nombreuses applications telles que la recherche d'information, seul l'ordre relatif aux objets les plus pertinents importe véritablement et la recherche de critères correspondant à de tels problèmes (dits d'ordre localisé) et d'algorithmes permettant de construire des règles pour obtenir des 'rangements' optimaux à l'égard de ces derniers constitue un enjeu crucial dans ce domaine. Plusieurs développements en ce sens ont été réalisés dans une série de travaux (se poursuivant encore actuellement) en collaboration avec Nicolas Vayatis. Enfin, la troisième partie du rapport reflète mon intérêt pour les applications des concepts probabilistes et des méthodes statistiques. Du fait de ma formation initiale, j'ai été naturellement conduit à considérer tout d'abord des applications en finance. Et bien que les approches historiques ne suscitent généralement pas d'engouement dans ce domaine, j'ai pu me convaincre progressivement du rôle important que pouvaient jouer les méthodes statistiques non paramétriques pour analyser les données massives (de très grande dimension et de caractère 'haute fréquence') disponibles en finance afin de détecter des structures cachées et en tirer partie pour l'évaluation du risque de marché ou la gestion de portefeuille par exemple. Ce point de vue est illustré par la brève présentation des travaux menés en ce sens en collaboration avec Skander Slim dans cette troisième partie. Ces dernières années, j'ai eu l'opportunité de pouvoir rencontrer des mathématiciens appliqués et des scientifiques travaillant dans d'autres domaines, pouvant également bénéficier des avancées de la modélisation probabiliste et des méthodes statistiques. J'ai pu ainsi aborder des applications relatives à la toxicologie, plus précisément au problème de l'évaluation des risque de contamination par voie alimentaire, lors de mon année de délégation auprès de l'Institut National de la Recherche Agronomique au sein de l'unité Metarisk, unité pluridisciplinaire entièrement consacrée à l'analyse du risque alimentaire. J'ai pu par exemple utiliser mes compétences dans le domaine de la modélisation maarkovienne afin de proposer un modèle stochastique décrivant l'évolution temporelle de la quantité de contaminant présente dans l'organisme (de manère à prendre en compte à la fois le phénomène d'accumulation du aux ingestions successives et la pharmacocinétique propre au contaminant régissant le processus d'élimination) et des méthodes d'inférence statistique adéquates lors de travaux en collaboration avec Patrice Bertail et Jessica Tressou. Cette direction de recherche se poursuit actuellement et l'on peut espérer qu'elle permette à terme de fonder des recommandations dans le domaine de la santé publique. Par ailleurs, j'ai la chance de pouvoir travailler actuellement avec Hector de Arazoza, Bertran Auvert, Patrice Bertail, Rachid Lounes et Viet-Chi Tran sur la modélisation stochastique de l'épidémie du virus VIH à partir des données épidémiologiques recensées sur la population de Cuba, lesquelles constituent l'une des bases de données les mieux renseignées sur l'évolution d'une épidémie de ce type. Et bien que ce projet vise essentiellement à obtenir un modèle numérique (permettant d'effectuer des prévisions quant à l'incidence de l'épidémie à court terme, de manière à pouvoir planifier la fabrication de la quantité d'anti-rétroviraux nécéssaire par exemple), il nous a conduit à aborder des questions théoriques ambitieuses, allant de l'existence d'une mesure quasi-stationnaire décrivant l'évolution à long terme de l'épidémie aux problèmes relatifs au caractère incomplet des données épidémiologiques disponibles. Il m'est malheureusement impossible d'évoquer ces questions ici sans risquer de les dénaturer, la présentation des problèmes mathématiques rencontrés dans ce projet mériterait à elle seule un rapport entier.
|
169 |
台灣地區影音著作盜版率之研究 / The study of audio-visual works' piracy rate in Taiwan.邱奕傑, Chiu,Yi-Jye. Unknown Date (has links)
隨著資訊科技的發展與網際網路的普及,音樂與電影光碟盜版的問題也逐年嚴重,然影音盜版不僅影響權利人團體,影音業者及創作者之生存,亦攸關我國智慧財產之發展,更常成為我國在國際貿易諮商上的重要課題。在種種緣由與現況下,得使國內許多產、官、學、研團體想去研究影音盜版的相關議題,以了解其嚴重程度如何,或有無較客觀合理的指標或評估方式?並進一步研擬有效方法來防範盜版問題進一步惡化,為此上述問題乃是本研究之源起。
目前所有的影音盜版研究,多著重在計算盜版率,探討盜版因素,盜版行為的心理與法制問題,皆還尚未針對影音盜版率,建構出可供學者推論的盜版率機率分配,及其他相關的數量研究,因此,本研究的主要實証方向,乃根據2004年經濟部智慧財產局(intellectual property office ministry of economic affairs,R.O.C)委託政治大學之消費者調查資料,就音樂CD、影音VCD/DVD兩部分,針對筆者有興趣之變項(性別、年齡、有無上網下載等),(1)分別建構各自的混合分配並了解其分配間的差異與趨勢, (2)探討消費者對盜版行為的態度,(3)了解消費者對喜好的光碟所願付價格之差異,(4)建立盜版率分配的信賴帶,以及(5)針對現有的調查資料進行盜版辨別。
最後,就查緝盜版與維護智慧財產權兩方面,實證分析提供政府相關單位作為參考的依據,以求擬訂周詳且完善的措施來防範日益惡化的盜版問題。 / With the development of computer technology and widespread of internet, the piracy problem goes more serious. The piracy situation makes much influence not only on the rights of international oblige societies but also the growing of the intellectual properties in Taiwan. Moreover, it becomes the rock on the road of international commercial negotiations. Beyond the serious situation in the mean time, more researchers and relevant organizations on the island are trying to pay more attention to this important issue. This research intends to understand several questions: How is the actual situation on the piracy problem? Are there any objective evaluation ways? Are there any effective policies to prevent it from going deeper? These questions lead to this research.
In the meantime, most of Audio & Video piracy research emphasized only on calculating the piracy rate, or the reasons, or the relevant psychological and law problems, but few on piracy quantitative studies. Therefore the mainly intention of this research is based on the data from the IPO(Intellectual Property Office Ministry of Economic Affairs, ROC), which is executing by National Chengchi University. As for the two parts concerning music CD and visual VCD/DVD, and the variables those I am highly interested including gender, age, education level, downloading or not.
The empirical study results show as below: (1)The piracy rate distribution corresponds with the Mixed Model, that mean that it have been proportionally mixed two degenerate distribution (while X=0 and 100) with the Normal distribution. (2) On the facets of distribution differences and trends analysis, not only music CD and visual VCD/DVD, the results of the research by Mann-Whitney test and Kolmogorov-Smirnov two sample test both reveal the rising tendency of overall piracy rate. The generation of 20~29 years old is the mainly pirate group, moreover, higher education grades group does the more pirating behaviors, and lower income group intends more unauthorized copying conducts. Furthermore, along with the development of internet technology, the infringement behavior is more serious on the network connectors than the non-network downloaders. (3) Under surveying the opinions of consumers about the piracy, regardless of whether music or movies, the deviation is more serious on male than female, under 30-year-old than above, low educated than high, low income than high, pirate than non-pirate, downloaders than non-downloaders. The problem locates not only the lack of the concepts and recognition on the intellectual properties rights, but also the scarce of moral or legal limitations on the unauthorized rebuilding or downloading. But in the other curious facet, although the higher grade educated groups got more equitable standpoints on the piracy discussion, but evidenced depend upon the collected data they are also mainly the group who did the piracy behaviors more. (4) On the price range that a consumer would like to pay for, most of the pirate consumer tends to pay low price to buy the A/V goods, most of the non-pirating consumer group tends to pay general price to buy ones, and no significant difference of these two groups with high price, (5) On the facets of confidence bands on the whole music CD and visual VCD/DVD pirating rate, because of the specialties of pirating data- the higher frequency while the piracy rate values 0 and 100, so that the upper and lower bound reveals at 0 and 100. Futhermore, the confidence bands obtains from the population distribution function, therefore it’s suitable for the goodness-of-fit test. The results met the Kolmogorov-Smirnov one sample test. (6) On the data recognition facets, the logistic regression model of piracy is constructed in this research. Classification from the fitted logistic regression models, the results reveals 107 non-pirate are mis-judged to pirating behaviors, 186 pirating samples are neglected to non-pirate ones, the correct recognition rate goes high of 88 %.
Key Words:Piracy Rate, Mixture Models, Mann-Whitney, Kolmogorov-Smirnov, Logistic Regression Model, Nonparametric Statistics.
|
170 |
Obtenção dos níveis de significância para os testes de Kruskal-Wallis, Friedman e comparações múltiplas não-paramétricas. / Obtaining significance levels for Kruskal-Wallis, Friedman and nonparametric multiple comparisons tests.Antonio Carlos Fonseca Pontes 29 June 2000 (has links)
Uma das principais dificuldades encontradas pelos pesquisadores na utilização da Estatística Experimental Não-Paramétrica é a obtenção de resultados confiáveis. Os testes mais utilizados para os delineamentos com um fator de classificação simples inteiramente casualizados e blocos casualizados são o de Kruskal-Wallis e o de Friedman, respectivamente. As tabelas disponíveis para estes testes são pouco abrangentes, fazendo com que o pesquisador seja obrigado a recorrer a aproximações. Estas aproximações diferem dependendo do autor a ser consultado, podendo levar a resultados contraditórios. Além disso, tais tabelas não consideram empates, mesmo no caso de pequenas amostras. No caso de comparações múltiplas isto é mais evidente ainda, em especial quando ocorrem empates ou ainda, nos delineamentos inteiramente casualizados onde se tem número diferente de repetições entre tratamentos. Nota-se ainda que os softwares mais utilizados em geral recorrem a aproximações para fornecer os níveis de significância, além de não apresentarem resultados para as comparações múltiplas. Assim, o objetivo deste trabalho é apresentar um programa, em linguagem C, que realiza os testes de Kruskal-Wallis, de Friedman e de comparações múltiplas entre todos os tratamentos (bilateral) e entre os tratamentos e o controle (uni e bilateral) considerando todas as configurações sistemáticas de postos ou com 1.000.000 de configurações aleatórias, dependendo do número total de permutações possíveis. Dois níveis de significância são apresentados: o DW ou MaxDif , baseado na comparação com a diferença máxima dentro de cada configuração e o Geral, baseado na comparação com todas as diferenças em cada configuração. Os valores do nível de significância Geral assemelham-se aos fornecidos pela aproximação normal. Os resultados obtidos através da utilização do programa mostram, ainda, que os testes utilizando as permutações aleatórias podem ser bons substitutos nos casos em que o número de permutações sistemáticas é muito grande, já que os níveis de probabilidade são bastante próximos. / One of the most difficulties for the researchers in using Nonparametric Methods is to obtain reliable results. Kruskal-Wallis and Friedman tests are the most used for one-way layout and for randomized blocks, respectively. Tables available for these tests are not too wild, so the research must use approximate values. These approximations are different, depending on the author and the results can be not similar. Furthermore, these tables do not taking account tied observations, even in the case of small sample. For multiple comparisons, this is more evident, specially when tied observations occur or the number of replications is different. Many softwares like SAS, STATISTICA, S-Plus, MINITAB, etc., use approximation in order to get the significance levels and they do not present results for multiple comparisons. Thus, the aim of this work is to present a routine in C language that runs Kruskal-Wallis, Friedman and multiple comparisons among all treatments (bi-tailed) and between treatment and control (uni and bi-tailed), considering all the systematic configurations of the ranks or with more than 1,000,000 random ones, depending on the total of possible permutations. Two levels of significance are presented: DW or MaxDif, based on the comparison of the maximum difference within each configuration and the Geral, based on the comparison of all differences for each configuration. The Geral values of the significance level are very similar for the normal approximation. The obtaining results through this routine show that, the tests using random permutations can be nice substitutes for the case of the number of systematic permutations is too large, once the levels of probability are very near.
|
Page generated in 0.0952 seconds