Spelling suggestions: "subject:"categorical data"" "subject:"cathegorical data""
21 |
Using Three Different Categorical Data Analysis Techniques to Detect Differential Item FunctioningStephens-Bonty, Torie Amelia 16 May 2008 (has links)
Diversity in the population along with the diversity of testing usage has resulted in smaller identified groups of test takers. In addition, computer adaptive testing sometimes results in a relatively small number of items being used for a particular assessment. The need and use for statistical techniques that are able to effectively detect differential item functioning (DIF) when the population is small and or the assessment is short is necessary. Identification of empirically biased items is a crucial step in creating equitable and construct-valid assessments. Parshall and Miller (1995) compared the conventional asymptotic Mantel-Haenszel (MH) with the exact test (ET) for the detection of DIF with small sample sizes. Several studies have since compared the performance of MH to logistic regression (LR) under a variety of conditions. Both Swaminathan and Rogers (1990), and Hildalgo and López-Pina (2004) demonstrated that MH and LR were comparable in their detection of items with DIF. This study followed by comparing the performance of the MH, the ET, and LR performance when both the sample size is small and test length is short. The purpose of this Monte Carlo simulation study was to expand on the research done by Parshall and Miller (1995) by examining power and power with effect size measures for each of the three DIF detection procedures. The following variables were manipulated in this study: focal group sample size, percent of items with DIF, and magnitude of DIF. For each condition, a small reference group size of 200 was utilized as well as a short, 10-item test. The results demonstrated that in general, LR was slightly more powerful in detecting items with DIF. In most conditions, however, power was well below the acceptable rate of 80%. As the size of the focal group and the magnitude of DIF increased, the three procedures were more likely to reach acceptable power. Also, all three procedures demonstrated the highest power for the most discriminating item. Collectively, the results from this research provide information in the area of small sample size and DIF detection.
|
22 |
"Clustering Categorical Response" Application to Lung Cancer Problems in Living ScalesGuo, Ling 22 April 2008 (has links)
The study aims to estimate the ability of different grouping techniques on categorical response. We try to find out how well do they work? Do they really find clusters when clusters exist? We use Cancer Problems in Living Scales from the ACS as our categorical data variables and lung cancer survivors as our studying group. Five methods of cluster analysis are examined for their accuracy in clustering on both real CPILS dataset and simulated data. The methods include hierarchical cluster analysis (Ward's method), model-based clustering of raw data, model-based clustering of the factors scores from a maximum likelihood factor analysis, model-based clustering of the predicted scores from independent factor analysis, and the method of latent class clustering. The results from each of the five methods are then compared to actual classifications. The performance of model-based clustering on raw data is poorer than that of the other methods and the latent class clustering method is most appropriate for the specific categorical data examined. These results are discussed and recommendations are made regarding future directions for cluster analysis research.
|
23 |
カテゴリカル・データの非計量的主成分分析の応用村上, 隆, Murakami, Takashi 26 December 1997 (has links)
国立情報学研究所で電子化したコンテンツを使用している。
|
24 |
KJ法および多重対応分析を用いた自由記述型応答の数量化鈴木, 郁子, SUZUKI, Ikuko, 和田, 真雄, WADA, Shinyu, 村上, 隆, MURAKAMI, Takashi 27 December 2005 (has links)
国立情報学研究所で電子化したコンテンツを使用している。
|
25 |
Algorithmically Guided Information Visualization : Explorative Approaches for High Dimensional, Mixed and Categorical Data / Algoritmiskt vägledd informationsvisualisering för högdimensionell och kategorisk dataJohansson Fernstad, Sara January 2011 (has links)
Facilitated by the technological advances of the last decades, increasing amounts of complex data are being collected within fields such as biology, chemistry and social sciences. The major challenge today is not to gather data, but to extract useful information and gain insights from it. Information visualization provides methods for visual analysis of complex data but, as the amounts of gathered data increase, the challenges of visual analysis become more complex. This thesis presents work utilizing algorithmically extracted patterns as guidance during interactive data exploration processes, employing information visualization techniques. It provides efficient analysis by taking advantage of fast pattern identification techniques as well as making use of the domain expertise of the analyst. In particular, the presented research is concerned with the issues of analysing categorical data, where the values are names without any inherent order or distance; mixed data, including a combination of categorical and numerical data; and high dimensional data, including hundreds or even thousands of variables. The contributions of the thesis include a quantification method, assigning numerical values to categorical data, which utilizes an automated method to define category similarities based on underlying data structures, and integrates relationships within numerical variables into the quantification when dealing with mixed data sets. The quantification is incorporated in an interactive analysis pipeline where it provides suggestions for numerical representations, which may interactively be adjusted by the analyst. The interactive quantification enables exploration using commonly available visualization methods for numerical data. Within the context of categorical data analysis, this thesis also contributes the first user study evaluating the performance of what are currently the two main visualization approaches for categorical data analysis. Furthermore, this thesis contributes two dimensionality reduction approaches, which aim at preserving structure while reducing dimensionality, and provide flexible and user-controlled dimensionality reduction. Through algorithmic quality metric analysis, where each metric represents a structure of interest, potentially interesting variables are extracted from the high dimensional data. The automatically identified structures are visually displayed, using various visualization methods, and act as guidance in the selection of interesting variable subsets for further analysis. The visual representations furthermore provide overview of structures within the high dimensional data set and may, through this, aid in focusing subsequent analysis, as well as enabling interactive exploration of the full high dimensional data set and selected variable subsets. The thesis also contributes the application of algorithmically guided approaches for high dimensional data exploration in the rapidly growing field of microbiology, through the design and development of a quality-guided interactive system in collaboration with microbiologists.
|
26 |
Generating a synthetic dataset for kidney transplantation using generative adversarial networks and categorical logit encodingBartocci, John Timothy 24 May 2021 (has links)
No description available.
|
27 |
Integrated studies on structure and formation mechanism of environmental consciousness in rural and urban China / 中国農村部と都市部における環境意識の構造と形成のメカニズムに関する総合的研究 / チュウゴク ノウソンブ ト トシブ ニオケル カンキョウ イシキ ノ コウゾウ ト ケイセイ ノ メカニズム ニカンスル ソウゴウテキ ケンキュウ陳 艶艶, Yanyan Chen 22 March 2016 (has links)
中国における都市部と農村部異なる制度的・社会経済的背景により、独特な環境意識を生まれていると考えられる。本研究は、現地調査によりデータを収集し、統計分析を駆使したことにより、都市部と農村部における環境意識の特有の構造と形成メカニズムを解明することを目的とする。先行研究の成果を踏まえ、都市部と農村部の社会構造を考慮し、環境意識に関する総合的な理論モデルを提案し、環境意識の三つのディメンションに分けて展開する。理論的に検討することと実証的なデータの分析結果を基に、環境意識形成の内在因子と外部影響要因を明らかにした。 / Long-time institutional and socioeconomic segmentations make rural China become a distinctive society from the urban China. The remarkable rural and urban division in China supplies us a good context to explore the formation and diverse social facets of environmental consciousness. This study aims to clarify the specific structure and formation mechanism of environmental consciousness under the different social backgrounds of rural and urban China based on the statistical results derived from survey data. Three dimensions of environmental consciousness and an integrated theoretical framework which involves both social structural and social psychological variables are proposed. Based on the proposed theoretical framework and examined data analyses, the inner causes and externally influencing factors of environmental consciousness were clarified. / 博士(文化情報学) / Doctor of Culture and Information Science / 同志社大学 / Doshisha University
|
28 |
Clustering and visualization for enhancing interpretation of categorical data / カテゴリカルデータの解釈容易性を向上させるためのクラスタリングと視覚化法について / カテゴリカル データ ノ カイシャク ヨウイセイ オ コウジョウ サセル タメ ノ クラスタリング ト シカクカホウ ニツイテ髙岸 茉莉子, 高岸 茉莉子, Mariko Takagishi 20 September 2019 (has links)
本論文では大規模カテゴリカルデータのデータ解釈の場面で生じる問題を考えた.データが大規模な場合,クラスター分析や視覚化などで,データの潜在的な構造を調べる方法が有用とされるが,対象ごとにカテゴリの解釈が異なったり,同じ属性でも回答傾向が異なったりすると解釈が複雑になる.本論文ではそのように既存手法をシンプルに適用するのでは解釈が難しいようなデータに対して,よりわかりやすい解釈をするための手法を開発した. / Large-scale categorical data are often obtained in various fields. As an interpretation of large-scale data tends to be complicated, methods to capture the latent structure in data, such as a cluster analysis and a visualization method are often used to make data more interpretable. However, there are some situations where these methods failed to capture the latent structure that is interpretable (e.g., interpretation of categories by each respondent is different). Therefore in this paper, two problems that often occur in large-scale categorical data analysis is considered, and new methods to address these issues are proposed. / 博士(文化情報学) / Doctor of Culture and Information Science / 同志社大学 / Doshisha University
|
29 |
Hiring Practices for Graphic Designers In Utah County, UtahDensley, Landon T. 12 July 2004 (has links) (PDF)
The purpose of this study was to show how hiring standards of evidence for graphic designers in Utah County compared with the national standards of evidence. The four major national standards of evidence for hiring graphic designers, identified by American Institute of Graphic Arts (AIGA) and Goldfarb, in order of importance are portfolio, recommendations, personality, and education. The data from this study revealed that Utah County employer's standards of evidence matched up closely to national standards of evidence, but the order of importance was slightly different because personality was ranked ahead of recommendations and education.
|
30 |
Unsupervised Categorical Clustering on Labor MarketsSteffen, Matthew James 10 April 2023 (has links)
During this "white collar recession,'' there is a flooded labor market of workers. For employers seeking to hire, there is a need to identify potential qualified candidates for each job. The current state of the art is LinkedIn Recruiting or elastic search on Resumes. The current state of the art lacks efficiency and scalability along with an intuitive ranking of candidates. We believe this can be fixed with multi-layer categorical clustering via modularity maximization. To test this, we gathered a dataset that is extensive and representative of the job market. Our data comes from PeopleDataLabs and LinkedIn and is sampled from 153 million individuals. As such, this data represents one of the most informative datasets for the task of ranking and clustering job titles and skills. Properly grouping individuals will help identify more candidates to fulfill the multitude of vacant positions. We implement a novel framework for categorical clustering, involving these attributes to deliver a reliable pool of candidates. We develop a metric for clustering based on commonality to rank clustering algorithms. The metric prefers modularity-based clustering algorithms like the Louvain algorithm. This allows us to use such algorithms to outperform other unsupervised methods for categorical clustering. Our implementation accurately clusters emergency services, health-care and other fields while managerial positions are interestingly swamped by soft or uninformative features thereby resulting in dominant ambiguous clusters.
|
Page generated in 0.0754 seconds