Global ETD Search

711	Exploring Characteristics of Social Classification Lin, Xia, Beaudoin, Joan, Bul, Yen, Desal, Kushal January 2006 (has links) Three empirical studies on characteristics of social classification are reported in this paper. The first study compared social tags with controlled vocabularies and title-based automatic indexing and found little overlaps among the three indexing methods. The second study investigated how well tags could be categorized to improve effectiveness of searching and browsing. The third study explored factors and radios that had the most significant impact on tag convergence. Finding of the three studies will help to identify characteristics of those tagging terms that are content-rich and that can be used to increase effectiveness of tagging, searching and browsing. Classification World Wide Web Knowledge Organization
712	Social classification: Panacea or Pandora? Furner, Jonathan 11 1900 (has links) Proceedings 17th Workshop of the American Society for Information Science and Technology Special Interest Group in Classification Research / Presentation at the beginning of the workshop, given to set the tone and outline issues key to the event. [jtt] Classification World Wide Web Indexing Knowledge Organization
713	The Index Catalogue and Historical Shifts in Medical Knowledge, & Word Usage Patterns Lussky, Joan January 2004 (has links) Faithful aggregated accounts of the advancement of science are invaluable for those setting scientific policy as well as scholars of the history of science. As science develops the scholarly communityiÌ s determination of the accepted knowledge undergoes shifts. Within medicine these shifts include our understanding of what can cause disease and what defines specific disease entities. Shifts in accepted medical knowledge are captured in the medical literature. The Index Catalogue of the Library of the Surgeon GeneraliÌ s Office, United States Army, published from 1880 -1961, is an extremely large index to medical literature. The newly digitized form of this index, referred to as the IndexCat, allows us a way to generate faithful accounts of the development of medical science during the late nineteenth- and early twentieth-centuries. My data looks at shifts within the IndexCat surrounding three disease entities: syphilis, Huntington's chorea, and beriberi, and their interactions with two disease causation theories: germ and hereditary, from 1880-1930. Temporal changes in the prominent subject heading words and title words within the literature of these diseases and disease theories corroborate qualitative accounts of this same literature, which reports the complex and sometimes oblique process of knowledge accretion. Although preliminary, my results indicate that the IndexCat is a valuable tool for studying the development of medical knowledge. Classification Indexing Cataloging Medical Libraries History
714	Advances in Classification Research, Volume 17: Proceedings of the 17th ASIS&T SIG/CR Classification Research Workshop January 2006 (has links) Papers from the 17th ASIST&T Special Interest Group for Classification Research Workshop. The workshop ran November 4, 2006 in Austin, Texas. Classification World Wide Web Knowledge Organization
715	MARC on the CDC 6400; a teaching tool for library classification Kacena, Carolyn, 1944- January 1974 (has links) No description available. MARC formats.
716	Data aggregation for capacity management Lee, Yong Woo 30 September 2004 (has links) This thesis presents a methodology for data aggregation for capacity management. It is assumed that there are a very large number of products manufactured in a company and that every product is stored in the database with its standard unit per hour and attributes that uniquely specify each product. The methodology aggregates products into families based on the standard units-per-hour and finds a subset of attributes that unambiguously identifies each family. Data reduction and classification are achieved using well-known multivariate statistical techniques such as cluster analysis, variable selection and discriminant analysis. The experimental results suggest that the efficacy of the proposed methodology is good in terms of data reduction. data aggregation capacity management data reduction classification
717	Phenotypic Classification of Paediatric Inflammatory Bowel Disease Sherlock, Mary 19 March 2013 (has links) This thesis explores aspects pertinent to the phenotypic classification of paediatric patients with inflammatory bowel disease (IBD). In the current era it has never been more important to have rigourous phenotypic classification to facilitate genotype-phenotype correlation studies as well as to optimize design of clinical trials of emerging therapies, where frequently response may differ according to phenotype of disease. The first study examined the reliability of the Montreal Classification for classifying paediatric IBD patients. This is the first study exploring the reliability of phenotypic classification in a paediatric population. The reliability of assigning an overall diagnosis of type of IBD was good, but not excellent. Amongst Crohn’s disease patients, reliability of assigning disease behaviour was excellent, while the reliability of assigning disease location categories varied from fair to good. The percentage agreement when describing disease extent for ulcerative colitis was high. The second study described the evolution of disease phenotype in a cohort of paediatric IBD patients. Similar to observations in adult-onset IBD, disease location was found to be relatively stable, while Crohn’s disease behaviour evolves from an inflammatory to a stricturing and/or penetrating phenotype in 20% of patients by 5 years of follow-up. The final study explored the association between 2 polymorphisms in the NOD2 gene with the requirement for intestinal resection (a surrogate marker of complicated disease) in models that included and excluded disease duration. Although no difference was found, this may have been influenced by data quality, which was suboptimal. In conclusion, this thesis has demonstrated that imprecision exists in the phenotyping of paediatric IBD patients, in whom phenotypic characteristics evolve over time. It is pertinent that disease duration be considered in any study attempting to make phenotypic correlations. Inflammatory bowel disease paediatric classification phenotype 0564
718	Phenotypic Classification of Paediatric Inflammatory Bowel Disease Sherlock, Mary 19 March 2013 (has links) This thesis explores aspects pertinent to the phenotypic classification of paediatric patients with inflammatory bowel disease (IBD). In the current era it has never been more important to have rigourous phenotypic classification to facilitate genotype-phenotype correlation studies as well as to optimize design of clinical trials of emerging therapies, where frequently response may differ according to phenotype of disease. The first study examined the reliability of the Montreal Classification for classifying paediatric IBD patients. This is the first study exploring the reliability of phenotypic classification in a paediatric population. The reliability of assigning an overall diagnosis of type of IBD was good, but not excellent. Amongst Crohn’s disease patients, reliability of assigning disease behaviour was excellent, while the reliability of assigning disease location categories varied from fair to good. The percentage agreement when describing disease extent for ulcerative colitis was high. The second study described the evolution of disease phenotype in a cohort of paediatric IBD patients. Similar to observations in adult-onset IBD, disease location was found to be relatively stable, while Crohn’s disease behaviour evolves from an inflammatory to a stricturing and/or penetrating phenotype in 20% of patients by 5 years of follow-up. The final study explored the association between 2 polymorphisms in the NOD2 gene with the requirement for intestinal resection (a surrogate marker of complicated disease) in models that included and excluded disease duration. Although no difference was found, this may have been influenced by data quality, which was suboptimal. In conclusion, this thesis has demonstrated that imprecision exists in the phenotyping of paediatric IBD patients, in whom phenotypic characteristics evolve over time. It is pertinent that disease duration be considered in any study attempting to make phenotypic correlations. Inflammatory bowel disease paediatric classification phenotype 0564
719	Large-Scale Web Page Classification Marath, Sathi 09 November 2010 (has links) Web page classification is the process of assigning predefined categories to web pages. Empirical evaluations of classifiers such as Support Vector Machines (SVMs), k-Nearest Neighbor (k-NN), and Naïve Bayes (NB), have shown that these algorithms are effective in classifying small segments of web directories. The effectiveness of these algorithms, however, has not been thoroughly investigated on large-scale web page classification of such popular web directories as Yahoo! and LookSmart. Such web directories have hundreds of thousands of categories, deep hierarchies, spindle category and document distributions over the hierarchies, and skewed category distribution over the documents. These statistical properties indicate class imbalance and rarity within the dataset. In hierarchical datasets similar to web directories, expanding the content of each category using the web pages of the child categories helps to decrease the degree of rarity. This process, however, results in the localized overabundance of positive instances especially in the upper level categories of the hierarchy. The class imbalance, rarity and the localized overabundance of positive instances make applying classification algorithms to web directories very difficult and the problem has not been thoroughly studied. To our knowledge, the maximum number of categories ever previously classified on web taxonomies is 246,279 categories of Yahoo! directory using hierarchical SVMs leading to a Macro-F1 of 12% only. We designed a unified framework for the content based classification of imbalanced hierarchical datasets. The complete Yahoo! web directory of 639,671 categories and 4,140,629 web pages is used to setup the experiments. In a hierarchical dataset, the prior probability distribution of the subcategories indicates the presence or absence of class imbalance, rarity and the overabundance of positive instances within the dataset. Based on the prior probability distribution and associated machine learning issues, we partitioned the subcategories of Yahoo! web directory into five mutually exclusive groups. The effectiveness of different data level, algorithmic and architectural solutions to the associated machine learning issues is explored. Later, the best performing classification technologies for a particular prior probability distribution have been identified and integrated into the Yahoo! Web directory classification model. The methodology is evaluated using a DMOZ subset of 17,217 categories and 130,594 web pages and we statistically proved that the methodology of this research works equally well on large and small dataset. The average classifier performance in terms of macro-averaged F1-Measure achieved in this research for Yahoo! web directory and DMOZ subset is 81.02% and 84.85% respectively.
720	Towards Coevolutionary Genetic Programming with Pareto Archiving Under Streaming Data Atwater, Aaron 13 August 2013 (has links) Classification under streaming data constraints implies that training must be performed continuously, can only access individual exemplars for a short time after they arrive, must adapt to dynamic behaviour over time, and must be able to retrieve a current classifier at any time. A coevolutionary genetic programming framework is adapted to operate in non-stationary streaming data environments. Methods to generate synthetic datasets for benchmarking streaming classification algorithms are introduced, and the proposed framework is evaluated against them. The use of Pareto archiving is evaluated as a mechanism for retaining access to a limited number of useful exemplars throughout training, and several fitness sharing heuristics for archiving are evaluated. Fitness sharing alone is found to be most effective under streams with continuous (incremental) changes, while the addition of an aging heuristic is preferred when the stream has stepwise changes. Tapped delay lines are explored as a method for explicitly incorporating sequence context in cyclical data streams, and their use in combination with the aging heuristic suggests a promising route forward. / Hyperref'd copy available at: https://web.cs.dal.ca/~atwater/ computer science genetic programming machine learning classification

Search results