Global ETD Search

91	Application of Neural Networks to Population Pharmacokinetic Data Analysis Chow, Hsiao-Hui, Tolle, Kristin M., Roe, Denise J., Elsberry, Victor, Chen, Hsinchun 07 1900 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / This research examined the applicability of using a neural network approach to analyze population pharmacokinetic data. Such data were collected retrospectively from pediatric patients who had received tobramycin for the treatment of bacterial infection. The information collected included patient-related demographic variables (age, weight, gender, and other underlying illness), the individualâ s dosing regimens (dose and dosing interval), time of blood drawn, and the resulting tobramycin concentration. Neural networks were trained with this information to capture the relationships between the plasma tobramycin levels and the following factors: patient-related demographic factors, dosing regimens, and time of blood drawn. The data were also analyzed using a standard population pharmacokinetic modeling program, NONMEM. The observed vs predicted concentration relationships obtained from the neural network approach were similar to those from NONMEM. The residuals of the predictions from neural network analyses showed a positive correlation with that from NONMEM. Average absolute errors were 33.9 and 37.3% for neural networks and 39.9% for NONMEM. Average prediction errors were found to be 2.59 and -5.01% for neural networks and 17.7% for NONMEM. We concluded that neural networks were capable of capturing the relationships between plasma drug levels and patient-related prognostic factors from routinely collected sparse withinpatient pharmacokinetic data. Neural networks can therefore be considered to have potential to become a useful analytical tool for population pharmacokinetic data analysis. Data Mining Medical Libraries
92	A Graph Model for E-Commerce Recommender Systems Huang, Zan, Chung, Wingyan, Chen, Hsinchun January 2004 (has links) Artificial Intelligence Lab, Department of MIS, University of Arizona / Information overload on the Web has created enormous challenges to customers selecting products for online purchases and to online businesses attempting to identify customersâ preferences efficiently. Various recommender systems employing different data representations and recommendation methods are currently used to address these challenges. In this research, we developed a graph model that provides a generic data representation and can support different recommendation methods. To demonstrate its usefulness and flexibility, we developed three recommendation methods: direct retrieval, association mining, and high-degree association retrieval. We used a data set from an online bookstore as our research test-bed. Evaluation results showed that combining product content information and historical customer transaction information achieved more accurate predictions and relevant recommendations than using only collaborative information. However, comparisons among different methods showed that high-degree association retrieval did not perform significantly better than the association mining method or the direct retrieval method in our test-bed. Data Mining Information Extraction
93	Automaticially Detecting Deceptive Criminal Identities Wang, Gang, Chen, Hsinchun, Atabakhsh, Homa 03 1900 (has links) Artificial Intelligence Lab, Department of MIS, Univeristy of Arizona / Fear about identity verification reached new heights since the terrorist attacks on Sept. 11, 2001, with national security issues related to detecting identity deception attracting more interest than ever before. Identity deception is an intentional falsification of identity in order to deter investigations. Conventional investigation methods run into difficulty when dealing with criminals who use deceptive or fraudulent identities, as the FBI discovered when trying to determine the true identities of 19 hijackers involved in the attacks. Besides its use in post-event investigation, the ability to validate identity can also be used as a tool to prevent future tragedies. Here, we focus on uncovering patterns of criminal identity deception based on actual criminal records and suggest an algorithmic approach to revealing deceptive identities. Artificial Intelligence Data Mining
94	Avoiding the Great Data-Wipe of Ought-Three: Maintaining an Institutional Record for Library Decision-Making in Threatening Times Nicholson, Scott January 2003 (has links) Because of the USA PATRIOT Act and similar legislation that allows the government to track the actions of individuals suspected of terrorist activities, many librarians are concerned about protecting information about library use at any cost. Some propose that the solution is to delete all data from the operational databases whenever possible; in fact, a recent New York Times article discusses daily shredding of library records from the Santa Cruz Public Library System (“Librarians Use Shredder to Show Opposition to New F.B.I. Powers”, Apr. 7th, 2003). However, deleting all data associated with library transactions will make data-based evaluation and justification of library services difficult; therefore, libraries must seek a balance between protecting the privacy of patrons and maintaining a history of library transactions. Data Mining Libraries Management
95	The Bibliomining Process: Data Warehousing and Data Mining for Library Decision-Making Nicholson, Scott January 2003 (has links) The goal of this brief article is to explain the bibliomining process. Emphasis is placed on data warehousing and patron privacy issues because they are required before anything else can begin. It is essential to capture our data-based institutional records while still protecting the privacy of users. By using a data warehouse, both goals can be met. Once the data warehouse is in place, the library can use reporting and exploration tools to gain a more thorough knowledge of their user communities and resource utilization. Libraries Data Mining
96	Discovering and summarizing email conversations Zhou, Xiaodong 05 1900 (has links) With the ever increasing popularity of emails, it is very common nowadays that people discuss specific issues, events or tasks among a group of people by emails. Those discussions can be viewed as conversations via emails and are valuable for the user as a personal information repository. For instance, in 10 minutes before a meeting, a user may want to quickly go through a previous discussion via emails that is going to be discussed in the meeting soon. In this case, rather than reading each individual email one by one, it is preferable to read a concise summary of the previous discussion with major information summarized. In this thesis, we study the problem of discovering and summarizing email conversations. We believe that our work can greatly support users with their email folders. However, the characteristics of email conversations, e.g., lack of synchronization, conversational structure and informal writing style, make this task particularly challenging. In this thesis, we tackle this task by considering the following aspects: discovering emails in one conversation, capturing the conversation structure and summarizing the email conversation. We first study how to discover all emails belonging to one conversation. Specifically, we study the hidden email problem, which is important for email summarization and other applications but has not been studied before. We propose a framework to discover and regenerate hidden emails. The empirical evaluation shows that this framework is accurate and scalable to large folders. Second, we build a fragment quotation graph to capture email conversations. The hidden emails belonging to each conversation are also included into the corresponding graph. Based on the quotation graph, we develop a novel email conversation summarizer, ClueWordSummarizer. The comparison with a state-of-the-art email summarizer as well as with a popular multi-document summarizer shows that ClueWordSummarizer obtains a higher accuracy in most cases. Furthermore, to address the characteristics of email conversations, we study several ways to improve the ClueWordSummarizer by considering more lexical features. The experiments show that many of those improvements can significantly increase the accuracy especially the subjective words and phrases. email summarization data mining
97	Explicating a Biological Basis for Chronic Fatigue Syndrome Abou-Gouda, Samar A. 18 December 2007 (has links) In the absence of clinical markers for Chronic Fatigue Syndrome (CFS), research to find a biological basis for it is still open. Many data-mining techniques have been widely employed to analyze biomedical data describing different aspects of CFS. However, the inconsistency of the results of these studies reflect the uncertainty in regards to the real basis of this disease. In this thesis, we show that CFS has a biological basis that is detectable in gene expression data better than blood profile and Single Nucleotide Polymorphism (SNP) data. Using random forests, the analysis of gene expression data achieves a prediction accuracy of approximately 89%. We also identify sets of differentially expressed candidate genes that might contribute to CFS. We show that the integration of data spanning multiple levels of the biological scale might reveal further insights into the understanding of CFS. Using integrated data, we achieve a prediction accuracy of approximately 91%. We find that Singular Value Decomposition (SVD) is a useful technique to visualize the performance of random forests. / Thesis (Master, Computing) -- Queen's University, 2007-12-11 12:15:40.096 Computer Science Data Mining
98	Mining frequent itemsets from uncertain data: extensions to constrained mining and stream mining Hao, Boyu 19 July 2010 (has links) Most studies on frequent itemset mining focus on mining precise data. However, there are situations in which the data are uncertain. This leads to the mining of uncertain data. There are also situations in which users are only interested in frequent itemsets that satisfy user-specified aggregate constraints. This leads to constrained mining of uncertain data. Moreover, floods of uncertain data can be produced in many other situations. This leads to stream mining of uncertain data. In this M.Sc. thesis, we propose algorithms to deal with all these situations. We first design a tree-based mining algorithm to find all frequent itemsets from databases of uncertain data. We then extend it to mine databases of uncertain data for only those frequent itemsets that satisfy user-specified aggregate constraints and to mine streams of uncertain data for all frequent itemsets. Experimental results show the effectiveness of all these algorithms. Data Mining Databases
99	Frequent pattern mining of uncertain data streams Jiang, Fan January 2011 (has links) When dealing with uncertain data, users may not be certain about the presence of an item in the database. For example, due to inherent instrumental imprecision or errors, data collected by sensors are usually uncertain. In various real-life applications, uncertain databases are not necessarily static, new data may come continuously and at a rapid rate. These uncertain data can come in batches, which forms a data stream. To discover useful knowledge in the form of frequent patterns from streams of uncertain data, algorithms have been developed to use the sliding window model for processing and mining data streams. However, for some applications, the landmark window model and the time-fading model are more appropriate. In this M.Sc. thesis, I propose tree-based algorithms that use the landmark window model or the time-fading model to mine frequent patterns from streams of uncertain data. Experimental results show the effectiveness of our algorithms. Data mining Databases
100	Mining frequent patterns from uncertain data with MapReduce Hayduk, Yaroslav 04 April 2012 (has links) Frequent pattern mining from uncertain data allows data analysts to mine frequent patterns from probabilistic databases, within which each item is associated with an existential probability representing the likelihood of the presence of the item in the transaction. When compared with precise data, the solution space for mining uncertain data is often much larger due to the probabilistic nature of uncertain databases. Thus, uncertain data mining algorithms usually take substantially more time to execute. Recent studies show that the MapReduce programming model yields significant performance gains for data mining algorithms, which can be mapped to the map and reduce execution phases of MapReduce. An attractive feature of MapReduce is fault-tolerance, which permits detecting and restarting failed jobs on working machines. In this M.Sc. thesis, I explore the feasibility of applying MapReduce to frequent pattern mining of uncertain data. Specifically, I propose two algorithms for mining frequent patterns from uncertain data with MapReduce. Data mining Databases

Search results