Submitted to the Health Informatics Graduate Program Faculty, Indiana University, in partial fulfillment of the requirements for the degree of Master of Science in Health Informatics.May 2006 / The purpose of this study is to examine the current use of natural language processing for extracting meaningful data from free text in medical reports. The use of natural language processing has been used to process information from various genres. To evaluate the use of natural language processing, a synthesized review of primary research papers specific to natural language processing and extracting data from medical reports. A three phased approach is used to describe the process of gathering the final metrics for validating the use of natural language processing.
The main purpose of any NLP is to extract or understand human language and to process it into meaning for a specified area of interest or end-user. There are three types of approaches: symbolic, statistical, and connectionist. There are identified problems with natural language processing and the different approaches. Problems noted about natural language processing in the research are: acquisition, coverage, robustness, and extensibility.
Metrics were gathered from primary research papers to evaluate the success of the natural language processors. Recall average of the four papers was 85%. Precision average of five papers was 87.7%. Accuracy average was 97%. Sensitivity average was 84%, while specificity was 97.4%. Based on the results of the primary research there was no definitive way to validate one NLP approach as an industry standard
The research reviewed it is clear that there has been at least limited success with information extraction from free text with use of natural language processing. It is important to understand the continuum of data, information, and knowledge in the previous and future research of natural language processing. In the industry of health informatics this is a technology necessary for improving healthcare and research.
Identifer | oai:union.ndltd.org:IUPUI/oai:scholarworks.iupui.edu:1805/610 |
Date | 29 June 2006 |
Creators | Pfeiffer II, Richard D. |
Contributors | McDaniel, Anna M. |
Source Sets | Indiana University-Purdue University Indianapolis |
Language | en_US |
Detected Language | English |
Type | Thesis |
Format | 642560 bytes, application/msword |
Page generated in 0.0019 seconds