Return to search

EVALUATION OF A NOVEL TERMINOLOGY TO CATEGORIZE CLINICAL DOCUMENT SECTION HEADERS AND A RELATED CLINICAL NOTE SECTION TAGGER

The aims of this project are to 1) build and evaluate a terminology that provides categorization labels, or tags, for common segments within clinical documents, and 2) to evaluate a tool to parse and label natural-language clinical documents using the terminology. Clinical documents generally contain many sections and subsections, such as history of present illness, physical examination, and cardiovascular exam. The author developed a section header terminology that models common section names, subsection names, and their relationships. This terminology was built using existing standardized terminologies, textbooks, and review of over 9,000 clinical notes. The section tagging tool, named SecTag, identifies terminology matches from clinical documents using a combination of linguistic, natural language processing, and machine learning techniques. The evaluation study focused on recognizing sections in 319 randomly-chosen history and physical examination notes that were generated during hospitalizations and outpatient visits. The overall recall and precision were 99% and 96%, respectively, over 16,036 possible sections. Recall and precision for sections not labeled in the document were 97% and 87%, respectively. The system correctly tagged 93% of the section start and end boundaries. SecTag failed to label 160 sections (1%); only 11 were headings that were absent in the terminology and which should be added to it. SecTag and its terminology are important first steps for understanding clinical notes. Future studies are needed to extend the terminology to other clinical note types and to link SecTag to a more in-depth natural language processing system.

Identiferoai:union.ndltd.org:VANDERBILT/oai:VANDERBILTETD:etd-07272007-173034
Date03 August 2007
CreatorsDenny, Joshua C
ContributorsAnderson Spickard, III, MD, MS, Randolph A. Miller, Kevin B. Johnson
PublisherVANDERBILT
Source SetsVanderbilt University Theses
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.library.vanderbilt.edu//available/etd-07272007-173034/
Rightsunrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Vanderbilt University or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.

Page generated in 0.0024 seconds