Return to search

Concept-value pair extraction from semi-structured clinical reports : a case study using echocardiogram reports

Thesis (S.M.)--Harvard-MIT Division of Health Sciences and Technology, 2004. / Includes bibliographical references (p. 38-39). / The task of gathering detailed patient information from narrative clinical text presents a significant barrier to clinical research. A prototype information extraction system was developed to extract pre-specified findings from narrative echocardiogram reports. The system which uses a Unified Medical Language System compatible architecture is very simple and takes advantage of canonical language use patterns to identify sentence templates with which concepts and their values can be identified. The data extracted from this system will be used to enrich an existing database used by clinical researchers in a large university healthcare system to identify potential research candidates fulfilling clinical inclusion criteria. The system was developed and evaluated using ten pre-determined clinical concepts. Concept-value pairs extracted by the system related to these ten conditions were compared with findings extracted manually by the author. The system was able to recall 78% of the relevant findings (CI, 76% to 80%), with a precision of 99% (CI, 98%-99%). Because data acquired from the system will ultimately be used in document and patient retrieval, preliminary analysis was done to evaluate document retrieval effectiveness. Median recall across the ten conditions was 36% (range, 0% to 93%). The system retrieved no documents for two of the ten conditions; median precision for the remaining eight conditions was 100% (range, 92% to 100%). / by Jeanhee Chung. / S.M.

Identiferoai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/28584
Date January 2004
CreatorsChung, Jeanhee, 1972-
ContributorsG. Octo Barnett, Henry Chueh and Shawn N. Murphy., Harvard University--MIT Division of Health Sciences and Technology., Harvard University--MIT Division of Health Sciences and Technology.
PublisherMassachusetts Institute of Technology
Source SetsM.I.T. Theses and Dissertation
Languageen_US
Detected LanguageEnglish
TypeThesis
Format39 p., 2470218 bytes, 2472397 bytes, application/pdf, application/pdf, application/pdf
RightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission., http://dspace.mit.edu/handle/1721.1/7582

Page generated in 0.0275 seconds