Return to search

Personalized Medicine through Automatic Extraction of Information from Medical Texts

The wealth of medical-related information available today gives rise to a multidimensional source of knowledge. Research discoveries published in prestigious venues, electronic-health records data, discharge summaries, clinical notes, etc., all represent important medical information that can assist in the medical decision-making process. The challenge that comes with accessing and using such vast and diverse sources of data stands in the ability to distil and extract reliable and relevant information. Computer-based tools that use natural language processing and machine learning techniques have proven to help address such challenges. This current work proposes automatic reliable solutions for solving tasks that can help achieve a personalized-medicine, a medical practice that brings together general medical knowledge and case-specific medical information. Phenotypic medical observations, along with data coming from test results, are not enough when assessing and treating a medical case. Genetic, life-style, background and environmental data also need to be taken into
account in the medical decision process. This thesis’s goal is to prove that natural
language processing and machine learning techniques represent reliable solutions for
solving important medical-related problems.
From the numerous research problems that need to be answered when implementing
personalized medicine, the scope of this thesis is restricted to four, as follows:
1. Automatic identification of obesity-related diseases by using only textual clinical
data;
2. Automatic identification of relevant abstracts of published research to be used for
building systematic reviews;
3. Automatic identification of gene functions based on textual data of published medical abstracts;
4. Automatic identification and classification of important medical relations between medical concepts in clinical and technical data. This thesis investigation on finding automatic solutions for achieving a personalized medicine through information identification and extraction focused on individual specific problems that can be later linked in a puzzle-building manner. A diverse representation technique that follows a divide-and-conquer methodological approach shows to be the most reliable solution for building automatic models that solve the above mentioned
tasks. The methodologies that I propose are supported by in-depth research experiments
and thorough discussions and conclusions.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OOU-OLD./22724
Date17 April 2012
CreatorsFrunza, Oana Magdalena
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeThèse / Thesis

Page generated in 0.0024 seconds