<p>One in nine women is expected to be diagnosed with breast cancer during her life. In 2013, an estimated 23, 800 Canadian women will be diagnosed with breast cancer and 5, 000 will die of it. Making decisions about the treatment for a patient is difficult since it depends on various clinical features, genomic factors, and pathological and cellular classification of a tumor.</p> <p>In this research, we propose a probabilistic graphical model for prognosis and diagnosis of breast cancer that can help medical doctors make better decisions about the best treatment for a patient. Probabilistic graphical models are suitable for making decisions under uncertainty from big data with missing attributes and noisy evidence.</p> <p>Using the proposed model, we may enter the results of different tests (e.g. estrogen and progesterone receptor test and HER2/neu test), microarray data, and clinical traits (e.g. woman's age, general health, menopausal status, stage of cancer, and size of the tumor) to the model and answer to following questions. How likely is it that the cancer will extend in the body (distant metastasis)? What is the chance of survival? How likely is that the cancer comes back (local or regional recurrence)? How promising is a treatment? For example, how likely metastasis is and how likely recurrence is for a new patient, if certain treatment e.g. surgical removal, radiation therapy, hormone therapy, or chemotherapy is applied. We can also classify various types of breast cancers using this model.</p> <p>Previous work mostly relied on clinical data. In our opinion, since cancer is a genetic disease, the integration of the genomic (microarray) and clinical data can improve the accuracy of the model for prognosis and diagnosis. However, increasing the number of variables may lead to poor results due to the curse of dimensionality dilemma and small sample size problem. The microarray data is high dimensional. It consists of around 25, 000 variables per patient. Moreover, structure learning and parameter learning for probabilistic graphical models require a significant amount of computations. The number of possible structures is also super-exponential with respect to the number of variables. For instance, there are more than 10^18 possible structures with just 10 variables.</p> <p>We address these problems by applying manifold learning and dimensionality reduction techniques to improve the accuracy of the model. Extensive experiments using real-world data sets such as METRIC and NKI show the accuracy of the proposed method for classification and predicting certain events, like recurrence and metastasis.</p> / Master of Science (MSc)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/13754 |
Date | 04 1900 |
Creators | KHADEMI, MAHMOUD |
Contributors | Nedialkov, Ned, Computing and Software |
Source Sets | McMaster University |
Detected Language | English |
Type | thesis |
Page generated in 0.0021 seconds