Global ETD Search

491	Extracting meaningful statistics for the characterization and classification of biological, medical, and financial data Woods, Tonya M. 21 September 2015 (has links) This thesis is focused on extracting meaningful statistics for the characterization and classification of biological, medical, and financial data and contains four chapters. The first chapter contains theoretical background on scaling and wavelets, which supports the work in chapters two and three. In the second chapter, we outline a methodology for representing sequences of DNA nucleotides as numeric matrices in order to analytically investigate important structural characteristics of DNA. This methodology involves assigning unit vectors to nucleotides, placing the vectors into columns of a matrix, and accumulating across the rows of this matrix. Transcribing the DNA in this way allows us to compute the 2-D wavelet transformation and assess regularity characteristics of the sequence via the slope of the wavelet spectra. In addition to computing a global slope measure for a sequence, we can apply our methodology for overlapping sections of nucleotides to obtain an evolutionary slope. In the third chapter, we describe various ways wavelet-based scaling may be used for cancer diagnostics. There were nearly half of a million new cases of ovarian, breast, and lung cancer in the United States last year. Breast and lung cancer have highest prevalence, while ovarian cancer has the lowest survival rate of the three. Early detection is critical for all of these diseases, but substantial obstacles to early detection exist in each case. In this work, we use wavelet-based scaling on metabolic data and radiography images in order to produce meaningful features to be used in classifying cases and controls. Computer-aided detection (CAD) algorithms for detecting lung and breast cancer often focus on select features in an image and make a priori assumptions about the nature of a nodule or a mass. In contrast, our approach to analyzing breast and lung images captures information contained in the background tissue of images as well as information about specific features and makes no such a priori assumptions. In the fourth chapter, we investigate the value of social media data in building commercial default and activity credit models. We use random forest modeling, which has been shown in many instances to achieve better predictive accuracy than logistic regression in modeling credit data. This result is of interest, as some entities are beginning to build credit scores based on this type of publicly available online data alone. Our work has shown that the addition of social media data does not provide any improvement in model accuracy over the bureau only models. However, the social media data on its own does have some limited predictive power. Wavelets Scaling Regularity Classification SVM GC content Exons Introns Ovarian cancer Breast cancer Mammography Lung cancer Lung CXR Credit risk Response model Random forest Social media data Online review data
492	Détection automatique de chutes de personnes basée sur des descripteurs spatio-temporels : définition de la méthode, évaluation des performances et implantation temps-réel Charfi, Imen 21 October 2013 (has links) (PDF) Nous proposons une méthode supervisée de détection de chutes de personnes en temps réel, robusteaux changements de point de vue et d'environnement. La première partie consiste à rendredisponible en ligne une base de vidéos DSFD enregistrées dans quatre lieux différents et qui comporteun grand nombre d'annotations manuelles propices aux comparaisons de méthodes. Nousavons aussi défini une métrique d'évaluation qui permet d'évaluer la méthode en s'adaptant à la naturedu flux vidéo et la durée d'une chute, et en tenant compte des contraintes temps réel. Dans unsecond temps, nous avons procédé à la construction et l'évaluation des descripteurs spatio-temporelsSTHF, calculés à partir des attributs géométriques de la forme en mouvement dans la scène ainsique leurs transformations, pour définir le descripteur optimisé de chute après une méthode de sélectiond'attributs. La robustesse aux changements d'environnement a été évaluée en utilisant les SVMet le Boosting. On parvient à améliorer les performances par la mise à jour de l'apprentissage parl'intégration des vidéos sans chutes enregistrées dans l'environnement définitif. Enfin, nous avonsréalisé, une implantation de ce détecteur sur un système embarqué assimilable à une caméra intelligentebasée sur un composant SoC de type Zynq. Une démarche de type Adéquation AlgorithmeArchitecture a permis d'obtenir un bon compromis performance de classification/temps de traitement [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre [SPI:OTHER] Engineering Sciences/Other Détection de chute temps réel Descripteurs spatio-temporels Sélection d'attributs SVM Boosting Base de vidéos de chute System on Chip (SoC)
493	Semantic Assisted, Multiresolution Image Retrieval in 3D Brain MR Volumes Quddus, Azhar January 2010 (has links) Content Based Image Retrieval (CBIR) is an important research area in the field of multimedia information retrieval. The application of CBIR in the medical domain has been attempted before, however the use of CBIR in medical diagnostics is a daunting task. The goal of diagnostic medical image retrieval is to provide diagnostic support by displaying relevant past cases, along with proven pathologies as ground truths. Moreover, medical image retrieval can be extremely useful as a training tool for medical students and residents, follow-up studies, and for research purposes. Despite the presence of an impressive amount of research in the area of CBIR, its acceptance for mainstream and practical applications is quite limited. The research in CBIR has mostly been conducted as an academic pursuit, rather than for providing the solution to a need. For example, many researchers proposed CBIR systems where the image database consists of images belonging to a heterogeneous mixture of man-made objects and natural scenes while ignoring the practical uses of such systems. Furthermore, the intended use of CBIR systems is important in addressing the problem of "Semantic Gap". Indeed, the requirements for the semantics in an image retrieval system for pathological applications are quite different from those intended for training and education. Moreover, many researchers have underestimated the level of accuracy required for a useful and practical image retrieval system. The human eye is extremely dexterous and efficient in visual information processing; consequently, CBIR systems should be highly precise in image retrieval so as to be useful to human users. Unsurprisingly, due to these and other reasons, most of the proposed systems have not found useful real world applications. In this dissertation, an attempt is made to address the challenging problem of developing a retrieval system for medical diagnostics applications. More specifically, a system for semantic retrieval of Magnetic Resonance (MR) images in 3D brain volumes is proposed. The proposed retrieval system has a potential to be useful for clinical experts where the human eye may fail. Previously proposed systems used imprecise segmentation and feature extraction techniques, which are not suitable for precise matching requirements of the image retrieval in this application domain. This dissertation uses multiscale representation for image retrieval, which is robust against noise and MR inhomogeneity. In order to achieve a higher degree of accuracy in the presence of misalignments, an image registration based retrieval framework is developed. Additionally, to speed-up the retrieval system, a fast discrete wavelet based feature space is proposed. Further improvement in speed is achieved by semantically classifying of the human brain into various "Semantic Regions", using an SVM based machine learning approach. A novel and fast identification system is proposed for identifying a 3D volume given a 2D image slice. To this end, we used SVM output probabilities for ranking and identification of patient volumes. The proposed retrieval systems are tested not only for noise conditions but also for healthy and abnormal cases, resulting in promising retrieval performance with respect to multi-modality, accuracy, speed and robustness. This dissertation furnishes medical practitioners with a valuable set of tools for semantic retrieval of 2D images, where the human eye may fail. Specifically, the proposed retrieval algorithms provide medical practitioners with the ability to retrieve 2D MR brain images accurately and monitor the disease progression in various lobes of the human brain, with the capability to monitor the disease progression in multiple patients simultaneously. Additionally, the proposed semantic classification scheme can be extremely useful for semantic based categorization, clustering and annotation of images in MR brain databases. This research framework may evolve in a natural progression towards developing more powerful and robust retrieval systems. It also provides a foundation to researchers in semantic based retrieval systems on how to expand existing toolsets for solving retrieval problems. Image Retrieval Medical Image Retrieval Semantic Image Retrieval Diagnostic Image Retrieval Multiresolution Wavelets Machine Learning Support Vector Machines Image Registration 2D Rigid Registration SVM MR inhomogeneity Electrical and Computer Engineering
494	Évaluation de la correction du mouvement respiratoire sur la détection des lésions en oncologie TEP Marache-Francisco, Simon 14 February 2012 (has links) (PDF) La tomographie par émission de positons (TEP) est une méthode d'imagerie clinique en forte expansion dans le domaine de l'oncologie. De nombreuses études cliniques montrent que la TEP permet, d'une part de diagnostiquer et caractériser les lésions cancéreuses à des stades plus précoces que l'imagerie anatomique conventionnelle, et d'autre part d'évaluer plus rapidement la réponse au traitement. Le raccourcissement du cycle comprenant le diagnostic, la thérapie, le suivi et la réorientation thérapeutiques contribue à augmenter le pronostic vital du patient et maîtriser les coûts de santé. La durée d'un examen TEP ne permet pas de réaliser une acquisition sous apnée. La qualité des images TEP est par conséquent affectée par les mouvements respiratoires du patient qui induisent un flou dans les images. Les effets du mouvement respiratoire sont particulièrement marqués au niveau du thorax et de l'abdomen. Plusieurs types de méthode ont été proposés pour corriger les données de ce phénomène, mais elles demeurent lourdes à mettre en place en routine clinique. Des travaux récemment publiés proposent une évaluation de ces méthodes basée sur des critères de qualité tels que le rapport signal sur bruit ou le biais. Aucune étude à ce jour n'a évalué l'impact de ces corrections sur la qualité du diagnostic clinique. Nous nous sommes focalisés sur la problématique de la détection des lésions du thorax et de l'abdomen de petit diamètre et faible contraste, qui sont les plus susceptibles de bénéficier de la correction du mouvement respiratoire en routine clinique. Nos travaux ont consisté dans un premier temps à construire une base d'images TEP qui modélisent un mouvement respiratoire non-uniforme, une variabilité inter-individuelle et contiennent un échantillonnage de lésions de taille et de contraste variable. Ce cahier des charges nous a orientés vers les méthodes de simulation Monte Carlo qui permettent de contrôler l'ensemble des paramètres influençant la formation et la qualité de l'image. Une base de 15 modèles de patient a été créée en adaptant le modèle anthropomorphique XCAT sur des images tomodensitométriques (TDM) de patients. Nous avons en parallèle développé une stratégie originale d'évaluation des performances de détection. Cette méthode comprend un système de détection des lésions automatisé basé sur l'utilisation de machines à vecteurs de support. Les performances sont mesurées par l'analyse des courbes free-receiver operating characteristics (FROC) que nous avons adaptée aux spécificités de l'imagerie TEP. L'évaluation des performances est réalisée sur deux techniques de correction du mouvement respiratoire, en les comparant avec les performances obtenues sur des images non corrigées ainsi que sur des images sans mouvement respiratoire. Les résultats obtenus sont prometteurs et montrent une réelle amélioration de la détection des lésions après correction, qui approche les performances obtenues sur les images statiques. [SPI:OTHER] Engineering Sciences/Other Imagerie médicale Thorax respirant Oncologie Reconnaissance de forme Computer aided detcetion - CAD Simulation Monte Carlo Tomodensitométrie Modèle anthropomorphique Machine à vecteur de support - SVM
495	Forecasting Mid-Term Electricity Market Clearing Price Using Support Vector Machines 2014 May 1900 (has links) In a deregulated electricity market, offering the appropriate amount of electricity at the right time with the right bidding price is of paramount importance. The forecasting of electricity market clearing price (MCP) is a prediction of future electricity price based on given forecast of electricity demand, temperature, sunshine, fuel cost, precipitation and other related factors. Currently, there are many techniques available for short-term electricity MCP forecasting, but very little has been done in the area of mid-term electricity MCP forecasting. The mid-term electricity MCP forecasting focuses electricity MCP on a time frame from one month to six months. Developing mid-term electricity MCP forecasting is essential for mid-term planning and decision making, such as generation plant expansion and maintenance schedule, reallocation of resources, bilateral contracts and hedging strategies. Six mid-term electricity MCP forecasting models are proposed and compared in this thesis: 1) a single support vector machine (SVM) forecasting model, 2) a single least squares support vector machine (LSSVM) forecasting model, 3) a hybrid SVM and auto-regression moving average with external input (ARMAX) forecasting model, 4) a hybrid LSSVM and ARMAX forecasting model, 5) a multiple SVM forecasting model and 6) a multiple LSSVM forecasting model. PJM interconnection data are used to test the proposed models. Cross-validation technique was used to optimize the control parameters and the selection of training data of the six proposed mid-term electricity MCP forecasting models. Three evaluation techniques, mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square root error (MSRE), are used to analysis the system forecasting accuracy. According to the experimental results, the multiple SVM forecasting model worked the best among all six proposed forecasting models. The proposed multiple SVM based mid-term electricity MCP forecasting model contains a data classification module and a price forecasting module. The data classification module will first pre-process the input data into corresponding price zones and then the forecasting module will forecast the electricity price in four parallel designed SVMs. This proposed model can best improve the forecasting accuracy on both peak prices and overall system compared with other 5 forecasting models proposed in this thesis. Classiﬁcation Deregulated electric market Electricity market clearing price Electricity price forecasting PJM Support vector machine (SVM) Peak price
496	Multi-label Classification with Multiple Label Correlation Orders And Structures Posinasetty, Anusha January 2016 (has links) (PDF) Multilabel classification has attracted much interest in recent times due to the wide applicability of the problem and the challenges involved in learning a classifier for multilabeled data. A crucial aspect of multilabel classification is to discover the structure and order of correlations among labels and their effect on the quality of the classifier. In this work, we propose a structural Support Vector Machine (structural SVM) based framework which enables us to systematically investigate the importance of label correlations in multi-label classification. The proposed framework is very flexible and provides a unified approach to handle multiple correlation orders and structures in an adaptive manner and helps to effectively assess the importance of label correlations in improving the generalization performance. We perform extensive empirical evaluation on several datasets from different domains and present results on various performance metrics. Our experiments provide for the first time, interesting insights into the following questions: a) Are label correlations always beneficial in multilabel classification? b) What effect do label correlations have on multiple performance metrics typically used in multilabel classification? c) Is label correlation order significant and if so, what would be the favorable correlation order for a given dataset and a given performance metric? and d) Can we make useful suggestions on the label correlation structure? Multi Label Classification Structural Support Vector Machine Machine Learning Multiclass Classification Multi-Label Classification Algorithms Structural SVM Computer Science
497	Réseaux de neurones, SVM et approches locales pour la prévision de séries temporelles / No available Cherif, Aymen 16 July 2013 (has links) La prévision des séries temporelles est un problème qui est traité depuis de nombreuses années. On y trouve des applications dans différents domaines tels que : la finance, la médecine, le transport, etc. Dans cette thèse, on s’est intéressé aux méthodes issues de l’apprentissage artificiel : les réseaux de neurones et les SVM. On s’est également intéressé à l’intérêt des méta-méthodes pour améliorer les performances des prédicteurs, notamment l’approche locale. Dans une optique de diviser pour régner, les approches locales effectuent le clustering des données avant d’affecter les prédicteurs aux sous ensembles obtenus. Nous présentons une modification dans l’algorithme d’apprentissage des réseaux de neurones récurrents afin de les adapter à cette approche. Nous proposons également deux nouvelles techniques de clustering, la première basée sur les cartes de Kohonen et la seconde sur les arbres binaires. / Time series forecasting is a widely discussed issue for many years. Researchers from various disciplines have addressed it in several application areas : finance, medical, transportation, etc. In this thesis, we focused on machine learning methods : neural networks and SVM. We have also been interested in the meta-methods to push up the predictor performances, and more specifically the local models. In a divide and conquer strategy, the local models perform a clustering over the data sets before different predictors are affected into each obtained subset. We present in this thesis a new algorithm for recurrent neural networks to use them as local predictors. We also propose two novel clustering techniques suitable for local models. The first is based on Kohonen maps, and the second is based on binary trees. Perceptron multi-couche Neural networks Multi layer perceptron Recurrent neural networks SVM (Support Vector Machines) Time series forecasting Regression Machine learning Supervised learning Unsupervised learning
498	Vision-based moving pedestrian recognition from imprecise and uncertain data / Reconnaissance de piétons par vision à partir de données imprécises et incertaines Zhou, Dingfu 05 December 2014 (has links) La mise en oeuvre de systèmes avancés d’aide à la conduite (ADAS) basée vision, est une tâche complexe et difficile surtout d’un point de vue robustesse en conditions d’utilisation réelles. Une des fonctionnalités des ADAS vise à percevoir et à comprendre l’environnement de l’ego-véhicule et à fournir l’assistance nécessaire au conducteur pour réagir à des situations d’urgence. Dans cette thèse, nous nous concentrons sur la détection et la reconnaissance des objets mobiles car leur dynamique les rend plus imprévisibles et donc plus dangereux. La détection de ces objets, l’estimation de leurs positions et la reconnaissance de leurs catégories sont importants pour les ADAS et la navigation autonome. Par conséquent, nous proposons de construire un système complet pour la détection des objets en mouvement et la reconnaissance basées uniquement sur les capteurs de vision. L’approche proposée permet de détecter tout type d’objets en mouvement en fonction de deux méthodes complémentaires. L’idée de base est de détecter les objets mobiles par stéréovision en utilisant l’image résiduelle du mouvement apparent (RIMF). La RIMF est définie comme l’image du mouvement apparent causé par le déplacement des objets mobiles lorsque le mouvement de la caméra a été compensé. Afin de détecter tous les mouvements de manière robuste et de supprimer les faux positifs, les incertitudes liées à l’estimation de l’ego-mouvement et au calcul de la disparité doivent être considérées. Les étapes principales de l’algorithme sont les suivantes : premièrement, la pose relative de la caméra est estimée en minimisant la somme des erreurs de reprojection des points d’intérêt appariées et la matrice de covariance est alors calculée en utilisant une stratégie de propagation d’erreurs de premier ordre. Ensuite, une vraisemblance de mouvement est calculée pour chaque pixel en propageant les incertitudes sur l’ego-mouvement et la disparité par rapport à la RIMF. Enfin, la probabilité de mouvement et le gradient de profondeur sont utilisés pour minimiser une fonctionnelle d’énergie de manière à obtenir la segmentation des objets en mouvement. Dans le même temps, les boîtes englobantes des objets mobiles sont générées en utilisant la carte des U-disparités. Après avoir obtenu la boîte englobante de l’objet en mouvement, nous cherchons à reconnaître si l’objet en mouvement est un piéton ou pas. Par rapport aux algorithmes de classification supervisée (comme le boosting et les SVM) qui nécessitent un grand nombre d’exemples d’apprentissage étiquetés, notre algorithme de boosting semi-supervisé est entraîné avec seulement quelques exemples étiquetés et de nombreuses instances non étiquetées. Les exemples étiquetés sont d’abord utilisés pour estimer les probabilités d’appartenance aux classes des exemples non étiquetés, et ce à l’aide de modèles de mélange de gaussiennes après une étape de réduction de dimension réalisée par une analyse en composantes principales. Ensuite, nous appliquons une stratégie de boosting sur des arbres de décision entraînés à l’aide des instances étiquetées de manière probabiliste. Les performances de la méthode proposée sont évaluées sur plusieurs jeux de données de classification de référence, ainsi que sur la détection et la reconnaissance des piétons. Enfin, l’algorithme de détection et de reconnaissances des objets en mouvement est testé sur les images du jeu de données KITTI et les résultats expérimentaux montrent que les méthodes proposées obtiennent de bonnes performances dans différents scénarios de conduite en milieu urbain. / Vision-based Advanced Driver Assistance Systems (ADAS) is a complex and challenging task in real world traffic scenarios. The ADAS aims at perceiving andunderstanding the surrounding environment of the ego-vehicle and providing necessary assistance for the drivers if facing some emergencies. In this thesis, we will only focus on detecting and recognizing moving objects because they are more dangerous than static ones. Detecting these objects, estimating their positions and recognizing their categories are significantly important for ADAS and autonomous navigation. Consequently, we propose to build a complete system for moving objects detection and recognition based on vision sensors. The proposed approach can detect any kinds of moving objects based on two adjacent frames only. The core idea is to detect the moving pixels by using the Residual Image Motion Flow (RIMF). The RIMF is defined as the residual image changes caused by moving objects with compensated camera motion. In order to robustly detect all kinds of motion and remove false positive detections, uncertainties in the ego-motion estimation and disparity computation should also be considered. The main steps of our general algorithm are the following : first, the relative camera pose is estimated by minimizing the sum of the reprojection errors of matched features and its covariance matrix is also calculated by using a first-order errors propagation strategy. Next, a motion likelihood for each pixel is obtained by propagating the uncertainties of the ego-motion and disparity to the RIMF. Finally, the motion likelihood and the depth gradient are used in a graph-cut-based approach to obtain the moving objects segmentation. At the same time, the bounding boxes of moving object are generated based on the U-disparity map. After obtaining the bounding boxes of the moving object, we want to classify the moving objects as a pedestrian or not. Compared to supervised classification algorithms (such as boosting and SVM) which require a large amount of labeled training instances, our proposed semi-supervised boosting algorithm is trained with only a few labeled instances and many unlabeled instances. Firstly labeled instances are used to estimate the probabilistic class labels of the unlabeled instances using Gaussian Mixture Models after a dimension reduction step performed via Principal Component Analysis. Then, we apply a boosting strategy on decision stumps trained using the calculated soft labeled instances. The performances of the proposed method are evaluated on several state-of-the-art classification datasets, as well as on a pedestrian detection and recognition problem.Finally, both our moving objects detection and recognition algorithms are tested on the public images dataset KITTI and the experimental results show that the proposed methods can achieve good performances in different urban scenarios. Séréovision Capteurs de vision Incertitude ADAS RIMF SVM Stereo-vision Motion detection Pedestrian recognition Semi-supervised learning Boosting Driver assistance system Pattern recognition systems Computer vision Robot vision Automotive sensors Traffic safety Support vector machines Decision trees ADAS
499	Machine Learning for Market Prediction : Soft Margin Classifiers for Predicting the Sign of Return on Financial Assets Abo Al Ahad, George, Salami, Abbas January 2018 (has links) Forecasting procedures have found applications in a wide variety of areas within finance and have further shown to be one of the most challenging areas of finance. Having an immense variety of economic data, stakeholders aim to understand the current and future state of the market. Since it is hard for a human to make sense out of large amounts of data, different modeling techniques have been applied to extract useful information from financial databases, where machine learning techniques are among the most recent modeling techniques. Binary classifiers such as Support Vector Machines (SVMs) have to some extent been used for this purpose where extensions of the algorithm have been developed with increased prediction performance as the main goal. The objective of this study has been to develop a process for improving the performance when predicting the sign of return of financial time series with soft margin classifiers. An analysis regarding the algorithms is presented in this study followed by a description of the methodology that has been utilized. The developed process containing some of the presented soft margin classifiers, and other aspects of kernel methods such as Multiple Kernel Learning have shown pleasant results over the long term, in which the capability of capturing different market conditions have been shown to improve with the incorporation of different models and kernels, instead of only a single one. However, the results are mostly congruent with earlier studies in this field. Furthermore, two research questions have been answered where the complexity regarding the kernel functions that are used by the SVM have been studied and the robustness of the process as a whole. Complexity refers to achieving more complex feature maps through combining kernels by either adding, multiplying or functionally transforming them. It is not concluded that an increased complexity leads to a consistent improvement, however, the combined kernel function is superior during some of the periods of the time series used in this thesis for the individual models. The robustness has been investigated for different signal-to-noise ratio where it has been observed that windows with previously poor performance are more exposed to noise impact. Machine Learning Finance Financial Time Series Support Vector Machines Relevance Vector Machines Multiple Kernel Learning Simulated Annealing SVM RVM MKL SA FSVM TSVM FTSVM Övrig annan teknik
500	Multimodal radiomics in neuro-oncology / Radiomique multimodale en neuro-oncologie Upadhaya, Taman 02 May 2017 (has links) Le glioblastome multiforme (GBM) est une tumeur de grade IV représentant 49% de toutes les tumeurs cérébrales. Malgré des modalités de traitement agressives (radiothérapie, chimiothérapie et résection chirurgicale), le pronostic est mauvais avec une survie globale médiane de 12 à 14 mois. Les aractéristiques issues de la neuro imagerie des GBM peuvent fournir de nouvelles opportunités pour la classification, le pronostic et le développement de nouvelles thérapies ciblées pour faire progresser la pratique clinique. Cette thèse se concentre sur le développement de modèles pronostiques exploitant des caractéristiques de radiomique extraites des images multimodales IRM (T1 pré- et post-contraste, T2 et FLAIR). Le contexte méthodologique proposé consiste à i) recaler tous les volumes multimodaux IRM disponibles et en segmenter un volume tumoral unique, ii) extraire des caractéristiques radiomiques et iii) construire et valider les modèles pronostiques par l’utilisation d’algorithmes d’apprentissage automatique exploitant des cohortes cliniques multicentriques de patients. Le coeur des méthodes développées est fondé sur l’extraction de radiomiques (incluant des paramètres d’intensité, de forme et de textures) pour construire des modèles pronostiques à l’aide de deux algorithmes d’apprentissage, les machines à vecteurs de support (support vector machines, SVM) et les forêts aléatoires (random forest, RF), comparées dans leur capacité à sélectionner et combiner les caractéristiques optimales. Les bénéfices et l’impact de plusieurs étapes de pré-traitement des images IRM (re-échantillonnage spatial des voxels, normalisation, segmentation et discrétisation des intensités) pour une extraction de métriques fiables ont été évalués. De plus les caractéristiques radiomiques ont été standardisées en participant à l’initiative internationale de standardisation multicentrique des radiomiques. La précision obtenue sur le jeu de test indépendant avec les deux algorithmes d’apprentissage SVM et RF, en fonction des modalités utilisées et du nombre de caractéristiques combinées atteignait 77 à 83% en exploitant toutes les radiomiques disponibles sans prendre en compte leur fiabilité intrinsèque, et 77 à 87% en n’utilisant que les métriques identifiées comme fiables.Dans cette thèse, un contexte méthodologique a été proposé, développé et validé, qui permet la construction de modèles pronostiques dans le cadre des GBM et de l’imagerie multimodale IRM exploitée par des algorithmes d’apprentissage automatique. Les travaux futurs pourront s’intéresser à l’ajout à ces modèles des informations contextuelles et génétiques. D’un point de vue algorithmique, l’exploitation de nouvelles techniques d’apprentissage profond est aussi prometteuse. / Glioblastoma multiforme (GBM) is a WHO grade IV tumor that represents 49% of ail brain tumours. Despite aggressive treatment modalities (radiotherapy, chemotherapy and surgical resections) the prognosis is poor, as médian overall survival (OS) is 12-14 months. GBM’s neuroimaging (non-invasive) features can provide opportunities for subclassification, prognostication, and the development of targeted therapies that could advance the clinical practice. This thesis focuses on developing a prognostic model based on multimodal MRI-derived (Tl pre- and post-contrast, T2 and FLAIR) radiomics in GBM. The proposed methodological framework consists in i) registering the available 3D multimodal MR images andsegmenting the tumor volume, ii) extracting radiomics iii) building and validating a prognostic model using machine learning algorithms applied to multicentric clinical cohorts of patients. The core component of the framework rely on extracting radiomics (including intensity, shape and textural metrics) and building prognostic models using two different machine learning algorithms (Support Vector Machine (SVM) and Random Forest (RF)) that were compared by selecting, ranking and combining optimal features. The potential benefits and respective impact of several MRI pre-processing steps (spatial resampling of the voxels, intensities quantization and normalization, segmentation) for reliable extraction of radiomics was thoroughly assessed. Moreover, the standardization of the radiomics features among methodological teams was done by contributing to “Multicentre Initiative for Standardisation of Radiomics”. The accuracy obtained on the independent test dataset using SVM and RF reached upto 83%- 77% when combining ail available features and upto 87%-77% when using only reliable features previously identified as robust, depending on number of features and modality. In this thesis, I developed a framework for developing a compréhensive prognostic model for patients with GBM from multimodal MRI-derived “radiomics and machine learning”. The future work will consists in building a unified prognostic model exploiting other contextual data such as genomics. In case of new algorithm development we look forward to develop the Ensemble models and deep learning-based techniques. Glioblastome multiforme Pronostic Radiomique Apprentissage automatique Machines à vecteurs de support Forêt aléatoire Sélection de paramètres Sélection de paramètres Glioblastoma multiforme Prognosis Radiomics Machine learning Prognostic model SVM RF Feature sélection Supervised learning 616.994 81

Search results