• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 200
  • 70
  • 23
  • 22
  • 21
  • 8
  • 5
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 442
  • 442
  • 442
  • 177
  • 145
  • 99
  • 86
  • 73
  • 72
  • 58
  • 55
  • 55
  • 54
  • 49
  • 48
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
361

New support vector machine formulations and algorithms with application to biomedical data analysis

Guan, Wei 13 June 2011 (has links)
The Support Vector Machine (SVM) classifier seeks to find the separating hyperplane wx=r that maximizes the margin distance 1/||w||2^2. It can be formalized as an optimization problem that minimizes the hinge loss Ʃ[subscript i](1-y[subscript i] f(x[subscript i]))₊ plus the L₂-norm of the weight vector. SVM is now a mainstay method of machine learning. The goal of this dissertation work is to solve different biomedical data analysis problems efficiently using extensions of SVM, in which we augment the standard SVM formulation based on the application requirements. The biomedical applications we explore in this thesis include: cancer diagnosis, biomarker discovery, and energy function learning for protein structure prediction. Ovarian cancer diagnosis is problematic because the disease is typically asymptomatic especially at early stages of progression and/or recurrence. We investigate a sample set consisting of 44 women diagnosed with serous papillary ovarian cancer and 50 healthy women or women with benign conditions. We profile the relative metabolite levels in the patient sera using a high throughput ambient ionization mass spectrometry technique, Direct Analysis in Real Time (DART). We then reduce the diagnostic classification on these metabolic profiles into a functional classification problem and solve it with functional Support Vector Machine (fSVM) method. The assay distinguished between the cancer and control groups with an unprecedented 99\% accuracy (100\% sensitivity, 98\% specificity) under leave-one-out-cross-validation. This approach has significant clinical potential as a cancer diagnostic tool. High throughput technologies provide simultaneous evaluation of thousands of potential biomarkers to distinguish different patient groups. In order to assist biomarker discovery from these low sample size high dimensional cancer data, we first explore a convex relaxation of the L₀-SVM problem and solve it using mixed-integer programming techniques. We further propose a more efficient L₀-SVM approximation, fractional norm SVM, by replacing the L₂-penalty with L[subscript q]-penalty (q in (0,1)) in the optimization formulation. We solve it through Difference of Convex functions (DC) programming technique. Empirical studies on the synthetic data sets as well as the real-world biomedical data sets support the effectiveness of our proposed L₀-SVM approximation methods over other commonly-used sparse SVM methods such as the L₁-SVM method. A critical open problem in emph{ab initio} protein folding is protein energy function design. We reduce the problem of learning energy function for extit{ab initio} folding to a standard machine learning problem, learning-to-rank. Based on the application requirements, we constrain the reduced ranking problem with non-negative weights and develop two efficient algorithms for non-negativity constrained SVM optimization. We conduct the empirical study on an energy data set for random conformations of 171 proteins that falls into the {it ab initio} folding class. We compare our approach with the optimization approach used in protein structure prediction tool, TASSER. Numerical results indicate that our approach was able to learn energy functions with improved rank statistics (evaluated by pairwise agreement) as well as improved correlation between the total energy and structural dissimilarity.
362

以財務比率、共同比分析和公司治理指標預測 上市公司財務危機之基因演算法與支持向量機的計算模型 / Applying Genetic Algorithms and Support Vector Machines for Predicting Financial Distresses with Financial Ratios and Features for Common-Size Analysis and Corporate Governance

黃珮雯, Huang, Pei-Wen Unknown Date (has links)
過去已有許多技術應用來建立預測財務危機的模型,如統計學的多變量分析或是類神經網路等分類技術。這些早期預測財務危機的模型大多以財務比率作為變數。然而歷經安隆(Enron)、世界通訊(WorldCom)等世紀騙局,顯示財務數字計算而成的財務比率有其天生的限制,無法在公司管理階層蓄意虛增盈餘時,及時給予警訊。因此,本論文初步探勘共同比分析、公司治理及傳統的Altman財務比率等研究方法,試圖突破財務比率在財務危機預測問題的限制,選出可能提高財務危機預測的特徵群。接著,我們進一步應用基因演算法篩選質性與非質性的特徵,期望藉由基因演算法裡子代獲得親代間最優基因的交配過程,可以讓子代的適應值最大化,找出最佳組合的特徵群,然後以此特徵群訓練支持向量機預測模型,以提高財務預測效果並降低公眾的損失。實驗結果顯示,共同比分析與公司治理等相關特徵確實能提升預測財務危機模型的預測效果,我們應當用基因演算法嘗試更多質性與非質性的特徵組合,及早預警財務危機公司以降低社會成本。
363

兩階段特徵選取法在蛋白質質譜儀資料之應用 / A Two-Stage Approach of Feature Selection on Proteomic Spectra Data

王健源, Wang,Chien-yuan Unknown Date (has links)
藉由「早期發現,早期治療」的方式,我們可以降低癌症的死亡率。因此找出與癌症病變有關的生物標記以期及早發現與治療是一項重要的工作。本研究分析了包含正常人以及攝護腺癌症病人實際的蛋白質質譜資料,而這些蛋白質質譜資料是來自於表面強化雷射解吸電離飛行質譜技術(SELDI-TOF MS)的蛋白質晶片實驗。表面增強雷射脫附遊離飛行時間質譜技術可有效地留存生物樣本的蛋白質特徵。如果沒有經過適當的事前處理步驟以消除實驗雜訊,ㄧ 個質譜中可能包含多於數百或數千的特徵變數。為了加速對於可能的蛋白質生物標記的搜尋,我們只考慮可以區分癌症病人與正常人的特徵變數。 基因演算法是一種類似生物基因演化的總體最佳化搜尋機制,它可以有效地在高維度空間中去尋找可能的最佳解。本研究中,我們利用仿基因演算法(GAL)進行蛋白質的特徵選取以區分癌症病人與正常人。另外,我們提出兩種兩階段仿基因演算法(TSGAL),以嘗試改善仿基因演算法的缺點。 / Early detection and diagnosis can effectively reduce the mortality of cancer. The discovery of biomarkers for the early detection and diagnosis of cancer is thus an important task. In this study, a real proteomic spectra data set of prostate cancer patients and normal patients was analyzed. The data were collected from a Surface-Enhanced Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry (SELDI-TOF MS) experiment. The SELDI-TOF MS technology captures protein features in a biological sample. Without suitable pre-processing steps to remove experimental noise, a mass spectrum could consists of more than hundreds or thousands of peaks. To narrow down the search for possible protein biomarkers, only those features that can distinguish between cancer and normal patients are selected. Genetic Algorithm (GA) is a global optimization procedure that uses an analogy of the genetic evolution of biological organisms. It’s shown that GA is effective in searching complex high-dimensional space. In this study, we consider GA-Like algorithm (GAL) for feature selection on proteomic spectra data in classifying prostate cancer patients from normal patients. In addition, we propose two types of Two-Stage GAL algorithm (TSGAL) to improve the GAL.
364

Semantic Assisted, Multiresolution Image Retrieval in 3D Brain MR Volumes

Quddus, Azhar January 2010 (has links)
Content Based Image Retrieval (CBIR) is an important research area in the field of multimedia information retrieval. The application of CBIR in the medical domain has been attempted before, however the use of CBIR in medical diagnostics is a daunting task. The goal of diagnostic medical image retrieval is to provide diagnostic support by displaying relevant past cases, along with proven pathologies as ground truths. Moreover, medical image retrieval can be extremely useful as a training tool for medical students and residents, follow-up studies, and for research purposes. Despite the presence of an impressive amount of research in the area of CBIR, its acceptance for mainstream and practical applications is quite limited. The research in CBIR has mostly been conducted as an academic pursuit, rather than for providing the solution to a need. For example, many researchers proposed CBIR systems where the image database consists of images belonging to a heterogeneous mixture of man-made objects and natural scenes while ignoring the practical uses of such systems. Furthermore, the intended use of CBIR systems is important in addressing the problem of "Semantic Gap". Indeed, the requirements for the semantics in an image retrieval system for pathological applications are quite different from those intended for training and education. Moreover, many researchers have underestimated the level of accuracy required for a useful and practical image retrieval system. The human eye is extremely dexterous and efficient in visual information processing; consequently, CBIR systems should be highly precise in image retrieval so as to be useful to human users. Unsurprisingly, due to these and other reasons, most of the proposed systems have not found useful real world applications. In this dissertation, an attempt is made to address the challenging problem of developing a retrieval system for medical diagnostics applications. More specifically, a system for semantic retrieval of Magnetic Resonance (MR) images in 3D brain volumes is proposed. The proposed retrieval system has a potential to be useful for clinical experts where the human eye may fail. Previously proposed systems used imprecise segmentation and feature extraction techniques, which are not suitable for precise matching requirements of the image retrieval in this application domain. This dissertation uses multiscale representation for image retrieval, which is robust against noise and MR inhomogeneity. In order to achieve a higher degree of accuracy in the presence of misalignments, an image registration based retrieval framework is developed. Additionally, to speed-up the retrieval system, a fast discrete wavelet based feature space is proposed. Further improvement in speed is achieved by semantically classifying of the human brain into various "Semantic Regions", using an SVM based machine learning approach. A novel and fast identification system is proposed for identifying a 3D volume given a 2D image slice. To this end, we used SVM output probabilities for ranking and identification of patient volumes. The proposed retrieval systems are tested not only for noise conditions but also for healthy and abnormal cases, resulting in promising retrieval performance with respect to multi-modality, accuracy, speed and robustness. This dissertation furnishes medical practitioners with a valuable set of tools for semantic retrieval of 2D images, where the human eye may fail. Specifically, the proposed retrieval algorithms provide medical practitioners with the ability to retrieve 2D MR brain images accurately and monitor the disease progression in various lobes of the human brain, with the capability to monitor the disease progression in multiple patients simultaneously. Additionally, the proposed semantic classification scheme can be extremely useful for semantic based categorization, clustering and annotation of images in MR brain databases. This research framework may evolve in a natural progression towards developing more powerful and robust retrieval systems. It also provides a foundation to researchers in semantic based retrieval systems on how to expand existing toolsets for solving retrieval problems.
365

運用雲端運算於智慧型健保費用異常偵測之研究 / A Research into Intelligent Cloud Computing Techniques for Detecting Anomalous Health-insurance Expenses

黃聖尹, Huang, Sheng Yin Unknown Date (has links)
我國健保費用逐漸增長,進而衍生出許多健保問題,其中浮報、虛報及詐欺等三種情況,會造成許多醫療資源的浪費。然而,目前電腦檔案分析只能偵測出浮報、虛報的行為,無法偵測出詐欺情況。對於健保詐欺之偵測只能仰賴傳統隨機抽樣檢驗及人力分析,而我國健保平均一年門診審查申報量約3.5 億件,其人力的負擔非常沉重。故本研究將探討如何利用電腦工具初步判別醫事機構之費用申報情況。 本研究透過大量文獻回顧,發現美國有研究指出結合Benford’s law 與智慧型方法來進行詐欺偵測,可獲得很好的效果(Busta & Weinberg 1998)。Benford’s law 指出許多數據來源皆會呈現特定的數字頻率分佈,近年來Benford’s law 亦被應用在許多不同領域的舞弊或詐欺的審查流程中。 本研究使用Apache Hadoop 及其相關專案,建構出一個大量資料儲存分析之環境,針對大量健保申報費用資料來進行分析。此系統結合了Benford’s law 數字分析方法並運用支持向量機(Support Vector Machine)來對健保費用申報進行大規模電腦初步審查,判別該醫事機構是否有異常申報之情況發生,並將初步判別之結果提供給健保局相關稽查人員,進而做深入的審查。 本研究所建構的智慧型健保費用異常偵測模型結合了Benford’s law 衍生指標變數與實務指標變數,並利用SVM 分析健保申報費用歷史資料,產生出預判模型,之後便可藉由此模型來判別未來健保費用申報資料是否有異常情況發生。在判別異常資料方面,本研究所建構的模型其整體正確率高達97.7995%,且所有的異常申報資料皆可準確地預測出來。 因此,本研究希望能結合Benford’s law 與智慧型運算方法於健保申報異常偵測上,如此一來便可藉由電腦進行初步審查,減少因傳統隨機抽樣調查所造成的不確定性以及審核大量健保資料時過多的人力資源浪費。
366

Reconhecimento de produtos por imagem utilizando palavras visuais e redes neurais convolucionais / Image recognition of products using bag of visual words and convolutional neural networks

Juraszek, Guilherme Defreitas 15 December 2014 (has links)
Made available in DSpace on 2016-12-12T20:22:53Z (GMT). No. of bitstreams: 1 Guilherme Defreitas Juraszek.pdf: 7449714 bytes, checksum: 9caf50824709b584d611d1086803286b (MD5) Previous issue date: 2014-12-15 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The popularization of electronic devices like cameras and smartphones resulted in an increasing volume of images and videos available on the internet. This scenario allowed researchers to explore new search and retrieval techniques to use, not only the wide available text, but also extract information directly from images and videos. In this work three image recognition techniques have been compared, the Bag of Features or Bag of Visual Words (BOVW) using artificial descriptors, Convolutional Neural Networks (CNN) and CNN as a natural descriptor where the descriptors are obtained from a large pre-trained CNN in a different dataset. The techniques are applied in the image recognition problem using image analysis. Those techniques can be applied in products search applications using smartphones, smart glasses, products recognition in videos and others. The BOVW technique is demonstrated using the artificial descriptors SIFT, SURF and MSER, with dense and interest points based extraction. The algorithms KMeans and unsupervised Optimum-Path Forest (OPF-U) are used for clustering and supervised Optimum-Path Forest (OPF-S) and Support Vector Machines (SVM) are used for classification. The second technique uses a convolutional neural network (CNN) with three convolutional layers. The third technique uses the Overfeat, a large pre-trained CNN in the ImageNet dataset, for extraction of a characteristic vector of the new image dataset. This characteristic vector act as a natural descriptor and is then classified using OPF-S and SVM. The accuracy, total time of processing, time for clustering (KMeans and OPF-U), time for classification (OPF-S and SVM) are evaluated in the Caltech 101 dataset and in a dataset created by the author with images of products (Recog- Prod). It is evaluated how image size, category size and overall parameters affect the accuracy of the studied techniques. The results showed that the CNN (Overfeat), pre-trained in a different large dataset, used for extraction of the natural descriptor of the new dataset and trained with SVM achieved the best accuracy with 0.855 in the Caltech 101 dataset and 0.905 in the authors dataset. The CNN created and trained entirely by the author showed the second best result with the accuracy of 0.710, using the RGB color space in the authors dataset and 0.540 using the YUV color space in the Caltech 101 dataset. Both CNN, using RGB and YUV, showed similar accuracies but the CNN using YUV images took significant less time to be trained. The BOVW technique resulted in a accuracy lower than the preview techniques in both tested datasets. In the experiments using the author s dataset with different category sizes (5, 10, 15, 36) the CNN as a natural descriptor resulted in the best accuracy among the other tested techniques. The CNN as a natural descriptor is also the most robust, since as the number of the categories is increased, and resulted in a lower accuracy decay among the others. In the experiments with a dataset with 5 categories the CNN as natural descriptor was able to recognize all the images correctly. / A popularização de equipamentos como câmeras e celulares equipados com câmeras resultou em um grande volume de informações no formato de imagens e vídeos disponibilizadas na internet. O crescimento no volume de informação digital disponível nestes formatos demanda a criação de novas soluções de buscas baseadas não apenas em texto, mas capazes de extraírem informações relevantes diretamente desses formatos de mídia. Neste trabalho são comparadas as técnicas de reconhecimento utilizando palavras visuais por meio de descritores artificiais Bag of Visual Words ou Bag of Features (BOVW), reconhecimento utilizando redes neurais convolucionais (CNN) e reconhecimento usando descritores naturais obtidos através de uma rede neural convolucional previamente treinada em uma base distinta. As técnicas são aplicadas no problema de reconhecimento de produtos a partir da análise de imagens. Tais técnicas podem ser aplicadas em uma ampla gama de sistemas como reconhecimento de produtos utilizando dispositivos móveis, obtenção de informações de produtos visualizados utilizando um óculos de realidade aumentada, reconhecimento de produtos em vídeos, entre outros. A técnica BOVW é demonstrada com base nos descritores artificiais SIFT, SURF e MSER com extração de características densa e por meio de pontos de interesse. São estudados os algoritmos KMeans e Floresta de Caminhos Ótimos não Supervisionada (OPFU) na etapa de agrupamento e Máquinas de Vetor de Suporte (SVM) e Floresta de Caminhos Ótimos Supervisionada (OPF-S) na etapa de classificação. A segunda técnica utiliza uma rede neural convolucional (CNN) de três camadas. Na terceira técnica é utilizada uma CNN, previamente treinada na base de imagens ImageNet, de cinco camadas convolucionais. A CNN previamente treinada é utilizada para a extração de um vetor de características do novo conjunto de imagens a ser analisado. Este vetor atua como um descritor natural e é classificado utilizando SVM e OPF-S. São avaliadas a acurácia, tempo de processamento total, tempo de processamento para agrupamento (KMeans e OPF-U), tempo de processamento para classificação das técnicas nas bases de imagens Caltech 101 e em uma base de imagens de produtos criada pelo autor (RecogProd). São avaliados ainda como o tamanho da imagens, quantidade de categorias e escolha dos parâmetros influenciam na acurácia do resultado. Os resultados mostram que a utilização de uma CNN (Overfeat), previamente treinada em uma grande base de imagens, como um descritor natural para extração de um vetor de características e treinamento de um classificador SVM, apresentou a melhor acurácia com 0,855 na base Caltech101 e 0,905 na base criada, RecogProd, em uma escala de 0 a 1. A CNN criada e treinada pelo autor apresentou o segundo melhor resultado com 0,710 utilizando o espaço de cores RGB na RecogProd e 0,540 utilizando o espaço de cores YUV na base Caltech101. A CNN treinada com imagens utilizando os espaço de cores RGB e YUV apresentaram acurácias muito próximas em ambas as bases de treinamento porém, o treinamento utilizando YUV foi muito mais rápido. A técnica BOVW apresentou uma acurácia inferior à CNN como descritor natural e a CNN em ambas as bases testadas. Nos experimentos, com diversos tamanhos de categorias (5, 10, 15 e 36) da RecogProd, a CNN como descritor natural apresentou novamente a melhor acurácia. Os resultados mostram ainda que, conforme o número de categorias é aumentado, a CNN como descritor natural apresentou uma queda menor na acurácia em relação às demais técnicas avaliadas. Foi observado ainda que em uma base com 5 categorias a CNN como descritor natural alcançou a acurácia de 1,0, sendo capaz de classificar todos os exemplos corretamente.
367

Multi-label Classification with Multiple Label Correlation Orders And Structures

Posinasetty, Anusha January 2016 (has links) (PDF)
Multilabel classification has attracted much interest in recent times due to the wide applicability of the problem and the challenges involved in learning a classifier for multilabeled data. A crucial aspect of multilabel classification is to discover the structure and order of correlations among labels and their effect on the quality of the classifier. In this work, we propose a structural Support Vector Machine (structural SVM) based framework which enables us to systematically investigate the importance of label correlations in multi-label classification. The proposed framework is very flexible and provides a unified approach to handle multiple correlation orders and structures in an adaptive manner and helps to effectively assess the importance of label correlations in improving the generalization performance. We perform extensive empirical evaluation on several datasets from different domains and present results on various performance metrics. Our experiments provide for the first time, interesting insights into the following questions: a) Are label correlations always beneficial in multilabel classification? b) What effect do label correlations have on multiple performance metrics typically used in multilabel classification? c) Is label correlation order significant and if so, what would be the favorable correlation order for a given dataset and a given performance metric? and d) Can we make useful suggestions on the label correlation structure?
368

Réseaux de neurones, SVM et approches locales pour la prévision de séries temporelles / No available

Cherif, Aymen 16 July 2013 (has links)
La prévision des séries temporelles est un problème qui est traité depuis de nombreuses années. On y trouve des applications dans différents domaines tels que : la finance, la médecine, le transport, etc. Dans cette thèse, on s’est intéressé aux méthodes issues de l’apprentissage artificiel : les réseaux de neurones et les SVM. On s’est également intéressé à l’intérêt des méta-méthodes pour améliorer les performances des prédicteurs, notamment l’approche locale. Dans une optique de diviser pour régner, les approches locales effectuent le clustering des données avant d’affecter les prédicteurs aux sous ensembles obtenus. Nous présentons une modification dans l’algorithme d’apprentissage des réseaux de neurones récurrents afin de les adapter à cette approche. Nous proposons également deux nouvelles techniques de clustering, la première basée sur les cartes de Kohonen et la seconde sur les arbres binaires. / Time series forecasting is a widely discussed issue for many years. Researchers from various disciplines have addressed it in several application areas : finance, medical, transportation, etc. In this thesis, we focused on machine learning methods : neural networks and SVM. We have also been interested in the meta-methods to push up the predictor performances, and more specifically the local models. In a divide and conquer strategy, the local models perform a clustering over the data sets before different predictors are affected into each obtained subset. We present in this thesis a new algorithm for recurrent neural networks to use them as local predictors. We also propose two novel clustering techniques suitable for local models. The first is based on Kohonen maps, and the second is based on binary trees.
369

Smart control of a soft robotic hand prosthesis / Contrôle intelligent d’une prothèse de main robotique souple

Rubiano Fonseca, Astrid 09 December 2016 (has links)
Le sujet principal de cette thèse est le développement d’un contrôle commande intelligentpour une prothèse de main robotique avec des parties souples qui comporte: (i) uneinterface homme–machine permettant de contrôler notre prothèse, (ii) et des stratégiesde contrôle améliorant les performances de la main robotique. Notre approche tientcompte : 1. du développement d’une interaction intuitive entre l'homme et la prothèse facilitantl'utilisation de la main, d'un système d’interaction entre l’utilisateur et la mainreposant sur l'acquisition de signaux ElectroMyoGrammes superficiels (sEMG) aumoyen d'un dispositif placé sur l'avant-bras du patient. Les signaux obtenus sontensuite traités avec un algorithme basé sur l'intelligence artificielle, en vued'identifier automatiquement les mouvements désirés par le patient.2. du contrôle de la main robotique grâce à la détection du contact avec l’objet et de lathéorie du contrôle hybride.Ainsi, nous concentrons notre étude sur : (i) l’établissement d’une relation entre lemouvement du membre supérieur et les signaux sEMG, (ii) les séparateurs à vaste margepour classer les patterns obtenues à partir des signaux sEMG correspondant auxmouvements de préhension, (iii) le développement d'un système de reconnaissance depréhension à partir d'un dispositif portable MyoArmbandTM, (iv) et des stratégieshybrides de contrôle commande de force-position de notre main robotique souple. / The target of this thesis disertation is to develop a new Smart control of a soft robotic hand prosthesis for the soft robotic hand prosthesis called ProMain Hand, which is characterized by:(i) flexible interaction with grasped object, (ii) and friendly-intuitive interaction between human and robot hand. Flexible interaction results from the synergies between rigid bodies and soft bodies, and actuation mechanism. The ProMain hand has three fingers, each one is equipped with three phalanges: proximal, medial and distal. The proximal and medial are built with rigid bodies,and the distal is fabricated using a deformable material. The soft distal phalange has a new smart force sensor, which was created with the aim to detect contact and force in the fingertip, facilitating the control of the hand. The friendly intuitive human-hand interaction is developed to facilitate the hand utilization. The human-hand interaction is driven by a controller that uses the superficial electromyographic signals measured in the forearm employing a wearable device. The wearable device called MyoArmband is placed around the forearm near the elbow joint. Based on the signals transmitted by the wearable device, the beginning of the movement is automatically detected, analyzing entropy behavior of the EMG signals through artificial intelligence. Then, three selected grasping gesture are recognized with the following methodology: (i) learning patients entropy patterns from electromyographic signals captured during the execution of selected grasping gesture, (ii) performing a support vector machine classifier, using raw entropy data extracted in real time from electromyographic signals.
370

Vision-based moving pedestrian recognition from imprecise and uncertain data / Reconnaissance de piétons par vision à partir de données imprécises et incertaines

Zhou, Dingfu 05 December 2014 (has links)
La mise en oeuvre de systèmes avancés d’aide à la conduite (ADAS) basée vision, est une tâche complexe et difficile surtout d’un point de vue robustesse en conditions d’utilisation réelles. Une des fonctionnalités des ADAS vise à percevoir et à comprendre l’environnement de l’ego-véhicule et à fournir l’assistance nécessaire au conducteur pour réagir à des situations d’urgence. Dans cette thèse, nous nous concentrons sur la détection et la reconnaissance des objets mobiles car leur dynamique les rend plus imprévisibles et donc plus dangereux. La détection de ces objets, l’estimation de leurs positions et la reconnaissance de leurs catégories sont importants pour les ADAS et la navigation autonome. Par conséquent, nous proposons de construire un système complet pour la détection des objets en mouvement et la reconnaissance basées uniquement sur les capteurs de vision. L’approche proposée permet de détecter tout type d’objets en mouvement en fonction de deux méthodes complémentaires. L’idée de base est de détecter les objets mobiles par stéréovision en utilisant l’image résiduelle du mouvement apparent (RIMF). La RIMF est définie comme l’image du mouvement apparent causé par le déplacement des objets mobiles lorsque le mouvement de la caméra a été compensé. Afin de détecter tous les mouvements de manière robuste et de supprimer les faux positifs, les incertitudes liées à l’estimation de l’ego-mouvement et au calcul de la disparité doivent être considérées. Les étapes principales de l’algorithme sont les suivantes : premièrement, la pose relative de la caméra est estimée en minimisant la somme des erreurs de reprojection des points d’intérêt appariées et la matrice de covariance est alors calculée en utilisant une stratégie de propagation d’erreurs de premier ordre. Ensuite, une vraisemblance de mouvement est calculée pour chaque pixel en propageant les incertitudes sur l’ego-mouvement et la disparité par rapport à la RIMF. Enfin, la probabilité de mouvement et le gradient de profondeur sont utilisés pour minimiser une fonctionnelle d’énergie de manière à obtenir la segmentation des objets en mouvement. Dans le même temps, les boîtes englobantes des objets mobiles sont générées en utilisant la carte des U-disparités. Après avoir obtenu la boîte englobante de l’objet en mouvement, nous cherchons à reconnaître si l’objet en mouvement est un piéton ou pas. Par rapport aux algorithmes de classification supervisée (comme le boosting et les SVM) qui nécessitent un grand nombre d’exemples d’apprentissage étiquetés, notre algorithme de boosting semi-supervisé est entraîné avec seulement quelques exemples étiquetés et de nombreuses instances non étiquetées. Les exemples étiquetés sont d’abord utilisés pour estimer les probabilités d’appartenance aux classes des exemples non étiquetés, et ce à l’aide de modèles de mélange de gaussiennes après une étape de réduction de dimension réalisée par une analyse en composantes principales. Ensuite, nous appliquons une stratégie de boosting sur des arbres de décision entraînés à l’aide des instances étiquetées de manière probabiliste. Les performances de la méthode proposée sont évaluées sur plusieurs jeux de données de classification de référence, ainsi que sur la détection et la reconnaissance des piétons. Enfin, l’algorithme de détection et de reconnaissances des objets en mouvement est testé sur les images du jeu de données KITTI et les résultats expérimentaux montrent que les méthodes proposées obtiennent de bonnes performances dans différents scénarios de conduite en milieu urbain. / Vision-based Advanced Driver Assistance Systems (ADAS) is a complex and challenging task in real world traffic scenarios. The ADAS aims at perceiving andunderstanding the surrounding environment of the ego-vehicle and providing necessary assistance for the drivers if facing some emergencies. In this thesis, we will only focus on detecting and recognizing moving objects because they are more dangerous than static ones. Detecting these objects, estimating their positions and recognizing their categories are significantly important for ADAS and autonomous navigation. Consequently, we propose to build a complete system for moving objects detection and recognition based on vision sensors. The proposed approach can detect any kinds of moving objects based on two adjacent frames only. The core idea is to detect the moving pixels by using the Residual Image Motion Flow (RIMF). The RIMF is defined as the residual image changes caused by moving objects with compensated camera motion. In order to robustly detect all kinds of motion and remove false positive detections, uncertainties in the ego-motion estimation and disparity computation should also be considered. The main steps of our general algorithm are the following : first, the relative camera pose is estimated by minimizing the sum of the reprojection errors of matched features and its covariance matrix is also calculated by using a first-order errors propagation strategy. Next, a motion likelihood for each pixel is obtained by propagating the uncertainties of the ego-motion and disparity to the RIMF. Finally, the motion likelihood and the depth gradient are used in a graph-cut-based approach to obtain the moving objects segmentation. At the same time, the bounding boxes of moving object are generated based on the U-disparity map. After obtaining the bounding boxes of the moving object, we want to classify the moving objects as a pedestrian or not. Compared to supervised classification algorithms (such as boosting and SVM) which require a large amount of labeled training instances, our proposed semi-supervised boosting algorithm is trained with only a few labeled instances and many unlabeled instances. Firstly labeled instances are used to estimate the probabilistic class labels of the unlabeled instances using Gaussian Mixture Models after a dimension reduction step performed via Principal Component Analysis. Then, we apply a boosting strategy on decision stumps trained using the calculated soft labeled instances. The performances of the proposed method are evaluated on several state-of-the-art classification datasets, as well as on a pedestrian detection and recognition problem.Finally, both our moving objects detection and recognition algorithms are tested on the public images dataset KITTI and the experimental results show that the proposed methods can achieve good performances in different urban scenarios.

Page generated in 0.0518 seconds