• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 97
  • 14
  • 12
  • 5
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 197
  • 197
  • 104
  • 54
  • 38
  • 37
  • 36
  • 31
  • 31
  • 30
  • 30
  • 20
  • 20
  • 19
  • 19
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Sumarização multidocumento com base em aspectos informativos / Multidocument summarization based on information aspects

Garay, Alessandro Yovan Bokan 20 August 2015 (has links)
A sumarização multidocumento consiste na produção de um sumário/resumo a partir de uma coleção de textos sobre um mesmo assunto. Devido à grande quantidade de informação disponível na Web, esta tarefa é de grande relevância já que pode facilitar a leitura dos usuários. Os aspectos informativos representam as unidades básicas de informação presentes nos textos. Por exemplo, em textos jornalísticos em que se relata um fato/acontecimento, os aspectos podem representar a seguintes informações: o que aconteceu, onde aconteceu, quando aconteceu, como aconteceu, e por que aconteceu. Conhecendo-se esses aspectos e as estratégias de produção e organização de sumários, é possível automatizar a tarefa de sumarização. No entanto, para o Português do Brasil, não há pesquisa feita sobre sumarização com base em aspectos. Portanto, neste trabalho de mestrado, investigaram-se métodos de sumarização multidocumento com base em aspectos informativos, pertencente à abordagem profunda para a sumarização, em que se busca interpretar o texto para se produzir sumários mais informativos. Em particular, implementaram-se duas etapas relacionadas: (i) identificação automática de aspectos os aspectos informativos e (ii) desenvolvimento e avaliação de dois métodos de sumarização com base em padrões de aspectos (ou templates) em sumários. Na etapa (i), criaram-se classificadores de aspectos com base em anotador de papéis semânticos, reconhecedor de entidades mencionadas, regras manuais e técnicas de aprendizado de máquina. Avaliaram-se os classificadores sobre o córpus CSTNews (Rassi et al., 2013; Felippo et al., 2014). Os resultados foram satisfatórios, demostrando que alguns aspectos podem ser identificados automaticamente em textos jornalísticos com um desempenho razoável. Já na etapa (ii), elaboraram-se dois métodos inéditos de sumarização multidocumento com base em aspectos. Os resultados obtidos mostram que os métodos propostos neste trabalho são competitivos com os métodos da literatura. Salienta-se que esta abordagem para sumarização tem recebido grande destaque ultimamente. Além disso, é inédita nos trabalhos desenvolvidos no Brasil, podendo trazer contribuições importantes para a área. / Multi-document summarization is the task of automatically producing a unique summary from a group of texts on the same topic. With the huge amount of available information in the web, this task is very relevant because it can facilitate the reading of the users. Informative aspects, in particular, represent the basic information units in texts and summaries, e.g., in news texts there should be the following information: what happened, when it happened, where it happened, how it happened and why it happened. Knowing these aspects and the strategies to produce and organize summaries, it is possible to automate the aspect-based summarization. However, there is no research about aspect-based multi-document summarization for Brazilian Portuguese. This research work investigates multi-document summarization methods based on informative aspects, which follows the deep approach for summarization, in which it aims at interpreting the texts to produce more informative summaries. In particular, two main stages are developed: (i) the automatic identification of informative aspects and (ii) and the development and evaluation of two summarization methods based on aspects patterns (or templates). In the step (i) classifiers were created based on semantic role labeling, named entity recognition, handcrafted rules and machine learning techniques. Classifiers were evaluated on the CSTNews annotated corpus (Rassi et al., 2013; Felippo et al., 2014). The results were satisfactory, demonstrating that some aspects can be automatically identified in the news with a reasonable performance. In the step (ii) two novels aspect-based multi-document summarization methods are elaborated. The results show that the proposed methods in this work are competitive with the classical methods. It should be noted that this approach has lately received a lot of attention. Furthermore, it is unprecedented in the summarization task developed in Brazil, with the potential to bring important contributions to the area.
162

Performance comparison of support vector machine and relevance vector machine classifiers for functional MRI data

Perez, Daniel Antonio 12 July 2010 (has links)
Multivariate pattern analysis (MVPA) of fMRI data has been growing in popularity due to its sensitivity to networks of brain activation. It is performed in a predictive modeling framework which is natural for implementing brain state prediction and real-time fMRI applications such as brain computer interfaces. Support vector machines (SVM) have been particularly popular for MVPA owing to their high prediction accuracy even with noisy datasets. Recent work has proposed the use of relevance vector machines (RVM) as an alternative to SVM. RVMs are particularly attractive in time sensitive applications such as real-time fMRI since they tend to perform classification faster than SVMs. Despite the use of both methods in fMRI research, little has been done to compare the performance of these two techniques. This study compares RVM to SVM in terms of time and accuracy to determine which is better suited to real-time applications.
163

Learning without labels and nonnegative tensor factorization

Balasubramanian, Krishnakumar 08 April 2010 (has links)
Supervised learning tasks like building a classifier, estimating the error rate of the predictors, are typically performed with labeled data. In most cases, obtaining labeled data is costly as it requires manual labeling. On the other hand, unlabeled data is available in abundance. In this thesis, we discuss methods to perform supervised learning tasks with no labeled data. We prove consistency of the proposed methods and demonstrate its applicability with synthetic and real world experiments. In some cases, small quantities of labeled data maybe easily available and supplemented with large quantities of unlabeled data (semi-supervised learning). We derive the asymptotic efficiency of generative models for semi-supervised learning and quantify the effect of labeled and unlabeled data on the quality of the estimate. Another independent track of the thesis is efficient computational methods for nonnegative tensor factorization (NTF). NTF provides the user with rich modeling capabilities but it comes with an added computational cost. We provide a fast algorithm for performing NTF using a modified active set method called block principle pivoting method and demonstrate its applicability to social network analysis and text mining.
164

Functional data mining with multiscale statistical procedures

Lee, Kichun 01 July 2010 (has links)
Hurst exponent and variance are two quantities that often characterize real-life, highfrequency observations. We develop the method for simultaneous estimation of a timechanging Hurst exponent H(t) and constant scale (variance) parameter C in a multifractional Brownian motion model in the presence of white noise based on the asymptotic behavior of the local variation of its sample paths. We also discuss the accuracy of the stable and simultaneous estimator compared with a few selected methods and the stability of computations that use adapted wavelet filters. Multifractals have become popular as flexible models in modeling real-life data of high frequency. We developed a method of testing whether the data of high frequency is consistent with monofractality using meaningful descriptors coming from a wavelet-generated multifractal spectrum. We discuss theoretical properties of the descriptors, their computational implementation, the use in data mining, and the effectiveness in the context of simulations, an application in turbulence, and analysis of coding/noncoding regions in DNA sequences. The wavelet thresholding is a simple and effective operation in wavelet domains that selects the subset of wavelet coefficients from a noised signal. We propose the selection of this subset in a semi-supervised fashion, in which a neighbor structure and classification function appropriate for wavelet domains are utilized. The decision to include an unlabeled coefficient in the model depends not only on its magnitude but also on the labeled and unlabeled coefficients from its neighborhood. The theoretical properties of the method are discussed and its performance is demonstrated on simulated examples.
165

Enhanced classification approach with semi-supervised learning for reliability-based system design

Patel, Jiten 02 July 2012 (has links)
Traditionally design engineers have used the Factor of Safety method for ensuring that designs do not fail in the field. Access to advanced computational tools and resources have made this process obsolete and new methods to introduce higher levels of reliability in an engineering systems are currently being investigated. However, even though high computational resources are available the computational resources required by reliability analysis procedures leave much to be desired. Furthermore, the regression based surrogate modeling techniques fail when there is discontinuity in the design space, caused by failure mechanisms, when the design is required to perform under severe externalities. Hence, in this research we propose efficient Semi-Supervised Learning based surrogate modeling techniques that will enable accurate estimation of a system's response, even under discontinuity. These methods combine the available set of labeled dataset and unlabeled dataset and provide better models than using labeled data alone. Labeled data is expensive to obtain since the responses have to be evaluated whereas unlabeled data is available in plenty, during reliability estimation, since the PDF information of uncertain variables is assumed to be known. This superior performance is gained by combining the efficiency of Probabilistic Neural Networks (PNN) for classification and Expectation-Maximization (EM) algorithm for treating the unlabeled data as labeled data with hidden labels.
166

Towards a novel medical diagnosis system for clinical decision support system applications

Kanwal, Summrina January 2016 (has links)
Clinical diagnosis of chronic disease is a vital and challenging research problem which requires intensive clinical practice guidelines in order to ensure consistent and efficient patient care. Conventional medical diagnosis systems inculcate certain limitations, like complex diagnosis processes, lack of expertise, lack of well described procedures for conducting diagnoses, low computing skills, and so on. Automated clinical decision support system (CDSS) can help physicians and radiologists to overcome these challenges by combining the competency of radiologists and physicians with the capabilities of computers. CDSS depend on many techniques from the fields of image acquisition, image processing, pattern recognition, machine learning as well as optimization for medical data analysis to produce efficient diagnoses. In this dissertation, we discuss the current challenges in designing an efficient CDSS as well as a number of the latest techniques (while identifying best practices for each stage of the framework) to meet these challenges by finding informative patterns in the medical dataset, analysing them and building a descriptive model of the object of interest and thus aiding in medical diagnosis. To meet these challenges, we propose an extension of conventional clinical decision support system framework, by incorporating artificial immune network (AIN) based hyper-parameter optimization as integral part of it. We applied the conventional as well as optimized CDSS on four case studies (most of them comprise medical images) for efficient medical diagnosis and compared the results. The first key contribution is the novel application of a local energy-based shape histogram (LESH) as the feature set for the recognition of abnormalities in mammograms. We investigated the implication of this technique for the mammogram datasets of the Mammographic Image Analysis Society and INbreast. In the evaluation, regions of interest were extracted from the mammograms, their LESH features were calculated, and they were fed to support vector machine (SVM) and echo state network (ESN) classifiers. In addition, the impact of selecting a subset of LESH features based on the classification performance was also observed and benchmarked against a state-of-the-art wavelet based feature extraction method. The second key contribution is to apply the LESH technique to detect lung cancer. The JSRT Digital Image Database of chest radiographs was selected for research experimentation. Prior to LESH feature extraction, we enhanced the radiograph images using a contrast limited adaptive histogram equalization (CLAHE) approach. Selected state-of-the-art cognitive machine learning classifiers, namely the extreme learning machine (ELM), SVM and ESN, were then applied using the LESH extracted features to enable the efficient diagnosis of a correct medical state (the existence of benign or malignant cancer) in the x-ray images. Comparative simulation results, evaluated using the classification accuracy performance measure, were further benchmarked against state-of-the-art wavelet based features, and authenticated the distinct capability of our proposed framework for enhancing the diagnosis outcome. As the third contribution, this thesis presents a novel technique for detecting breast cancer in volumetric medical images based on a three-dimensional (3D) LESH model. It is a hybrid approach, and combines the 3D LESH feature extraction technique with machine learning classifiers to detect breast cancer from MRI images. The proposed system applies CLAHE to the MRI images before extracting the 3D LESH features. Furthermore, a selected subset of features is fed to a machine learning classifier, namely the SVM, ELM or ESN, to detect abnormalities and to distinguish between different stages of abnormality. The results indicate the high performance of the proposed system. When compared with the wavelet-based feature extraction technique, statistical analysis testifies to the significance of our proposed algorithm. The fourth contribution is a novel application of the (AIN) for optimizing machine learning classification algorithms as part of CDSS. We employed our proposed technique in conjunction with selected machine learning classifiers, namely the ELM, SVM and ESN, and validated it using the benchmark medical datasets of PIMA India diabetes and BUPA liver disorders, two-dimensional (2D) medical images, namely MIAS and INbreast and JSRT chest radiographs, as well as on the three-dimensional TCGA-BRCA breast MRI dataset. The results were investigated using the classification accuracy measure and the learning time. We also compared our methodology with the benchmarked multi-objective genetic algorithm (ES)-based optimization technique. The results authenticate the potential of the AIN optimised CDSS.
167

Um modelo para recomendação de cursos de especialização baseado no perfil profissional do candidato

Souza, Antonio Eduardo Rodrigues de 27 August 2013 (has links)
Made available in DSpace on 2016-03-15T19:37:46Z (GMT). No. of bitstreams: 1 Antonio Eduardo Rodrigues de Souza.pdf: 1430416 bytes, checksum: 7625e34085fd9ad09014886b876642c3 (MD5) Previous issue date: 2013-08-27 / Economic globalization has made products and services markets more competitive, demanding a better qualification of manpower. Consequently, companies are in need of best qualified professionals to meet specific demands. In this context, specialization courses options have been sought by professionals to acquire and update knowledge. However, diversification of courses offered by various institutions in various areas, targeted to specific audiences or general, or the lack of objective information, hinder the understanding of the important factors in the decision to be taken by a candidate. A poorly chosen option may incur factors that can trigger the change or even dropping out of the course. The topic is current and relevant to Higher Education Institutions (HEIs), showing the importance of offering specialized courses that are aligned to the skills of educational institutions and the interests of the professional training and retraining. Therefore, this paper proposes to study professional factors that influence candidates in choosing a course, and develop a recommendation model, using artificial intelligence techniques to practical use in HEIs, which assists applicants in the choice of courses, as well as serve as support and guidance to staff in the selection of candidates. It was applied a methodology based on processes Knowledge Discovery in Databases (KDD) and Cross-Industry Standard Process for Data Mining (CRISP-DM) in the evaluation of the historical data of candidates freshmen at a private university in the city of São Paulo, and proposed a recommendation model, which will identify the most suitable course of a candidate's profile, using a technique of data mining based on decision trees for the discovery of relevant knowledge from database. The completion of this project has allowed to propose courses that would be more suitable to professional profiles of the candidates, based on the professional and educational historical information which were considered more important for the candidate selection. It is expected, therefore, that the counseling service will become more accurate and more responsive to the selection of candidates, assisting to reduce the number of abandonments, dropouts or changes in specialization courses offered by the studied university. / A globalização da economia tornou os mercados de produtos e serviços mais competitivos, demandando uma melhor qualificação da mão-de-obra. Consequentemente, as empresas têm necessitado de profissionais mais bem qualificados para atender a demandas específicas. Neste contexto, os cursos de especialização têm sido opções procuradas por profissionais para adquirir e atualizar o conhecimento. Contudo, a diversificação de cursos, oferecidos por diversas instituições de ensino, nas mais variadas áreas, direcionados a públicos específicos ou gerais, ou a falta de informações objetivas, dificultam a compreensão de fatores importantes na decisão a ser tomada por um candidato. Uma opção mal escolhida pode incorrer em fatores que podem desencadear a mudança ou até mesmo a desistência do curso. O tema é atual e relevante para as Instituições de Ensino Superior (IES), mostrando a importância de se ofertar cursos de especialização que estejam alinhados às competências das instituições de ensino e aos interesses de capacitação e requalificação do profissional. Portanto, o presente trabalho propõe estudar as características profissionais que influenciam os candidatos na escolha de um curso, e desenvolver um modelo de recomendação, utilizando-se técnicas de inteligência artificial, para uso prático nas IES, que auxilie os candidatos na escolha dos cursos, assim como sirva de apoio aos coordenadores na orientação e seleção dos candidatos. Será aplicada uma metodologia baseada nos processos Knowledge Discovery in Databases (KDD) e CRoss-Industry Standard Process for Data Mining (CRISPDM) para análise e avaliação dos dados históricos de candidatos ingressantes em uma universidade particular, na cidade de São Paulo, e proposto um modelo de recomendação, que identificará o curso mais adequado ao perfil de um candidato, utilizando-se uma técnica de mineração de dados baseada em árvores de decisão para a descoberta de conhecimento relevante do banco de dados. A conclusão do projeto permitiu propor cursos que seriam mais adequados aos perfis profissionais dos candidatos, tomando-se como base as informações do histórico profissional e educacional que foram consideradas mais importantes para a seleção dos candidatos. Espera-se, com isso, tornar mais preciso o serviço de aconselhamento de cursos, e mais ágil a seleção de candidatos, contribuindo para a redução do número de abandonos, desistências ou mudanças nos cursos de especialização oferecidos pela universidade estudada.
168

Sélection séquentielle en environnement aléatoire appliquée à l'apprentissage supervisé

Caelen, Olivier 25 September 2009 (has links)
Cette thèse se penche sur les problèmes de décisions devant être prises de manière séquentielle au sein d'un environnement aléatoire. Lors de chaque étape d'un tel problème décisionnel, une alternative doit être sélectionnée parmi un ensemble d'alternatives. Chaque alternative possède un gain moyen qui lui est propre et lorsque l'une d'elles est sélectionnée, celle-ci engendre un gain aléatoire. La sélection opérée peut suivre deux types d'objectifs.<p>Dans un premier cas, les tests viseront à maximiser la somme des gains collectés. Un juste compromis doit alors être trouvé entre l'exploitation et l'exploration. Ce problème est couramment dénommé dans la littérature scientifique "multi-armed bandit problem".<p>Dans un second cas, un nombre de sélections maximal est imposé et l'objectif consistera à répartir ces sélections de façon à augmenter les chances de trouver l'alternative présentant le gain moyen le plus élevé. Ce deuxième problème est couramment repris dans la littérature scientifique sous l'appellation "selecting the best".<p>La sélection de type gloutonne joue un rôle important dans la résolution de ces problèmes de décision et opère en choisissant l'alternative qui s'est jusqu'ici montrée optimale. Or, la nature généralement aléatoire de l'environnement rend incertains les résultats d'une telle sélection. <p>Dans cette thèse, nous introduisons une nouvelle quantité, appelée le "gain espéré d'une action gloutonne". Sur base de quelques propriétés de cette quantité, de nouveaux algorithmes permettant de résoudre les deux problèmes décisionnels précités seront proposés.<p>Une attention particulière sera ici prêtée à l'application des techniques présentées au domaine de la sélection de modèles en l'apprentissage artificiel supervisé. <p>La collaboration avec le service d'anesthésie de l'Hôpital Erasme nous a permis d'appliquer les algorithmes proposés à des données réelles, provenant du milieu médical. Nous avons également développé un système d'aide à la décision dont un prototype a déjà été testé en conditions réelles sur un échantillon restreint de patients. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished
169

Sumarização multidocumento com base em aspectos informativos / Multidocument summarization based on information aspects

Alessandro Yovan Bokan Garay 20 August 2015 (has links)
A sumarização multidocumento consiste na produção de um sumário/resumo a partir de uma coleção de textos sobre um mesmo assunto. Devido à grande quantidade de informação disponível na Web, esta tarefa é de grande relevância já que pode facilitar a leitura dos usuários. Os aspectos informativos representam as unidades básicas de informação presentes nos textos. Por exemplo, em textos jornalísticos em que se relata um fato/acontecimento, os aspectos podem representar a seguintes informações: o que aconteceu, onde aconteceu, quando aconteceu, como aconteceu, e por que aconteceu. Conhecendo-se esses aspectos e as estratégias de produção e organização de sumários, é possível automatizar a tarefa de sumarização. No entanto, para o Português do Brasil, não há pesquisa feita sobre sumarização com base em aspectos. Portanto, neste trabalho de mestrado, investigaram-se métodos de sumarização multidocumento com base em aspectos informativos, pertencente à abordagem profunda para a sumarização, em que se busca interpretar o texto para se produzir sumários mais informativos. Em particular, implementaram-se duas etapas relacionadas: (i) identificação automática de aspectos os aspectos informativos e (ii) desenvolvimento e avaliação de dois métodos de sumarização com base em padrões de aspectos (ou templates) em sumários. Na etapa (i), criaram-se classificadores de aspectos com base em anotador de papéis semânticos, reconhecedor de entidades mencionadas, regras manuais e técnicas de aprendizado de máquina. Avaliaram-se os classificadores sobre o córpus CSTNews (Rassi et al., 2013; Felippo et al., 2014). Os resultados foram satisfatórios, demostrando que alguns aspectos podem ser identificados automaticamente em textos jornalísticos com um desempenho razoável. Já na etapa (ii), elaboraram-se dois métodos inéditos de sumarização multidocumento com base em aspectos. Os resultados obtidos mostram que os métodos propostos neste trabalho são competitivos com os métodos da literatura. Salienta-se que esta abordagem para sumarização tem recebido grande destaque ultimamente. Além disso, é inédita nos trabalhos desenvolvidos no Brasil, podendo trazer contribuições importantes para a área. / Multi-document summarization is the task of automatically producing a unique summary from a group of texts on the same topic. With the huge amount of available information in the web, this task is very relevant because it can facilitate the reading of the users. Informative aspects, in particular, represent the basic information units in texts and summaries, e.g., in news texts there should be the following information: what happened, when it happened, where it happened, how it happened and why it happened. Knowing these aspects and the strategies to produce and organize summaries, it is possible to automate the aspect-based summarization. However, there is no research about aspect-based multi-document summarization for Brazilian Portuguese. This research work investigates multi-document summarization methods based on informative aspects, which follows the deep approach for summarization, in which it aims at interpreting the texts to produce more informative summaries. In particular, two main stages are developed: (i) the automatic identification of informative aspects and (ii) and the development and evaluation of two summarization methods based on aspects patterns (or templates). In the step (i) classifiers were created based on semantic role labeling, named entity recognition, handcrafted rules and machine learning techniques. Classifiers were evaluated on the CSTNews annotated corpus (Rassi et al., 2013; Felippo et al., 2014). The results were satisfactory, demonstrating that some aspects can be automatically identified in the news with a reasonable performance. In the step (ii) two novels aspect-based multi-document summarization methods are elaborated. The results show that the proposed methods in this work are competitive with the classical methods. It should be noted that this approach has lately received a lot of attention. Furthermore, it is unprecedented in the summarization task developed in Brazil, with the potential to bring important contributions to the area.
170

Detekce cizích objektů v rentgenových snímcích hrudníku s využitím metod strojového učení / Detection of foreign objects in X-ray chest images using machine learning methods

Matoušková, Barbora January 2021 (has links)
Foreign objects in Chest X-ray (CXR) cause complications during automatic image processing. To prevent errors caused by these foreign objects, it is necessary to automatically find them and ommit them in the analysis. These are mainly buttons, jewellery, implants, wires and tubes. At the same time, finding pacemakers and other placed devices can help with automatic processing. The aim of this work was to design a method for the detection of foreign objects in CXR. For this task, Faster R-CNN method with a pre-trained ResNet50 network for feature extraction was chosen which was trained on 4 000 images and lately tested on 1 000 images from a publicly available database. After finding the optimal learning parameters, it was managed to train the network, which achieves 75% precision, 77% recall and 76% F1 score. However, a certain part of the error is formed by non-uniform annotations of objects in the data because not all annotated foreign objects are located in the lung area, as stated in the description.

Page generated in 0.1319 seconds