• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 31
  • 9
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 84
  • 84
  • 84
  • 65
  • 64
  • 36
  • 36
  • 36
  • 29
  • 23
  • 23
  • 22
  • 20
  • 18
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Traitement de données numériques par analyse formelle de concepts et structures de patrons / Mining numerical data with formal concept analysis and pattern structures

Kaytoue, Mehdi 22 April 2011 (has links)
Le sujet principal de cette thèse porte sur la fouille de données numériques et plus particulièrement de données d'expression de gènes. Ces données caractérisent le comportement de gènes dans diverses situations biologiques (temps, cellule, etc.). Un problème important consiste à établir des groupes de gènes partageant un même comportement biologique. Cela permet d'identifier les gènes actifs lors d'un processus biologique, comme par exemple les gènes actifs lors de la défense d'un organisme face à une attaque. Le cadre de la thèse s'inscrit donc dans celui de l'extraction de connaissances à partir de données biologiques. Nous nous proposons d'étudier comment la méthode de classification conceptuelle qu'est l'analyse formelle de concepts (AFC) peut répondre au problème d'extraction de familles de gènes. Pour cela, nous avons développé et expérimenté diverses méthodes originales en nous appuyant sur une extension peu explorée de l'AFC : les structures de patrons. Plus précisément, nous montrons comment construire un treillis de concepts synthétisant des familles de gènes à comportement similaire. L'originalité de ce travail est (i) de construire un treillis de concepts sans discrétisation préalable des données de manière efficace, (ii) d'introduire une relation de similarité entres les gènes et (iii) de proposer des ensembles minimaux de conditions nécessaires et suffisantes expliquant les regroupements formés. Les résultats de ces travaux nous amènent également à montrer comment les structures de patrons peuvent améliorer la prise de décision quant à la dangerosité de pratiques agricoles dans le vaste domaine de la fusion d'information / The main topic of this thesis addresses the important problem of mining numerical data, and especially gene expression data. These data characterize the behaviour of thousand of genes in various biological situations (time, cell, etc.).A difficult task consists in clustering genes to obtain classes of genes with similar behaviour, supposed to be involved together within a biological process.Accordingly, we are interested in designing and comparing methods in the field of knowledge discovery from biological data. We propose to study how the conceptual classification method called Formal Concept Analysis (FCA) can handle the problem of extracting interesting classes of genes. For this purpose, we have designed and experimented several original methods based on an extension of FCA called pattern structures. Furthermore, we show that these methods can enhance decision making in agronomy and crop sanity in the vast formal domain of information fusion
32

Praktické uplatnění technologií data mining ve zdravotních pojišťovnách / Practical applications of data mining technologies in health insurance companies

Kulhavý, Lukáš January 2010 (has links)
This thesis focuses on data mining technology and its possible practical use in the field of health insurance companies. Thesis defines the term data mining and its relation to the term knowledge discovery in databases. The term data mining is explained, inter alia, with methods describing the individual phases of the process of knowledge discovery in databases (CRISP-DM, SEMMA). There is also information about possible practical applications, technologies and products available in the market (both products available free and commercial products). Introduction of the main data mining methods and specific algorithms (decision trees, association rules, neural networks and other methods) serves as a theoretical introduction, on which are the practical applications of real data in real health insurance companies build. These are applications seeking the causes of increased remittances and churn prediction. I have solved these applications in freely-available systems Weka and LISP-Miner. The objective is to introduce and to prove data mining capabilities over this type of data and to prove capabilities of Weka and LISP-Miner systems in solving tasks due to the methodology CRISP-DM. The last part of thesis is devoted the fields of cloud and grid computing in conjunction with data mining. It offers an insight into possibilities of these technologies and their benefits to the technology of data mining. Possibilities of cloud computing are presented on the Amazon EC2 system, grid computing can be used in Weka Experimenter interface.
33

Análise de grandezas cinemáticas e dinâmicas inerentes à hemiparesia através da descoberta de conhecimento em bases de dados / Analysis of kinematic and dynamic data inherent to hemiparesis through knowledge discovery in databases

Moretti, Caio Benatti 31 March 2016 (has links)
Em virtude de uma elevada expectativa de vida mundial, faz-se crescente a probabilidade de ocorrer acidentes naturais e traumas físicos no cotidiano, o que ocasiona um aumento na demanda por reabilitação. A terapia física, sob o paradigma da reabilitação robótica com serious games, oferece maior motivação e engajamento do paciente ao tratamento, cujo emprego foi recomendado pela American Heart Association (AHA), apontando a mais alta avaliação (Level A) para pacientes internados e ambulatoriais. No entanto, o potencial de análise dos dados coletados pelos dispositivos robóticos envolvidos é pouco explorado, deixando de extrair informações que podem ser de grande valia para os tratamentos. O foco deste trabalho consiste na aplicação de técnicas para descoberta de conhecimento, classificando o desempenho de pacientes diagnosticados com hemiparesia crônica. Os pacientes foram inseridos em um ambiente de reabilitação robótica, fazendo uso do InMotion ARM, um dispositivo robótico para reabilitação de membros superiores e coleta dos dados de desempenho. Foi aplicado sobre os dados um roteiro para descoberta de conhecimento em bases de dados, desempenhando pré-processamento, transformação (extração de características) e então a mineração de dados a partir de algoritmos de aprendizado de máquina. A estratégia do presente trabalho culminou em uma classificação de padrões com a capacidade de distinguir lados hemiparéticos sob uma precisão de 94%, havendo oito atributos alimentando a entrada do mecanismo obtido. Interpretando esta coleção de atributos, foi observado que dados de força são mais significativos, os quais abrangem metade da composição de uma amostra. / As a result of a higher life expectancy, the high probability of natural accidents and traumas occurences entails an increasing need for rehabilitation. Physical therapy, under the robotic rehabilitation paradigm with serious games, offers the patient better motivation and engagement to the treatment, being a method recommended by American Heart Association (AHA), pointing the highest assessment (Level A) for inpatients and outpatients. However, the rich potential of the data analysis provided by robotic devices is poorly exploited, discarding the opportunity to aggregate valuable information to treatments. The aim of this work consists of applying knowledge discovery techniques by classifying the performance of patients diagnosed with chronic hemiparesis. The patients, inserted into a robotic rehabilitation environment, exercised with the InMotion ARM, a robotic device for upper-limb rehabilitation which also does the collection of performance data. A Knowledge Discovery roadmap was applied over collected data in order to preprocess, transform and perform data mining through machine learning methods. The strategy of this work culminated in a pattern classification with the abilty to distinguish hemiparetic sides with an accuracy rate of 94%, having eight attributes feeding the input of the obtained mechanism. The interpretation of these attributes has shown that force-related data are more significant, comprising half of the composition of a sample.
34

Temporale Aspekte entdeckten Wissens

Baron, Steffan 06 October 2004 (has links)
In den letzten Jahren haben Anzahl und Umfang verfuegbarer Datensaetze stark zugenommen, wodurch die Entwicklung von Methoden zur Entdeckung von Wissens in den Daten zu einer grossen Herausforderung geworden ist. Waehrend dabei sonst Effizienzfragen im Vordergrund standen, wurde in juengerer Zeit auch die temporale Dimension der Daten einbezogen. Es wurden Methoden erarbeitet, die der Pflege des entdeckten Wissens dienen. Diesen Techniken liegt die Idee zugrunde, dass Daten oft ueber einen langen Zeitraum gesammelt werden. Damit sind sie den gleichen Aenderungen ausgesetzt wie die Realitaet. Aendern sich aber die Daten, ist auch mit Aenderungen in den Analyse-Ergebnissen zu rechnen. Es genuegt aber nicht, nur die Aktualitaet der Ergebnisse sicherzustellen. Vielmehr ist es notwendig, auch ihre Entwicklung im Zeitverlauf zu erfassen. In dieser Arbeit wird Wissensentdeckung als kontinuierlicher Prozess verstanden. Daten werden ueber einen potentiell langen Zeitraum gesammelt und in bestimmten Zeitabstaenden analysiert. Jede Analyse liefert eine Menge von Mustern, die in einer Regelbasis erfasst und deren Entwicklung aufgezeichnet wird. Ausgangspunkt ist ein temporales Datenmodell, das den Inhalt von Mustern und ihre statistischen Eigenschaften abbildet. Darauf aufbauend, wird ein umfassendes Bezugssystem fuer die Ueberwachung und Analyse der Entwicklung entdeckten Wissens entwickelt, das die vielen verschiedenen Facetten der Evolution von Mustern integriert und die Erkennung von Trends erlaubt. Dieses Bezugssystem ermoeglicht es, verschiedene Arten von Musteraenderungen nach qualitativen, quantitativen und temporalen Kriterien erkennen und bewerten zu koennen, andererseits gestattet es, die temporalen Eigenschaften der gefundenen Zusammenhaenge als Kriterium fuer ihre Relevanz zu nutzen und die Ursachen der beobachteten Aenderungen zu bestimmen. Im Rahmen zweier Fallstudien wurden die vorgestellten Konzepte einer eingehenden Ueberpruefung unterzogen. / Over the past years the number and size of datasets have grown significantly. This has stimulated research into the development of techniques for the discovery of knowledge in this data. Traditionally the emphasis has been on criteria such as performance and scalability; in recent years, however, the temporal dimension of the data has become a focus of interest. Methods have been developed that deal with the maintenance of the discovered knowledge. These approaches are based on the assumption that the data is collected over a long period of time and, thus, affected by the same changes as the aspects of reality captured in the data. Hence, changes to the data will also be reflected in changes to the results of analysing the data. Therefore, it is not sufficient to consider only the non-temporal aspects of the knowledge, rather it becomes a necessity to also consider the development of identified patterns over time. In this work, knowledge discovery is considered to be a continuous process: data is collected over a period of time and analysed at specific time intervals. Each analysis produces a set of patterns which are stored in a rule base and monitored based on their statistical properties. Using a temporal data model which consists of both the content of a pattern and its statistical measurements, a general framework for monitoring and analysing the development of the discovered knowledge is proposed. Integrating the many different facets of pattern evolution, the model also provides for trend recognition. The framework is used to detect and assess different types of pattern change with respect to their qualitative, quantitative and temporal aspects. In addition, it permits the usage of the temporal properties of patterns as criterion for their relevance and enables the application expert to determine the causes of pattern change. Two case studies are presented and discussed which examine the eligibility of the proposed concepts thoroughly.
35

Applications of Knowledge Discovery in Quality Registries - Predicting Recurrence of Breast Cancer and Analyzing Non-compliance with a Clinical Guideline

Razavi, Amir Reza January 2007 (has links)
In medicine, data are produced from different sources and continuously stored in data depositories. Examples of these growing databases are quality registries. In Sweden, there are many cancer registries where data on cancer patients are gathered and recorded and are used mainly for reporting survival analyses to high level health authorities. In this thesis, a breast cancer quality registry operating in South-East of Sweden is used as the data source for newer analytical techniques, i.e. data mining as a part of knowledge discovery in databases (KDD) methodology. Analyses are done to sift through these data in order to find interesting information and hidden knowledge. KDD consists of multiple steps, starting with gathering data from different sources and preparing them in data pre-processing stages prior to data mining. Data were cleaned from outliers and noise and missing values were handled. Then a proper subset of the data was chosen by canonical correlation analysis (CCA) in a dimensionality reduction step. This technique was chosen because there were multiple outcomes, and variables had complex relationship to one another. After data were prepared, they were analyzed with a data mining method. Decision tree induction as a simple and efficient method was used to mine the data. To show the benefits of proper data pre-processing, results from data mining with pre-processing of the data were compared with results from data mining without data pre-processing. The comparison showed that data pre-processing results in a more compact model with a better performance in predicting the recurrence of cancer. An important part of knowledge discovery in medicine is to increase the involvement of medical experts in the process. This starts with enquiry about current problems in their field, which leads to finding areas where computer support can be helpful. The experts can suggest potentially important variables and should then approve and validate new patterns or knowledge as predictive or descriptive models. If it can be shown that the performance of a model is comparable to domain experts, it is more probable that the model will be used to support physicians in their daily decision-making. In this thesis, we validated the model by comparing predictions done by data mining and those made by domain experts without finding any significant difference between them. Breast cancer patients who are treated with mastectomy are recommended to receive radiotherapy. This treatment is called postmastectomy radiotherapy (PMRT) and there is a guideline for prescribing it. A history of this treatment is stored in breast cancer registries. We analyzed these datasets using rules from a clinical guideline and identified cases that had not been treated according to the PMRT guideline. Data mining revealed some patterns of non-compliance with the PMRT guideline. Further analysis with data mining revealed some reasons for guideline non-compliance. These patterns were then compared with reasons acquired from manual inspection of patient records. The comparisons showed that patterns resulting from data mining were limited to the stored variables in the registry. A prerequisite for better results is availability of comprehensive datasets. Medicine can take advantage of KDD methodology in different ways. The main advantage is being able to reuse information and explore hidden knowledge that can be obtained using advanced analysis techniques. The results depend on good collaboration between medical informaticians and domain experts and the availability of high quality data.
36

Aplicação do processo de descoberta de conhecimento em dados do poder judiciário do estado do Rio Grande do Sul / Applying the Knowledge Discovery in Database (KDD) Process to Data of the Judiciary Power of Rio Grande do Sul

Schneider, Luís Felipe January 2003 (has links)
Para explorar as relações existentes entre os dados abriu-se espaço para a procura de conhecimento e informações úteis não conhecidas, a partir de grandes conjuntos de dados armazenados. A este campo deu-se o nome de Descoberta de Conhecimento em Base de Dados (DCBD), o qual foi formalizado em 1989. O DCBD é composto por um processo de etapas ou fases, de natureza iterativa e interativa. Este trabalho baseou-se na metodologia CRISP-DM . Independente da metodologia empregada, este processo tem uma fase que pode ser considerada o núcleo da DCBD, a “mineração de dados” (ou modelagem conforme CRISP-DM), a qual está associado o conceito “classe de tipo de problema”, bem como as técnicas e algoritmos que podem ser empregados em uma aplicação de DCBD. Destacaremos as classes associação e agrupamento, as técnicas associadas a estas classes, e os algoritmos Apriori e K-médias. Toda esta contextualização estará compreendida na ferramenta de mineração de dados escolhida, Weka (Waikato Environment for Knowledge Analysis). O plano de pesquisa está centrado em aplicar o processo de DCBD no Poder Judiciário no que se refere a sua atividade fim, julgamentos de processos, procurando por descobertas a partir da influência da classificação processual em relação à incidência de processos, ao tempo de tramitação, aos tipos de sentenças proferidas e a presença da audiência. Também, será explorada a procura por perfis de réus, nos processos criminais, segundo características como sexo, estado civil, grau de instrução, profissão e raça. O trabalho apresenta nos capítulos 2 e 3 o embasamento teórico de DCBC, detalhando a metodologia CRISP-DM. No capítulo 4 explora-se toda a aplicação realizada nos dados do Poder Judiciário e por fim, no capítulo 5, são apresentadas as conclusões. / With the purpose of exploring existing connections among data, a space has been created for the search of Knowledge an useful unknown information based on large sets of stored data. This field was dubbed Knowledge Discovery in Databases (KDD) and it was formalized in 1989. The KDD consists of a process made up of iterative and interactive stages or phases. This work was based on the CRISP-DM methodology. Regardless of the methodology used, this process features a phase that may be considered as the nucleus of KDD, the “data mining” (or modeling according to CRISP-DM) which is associated with the task, as well as the techniques and algorithms that may be employed in an application of KDD. What will be highlighted in this study is affinity grouping and clustering, techniques associated with these tasks and Apriori and K-means algorithms. All this contextualization will be embodied in the selected data mining tool, Weka (Waikato Environment for Knowledge Analysis). The research plan focuses on the application of the KDD process in the Judiciary Power regarding its related activity, court proceedings, seeking findings based on the influence of the procedural classification concerning the incidence of proceedings, the proceduring time, the kind of sentences pronounced and hearing attendance. Also, the search for defendants’ profiles in criminal proceedings such as sex, marital status, education background, professional and race. In chapters 2 and 3, the study presents the theoretical grounds of KDD, explaining the CRISP-DM methodology. Chapter 4 explores all the application preformed in the data of the Judiciary Power, and lastly, in Chapter conclusions are drawn
37

Teorias sociais implícitas nos índices e sistemas de indicadores: uma contribuição estatística ao estudo do desenvolvimento

Soares Júnior, Jair Sampaio January 2010 (has links)
Submitted by Tatiana Lima (tatianasl@ufba.br) on 2015-03-23T19:53:51Z No. of bitstreams: 1 Soares Júnior, Jair Sampaio.pdf: 3716316 bytes, checksum: 0ed464d1e2ef44d783c119d942d5b049 (MD5) / Approved for entry into archive by Tatiana Lima (tatianasl@ufba.br) on 2015-04-06T17:55:50Z (GMT) No. of bitstreams: 1 Soares Júnior, Jair Sampaio.pdf: 3716316 bytes, checksum: 0ed464d1e2ef44d783c119d942d5b049 (MD5) / Made available in DSpace on 2015-04-06T17:55:50Z (GMT). No. of bitstreams: 1 Soares Júnior, Jair Sampaio.pdf: 3716316 bytes, checksum: 0ed464d1e2ef44d783c119d942d5b049 (MD5) / Dados públicos armazenados, nunca antes disponibilizados à população, hoje podem ser acessados livremente através da Internet. Paralelamente, impulsionado pelo aumento da capacidade computacional de armazenamento e processamento de dados, o Knowledge Discovery in Databases – KDD tem se tornado um método amplamente discutido para extração de conhecimento das bases públicas de dados. Este trabalho se propõe a contribuir para o estudo do desenvolvimento humano ao avaliar o potencial do KDD como método a ser empregado na mensuração do desenvolvimento social a partir de informações públicas. Por outro lado, a evolução do pensamento científico sobre o desenvolvimento converge cada vez mais para uma percepção transdisciplinar, complexa e intangível. Nessa perspectiva, os atuais métodos utilizados na construção de índices e sistemas de indicadores sociais, mostram-se insuficientes para representar o fenômeno e o KDD destaca-se em meio ao estado da arte das tecnologias de pesquisa empregadas no estudo desse tema como um método promissor, uma vez que contempla a modelagem de conceitos sociais e permite identificar e mensurar diversas relações entre os fatores associados ao fenômeno. A partir da teoria das medições, que se baseia na filosofia da ciência e na estatística, são exploradas as potencialidades analíticas do método na elaboração de índices sociais, tanto em nível teórico-metodológico – em que se discutem os construtos, modelos e indicadores – quanto em nível prático, através da sua aplicação ao contexto do desenvolvimento social no Brasil. Para isso, no nível teóricometodológico, são analisados quarenta e três dos mais conhecidos trabalhos nacionais e internacionais utilizados na mensuração do desenvolvimento social. No nível prático, são utilizados dados públicos do Instituto Brasileiro de Geografia e Estatística, Ministério da Saúde, Ministério da Educação e Ministério da Justiça de todos os 5.560 municípios brasileiros. Os resultados da pesquisa apontam que, de fato, o KDD apresenta-se como um método com grande potencial analítico em relação aos métodos tradicionais, revelando-se, também, como adequado à abordagem multivariada do fenômeno, uma vez que é capaz de refletir, na esfera das investigações da realidade social, a complexidade do fenômeno, em afinidade com as formulações teóricas mais recentes. Public data stored, never before available to the population today can be accessed freely via the Internet. In parallel, driven by increased computing storage and processing of data, Knowledge Discovery in Databases - KDD has become a widely discussed method for extracting knowledge from the public databases. This paper aims to contribute to the study of human development to assess the potential of KDD as a method to be employed in the measurement of social development from public information. Moreover, the evolution of scientific thinking on development converges increasingly to a perception disciplinary, complex and intangible. From this perspective, the current methods used in the construction of indexes and systems of social indicators, to be insufficient to represent the phenomenon and KDD stands out amid the state of the art search technologies employed to study this subject as a promising method, since it includes the modeling of social concepts and to identify and measure various relationships between factors associated with the phenomenon. From the theory of measurement based on philosophy of science and statistics, are exploring the potential of the analytical method in developing social indicators, both in theoretical and methodological - in which we discuss the constructs, models and indicators - as on a practical level, through its application to the context of social development in Brazil. For this, the theoretical and methodological, are analyzed Forty-three of the best known national and international work used in the measurement of social development. On a practical level, are used public data from the Brazilian Institute of Geography and Statistics, Ministry of Health, Ministry of Education and Ministry of Justice in all 5,560 Brazilian cities. The survey results indicate that, in fact, the KDD is presented as an analytical method with great potential in relation to traditional methods, revealing, too, as appropriate to the multivariate approach the phenomenon, since it is capable of reflecting, in the sphere of research of social reality, the complexity of the phenomenon, in affinity with the more recent theoretical formulations.
38

Modelação e análise da vida útil (metrológica) de medidores tipo indução de energia elétrica ativa /

Silva, Marcelo Rubia da. January 2010 (has links)
Orientador: Carlos Alberto Canesin / Banca: Júlio Borges de Souza / Banca: Denizar Cruz Martins / Resumo: O estudo da confiabilidade operacional de equipamentos se tornou fundamental para as empresas possuírem o devido controle dos seus ativos, tanto pelo lado financeiro quanto em questões de segurança. O estudo da taxa de falha de equipamentos prevê quando as falhas irão ocorrer possibilitando estabelecer atitudes preventivas, porém, seu estudo deve ser realizado em condições de operação estabelecidas e fixas. Os medidores de energia elétrica, parte do ativo financeiro das concessionárias de energia, são equipamentos utilizados em diversas condições de operação, tanto nas condições do fluxo de energia, tais como presenças de harmônicos, subtensões, sobre-tensões e padrões de consumo distintos, quanto pelo local físico de instalação, tais como maresia, temperatura, umidade, etc. As falhas nos medidores eletromecânicos de energia elétrica são de difícil constatação uma vez que a maioria dos erros de medição, ocasionados principalmente por envelhecimento de componentes, não alteram a qualidade da energia fornecida e nem interrompem o seu fornecimento. Neste sentido, este trabalho propõe uma nova metodologia de determinação de falhas em medidores eletromecânicos de energia elétrica ativa. Faz-se uso de banco de dados de uma concessionária de energia elétrica e do processo de descoberta de conhecimento em bases de dados para selecionar as variáveis mais significativas na determinação de falhas em medidores eletromecânicos de energia elétrica ativa, incluindo no conjunto de falhas a operação com erros de medição acima do permitido pela legislação nacional (2010). Duas técnicas de mineração de dados foram utilizadas: regressão stepwise e árvores de decisão. As variáveis obtidas foram utilizadas na construção de um modelo de agrupamento de equipamentos associando a cada grupo uma probabilidade... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: The operational reliability study of equipments has become primal in order to enterprises have the righteous control over their assets, both by financial side as by security reasons. The study for the hazard rate of equipments allows to foresee the failures for the equipments and to act preventively, but this study must be accomplished under established and fixed operation conditions. The energy meters, for their part, are equipments utilized in several operating conditions so on the utilization manner, like presence of harmonics, undervoltages and over-voltages and distinct consumption patterns, as on the installation location, like swel, temperature, humidity, etc. Failures in electromechanical Wh-meters are difficult to detect once that the majority of metering errors occurred mainly by aging of components do not change the quality of offered energy neither disrupt its supply. In this context, this work proposes a novel methodology to obtain failure determination for electromechanical Whmeters. It utilizes Wh-databases from an electrical company and of the process of knowledge discovery in databases to specify the most significant variables in determining failures in electromechanical Wh-meters, including in the failure set the operation with metering errors above those permitted by national regulations (2010). Two techniques of data mining were used in this work: stepwise regression and decision trees. The obtained variables were utilized on the construction of a model of clustering similar equipments and the probability of failure of those clusters were determined. As final results, an application in a friendly platform were developed in order to apply the methodology, and a case study was accomplished in order to demonstrate its feasibility. / Mestre
39

Aplicação do processo de descoberta de conhecimento em dados do poder judiciário do estado do Rio Grande do Sul / Applying the Knowledge Discovery in Database (KDD) Process to Data of the Judiciary Power of Rio Grande do Sul

Schneider, Luís Felipe January 2003 (has links)
Para explorar as relações existentes entre os dados abriu-se espaço para a procura de conhecimento e informações úteis não conhecidas, a partir de grandes conjuntos de dados armazenados. A este campo deu-se o nome de Descoberta de Conhecimento em Base de Dados (DCBD), o qual foi formalizado em 1989. O DCBD é composto por um processo de etapas ou fases, de natureza iterativa e interativa. Este trabalho baseou-se na metodologia CRISP-DM . Independente da metodologia empregada, este processo tem uma fase que pode ser considerada o núcleo da DCBD, a “mineração de dados” (ou modelagem conforme CRISP-DM), a qual está associado o conceito “classe de tipo de problema”, bem como as técnicas e algoritmos que podem ser empregados em uma aplicação de DCBD. Destacaremos as classes associação e agrupamento, as técnicas associadas a estas classes, e os algoritmos Apriori e K-médias. Toda esta contextualização estará compreendida na ferramenta de mineração de dados escolhida, Weka (Waikato Environment for Knowledge Analysis). O plano de pesquisa está centrado em aplicar o processo de DCBD no Poder Judiciário no que se refere a sua atividade fim, julgamentos de processos, procurando por descobertas a partir da influência da classificação processual em relação à incidência de processos, ao tempo de tramitação, aos tipos de sentenças proferidas e a presença da audiência. Também, será explorada a procura por perfis de réus, nos processos criminais, segundo características como sexo, estado civil, grau de instrução, profissão e raça. O trabalho apresenta nos capítulos 2 e 3 o embasamento teórico de DCBC, detalhando a metodologia CRISP-DM. No capítulo 4 explora-se toda a aplicação realizada nos dados do Poder Judiciário e por fim, no capítulo 5, são apresentadas as conclusões. / With the purpose of exploring existing connections among data, a space has been created for the search of Knowledge an useful unknown information based on large sets of stored data. This field was dubbed Knowledge Discovery in Databases (KDD) and it was formalized in 1989. The KDD consists of a process made up of iterative and interactive stages or phases. This work was based on the CRISP-DM methodology. Regardless of the methodology used, this process features a phase that may be considered as the nucleus of KDD, the “data mining” (or modeling according to CRISP-DM) which is associated with the task, as well as the techniques and algorithms that may be employed in an application of KDD. What will be highlighted in this study is affinity grouping and clustering, techniques associated with these tasks and Apriori and K-means algorithms. All this contextualization will be embodied in the selected data mining tool, Weka (Waikato Environment for Knowledge Analysis). The research plan focuses on the application of the KDD process in the Judiciary Power regarding its related activity, court proceedings, seeking findings based on the influence of the procedural classification concerning the incidence of proceedings, the proceduring time, the kind of sentences pronounced and hearing attendance. Also, the search for defendants’ profiles in criminal proceedings such as sex, marital status, education background, professional and race. In chapters 2 and 3, the study presents the theoretical grounds of KDD, explaining the CRISP-DM methodology. Chapter 4 explores all the application preformed in the data of the Judiciary Power, and lastly, in Chapter conclusions are drawn
40

Análise de grandezas cinemáticas e dinâmicas inerentes à hemiparesia através da descoberta de conhecimento em bases de dados / Analysis of kinematic and dynamic data inherent to hemiparesis through knowledge discovery in databases

Caio Benatti Moretti 31 March 2016 (has links)
Em virtude de uma elevada expectativa de vida mundial, faz-se crescente a probabilidade de ocorrer acidentes naturais e traumas físicos no cotidiano, o que ocasiona um aumento na demanda por reabilitação. A terapia física, sob o paradigma da reabilitação robótica com serious games, oferece maior motivação e engajamento do paciente ao tratamento, cujo emprego foi recomendado pela American Heart Association (AHA), apontando a mais alta avaliação (Level A) para pacientes internados e ambulatoriais. No entanto, o potencial de análise dos dados coletados pelos dispositivos robóticos envolvidos é pouco explorado, deixando de extrair informações que podem ser de grande valia para os tratamentos. O foco deste trabalho consiste na aplicação de técnicas para descoberta de conhecimento, classificando o desempenho de pacientes diagnosticados com hemiparesia crônica. Os pacientes foram inseridos em um ambiente de reabilitação robótica, fazendo uso do InMotion ARM, um dispositivo robótico para reabilitação de membros superiores e coleta dos dados de desempenho. Foi aplicado sobre os dados um roteiro para descoberta de conhecimento em bases de dados, desempenhando pré-processamento, transformação (extração de características) e então a mineração de dados a partir de algoritmos de aprendizado de máquina. A estratégia do presente trabalho culminou em uma classificação de padrões com a capacidade de distinguir lados hemiparéticos sob uma precisão de 94%, havendo oito atributos alimentando a entrada do mecanismo obtido. Interpretando esta coleção de atributos, foi observado que dados de força são mais significativos, os quais abrangem metade da composição de uma amostra. / As a result of a higher life expectancy, the high probability of natural accidents and traumas occurences entails an increasing need for rehabilitation. Physical therapy, under the robotic rehabilitation paradigm with serious games, offers the patient better motivation and engagement to the treatment, being a method recommended by American Heart Association (AHA), pointing the highest assessment (Level A) for inpatients and outpatients. However, the rich potential of the data analysis provided by robotic devices is poorly exploited, discarding the opportunity to aggregate valuable information to treatments. The aim of this work consists of applying knowledge discovery techniques by classifying the performance of patients diagnosed with chronic hemiparesis. The patients, inserted into a robotic rehabilitation environment, exercised with the InMotion ARM, a robotic device for upper-limb rehabilitation which also does the collection of performance data. A Knowledge Discovery roadmap was applied over collected data in order to preprocess, transform and perform data mining through machine learning methods. The strategy of this work culminated in a pattern classification with the abilty to distinguish hemiparetic sides with an accuracy rate of 94%, having eight attributes feeding the input of the obtained mechanism. The interpretation of these attributes has shown that force-related data are more significant, comprising half of the composition of a sample.

Page generated in 0.1232 seconds