Spelling suggestions: "subject:"amedical databases"" "subject:"comedical databases""
1 |
Development of a Framework to Identify Patient Pathways through a Segment of the Health Care CycleBhattacharya, Abhik 10 March 2009 (has links)
The US spends more money on health care than other industrialized nations. Nevertheless, the US lags behind them in life expectancies, access to care, and other health indicators. This can be attributed to the numerous issues that afflict the US health care sector - the lack of a universal health coverage, increasing medical errors, over and under-treatment of patients, lack of standardization, and so on.
It is believed that the structure of health care delivery as it exists in the US is broken, which consequently reduces the quality of provided care and increases costs. There is a growing consensus among the different players in the sector that a complete overhaul of the health care system is required. This study presents an approach to identify patient treatment over a cycle of care.
Every medical condition has a care cycle over which treatment is provided. The complete cycle of care of most medical conditions comprise of both inpatient and ambulatory care and start from the onset of the disease to its resolution. There are established guidelines that state what care should be provided during various points of this cycle. It is important to identify and analyze the flow of patients through this cycle of care. Once the flow is identified, various analyses can then be conducted to identify bottlenecks, delays, redundancies and other issues that reduce efficiency and increase costs.
Unfortunately, due to the fact that medical data is collected for either medical or billing purposes and not for an operational analysis, it is very difficult to analyze the flow of patients over this cycle of care. This study developed a framework to extract relevant patient medical information from existing administrative databases of health care organizations. This was used to create patient flow paths across a segment of the care cycle to enable the analysis of the care treatment. A case study was conducted at a federal health care provider to identify and map the flow over the care cycle of patients with lung cancer.
2 |
Accès sémantique aux données massives et hétérogènes en santé / Semantic access to massive and heterogeneous health dataLelong, Romain 17 June 2019 (has links)
Les données cliniques sont produites par différents professionnels de santé, dans divers lieux et sous diverses formes dans le cadre de la pratique de la médecine. Elles présentent par conséquent une hétérogénéité à la fois au niveau de leur nature et de leur structure mais également une volumétrie particulièrement importante et qualifiable de massive. Le travail réalisé dans le cadre de cette thèse s’attache à proposer une méthode de recherche d’information efficace au sein de ce type de données complexes et massives. L’accès aux données cliniques se heurte en premier lieu à la nécessité de modéliser l’informationclinique. Ceci peut notamment être réalisé au sein du dossier patient informatisé ou, dans une plus large mesure, au sein d’entrepôts de données. Je propose dans ce mémoire unepreuve de concept d’un moteur de recherche permettant d’accéder à l’information contenue au sein de l’entrepôt de données de santé sémantique du Centre Hospitalier Universitaire de Rouen. Grâce à un modèle de données générique, cet entrepôt adopte une vision de l’information assimilable à un graphe de données rendant possible la modélisation de cette information tout en préservant sa complexité conceptuelle. Afin de fournir des fonctionnalités de recherche adaptées à cette représentation générique, un langage de requêtes permettant l’accès à l’information clinique par le biais des diverses entités qui la composent a été développé et implémenté dans le cadre de cette thèse. En second lieu, la massivité des données cliniques constitue un défi technique majeur entravant la mise en oeuvre d’une recherche d’information efficace. L’implémentation initiale de la preuve de concept sur un système de gestion de base de données relationnel a permis d’objectiver les limites de ces derniers en terme de performances. Une migration vers un système NoSQL orienté clé-valeur a été réalisée. Bien qu’offrant de bonnes performances d’accès atomique aux données, cette migration a également nécessité des développements annexes et la définition d’une architecture matérielle et applicative propice à la mise en oeuvre des fonctionnalités de recherche et d’accès aux données. Enfin, l’apport de ce travail dans le contexte plus général de l’entrepôt de données de santé sémantique du CHU de Rouen a été évalué. La preuve de concept proposée dans ce travail a ainsi été exploitée pour accéder aux descriptions sémantiques afin de répondre à des critères d’inclusion et d’exclusion de patients dans des études cliniques. Dans cette évaluation, une réponse totale ou partielle a pu être apportée à 72,97% des critères. De plus, la généricité de l’outil a également permis de l’exploiter dans d’autres contextes tels que la recherche d’information documentaire et bibliographique en santé. / Clinical data are produced as part of the practice of medicine by different health professionals, in several places and in various formats. They therefore present an heterogeneity both in terms of their nature and structure and are furthermore of a particularly large volume, which make them considered as Big Data. The work carried out in this thesis aims at proposing an effective information retrieval method within the context of this type of complex and massive data. First, the access to clinical data constrained by the need to model clinical information. This can be done within Electronic Health Records and, in a larger extent, within data Warehouses. In this thesis, I proposed a proof of concept of a search engine allowing the access to the information contained in the Semantic Health Data Warehouse of the Rouen University Hospital. A generic data model allows this data warehouse to view information as a graph of data, thus enabling to model the information while preserving its conceptual complexity. In order to provide search functionalities adapted to this generic representation of data, a query language allowing access to clinical information through the various entities of which it is composed has been developed and implemented as a part of this thesis’s work. Second, the massiveness of clinical data is also a major technical challenge that hinders the implementation of an efficient information retrieval. The initial implementation of the proof of concept highlighted the limits of a relational database management systems when used in the context of clinical data. A migration to a NoSQL key-value store has been then completed. Although offering good atomic data access performance, this migration nevertheless required additional developments and the design of a suitable hardware and applicative architecture toprovide advanced search functionalities. Finally, the contribution of this work within the general context of the Semantic Health Data Warehouse of the Rouen University Hospital was evaluated. The proof of concept proposed in this work was used to access semantic descriptions of information in order to meet the criteria for including and excluding patients in clinical studies. In this evaluation, a total or partial response is given to 72.97% of the criteria. In addition, the genericity of the tool has also made it possible to use it in other contexts such as documentary and bibliographic information retrieval in health.
3 |
Seleção e construção de features relevantes para o aprendizado de máquina. / Relevant feature selection and construction for machine learning.Lee, Huei Diana 27 April 2000 (has links)
No Aprendizado de Máquina Supervisionado - AM - é apresentado ao algoritmo de indução um conjunto de instâncias de treinamento, no qual cada instância é um vetor de features rotulado com a classe. O algoritmo de indução tem como tarefa induzir um classificador que será utilizado para classificar novas instâncias. Algoritmos de indução convencionais baseam-se nos dados fornecidos pelo usuário para construir as descrições dos conceitos. Uma representação inadequada do espaço de busca ou da linguagem de descrição do conjunto de instâncias, bem como erros nos exemplos de treinamento, podem tornar os problemas de aprendizado difícies. Um dos problemas centrais em AM é a Seleção de um Subconjunto de Features - SSF - na qual o objetivo é tentar diminuir o número de features que serão fornecidas ao algoritmo de indução. São várias as razões para a realização de SSF. A primeira é que a maioria dos algoritmos de AM, computacionalmente viáveis, não trabalham bem na presença de muitas features, isto é a precisão dos classificadores gerados pode ser melhorada com a aplicação de SSF. Ainda, com um número menor de features, a compreensibilidade do conceito induzido pode ser melhorada. Uma terceira razão é o alto custo para coletar e processar grande quantidade de dados. Existem, basicamente, três abordagens para a SSF: embedded, filtro e wrapper. Por outro lado, se as features utilizadas para descrever os exemplos de treinamento são inadequadas, os algoritmos de aprendizado estão propensos a criar descrições excessivamente complexas e imprecisas. Porém, essas features, individualmente inadequadas, podem algumas vezes serem, convenientemente, combinadas gerando novas features que podem mostrar-se altamente representativas para a descrição de um conceito. O processo de construção de novas features é conhecido como Construção de Features ou Indução Construtiva - IC. Neste trabalho são enfocadas as abordagens filtro e wrapper para a realização de SSF, bem como a IC guiada pelo conhecimento. É descrita uma série de experimentos usando SSF e IC utilizando quatro conjuntos de dados naturais e diversos algoritmos simbólicos de indução. Para cada conjunto de dados e cada indutor, são realizadas várias medidas, tais como, precisão, tempo de execução do indutor e número de features selecionadas pelo indutor. São descritos também diversos experimentos realizados utilizando três conjuntos de dados do mundo real. O foco desses experimentos não está somente na avaliação da performance dos algoritmos de indução, mas também na avaliação do conhecimento extraído. Durante a extração de conhecimento, os resultados foram apresentados aos especialistas para que fossem feitas sugestões para experimentos futuros. Uma parte do conhecimento extraído desses três estudos de casos foram considerados muito interessantes pelos especialistas. Isso mostra que a interação de diferentes áreas de conhecimento, neste caso específico, áreas médica e computacional, pode produzir resultados interessantes. Assim, para que a aplicação do Aprendizado de Máquina possa gerar frutos é necessário que dois grupos de pesquisadores sejam unidos: aqueles que conhecem os métodos de AM existentes e aqueles com o conhecimento no domínio da aplicação para o fornecimento de dados e a avaliação do conhecimento adquirido. / In supervised Machine Learning - ML - an induction algorithm is typically presented with a set of training instances, where each instance is described by a vector of feature values and a class label. The task of the induction algorithm (inducer) is to induce a classifier that will be useful in classifying new cases. Conventional inductive-learning algorithms rely on existing (user) provided data to build their descriptions. Inadequate representation space or description language as well as errors in training examples can make learning problems be difficult. One of the main problems in ML is the Feature Subset Selection - FSS - problem, i.e. the learning algorithm is faced with the problem of selecting some subset of features upon which to focus its attention, while ignoring the rest. There are a variety of reasons that justify doing FSS. The first reason that can be pointed out is that most of the ML algorithms, that are computationally feasible, do not work well in the presence of a very large number of features. This means that FSS can improve the accuracy of the classifiers generated by these algorithms. Another reason to use FSS is that it can improve comprehensibility, i.e. the human ability of understanding the data and the rules generated by symbolic ML algorithms. A third reason for doing FSS is the high cost in some domains for collecting data. Finally, FSS can reduce the cost of processing huge quantities of data. Basically, there are three approaches in Machine Learning for FSS: embedded, filter and wrapper approaches. On the other hand, if the provided features for describing the training examples are inadequate, the learning algorithms are likely to create excessively complex and inaccurate descriptions. These individually inadequate features can sometimes be combined conveniently, generating new features which can turn out to be highly representative to the description of the concept. The process of constructing new features is called Constructive Induction - CI. Is this work we focus on the filter and wrapper approaches for FSS as well as Knowledge-driven CI. We describe a series of experiments for FSS and CI, performed on four natural datasets using several symbolic ML algorithms. For each dataset, various measures are taken to compare the inducers performance, for example accuracy, time taken to run the inducers and number of selected features by each evaluated induction algorithm. Several experiments using three real world datasets are also described. The focus of these three case studies is not only comparing the induction algorithms performance, but also the evaluation of the extracted knowledge. During the knowledge extraction step results were presented to the specialist, who gave many suggestions for the development of further experiments. Some of the knowledge extracted from these three real world datasets were found very interesting by the specialist. This shows that the interaction between different areas, in this case, medical and computational areas, may produce interesting results. Thus, two groups of researchers need to be put together if the application of ML is to bear fruit: those that are acquainted with the existing ML methods, and those with expertise in the given application domain to provide training data.
4 |
Seleção e construção de features relevantes para o aprendizado de máquina. / Relevant feature selection and construction for machine learning.Huei Diana Lee 27 April 2000 (has links)
No Aprendizado de Máquina Supervisionado - AM - é apresentado ao algoritmo de indução um conjunto de instâncias de treinamento, no qual cada instância é um vetor de features rotulado com a classe. O algoritmo de indução tem como tarefa induzir um classificador que será utilizado para classificar novas instâncias. Algoritmos de indução convencionais baseam-se nos dados fornecidos pelo usuário para construir as descrições dos conceitos. Uma representação inadequada do espaço de busca ou da linguagem de descrição do conjunto de instâncias, bem como erros nos exemplos de treinamento, podem tornar os problemas de aprendizado difícies. Um dos problemas centrais em AM é a Seleção de um Subconjunto de Features - SSF - na qual o objetivo é tentar diminuir o número de features que serão fornecidas ao algoritmo de indução. São várias as razões para a realização de SSF. A primeira é que a maioria dos algoritmos de AM, computacionalmente viáveis, não trabalham bem na presença de muitas features, isto é a precisão dos classificadores gerados pode ser melhorada com a aplicação de SSF. Ainda, com um número menor de features, a compreensibilidade do conceito induzido pode ser melhorada. Uma terceira razão é o alto custo para coletar e processar grande quantidade de dados. Existem, basicamente, três abordagens para a SSF: embedded, filtro e wrapper. Por outro lado, se as features utilizadas para descrever os exemplos de treinamento são inadequadas, os algoritmos de aprendizado estão propensos a criar descrições excessivamente complexas e imprecisas. Porém, essas features, individualmente inadequadas, podem algumas vezes serem, convenientemente, combinadas gerando novas features que podem mostrar-se altamente representativas para a descrição de um conceito. O processo de construção de novas features é conhecido como Construção de Features ou Indução Construtiva - IC. Neste trabalho são enfocadas as abordagens filtro e wrapper para a realização de SSF, bem como a IC guiada pelo conhecimento. É descrita uma série de experimentos usando SSF e IC utilizando quatro conjuntos de dados naturais e diversos algoritmos simbólicos de indução. Para cada conjunto de dados e cada indutor, são realizadas várias medidas, tais como, precisão, tempo de execução do indutor e número de features selecionadas pelo indutor. São descritos também diversos experimentos realizados utilizando três conjuntos de dados do mundo real. O foco desses experimentos não está somente na avaliação da performance dos algoritmos de indução, mas também na avaliação do conhecimento extraído. Durante a extração de conhecimento, os resultados foram apresentados aos especialistas para que fossem feitas sugestões para experimentos futuros. Uma parte do conhecimento extraído desses três estudos de casos foram considerados muito interessantes pelos especialistas. Isso mostra que a interação de diferentes áreas de conhecimento, neste caso específico, áreas médica e computacional, pode produzir resultados interessantes. Assim, para que a aplicação do Aprendizado de Máquina possa gerar frutos é necessário que dois grupos de pesquisadores sejam unidos: aqueles que conhecem os métodos de AM existentes e aqueles com o conhecimento no domínio da aplicação para o fornecimento de dados e a avaliação do conhecimento adquirido. / In supervised Machine Learning - ML - an induction algorithm is typically presented with a set of training instances, where each instance is described by a vector of feature values and a class label. The task of the induction algorithm (inducer) is to induce a classifier that will be useful in classifying new cases. Conventional inductive-learning algorithms rely on existing (user) provided data to build their descriptions. Inadequate representation space or description language as well as errors in training examples can make learning problems be difficult. One of the main problems in ML is the Feature Subset Selection - FSS - problem, i.e. the learning algorithm is faced with the problem of selecting some subset of features upon which to focus its attention, while ignoring the rest. There are a variety of reasons that justify doing FSS. The first reason that can be pointed out is that most of the ML algorithms, that are computationally feasible, do not work well in the presence of a very large number of features. This means that FSS can improve the accuracy of the classifiers generated by these algorithms. Another reason to use FSS is that it can improve comprehensibility, i.e. the human ability of understanding the data and the rules generated by symbolic ML algorithms. A third reason for doing FSS is the high cost in some domains for collecting data. Finally, FSS can reduce the cost of processing huge quantities of data. Basically, there are three approaches in Machine Learning for FSS: embedded, filter and wrapper approaches. On the other hand, if the provided features for describing the training examples are inadequate, the learning algorithms are likely to create excessively complex and inaccurate descriptions. These individually inadequate features can sometimes be combined conveniently, generating new features which can turn out to be highly representative to the description of the concept. The process of constructing new features is called Constructive Induction - CI. Is this work we focus on the filter and wrapper approaches for FSS as well as Knowledge-driven CI. We describe a series of experiments for FSS and CI, performed on four natural datasets using several symbolic ML algorithms. For each dataset, various measures are taken to compare the inducers performance, for example accuracy, time taken to run the inducers and number of selected features by each evaluated induction algorithm. Several experiments using three real world datasets are also described. The focus of these three case studies is not only comparing the induction algorithms performance, but also the evaluation of the extracted knowledge. During the knowledge extraction step results were presented to the specialist, who gave many suggestions for the development of further experiments. Some of the knowledge extracted from these three real world datasets were found very interesting by the specialist. This shows that the interaction between different areas, in this case, medical and computational areas, may produce interesting results. Thus, two groups of researchers need to be put together if the application of ML is to bear fruit: those that are acquainted with the existing ML methods, and those with expertise in the given application domain to provide training data.
5 |
Διάγνωση, πρόγνωση και υποστήριξη θεραπευτικής αγωγής κακοηθών λεμφωμάτων με χρήση τεχνητής νοημοσύνηςΔράκος, Ιωάννης 13 July 2010 (has links)
Η παρούσα διδακτορική διατριβή έχει ως στόχο τη δημιουργία ενός αποδοτικού μοντέλου για το Λειτουργικό Συνδυασμό Βιο-Ιατρικών δεδομένων (BioMedical data integration).
Ξεκινώντας από τη σχεδιαστική ανάλυση της ιατρικής γνώσης και των προβλημάτων που προκύπτουν από τον τρόπο παραγωγής των ιατρικών δεδομένων, προχωρεί στην επίλυση των επιμέρους θεμάτων Λειτουργικού Συνδυασμού εντός ενός συγκεκριμένου ιατρικού πεδίου και καταλήγει στον ολοκληρωμένο Λειτουργικό Συνδυασμό ιατρικών δεδομένων προερχόμενων από διαφορετικές πηγές και πεδία γνώσης.
Συνεχίζει με τη σχεδίαση ενός μοντέλου βάσεων δεδομένων που ακολουθεί «οριζόντια» λογική και είναι αρκετά αποδοτικό ώστε να αποκρίνεται σε πολύπλοκα και ευρείας κλίμακας ερωτήματα σε πραγματικό χρόνο.
Καταλήγει με την παρουσίαση μίας ολοκληρωμένης εφαρμογής η οποία εκμεταλλευόμενη τα πλεονεκτήματα του Λειτουργικού Συνδυασμού και της οριζόντιας δομής των δεδομένων είναι σε θέση να διαχειριστεί εξετάσεις προερχόμενες από κάθε κυτταρομετρητή ροής και συνδυάζοντάς αυτές με τις υπόλοιπες αιματολογικές κλινικοεργαστηριακές εξετάσεις να απαντά σε καθημερινά και σύνθετα ερευνητικά, ιατρικά ερωτήματα.
Τα πρωτότυπα ερευνητικά αποτελέσματα που προέκυψαν στα πλαίσια της παρούσης εργασίας δημοσιεύτηκαν σε έγκυρα διεθνή περιοδικά και σε διεθνή και ελληνικά συνέδρια με κριτές. / Current dissertation focuses on the creation of an efficient model for Bio-medical data integration.
Starting with an analytical approach of the medical knowledge and the problems that may occur cause of the way that medical data are produced, continues with the necessary solutions for single domain data integration and concludes with the proposal of a working framework for mass data integration, originating from multiple medical domains.
The proposed integration model is based on the “horizontal” logic of a database design and it’s efficient enough to produce query results in real time, even for complex real-life medical questions.
The proof of concept of the working framework and its goals for mass data integration is achieved through the presentation of a medical information system. The presented system, by taking advantage of the “horizontal” database design, is able to manage Flow Cytometry measurements, originating for any available hardware and by integrating the cytometric data with other types of hematological data is able to give answers to everyday and research medical questions.
All original research results that produced within the scope of this dissertation were published in international research journals and medical conferences.
6 |
Traditional Chinese medical clinic systemLiu, Chaomei 01 January 2004 (has links)
The Chinese Medical Clinic System is designed to help acupuncturists and assistants record and store information. This system can maintain and schedule appointments and view patient diagnoses effectively. The system will be implemented on a desktop PC connected to the internet to facilitate the acupuncturists record of information.
7 |
Enhancing association rules algorithms for mining distributed databases : integration of fast BitTable and multi-agent association rules mining in distributed medical databases for decision supportAbdo, Walid Adly Atteya January 2012 (has links)
Over the past few years, mining data located in heterogeneous and geographically distributed sites have been designated as one of the key important issues. Loading distributed data into centralized location for mining interesting rules is not a good approach. This is because it violates common issues such as data privacy and it imposes network overheads. The situation becomes worse when the network has limited bandwidth which is the case in most of the real time systems. This has prompted the need for intelligent data analysis to discover the hidden information in these huge amounts of distributed databases. In this research, we present an incremental approach for building an efficient Multi-Agent based algorithm for mining real world databases in geographically distributed sites. First, we propose the Distributed Multi-Agent Association Rules algorithm (DMAAR) to minimize the all-to-all broadcasting between distributed sites. Analytical calculations show that DMAAR reduces the algorithm complexity and minimizes the message communication cost. The proposed Multi-Agent based algorithm complies with the Foundation for Intelligent Physical Agents (FIPA), which is considered as the global standards in communication between agents, thus, enabling the proposed algorithm agents to cooperate with other standard agents. Second, the BitTable Multi-Agent Association Rules algorithm (BMAAR) is proposed. BMAAR includes an efficient BitTable data structure which helps in compressing the database thus can easily fit into the memory of the local sites. It also includes two BitWise AND/OR operations for quick candidate itemsets generation and support counting. Moreover, the algorithm includes three transaction trimming techniques to reduce the size of the mined data. Third, we propose the Pruning Multi-Agent Association Rules algorithm (PMAAR) which includes three candidate itemsets pruning techniques for reducing the large number of generated candidate itemsets, consequently, reducing the total time for the mining process. The proposed PMAAR algorithm has been compared with existing Association Rules algorithms against different benchmark datasets and has proved to have better performance and execution time. Moreover, PMAAR has been implemented on real world distributed medical databases obtained from more than one hospital in Egypt to discover the hidden Association Rules in patients' records to demonstrate the merits and capabilities of the proposed model further. Medical data was anonymously obtained without the patients' personal details. The analysis helped to identify the existence or the absence of the disease based on minimum number of effective examinations and tests. Thus, the proposed algorithm can help in providing accurate medical decisions based on cost effective treatments, improving the medical service for the patients, reducing the real time response for the health system and improving the quality of clinical decision making.
8 |
Enhancing association rules algorithms for mining distributed databases. Integration of fast BitTable and multi-agent association rules mining in distributed medical databases for decision support.Abdo, Walid A.A. January 2012 (has links)
Over the past few years, mining data located in heterogeneous and geographically distributed sites have been designated as one of the key important issues. Loading distributed data into centralized location for mining interesting rules is not a good approach. This is because it violates common issues such as data privacy and it imposes network overheads. The situation becomes worse when the network has limited bandwidth which is the case in most of the real time systems. This has prompted the need for intelligent data analysis to discover the hidden information in these huge amounts of distributed databases.
In this research, we present an incremental approach for building an efficient Multi-Agent based algorithm for mining real world databases in geographically distributed sites. First, we propose the Distributed Multi-Agent Association Rules algorithm (DMAAR) to minimize the all-to-all broadcasting between distributed sites. Analytical calculations show that DMAAR reduces the algorithm complexity and minimizes the message communication cost. The proposed Multi-Agent based algorithm complies with the Foundation for Intelligent Physical Agents (FIPA), which is considered as the global standards in communication between agents, thus, enabling the proposed algorithm agents to cooperate with other standard agents.
Second, the BitTable Multi-Agent Association Rules algorithm (BMAAR) is proposed. BMAAR includes an efficient BitTable data structure which helps in compressing the database thus can easily fit into the memory of the local sites. It also includes two BitWise AND/OR operations for quick candidate itemsets generation and support counting. Moreover, the algorithm includes three transaction trimming techniques to reduce the size of the mined data.
Third, we propose the Pruning Multi-Agent Association Rules algorithm (PMAAR) which includes three candidate itemsets pruning techniques for reducing the large number of generated candidate itemsets, consequently, reducing the total time for the mining process.
The proposed PMAAR algorithm has been compared with existing Association Rules algorithms against different benchmark datasets and has proved to have better performance and execution time. Moreover, PMAAR has been implemented on real world distributed medical databases obtained from more than one hospital in Egypt to discover the hidden Association Rules in patients¿ records to demonstrate the merits and capabilities of the proposed model further. Medical data was anonymously obtained without the patients¿ personal details. The analysis helped to identify the existence or the absence of the disease based on minimum number of effective examinations and tests. Thus, the proposed algorithm can help in providing accurate medical decisions based on cost effective treatments, improving the medical service for the patients, reducing the real time response for the health system and improving the quality of clinical decision making.
Page generated in 0.0871 seconds