Spelling suggestions: "subject:"pubmed"" "subject:"ipubmed""
1 |
Prototype Research to Improve Online Medical Literature Search InterfaceWang, Chi-chao 22 August 2008 (has links)
none
|
2 |
PubMedWallace, Rick L. 01 January 2005 (has links)
No description available.
|
3 |
Indicateurs SIGAPS : quels sont les profils des publications des établissements de santé français ? / SIGAPS indicators : what are the publications' profiles of French healthcare institutions from 2004 to 2014?Blanc, Emeline 21 June 2019 (has links)
Le financement de la recherche des hôpitaux français est, pour une part, basé sur le nombre de publications scientifiques, en prenant aussi en compte la position des auteurs (première, deuxième, troisième, avant-dernière, dernière, liste investigateurs, Autres) et la catégorie du journal (A, B, C, D, E, NC). Le profil et l'évolution des publications des six types d'hôpitaux, centre hospitalo universitaire, centre de lutte contre le cancer, centre hospitalier, établissement à but non lucratif, service de santé des armées, établissement à but lucratif, en fonction de ces indicateurs sont évalués sur une période de 10 ans entre 2004 et 2014. Sur les 192 886 publications analysées, les centres de lutte contre le cancer publient en majorité dans les journaux de catégorie B, la catégorie E étant la plus fréquente pour les autres types d'hôpitaux. Concernant la position des auteurs dans les articles, la première position est la plus fréquente pour les centres hospitalo-universitaires et le service de santé des armées, tandis que c'est la position Autres pour les autres types d'hôpitaux. Les données sur les positions d'auteur indiquent que tous les types d'hôpitaux participent à des projets de recherche. Les centres hospitaliers, les établissements à but non lucratif, les établissements à but lucratif et les centres de lutte contre le cancer collaborent avec d'autres hôpitaux ; le centre de lutte contre le cancer se distinguant en publiant dans des journaux de haute catégorie. Sur la même période, pour les centres hospitalo-universitaires, les moyennes, les coefficients de corrélation de Spearman et les droites de régression de Pearson sont calculés entre les variables suivantes par équivalents temps plein (ETP) médical : nombre de publications, nombre de séjours à l'hôpital, ETP de praticiens hospitalo-universitaires et ETP d'internes. Les moyennes par ETP médical sont 0,73 publications, 235,8 séjours et 0,63 internes. Les corrélations entre ces trois variables deux à deux sont faibles. Les corrélations entre la proportion de praticiens hospitalo-universitaires et le nombre de publications par ETP médical, le nombre d'internes par ETP médical, le nombre de publications dans les catégories A et B, ou le nombre de publications en premier et dernier auteurs sont fortes. Les hôpitaux français participent au développement de la recherche à des niveaux différents bien que leurs profils soient tous différents. Parmi eux, les centres hospitalo-universitaires ont trois missions : la recherche, l'enseignement et le soin. Chacun d'eux présentent des modèles différents en termes d'activités de soins, d'enseignement et de recherche. Aucun n'est au-dessus de la moyenne dans les 3 activités. Neuf hôpitaux assurent au moins 2 des 3 missions avec des scores supérieurs à la moyenne / Research activity funding in French hospitals is partly based on the number of publications, while taking also into account author position (first, second, third, second-to-last, last, investigator list, and Other) and journal category (A being the highest category followed by B, C, D, E, and NC). Over the 2004-2014 decade, the profile and the evolution of publications for the six existing types of hospital, i.e public teaching hospital, cancer centre, public non-teaching hospital, not-for-profit private hospital, military hospital, and for-profit private hospital were analysed. Among a total of 192 886 publications analysed, the most frequent category was B for cancer centres, whereas this was E in the other types. The first position was the most frequent for public teaching hospitals and the military hospital, whereas the Other position was the most frequent for the other types. The author position indicated that all types of hospital are involved in research projects. Public non-teaching hospitals, not-for-profit private hospitals, for-profit private hospitals, and cancer centres collaborated with other institutions; cancer centres were often distinguished by publishing in high-category journals. Over the same period, for the public teaching hospitals, means, Spearman correlation coefficients and Pearson regression lines were calculated between the following variables from 2004 to 2014 per full time equivalent (FTE) physicians: the number of publications, the number of hospital stays, the number of FTE university hospital practitioners, and finally the number of residents. The mean per FTE physicians was 0.73 publications, 235.8 hospital stays, and 0.63 residents. The correlations between these three variables two by two were weak. The correlations between the proportion of university hospital practitioners and number of publications per FTE physicians, number of residents per FTE physicians, AB categories publications per FTE physicians, or first and last authors’ publications per FTE physicians was strong. All French hospitals are involved into research activities but with different patterns. Among them, public teaching hospitals have three missions: research, care and teaching. Each of them had different patterns in term of care, teaching, and research activities. None was above the mean for all these 3 activities. Nine had at least 2 of the 3 missions with above-average scores
|
4 |
Estudio bibliométrico de la producción científica sobre dioxinas a través de las bases de datos Pubmed e I.M.E. (1997-2003)Peña-Rey Lorenzo, Isabel 20 December 2004 (has links)
Introducción .Las dioxinas, son compuestos organoclorados que se acumulan en la cadena alimentaria, generando diversos problemas de salud. En el año 1997 fueron declarados carcinógenos humanos. Su principal fuente de producción son las incineradoras de residuos sólidos. Objetivos del estudio:Analizar la producción cientifica sobre dioxinas en los últimos 7 años; conocer la distribución de los articulos en las revistas, estudiar su distribución geográfica e idiomas, analizar la productividad de los autores, las pautas de firmas por autor y trabajo en equipo, la calidad de los articulos y el impacto de las revistas que publican sobre el tema, la existencia de los colegios invisibles y las enfermedades que más carga de enfermedad producen en los paises que publican sobre el tema. Material y método. Se utilizaron las bases de datos PubMed e Índice Médico Español (IME). Se crearon nuevas bases de datos en Reference Manager v.10 y se analizaron con SPSS v. 11.0 Se utilizó la base de datos Science Citation Index-Expanded para el estudio de las citas y Journal Citation Reports (JCR) para el estudio de la calidad de las revistas. Se aplicaron las leyes de Solla Price, la ley de Bradford y la ley de Lotka; se estudiaron el Índice de productividad y el de Transitoriedad de los autores, el Índice de Impacto y el Índice de Inmediatez de las revistas. Se hizo un control de la calidad de los articulos siguiendo los Requisitos de Uniformidad para Manuscritos enviados a Revistas Biomédicas del Comité Internacional de Directores de Revistas Médicas (CIDRM).Se analizaron y representaron los componentes de los colegios invisibles a través del estudio de las citas. Se estudió la morbi-mortalidad y la carga de enfermedad de las enfermedades que según la OMS existen en los paises más afectados por estos compuestos. Resultados Se encontraron 3.522 articulos. El número de articulos por año publicado se adecua a la Ley de Solla Price, con un coeficiente de correlación cercano a 1. De las 641 revistas que publican sobre el tema el núcleo de las zonas de Bradford lo compone la revista Chemosphere con 446 articulas, en el otro extremo hay 313 revistas que publican 1 solo articulo en los 7 años del estudio. El autor que más publica es Peterson con 42 artículos. La ecuación de Lotka se adapta a nuestro estudio. Las revistas que más publican tienen mayor factor de impacto. No hay diferencias en la calidad de los artículos publicados entre las que más y menos publican. Las tres enfermedades que producen mayor carga de enfermedad según la OMS en los paises productores de información sobre dioxinas son las enfermedades neuropsiquiátricas, cardiovasculares y neoplasias. Conclusiones: Se muestra la dispersión de la literatura cientifica en orden creciente. La mayoría de los artículos pertenecen a revistas publicadas en el pais de origen de la base de datos. Del análisis de los autores se reconocen grupos de investigación siendo los que más publican los que más se citan, y los que trabajan en conjunto. Los artículos con más de 7 autores son los que predominan al final del estudio. Las revistas que más artículos publican tienen con más frecuencia factor de impacto medido por JCR. La calidad de los artículos no difiere entre los autores más o menos prolíficos. No existe relación entre la publicación de artículos sobre dioxinas en los paises y una mayor prevalencia o incidencia de ciertos cánceres. Las enfermedades que más carga de enfermedad producen probablemente se deben a la presencia de otros factores más que a las dioxinas. / Introduction Dioxins were declared as human carcinogenic substances in 1997. They accumulate in food, and they can genera te health problems. The main dioxins producers are salid waste incinerators. Objectives: to analyse the scientific production about dioxins in the last 7 years, describing the distribution in journals, geographic distribution and the publication languages. To analyse the productivity of authors, the pattern of signatories / authors, the work teams and the relationship with burden of disease. Methods Sources of date: PubMed, I.M.E. (Spanish Medical Index) and Science Citation Index-Expanded databases. Data were analysed with SPSS 11.0 and Reference Manager 10.0 programs. Solla Price, Bradford and Lotka modeis were applied; the productivity index was computed; the Impact Factor and the Immediacy Index of journals were studied. A quality control of papers was done, taking into account the Uniform Requirements for Manuscript Submitted to Biomedical Journals of the International Commitee of Medical Journal Editors. Components of Invisible Colleges were assessed. Scientific production was linked to burden of disease produced by dioxins. Results 3522 articles were found. The number of articles published each year is fitted to Solla Price model, with a correlation coefficient clase to 1. The Bradford nucleus is the journal Chemosphere, with 446 articles. One author published 42 articles. The equation is fitted to Lotka's one, with an exponential change of -0.5. The most publisher journals have algo higher Impact Factor. No differences of quality of articles were founded among journals according to the number of published papers. Neuro-psychiatric and cardiovascular disorders and tumors were founded as the three groups that produce majar burden of disease in countries with higher information production on dioxins. Conclusions It has been shown the scientific literature dispersion. Majority of papers belong to journals edited in the origin country of database. Research groups of authors have be en found. Specialisation of some journals of Nucleus and 1st Bradford Zone has been shown. Burden of disease of some countries is not explained by contamination.
|
5 |
ARQUITETURA DE UM SISTEMA DE CONSULTAS E VISUALIZAÇÃO GRÁFICA DA REPRESENTAÇÃO DO CONHECIMENTO CONTIDO NO PUBMEDMachado, Henrique Tamiosso 12 March 2009 (has links)
Made available in DSpace on 2018-06-27T18:56:11Z (GMT). No. of bitstreams: 2
Henrique Tamiosso Machado.pdf: 2591851 bytes, checksum: b09aea9b36e9a9f1d51baa431dec4f93 (MD5)
Henrique Tamiosso Machado.pdf.jpg: 3371 bytes, checksum: 9b313eea24e6b21e8cdba71fdc2e37be (MD5)
Previous issue date: 2009-03-12 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / In the bioinformatics has a great amount of biological and genetic information that serve of support for research, and each day this amount of information grows more. Diverse research made for many researchers in different areas, as molecular biology, structural biochemist, enzymology, physiology, pathology, among others, comes generating some results and information that must be stored to be used in the diverse forms. But there the problem appears, how to store, to manipulate, to visualize all these information? The bioinformatics uses the computational power to catalogue, to organize, to structuralize and to manipulate these information facilitating the use of these information that are of extreme importance for Biology.
The PubMed is a service of the National Library of Medicine of United States that supplies access to more than 18 million citations for periodical scientific articles in the area of health science. In order to consider a new boarding for the search and representation of the knowledge found in the gotten result, this work presents the use of norm ISO 13250 Topic Maps for the creation of semantic nets involving the concepts found in the system of information of the PubMed, the construction of a composed architecture for a data base with deriving information of the PubMed by the National Library of Medicine of the United States (NLM), an interface for different research of the available one for the Entrez system, where it is possible to define priorities for the consultations. Also is possible to make the representation of the knowledge contained in the PubMed through semantic nets and techniques of date mining. / Na bioinformática existe uma grande quantidade de informações biológicas e genéticas que servem de suporte para pesquisas, e a cada dia essa quantidade de informações cresce ainda mais. Diversas pesquisas realizadas por inúmeros pesquisadores de diferentes áreas como biologia molecular, bioquímica estrutural, enzimologia, fisiologia, patologia, entre outras, vem gerando vários resultados e informações que devem ser armazenadas para serem utilizadas de diversas formas. Mas aí que surge o problema, como armazenar, manipular, visualizar todas essas informações? A bioinformática usa o poder computacional para catalogar, organizar, estruturar e manipular essas informações de uma forma que facilite a utilização dessas informações que são de extrema importância para a Biologia.
O PubMed é um serviço da Biblioteca Nacional de Medicina dos Estados Unidos (U.S. National Library of Medicine) que fornece acesso para mais de 18 milhões de citações para artigos científicos de jornais da área de ciências da saúde. De modo a propor uma nova abordagem para a busca e representação do conhecimento encontrado no resultado obtido, este trabalho apresenta uma arquitetura de um sistema para consultas utilizando prioridades e visualização das informações através de redes semânticas para representar o conhecimento contido no PubMed, para isso, foi utilizado a norma ISO 13250 Topic Maps para a criação de redes semânticas envolvendo os conceitos encontrados no sistema de informação do PubMed, realizado o desenvolvimento de uma arquitetura composta por um banco de dados com informações oriundas do PubMed disponibilizadas pela Biblioteca Nacional de Medicina dos Estados Unidos (NLM), uma interface para pesquisa diferente da disponibilizada pelo sistema Entrez, onde seja possível definir prioridades para as consultas. E também fazer a representação do conhecimento contido no PubMed através de redes semânticas e técnicas de data mining.
|
6 |
TEXT MINER FOR HYPERGRAPHS USING OUTPUT SPACE SAMPLINGTirupattur, Naveen 16 August 2011 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Text Mining is process of extracting high-quality knowledge from analysis of
textual data. Rapidly growing interest and focus on research in many fields is
resulting in an overwhelming amount of research literature. This literature is a vast source of knowledge. But due to huge volume of literature, it is practically impossible for researchers to manually extract the knowledge. Hence, there is a need for automated approach to extract knowledge from unstructured data. Text mining is right approach for automated extraction of knowledge from textual data.
The objective of this thesis is to mine documents pertaining to research literature, to find novel associations among entities appearing in that literature using Incremental Mining. Traditional text mining approaches provide binary associations. But it is important to understand context in which these associations occur. For example entity A has association with entity B in context of entity C. These contexts can be visualized as multi-way associations among the entities which are represented by a Hypergraph. This thesis work talks about extracting such multi-way associations among the entities using Frequent Itemset
Mining and application of a new concept called Output space sampling to extract
such multi-way associations in space and time efficient manner. We incorporated concept of personalization in Output space sampling so that user can specify his/her interests as the frequent hyper-associations are extracted from the text.
|
7 |
Wikipdf - A Tool To Help Scientists Understand The Literature Of The Biological, Health, And Life SciencesCalloway, David 01 January 2006 (has links)
Biological sciences literature can be extraordinarily difficult to understand. Papers are commonly filled with terminology unique to a particular sub-discipline. Readers with expertise outside that sub-discipline often have difficulty understanding information the author is trying to convey. The WikiPDF project that is the subject of this thesis helps readers understand the biological sciences literature by automatically generating a customized glossary for each page of any technical paper available in Adobe Portable Document Format (PDF) format. WikiPDF relies on the Wikipedia®, an on-line encyclopedia created and supported by a host of volunteers, as a source of definitions used in its glossaries. WikiPDF uses the National Institutes of Health (NIH) Medline/PubMed database of journal papers to organize, index, and locate WikiPDF glossaries. Design and implementation of this project relied exclusively on open-source software, including the Linux operating system, the Apache Tomcat web server, and the MySQL relational database system.
|
8 |
Recuperação da Informação: estudo da usabilidade na base de dados Public Medical (PUBMED)COELHO, Odete Máyra Mesquita January 2014 (has links)
COELHO, Odete Máyra Mesquita; PINTO, Virgínia Bentes. Recuperação da Informação: estudo da usabilidade na base de dados Public Medical (PUBMED). 2014. 172 f. Dissertação (Mestrado) - Universidade Federal da Paraíba, Mestrado em Ciência da Informação, Paraíba, 2014. / Submitted by Lidya Silva (nagylla.lidya@gmail.com) on 2016-07-01T14:54:21Z
No. of bitstreams: 1
2014_diss_ommsales.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5) / Rejected by Márcia Araújo (marcia_m_bezerra@yahoo.com.br), reason: Por gentileza, faça as devidas correções de acordo com as orientações recebidas. Qualquer dúvida ligar 33667659.
Márcia Bezerra
Revisora do Repositório Institucional
Biblioteca das Casas de Cultura Estrangeira/UFC
on 2016-07-06T13:18:46Z (GMT) / Submitted by Lidya Silva (nagylla.lidya@gmail.com) on 2016-07-08T13:27:55Z
No. of bitstreams: 1
2014_diss_ommsales.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5) / Approved for entry into archive by Maria Josineide Góis (josineide@ufc.br) on 2016-07-15T13:21:36Z (GMT) No. of bitstreams: 1
2014_diss_ommsales.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5) / Made available in DSpace on 2016-07-15T13:21:36Z (GMT). No. of bitstreams: 1
2014_diss_ommsales.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5)
Previous issue date: 2014 / It investigates the understanding that resident doctors have about the process of information retrieval on the basis of Public Medical (PubMed) data, taking into consideration the aspects of usability in human-computer interaction, the resources available and the level of user satisfaction in searching process. The theoretical framework used for this research relates the concepts of information and information systems for the healthcare, and then addresses the Information Retrieval systems and databases, entering the field of information architecture for evaluating the usability of these sources information. The methodological approach includes exploratory research whose first phase consisted of the heuristic evaluation of the PubMed database interface, using the guidelines proposed by Nielsen and Tahir (2002). The results of this analysis show that although these guidelines have been designed to build homepage, thirty-eight of them are suited to the PubMed interface. Therefore, it is inferred that these guidelines can be used for heuristic evaluation of databases focused on the area of Health regarding the usability of this database, it was observed that the interface has a well-structured architecture, is friendly and objective, and present numerous possibilities for search and retrieval of information. The second phase of empirical study took place through the application of prospective usability testing to measure user satisfaction database. These tests were done using a semi-structured questionnaire administered to resident doctors specialty of Internal Medicine, University Hospital Walter Cantídio the Federal University of Ceará, totaling 36% of participants. The results of this step show a good performance and a good user satisfaction PubMed regarding the usability of the database, considering that enables them to achieve their research goals with real effectiveness and efficiency, yet they do not know all the resources available to search and retrieval of information offered by this database. / Investiga qual o entendimento que os médicos residentes têm sobre o processo de recuperação de informação na base de dados Public Medical (PubMed), levando em consideração os aspectos relativos à usabilidade na interação humano-computador, os recursos disponíveis e o nível de satisfação do usuário no processo de busca. O referencial teórico utilizado para esta pesquisa relaciona os conceitos de informação e de informação para a área da saúde, e em seguida aborda os Sistemas de Recuperação de Informação e as bases de dados, adentrando no campo da arquitetura da informação para avaliar a usabilidade dessas fontes de informação. O percurso metodológico contempla a pesquisa exploratória cuja primeira etapa constou da avaliação heurística da interface da base de dados PubMed, utilizando- se as diretrizes propostas por Nielsen e Tahir (2002). Os resultados dessa análise evidenciam que, embora tais diretrizes tenham sido pensadas para a construção de homepage, trinta e oito delas se adequaram à interface da PubMed. Portanto, infere- se que essas diretrizes podem ser utilizadas para a avaliação heurística de bases de dados voltadas para a área da Saúde. Com relação à usabilidade dessa base de dados, evidenciou-se que a interface tem uma arquitetura bem estruturada, é amigável e objetiva, além de apresentar inúmeras possibilidades de busca e recuperação da informação. A segunda etapa do estudo empírico deu-se por meio da aplicação dos testes prospectivos de usabilidade para mensurar a satisfação dos usuários da base de dados. Esses testes foram feitos por meio de um questionário semiestruturado aplicado aos médicos residentes da especialidade de Clínica Médica do Hospital Universitário Walter Cantídio da Universidade Federal do Ceará, perfazendo um total de 36% de participantes. Os resultados dessa etapa evidenciam um bom desempenho e uma boa satisfação dos usuários da PubMed quanto à usabilidade dessa base de dados, haja vista que permite a eles atingirem seus objetivos de pesquisa com real eficácia e eficiência, ainda que não conheçam todos os recursos disponíveis para a busca e a recuperação da informação oferecidos por essa base de dados.
|
9 |
Recuperação da informação: estudo da usabilidade na base de dados Public Medical (PUBMED).Coelho, Odete Máyra Mesquita 21 February 2014 (has links)
Made available in DSpace on 2015-04-16T15:23:33Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 4229373 bytes, checksum: 0087285c704b68c550008eeb3ca7869a (MD5)
Previous issue date: 2014-02-21 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / It investigates the understanding that resident doctors have about the process of information retrieval on the basis of Public Medical (PubMed) data, taking into consideration the aspects of usability in human-computer interaction, the resources available and the level of user satisfaction in searching process. The theoretical framework used for this research relates the concepts of information and information systems for the healthcare, and then addresses the Information Retrieval systems and databases, entering the field of information architecture for evaluating the usability of these sources information. The methodological approach includes exploratory research whose first phase consisted of the heuristic evaluation of the PubMed database interface, using the guidelines proposed by Nielsen and Tahir (2002). The results of this analysis show that although these guidelines have been designed to build homepage, thirty-eight of them are suited to the PubMed interface. Therefore, it is inferred that these guidelines can be used for heuristic evaluation of databases focused on the area of Health regarding the usability of this database, it was observed that the interface has a well-structured architecture, is friendly and objective, and present numerous possibilities for search and retrieval of information. The second phase of empirical study took place through the application of prospective usability testing to measure user satisfaction database. These tests were done using a semi-structured questionnaire administered to resident doctors specialty of Internal Medicine, University Hospital Walter Cantídio the Federal University of Ceará, totaling 36% of participants. The results of this step show a good performance and a good user satisfaction PubMed regarding the usability of the database, considering that enables them to achieve their research goals with real effectiveness and efficiency, yet they do not know all the resources available to search and retrieval of information offered by this database. / Investiga qual o entendimento que os médicos residentes têm sobre o processo de recuperação de informação na base de dados Public Medical (PubMed), levando em consideração os aspectos relativos à usabilidade na interação humano-computador, os recursos disponíveis e o nível de satisfação do usuário no processo de busca. O referencial teórico utilizado para esta pesquisa relaciona os conceitos de informação e de informação para a área da saúde, e em seguida aborda os Sistemas de Recuperação de Informação e as bases de dados, adentrando no campo da arquitetura da informação para avaliar a usabilidade dessas fontes de informação. O percurso metodológico contempla a pesquisa exploratória cuja primeira etapa constou da avaliação heurística da interface da base de dados PubMed, utilizando-se as diretrizes propostas por Nielsen e Tahir (2002). Os resultados dessa análise evidenciam que, embora tais diretrizes tenham sido pensadas para a construção de homepage, trinta e oito delas se adequaram à interface da PubMed. Portanto, infere-se que essas diretrizes podem ser utilizadas para a avaliação heurística de bases de dados voltadas para a área da Saúde. Com relação à usabilidade dessa base de dados, evidenciou-se que a interface tem uma arquitetura bem estruturada, é amigável e objetiva, além de apresentar inúmeras possibilidades de busca e recuperação da informação. A segunda etapa do estudo empírico deu-se por meio da aplicação dos testes prospectivos de usabilidade para mensurar a satisfação dos usuários da base de dados. Esses testes foram feitos por meio de um questionário semiestruturado aplicado aos médicos residentes da especialidade de Clínica Médica do Hospital Universitário Walter Cantídio da Universidade Federal do Ceará, perfazendo um total de 36% de participantes. Os resultados dessa etapa evidenciam um bom desempenho e uma boa satisfação dos usuários da PubMed quanto à usabilidade dessa base de dados, haja vista que permite a eles atingirem seus objetivos de pesquisa com real eficácia e eficiência, ainda que não conheçam todos os recursos disponíveis para a busca e a recuperação da informação oferecidos por essa base de dados.
|
10 |
BioEve: User Interface Framework Bridging IE and IRJanuary 2010 (has links)
abstract: Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many systems aid in navigating through this wide collection of documents, the vastness and depth of this information overload can be overwhelming. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also facilitate discovery of the unknown information implicitly conveyed in the texts. This thesis presents the different approaches used for large scale biomedical named entity recognition, and the challenges faced in each. It also proposes BioEve: an integrative framework to fuse a faceted search with information extraction to provide a search service that addresses the user's desire for "completeness" of the query results, not just the top-ranked ones. This information extraction system enables discovery of important semantic relationships between entities such as genes, diseases, drugs, and cell lines and events from biomedical text on MEDLINE, which is the largest publicly available database of the world's biomedical journal literature. It is an innovative search and discovery service that makes it easier to search/navigate and discover knowledge hidden in life sciences literature. To demonstrate the utility of this system, this thesis also details a prototype enterprise quality search and discovery service that helps researchers with a guided step-by-step query refinement, by suggesting concepts enriched in intermediate results, and thereby facilitating the "discover more as you search" paradigm. / Dissertation/Thesis / M.S. Computer Science 2010
|
Page generated in 0.0219 seconds