• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 46
  • 25
  • 9
  • 1
  • 1
  • Tagged with
  • 82
  • 66
  • 54
  • 30
  • 29
  • 29
  • 27
  • 15
  • 13
  • 12
  • 11
  • 10
  • 10
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

GoPubMed: Ontology-based literature search for the life sciences / GoPubMed: ontologie-basierte Literatursuche für die Lebenswissenschaften

Doms, Andreas 20 January 2009 (has links) (PDF)
Background: Most of our biomedical knowledge is only accessible through texts. The biomedical literature grows exponentially and PubMed comprises over 18.000.000 literature abstracts. Recently much effort has been put into the creation of biomedical ontologies which capture biomedical facts. The exploitation of ontologies to explore the scientific literature is a new area of research. Motivation: When people search, they have questions in mind. Answering questions in a domain requires the knowledge of the terminology of that domain. Classical search engines do not provide background knowledge for the presentation of search results. Ontology annotated structured databases allow for data-mining. The hypothesis is that ontology annotated literature databases allow for text-mining. The central problem is to associate scientific publications with ontological concepts. This is a prerequisite for ontology-based literature search. The question then is how to answer biomedical questions using ontologies and a literature corpus. Finally the task is to automate bibliometric analyses on an corpus of scientific publications. Approach: Recent joint efforts on automatically extracting information from free text showed that the applied methods are complementary. The idea is to employ the rich terminological and relational information stored in biomedical ontologies to markup biomedical text documents. Based on established semantic links between documents and ontology concepts the goal is to answer biomedical question on a corpus of documents. The entirely annotated literature corpus allows for the first time to automatically generate bibliometric analyses for ontological concepts, authors and institutions. Results: This work includes a novel annotation framework for free texts with ontological concepts. The framework allows to generate recognition patterns rules from the terminological and relational information in an ontology. Maximum entropy models can be trained to distinguish the meaning of ambiguous concept labels. The framework was used to develop a annotation pipeline for PubMed abstracts with 27,863 Gene Ontology concepts. The evaluation of the recognition performance yielded a precision of 79.9% and a recall of 72.7% improving the previously used algorithm by 25,7% f-measure. The evaluation was done on a manually created (by the original authors) curation corpus of 689 PubMed abstracts with 18,356 curations of concepts. Methods to reason over large amounts of documents with ontologies were developed. The ability to answer questions with the online system was shown on a set of biomedical question of the TREC Genomics Track 2006 benchmark. This work includes the first ontology-based, large scale, online available, up-to-date bibliometric analysis for topics in molecular biology represented by GO concepts. The automatic bibliometric analysis is in line with existing, but often out-dated, manual analyses. Outlook: A number of promising continuations starting from this work have been spun off. A freely available online search engine has a growing user community. A spin-off company was funded by the High-Tech Gründerfonds which commercializes the new ontology-based search paradigm. Several off-springs of GoPubMed including GoWeb (general web search), Go3R (search in replacement, reduction, refinement methods for animal experiments), GoGene (search in gene/protein databases) are developed.
52

Datenzentrierte Bestimmung von Assoziationsregeln in parallelen Datenbankarchitekturen

Legler, Thomas 15 August 2009 (has links) (PDF)
Die folgende Arbeit befasst sich mit der Alltagstauglichkeit moderner Massendatenverarbeitung, insbesondere mit dem Problem der Assoziationsregelanalyse. Vorhandene Datenmengen wachsen stark an, aber deren Auswertung ist für ungeübte Anwender schwierig. Daher verzichten Unternehmen auf Informationen, welche prinzipiell vorhanden sind. Assoziationsregeln zeigen in diesen Daten Abhängigkeiten zwischen den Elementen eines Datenbestandes, beispielsweise zwischen verkauften Produkten. Diese Regeln können mit Interessantheitsmaßen versehen werden, welche dem Anwender das Erkennen wichtiger Zusammenhänge ermöglichen. Es werden Ansätze gezeigt, dem Nutzer die Auswertung der Daten zu erleichtern. Das betrifft sowohl die robuste Arbeitsweise der Verfahren als auch die einfache Auswertung der Regeln. Die vorgestellten Algorithmen passen sich dabei an die zu verarbeitenden Daten an, was sie von anderen Verfahren unterscheidet. Assoziationsregelsuchen benötigen die Extraktion häufiger Kombinationen (EHK). Hierfür werden Möglichkeiten gezeigt, Lösungsansätze auf die Eigenschaften moderne System anzupassen. Als Ansatz werden Verfahren zur Berechnung der häufigsten $N$ Kombinationen erläutert, welche anders als bekannte Ansätze leicht konfigurierbar sind. Moderne Systeme rechnen zudem oft verteilt. Diese Rechnerverbünde können große Datenmengen parallel verarbeiten, benötigen jedoch die Vereinigung lokaler Ergebnisse. Für verteilte Top-N-EHK auf realistischen Partitionierungen werden hierfür Ansätze mit verschiedenen Eigenschaften präsentiert. Aus den häufigen Kombinationen werden Assoziationsregeln gebildet, deren Aufbereitung ebenfalls einfach durchführbar sein soll. In der Literatur wurden viele Maße vorgestellt. Je nach den Anforderungen entsprechen sie je einer subjektiven Bewertung, allerdings nicht zwingend der des Anwenders. Hierfür wird untersucht, wie mehrere Interessantheitsmaßen zu einem globalen Maß vereinigt werden können. Dies findet Regeln, welche mehrfach wichtig erschienen. Der Nutzer kann mit den Vorschlägen sein Suchziel eingrenzen. Ein zweiter Ansatz gruppiert Regeln. Dies erfolgt über die Häufigkeiten der Regelelemente, welche die Grundlage von Interessantheitsmaßen bilden. Die Regeln einer solchen Gruppe sind daher bezüglich vieler Interessantheitsmaßen ähnlich und können gemeinsam ausgewertet werden. Dies reduziert den manuellen Aufwand des Nutzers. Diese Arbeit zeigt Möglichkeiten, Assoziationsregelsuchen auf einen breiten Benutzerkreis zu erweitern und neue Anwender zu erreichen. Die Assoziationsregelsuche wird dabei derart vereinfacht, dass sie statt als Spezialanwendung als leicht nutzbares Werkzeug zur Datenanalyse verwendet werden kann. / The importance of data mining is widely acknowledged today. Mining for association rules and frequent patterns is a central activity in data mining. Three main strategies are available for such mining: APRIORI , FP-tree-based approaches like FP-GROWTH, and algorithms based on vertical data structures and depth-first mining strategies like ECLAT and CHARM. Unfortunately, most of these algorithms are only moderately suitable for many “real-world” scenarios because their usability and the special characteristics of the data are two aspects of practical association rule mining that require further work. All mining strategies for frequent patterns use a parameter called minimum support to define a minimum occurrence frequency for searched patterns. This parameter cuts down the number of patterns searched to improve the relevance of the results. In complex business scenarios, it can be difficult and expensive to define a suitable value for the minimum support because it depends strongly on the particular datasets. Users are often unable to set this parameter for unknown datasets, and unsuitable minimum-support values can extract millions of frequent patterns and generate enormous runtimes. For this reason, it is not feasible to permit ad-hoc data mining by unskilled users. Such users do not have the knowledge and time to define suitable parameters by trial-and-error procedures. Discussions with users of SAP software have revealed great interest in the results of association-rule mining techniques, but most of these users are unable or unwilling to set very technical parameters. Given such user constraints, several studies have addressed the problem of replacing the minimum-support parameter with more intuitive top-n strategies. We have developed an adaptive mining algorithm to give untrained SAP users a tool to analyze their data easily without the need for elaborate data preparation and parameter determination. Previously implemented approaches of distributed frequent-pattern mining were expensive and time-consuming tasks for specialists. In contrast, we propose a method to accelerate and simplify the mining process by using top-n strategies and relaxing some requirements on the results, such as completeness. Unlike such data approximation techniques as sampling, our algorithm always returns exact frequency counts. The only drawback is that the result set may fail to include some of the patterns up to a specific frequency threshold. Another aspect of real-world datasets is the fact that they are often partitioned for shared-nothing architectures, following business-specific parameters like location, fiscal year, or branch office. Users may also want to conduct mining operations spanning data from different partners, even if the local data from the respective partners cannot be integrated at a single location for data security reasons or due to their large volume. Almost every data mining solution is constrained by the need to hide complexity. As far as possible, the solution should offer a simple user interface that hides technical aspects like data distribution and data preparation. Given that BW Accelerator users have such simplicity and distribution requirements, we have developed an adaptive mining algorithm to give unskilled users a tool to analyze their data easily, without the need for complex data preparation or consolidation. For example, Business Intelligence scenarios often partition large data volumes by fiscal year to enable efficient optimizations for the data used in actual workloads. For most mining queries, more than one data partition is of interest, and therefore, distribution handling that leaves the data unaffected is necessary. The algorithms presented in this paper have been developed to work with data stored in SAP BW. A salient feature of SAP BW Accelerator is that it is implemented as a distributed landscape that sits on top of a large number of shared-nothing blade servers. Its main task is to execute OLAP queries that require fast aggregation of many millions of rows of data. Therefore, the distribution of data over the dedicated storage is optimized for such workloads. Data mining scenarios use the same data from storage, but reporting takes precedence over data mining, and hence, the data cannot be redistributed without massive costs. Distribution by special data semantics or user-defined selections can produce many partitions and very different partition sizes. The handling of such real-world distributions for frequent-pattern mining is an important task, but it conflicts with the requirement of balanced partition.
53

GoWeb: Semantic Search and Browsing for the Life Sciences

Dietze, Heiko 21 December 2010 (has links) (PDF)
Searching is a fundamental task to support research. Current search engines are keyword-based. Semantic technologies promise a next generation of semantic search engines, which will be able to answer questions. Current approaches either apply natural language processing to unstructured text or they assume the existence of structured statements over which they can reason. This work provides a system for combining the classical keyword-based search engines with semantic annotation. Conventional search results are annotated using a customized annotation algorithm, which takes the textual properties and requirements such as speed and scalability into account. The biomedical background knowledge consists of the GeneOntology and Medical Subject Headings and other related entities, e.g. proteins/gene names and person names. Together they provide the relevant semantic context for a search engine for the life sciences. We develop the system GoWeb for semantic web search and evaluate it using three benchmarks. It is shown that GoWeb is able to aid question answering with success rates up to 79%. Furthermore, the system also includes semantic hyperlinks that enable semantic browsing of the knowledge space. The semantic hyperlinks facilitate the use of the eScience infrastructure, even complex workflows of composed web services. To complement the web search of GoWeb, other data source and more specialized information needs are tested in different prototypes. This includes patents and intranet search. Semantic search is applicable for these usage scenarios, but the developed systems also show limits of the semantic approach. That is the size, applicability and completeness of the integrated ontologies, as well as technical issues of text-extraction and meta-data information gathering. Additionally, semantic indexing as an alternative approach to implement semantic search is implemented and evaluated with a question answering benchmark. A semantic index can help to answer questions and address some limitations of GoWeb. Still the maintenance and optimization of such an index is a challenge, whereas GoWeb provides a straightforward system.
54

Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed

Eisinger, Daniel 08 September 2014 (has links) (PDF)
The patent domain is a very important source of scientific information that is currently not used to its full potential. Searching for relevant patents is a complex task because the number of existing patents is very high and grows quickly, patent text is extremely complicated, and standard vocabulary is not used consistently or doesn’t even exist. As a consequence, pure keyword searches often fail to return satisfying results in the patent domain. Major companies employ patent professionals who are able to search patents effectively, but even they have to invest a lot of time and effort into their search. Academic scientists on the other hand do not have access to such resources and therefore often do not search patents at all, but they risk missing up-to-date information that will not be published in scientific publications until much later, if it is published at all. Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Similarly, professional patent searches expand beyond keywords by including class codes from various patent classification systems. However, classification-based searches can only be performed effectively if the user has very detailed knowledge of the system, which is usually not the case for academic scientists. Consequently, we investigated methods to automatically identify relevant classes that can then be suggested to the user to expand their query. Since every patent is assigned at least one class code, it should be possible for these assignments to be used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. In order to gain such knowledge, we perform an in-depth comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms. Our analysis shows that the hierarchies are structurally similar, but terms and annotations differ significantly. The most important differences concern the considerably higher complexity of the IPC class definitions compared to MeSH terms and the far lower number of class assignments to the average patent compared to the number of MeSH terms assigned to PubMed documents. As a result of these differences, problems are caused both for unexperienced patent searchers and professionals. On the one hand, the complex term system makes it very difficult for members of the former group to find any IPC classes that are relevant for their search task. On the other hand, the low number of IPC classes per patent points to incomplete class assignments by the patent office, therefore limiting the recall of the classification-based searches that are frequently performed by the latter group. We approach these problems from two directions: First, by automatically assigning additional patent classes to make up for the missing assignments, and second, by automatically retrieving relevant keywords and classes that are proposed to the user so they can expand their initial search. For the automated assignment of additional patent classes, we adapt an approach to the patent domain that was successfully used for the assignment of MeSH terms to PubMed abstracts. Each document is assigned a set of IPC classes by a large set of binary Maximum-Entropy classifiers. Our evaluation shows good performance by individual classifiers (precision/recall between 0:84 and 0:90), making the retrieval of additional relevant documents for specific IPC classes feasible. The assignment of additional classes to specific documents is more problematic, since the precision of our classifiers is not high enough to avoid false positives. However, we propose filtering methods that can help solve this problem. For the guided patent search, we demonstrate various methods to expand a user’s initial query. Our methods use both keywords and class codes that the user enters to retrieve additional relevant keywords and classes that are then suggested to the user. These additional query components are extracted from different sources such as patent text, IPC definitions, external vocabularies and co-occurrence data. The suggested expansions can help unexperienced users refine their queries with relevant IPC classes, and professionals can compose their complete query faster and more easily. We also present GoPatents, a patent retrieval prototype that incorporates some of our proposals and makes faceted browsing of a patent corpus possible.
55

Explorative Suchstrategien am Beispiel von flickr. com

Wenke, Birgit, Lechner, Ulrike 13 May 2014 (has links) (PDF)
No description available.
56

Jenseits der Suchmaschinen: Konzeption einer iterativen Informationssuche in Blogs

Franke, Ingmar S., Taranko, Severin, Wessel, Hans 15 May 2014 (has links) (PDF)
No description available.
57

Application of the FITT framework to evaluate a prototype health information system

Honekamp, Wilfried, Ostermann, Herwig 24 June 2011 (has links) (PDF)
We developed a prototype information system with an integrated expert system for headache patients. The FITT (fit between individual, task and technology) framework was used to evaluate the prototype health information system and to determine which deltas to work on in future developments. We positively evaluated the system in all FITT dimensions. The framework provided a proper tool for evaluating the prototype health information system and determining which deltas to work on in future developments.
58

Angeleitete internetbasierte Patienteninformation / Guided consumer health information retrieval

Honekamp, Wilfried 14 June 2011 (has links) (PDF)
Eine stetig wachsende Zahl von Nutzern sucht im Internet nach Gesundheitsinformationen. Hierzu steht ihnen eine Vielzahl ganz unterschiedlicher Anbieter zur Verfügung, die bei divergierenden Interessen gesundheitsrelevante Angebote im Internet vorhalten. Abgesehen von der Informationsflut, mit der die Nutzer bei der Suche überhäuft werden, erhalten sie auch falsche, irreführende, veraltete und sogar gesundheitsgefährdende Informationen. In den letzten zehn Jahren haben verschiedene Wissenschaftler die Anforderungen an ein ideales Gesundheitsinformationssystem ermittelt. Im Rahmen des in diesem Beitrag beschriebenen Projekts wurde ein Gesundheitsinformationssystem als Prototyp zur anamnesebezogenen, internetbasierten Patienteninformation entwickelt und anhand einer Studie evaluiert. Dabei wird die Untersuchung auf deutschsprachige Erwachsene mit Kopfschmerzen eingegrenzt. Insgesamt wird die Hypothese überprüft, dass eine angeleitete, anamnesebezogene Internetsuche für den Patienten bessere Ergebnisse liefert, als dies durch die herkömmliche Nutzung von Gesundheitsportalen oder Suchmaschinen erreicht werden kann. Zur Evaluation wurde eine kontrollierte Zweigruppenstudie mit insgesamt 140 Teilnehmern in zwei Studienabschnitten durchgeführt. Dabei wurde im ersten Abschnitt festgestellt, dass bei einfach strukturierten Krankheitsfällen das Informationssystem gleichgute Ergebnisse liefert wie die herkömmliche Suche. Im zweiten Abschnitt konnte allerdings festgestellt werden, dass bei komplexen Kopfschmerzfällen mit Hilfe des Prototyps signifikant (P=0,031) bessere Diagnosen gestellt werden konnten als ohne. Medizinische Expertensysteme in Kombination mit einer Meta-Suche nach maßgeschneiderten qualitätsgesicherten Informationen erweisen sich als probate Möglichkeit, den Ansprüchen an eine geeignete Versorgung mit Gesundheitsinformationen gerecht zu werden. / A steadily increasing number of users search for health information online. Therefore, a multitude of totally different providers with diverging interests offer information. Apart from the information overload the users are flooded with, they may access false, misleading or even life threatening information. In the last 10 years scientists have determined the requirements of an ideal health information system. In the study described in this paper a prototype health information system providing anamnesis related internet-based consumer health information is evaluated. In total, the hypothesis that a computer-aided anamnesis-related internet search provides better results than the use of conventional search engines or health portals is evaluated. For evaluation a randomised controlled study with 140 participants has been conducted in two study sections. In the first section it was found, that for a less complex diagnosis the prototype information system did equally well as the conventional information retrieval. In the second study section it was found, that dealing with complex headache cases participants using the prototype determined significantly better (P=0.031) diagnoses than the control group did without prototype support. It has been shown, that medical expert systems in combination with a meta-search for tailored quality controlled information represents a feasible strategy to provide reliable health information.
59

Active Brownian Particles with alpha Stable Noise in the Angular Dynamics: Non Gaussian Displacements, Adiabatic Eliminations, and Local Searchers

Nötel, Jörg 17 January 2019 (has links)
Das Konzept von aktiven Brownschen Teilchen kann benutzt werden, um das Verhalten einfacher biologischer Organismen oder künstlicher Objekte, welche die Möglichkeit besitzen sich von selbst fortzubewegen zu beschreiben. Als Bewegungsgleichungen für aktive Brownsche Teilchen kommen Langevin Gleichungen zum Einsatz. In dieser Arbeit werden aktive Teilchen mit konstanter Geschwindigkeit diskutiert. Im ersten Teil der Arbeit wirkt auf die Bewegungsrichtung des Teilchen weißes alpha-stabiles Rauschen. Es werden die mittlere quadratische Verschiebung und der effektive Diffusionskoeffizient bestimmt. Eine überdampfte Beschreibung, gültig für Zeiten groß gegenüber der Relaxationszeit wird hergleitet. Als experimentell zugängliche Meßgröße, welche als Unterscheidungsmerkmal für die unterschiedlichen Rauscharten herangezogen werden kann, wird die Kurtose berechnet. Neben weißem Rauschen wird noch der Fall eines Ornstein-Uhlenbeck Prozesses angetrieben von Cauchy verteiltem Rauschen diskutiert. Während eine normale Diffusion mit zu weißem Rauschen identischem Diffusionskoeffizienten bestimmt wird, kann die beobachtete Verteilung der Verschiebungen Nicht-Gaußförmig sein. Die Zeit für den Übergang zur Gaußverteilung kann deutlich größer als die Zeitskale Relaxationszeit und die Zeitskale des Ornstein-Uhlenbeck Prozesses sein. Eine Grenze der benötigten Zeit wird durch eine Näherung der Kurtosis ermittelt. Weiterhin werden die Grundlagen eines stochastischen Modells für lokale Suche gelegt. Lokale Suche ist die Suche in der näheren Umgebung eines bestimmten Punktes, welcher Haus genannt wird. Abermals diskutieren wir ein aktives Teilchen mit unveränderlichem Absolutbetrag der Geschwindigkeit und weißen alpha-stabilem Rauschen in der Bewegungsrichtungsdynamik. Die deterministische Bewegung des Teilchens wird analysiert bevor die Situation mit Rauschen betrachtet wird. Die stationäre Aufenthaltswahrscheinlichkeitsdichtefunktion wird bestimmt. Es wird eine optimale Rauschstärke für die lokale Suche, das heißt für das Auffinden eines neuen Ortes in kleinstmöglicher Zeit festgestellt. Die kleinstmögliche Zeit wird kaum von der Rauschart abhängen. Wir werden jedoch feststellen, dass die Rauschart deutlichen Einfluß auf die Rückkehrwahrscheinlichkeit zum Haus hat, wenn die Richtung des zu Hauses fehlerbehaftet ist. Weiterhin wird das Model durch eine an das Haus abstandsabhängige Kopplung erweitert werden. Zum Abschluß betrachten wir eine Gruppe von Suchern. / Active Brownian particles described by Langevin equations are used to model the behavior of simple biological organisms or artificial objects that are able to perform self propulsion. In this thesis we discuss active particles with constant speed. In the first part, we consider angular driving by white Levy-stable noise and we discuss the mean squared displacement and diffusion coefficients. We derive an overdamped description for those particles that is valid at time scales larger the relaxation time. In order to provide an experimentally accessible property that distinguishes between the considered noise types, we derive an analytical expression for the kurtosis. Afterwards, we consider an Ornstein-Uhlenbeck process driven by Cauchy noise in the angular dynamics of the particle. While, we find normal diffusion with the diffusion coefficient identical to the white noise case we observe a Non-Gaussian displacement at time scales that can be considerable larger than the relaxation time and the time scale provided by the Ornstein-Uhlenbeck process. In order to provide a limit for the time needed for the transition to a Gaussian displacement, we approximate the kurtosis. Afterwards, we lay the foundation for a stochastic model for local search. Local search is concerned with the neighborhood of a given spot called home. We consider an active particle with constant speed and alpha-stable noise in the dynamics of the direction of motion. The deterministic motion will be discussed before considering the noise to be present. An analytical result for the steady state spatial density will be given. We will find an optimal noise strength for the local search and only a weak dependence on the considered noise types. Several extensions to the introduced model will then be considered. One extension includes a distance dependent coupling towards the home and thus the model becomes more general. Another extension concerned with an erroneous understanding by the particle of the direction of the home leads to the result that the return probability to the home depends on the noise type. Finally we consider a group of searchers.
60

Lack of Association between Polymorphisms of the Dopamine D4 Receptor Gene and Personality

Strobel, Alexander, Spinath, Frank M., Angleitner, Alois, Riemann, Rainer, Lesch, Klaus-Peter January 2003 (has links)
Recent studies have suggested a role of two polymorphisms of the dopamine D4 receptor gene (DRD4 exon III and –521C/T) in the modulation of personality traits such as ‘novelty seeking’ or ‘extraversion’, which are supposed to be modulated by individual differences in dopaminergic function. However, several replication studies have not provided positive findings. The present study was performed to further investigate whether DRD4 exon III and –521C/T are associated with individual differences in personality. One hundred and fifteen healthy German volunteers completed the NEO-Five-Factor Inventory (NEO-FFI) and were genotyped for the two DRD4 polymorphisms. We found no association between DRD4exon III and –521C/T, respectively, and estimated novelty seeking, NEO-FFI extraversion or other personality factors. Our findings are in line with several earlier studies which have failed to replicate the initial association results. Hence, our data do not provide evidence for a role of DRD4 exon III and the –521C/T polymorphism in the modulation of novelty seeking and extraversion. / Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG-geförderten) Allianz- bzw. Nationallizenz frei zugänglich.

Page generated in 0.047 seconds