131 |
Importance of substrate quality and clay content on microbial extracellular polymeric substances production and aggregate stability in soilsOlagoke, Folasade K., Bettermann, Antje, Nguyen, Phuong Thi Bich, Redmile-Gordon, Marc, Babin, Doreen, Smalla, Kornelia, Nesme, Joseph, Sørensen, Søren J., Kalbitz, Karsten, Vogel, Cordula 04 June 2024 (has links)
We investigated the effects of substrate (cellulose or starch) and different clay contents on the production of microbial extracellular polymeric substances (EPS) and concomitant development of stable soil aggregates. Soils were incubated with different amounts of montmorillonite (+ 0.1%, + 1%, + 10%) both with and without two substrates of contrasting quality (starch and cellulose). Microbial respiration (CO2), biomass carbon (C), EPS-protein, and EPS-polysaccharide were determined over the experimental period. The diversity and compositional shifts of microbial communities (bacteria/archaea) were analysed by sequencing 16S rRNA gene fragments amplified from soil DNA. Soil aggregate size distribution was determined and geometric mean diameter calculated for aggregate formation. Aggregate stabilities were compared among 1–2-mm size fraction. Starch amendment supported a faster increase than cellulose in both respiration and microbial biomass. Microbial community structure and composition differed depending on the C substrate added. However, clay addition had a more pronounced effect on alpha diversity compared to the addition of starch or cellulose. Substrate addition resulted in an increased EPS concentration only if combined with clay addition. At high clay addition, starch resulted in higher EPS concentrations than cellulose. Where additional substrate was not provided, EPS-protein was only weakly correlated with aggregate formation and stability. The relationship became stronger with addition of substrate. Labile organic C thus clearly plays a role in aggregate formation, but increasing clay content was found to enhance aggregate stability and additionally resulted in the development of distinct microbial communities and increased EPS production.
|
132 |
Buchführungsergebnisse spezialisierter Schafbetriebe in ausgewählten Bundesländern: Wirtschaftsjahr ...03 June 2024 (has links)
No description available.
|
133 |
Characterization of Influences on the Wall Stability of Deep Drawn Paperboard ShapesHauptmann, Marek, Majschak, Jens-Peter 08 June 2016 (has links) (PDF)
Deep drawn shapes with orthogonal wall components are usually evaluated by shape accuracy and visual quality. There have been only a few investigations on the stability of such structures; however, the effect of the wrinkles on the stability of the wall is important for packaging applications and can support the shape accuracy. This paper focuses on the influences of process parameters on the stability of orthogonal walls of shapes produced by deep drawing with rigid tools and immediate compression. The wall stability was evaluated by tensile testing orthogonal to the direction of the wrinkles. The stability distribution was characterized with regard to the drawing height, and a comparison was made between the two different materials. The wall stability decreased with increases in the forming height. Furthermore, a blank holder force design and z-directional compression level improved the wall stability. Together with an elevated moisture content of the material and thermal energy supply that delivered two to three times higher resistance against wrinkle extension, these effects drastically improved the wall stability.
|
134 |
Egyptian Christianity : an historical examination of the belief systems prevalent in Alexandria c.100 B.C.E. - 400 C.E. and their role in the shaping of early ChristianityFogarty, Margaret Elizabeth 03 1900 (has links)
Thesis (MPhil)--Stellenbosch University, 2004. / ENGLISH ABSTRACT: This thesis sets out to examine, as far as possible within the constraints of a limited study, the
nature of the Christianity professed in the first centuries of the Common Era, by means of an
historical examination of Egyptian Christianity. The thesis contends that the believers in
Christ's teachings, in the first century, were predominantly Jewish, that "Christianity" did not
exist as a developed separate religion until its first formal systematizations commenced in the
second century, through the prolific writings of the Alexandrians, Clement and Origen. It is
noted that the name "Christianity" itself was coined for the first time in the second century by
Ignatius of Antioch; and that until the fourth century it is more accurate to speak of many
Christianities in view of regional-cultural and interpretative differences where the religion took
root. The study examines the main religions of the world in which the new religion began to
establish itself, and against which it had to contend for its very survival. Many elements of these
religions influenced the rituals and formulation of the new religion and are traced through
ancient Egyptian religion, the Isis and Serapis cults, Judaism, Gnosticism and Hermeticism.
Alexandria, as the intellectual matrix of the Graeco-Roman world, was the key centre in which
the new religion was formally developed. The thesis argues, therefore, that despite the obscurity
of earliest Christianity in view of the dearth of extant sources, the emergent religion was
significantly Egyptian in formulation, legacy and influence in the world of Late Antiquity. It is
argued, in conclusion, that the politics of the West in making Christianity the official religion
of the empire, thus centring it henceforth in Rome, effectively effaced the Egyptian roots. In
line with current major research into the earliest centuries of Christianity, the thesis contends
that while Jerusalem was the spring of the new religion Alexandria, and Egypt as a whole,
formed a vital tributary of the river of Christianity which was to flow through the whole world.
It is argued that without the Egyptian branch, Christianity would have been a different
phenomenon to what it later became. The legacy of Egyptian Christianity is not only of singular
importance in the development of Christianity but, attracting as it does the continued interest of
current researchers in the historical, papyrological and archaeological fields, it holds also
considerable significance for the study of the history of religions in general, and Christianity in
particular. / AFRIKAANSE OPSOMMING: Die proefskrif poog om, insover moontlik binne beperkte skopus, die aard van die vroeë
Christendom gedurende die eerste eeue V.C. te ondersoek, deur middel van 'n historiese
ondersoek van die Egiptiese Christendom. Die tesis voer aan dat die vroegste Christelike
gelowiges in die eerste eeu N.C. grootendeels Joods was, en dat die Christendom as afsonderlike
godsdiens nie ontstaan het nie voor die formele sistematiseringe wat deur die Aleksandryne
Clemens en Origines aangebring is nie. Selfs die term Christendom is vir die eerste keer in die
tweede eeu n.C. deur Ignatius van Antiochië versin; daar word verder opgemerk dat voor die
vierde eeu dit meer akkuraat is om van veelvuldige Christelike groepe te praat. Die studie
ondersoek die vernaamste godsdienste van die milieu waarin die nuwe godsdiens wortel geskied
het, en waarteen dit om sy oorlewing moes stry. Baie invloede van die godsdienste is uitgeoefen
op die rites en die daarstelling van die nuwe godsdiens, en kan herlei word na die antieke
Egiptiese godsdiens, die kultusse van Isis en Serapis, Judaïsme, Gnostisisme en Hermetisme.
Aleksandrië, die intellektuele matriks van die Grieks-Romeinse wêreld, was die hoof-sentrum
waarin die nuwe godsdiens formeelontwikkel het. Die tesis toon daarom aan dat ten spyte van
die onbekendheid van die vroegste Christendom, wat te wyte is aan die tekort aan bronne, die
opkomende godsdiens in die Laat Antieke wêreld opvallend Egipties van aard was in
formulering, invloed en erfenis. Ten slotte word daar aangevoer dat die politiek van die Weste
wat die Christendom as amptelike godsdiens van die ryk gemaak het, en wat dit vervolgens dus
in Rome laat konsentreer het, die Egiptiese oorspronge van die godsdiens feitlik uitgewis het. In
samehang met kontemporêre belangrike navorsing op die gebied van die Christendom se
vroegste eeue, argumenteer die tesis dat terwyl Jerusalem wel die bron van die nuwe godsdiens
was, Aleksandrië, en Egipte as geheel, 'n deurslaggewende sytak was van die rivier van die
Christendom wat uiteindelik deur die ganse wêreld sou vloei. Daar word aangetoon dat sonder
die Egiptiese tak, die Christendom 'n heel ander verskynsel sou gewees het in vergelyking met sy
latere formaat. Die erfenis van die Egiptiese Christendom is nie alleen van die grootste belang
vir die ontwikkeling van die Christendom nie, maar 'n nalatenskap wat die voortgesette aandag
van navorsers op historiese, papirologiese en argeologiese gebiede vra, en is daarom van groot
belang vir die studie van die geskiedenis van godsdienste in die algemeen, en die Christendom in
die besonder.
|
135 |
Konzept einer an semantischen Kriterien orientierten Kommunikation für medizinische InformationssystemeNguyen-Dobinsky, Trong-Nghia 03 April 1998 (has links)
Einleitung In einem größeren Universitätsklinikum wie in der Charité sind EDV-gestützte Verfahren in verschiedenen Einrichtungen und für verschiedene Aufgaben im Einsatz: Verwaltung, Krankenversorgung, Forschung und Lehre. Diese Subsysteme sind in der Regel nicht in der Lage, Daten untereinander so auszutauschen, daß die in den Daten enthaltene Semantik nicht verlorengeht. Die Ursache liegt im wesentlichen in der Komplexität und in der Unschärfe der medizinischen Informationen. Medizinische Standards (HL7, DICOM, SNOMED, ICD, ICPM, ...) lassen sich für den Austausch von Daten verwenden, die gut formalisierbar und mit einer klaren Bedeutung behaftet sind. Nicht formalisierbare Daten, die z. B. in einem Befund oft vorkommen, lassen sich nicht ohne weiteres mit diesen Standards darstellen. Ziel Entwicklung eines Konzeptes für den Austausch medizinischer Daten, das die o. g. Probleme vermeidet. Material und Methoden Die Analyse der vorhandenen Subsysteme, Standards und Konzepte zeigt, daß das Konzept einerseits eine sehr einfache Syntax und eine simple Struktur aufweisen muß. Andererseits muß die medizinische Semantik voll erhalten bleiben. Als Vorbild kann die relationale Datenbank dienen, die mit einem Datentyp (Relation bzw. Tabelle) und einem einzigen Operator (SELECT) auf diesen Datentyp auskommt. Ergebnisse Das Konzept ist objektorientiert. Es enthält nur einen Datentyp. Das ist das AMICI-Objekt (AMICI: Architecture for Medical Information Exchange and Communication Interface). Über dieses AMICI-Objekt wird der gesamte Datenaustausch vorgenommen. Kann das Empfängersystem ein Objekt nicht oder nicht korrekt interpretieren, so wird die Interpretation vom Sendesystem übernommen. Ein Subsystem wird im Netzwerk über einen medizinischen Kontext angeschlossen, der das Interessengebiet und die Fähigkeit des Subsystems beschreibt. Das Subsystem kann an Hand der im Netz bekannten medizinischen Kontexte feststellen, welche weiteren Subsysteme für den eigenen Zweck interessant sein könnten. Alle AMICI-Objekte erhalten eine weltweit eindeutige Identifikation, so daß die Daten aus verschiedenen Institutionen, auch international, miteinander gemischt werden können. Diskussion Das Konzept kann als Basis für weitere Dienstleitungen in einem Klinikum bzw. einem Krankenhaus dienen. Namentlich zu nennen sind telemedizinische Anwendungen, bei denen nicht nur die Kommunikation zwischen Ärzten, sondern auch zwischen Patienten und Arzt möglich ist. Weiterhin betrifft dies den Einsatz von Software-Agenten, die sich um den Informationsbedarf eines Arztes individuell kümmern. / Introduction Large hospitals like the University hospital Charité use in different units different information systems for recording patient and medical data. There are also different tasks: administration, healthcare, research and education. These medical information systems are often called subsystems. They are usually not able to exchange data without lost of semantic. The complexity and the variability of medical terminology cause this problem. Existing medical standards (e. g. HL7, DICOM, SNOMED, ICD, ICPM, ...) are helpful for well formalised terms. Non-formalised terms that are often used in diagnostic reports can not be represented by existing standards. Aims Development of a concept for medical information exchange which fulfills the requirements mentioned above. Material and Methods The system analysis that is performed based on existing subsystems, medical standards and concepts provides two essential requirements. On the one hand the syntax of such standard must be extremely simple. On the other hand the standard must be able to transfer extremely complex semantics. As an example relational databases (RDB) provide a good idea of such simple syntax and complex semantics. RDB's include only one data type. It is called relation or table. To manipulate tables one needs only one operation. That is the SELECT command in SQL. Result The concept is object oriented. It includes only one object called AMICI-object like RDB's (AMICI: Architecture for Medical Information Exchange and Communication Interface). Data exchange is completely performed by these AMICI-objects. If the receiving subsystem is not able to interpret and represent an object, the sending subsystem will take over this task. Within a network a subsystem uses a special AMICI-object called medical context to describe its features and its area of interest. A subsystem can inquire medical contexts to explore installed and running subsystems in the network. An international unique identifier identifies every AMICI object so that you can mix objects provided by different international institutions, e. g. to use them in multi-center-studies. Discussion This concept can also be used as a basic service for higher level applications in a hospital. Two of them are telemedicine and software agents. Telemedicine is not only a tool for physicians. It should be also a tool for communication and interaction between patient and physician. Physicians can use personal software agents for information acquisition, which meets exactly his specific requirements.
|
136 |
Three essays on hidden liquidity in financial marketsCebiroglu, Gökhan 10 April 2014 (has links)
An den Handelsbörsen der Welt, hat der Anteil unsichtbarer Luidität in den letzten Jahren dramatisch zugenommen. Obwohl dieser Trend zunehmend in den Fokus regulatorischer Debatten und akademischer Dikussionen rückt, sind sich Forscher und die Aufsichtsbehörden über die Implikationen und entsprechende regulatorische Maßnahmen uneins. In der vorliegenden Arbeit, werden die damit verbundenen Fragestellungen in drei separaten Kapiteln theoretisch und empirisch untersucht. Mit Hilfe eines speziellen NASDAQ Datensatzes, werden in Kapitel 1 die Marktfaktoren, die unsichtbaren Liquidität begünstigen sowie den Einfluß, den unsichtbare Liquidät auf Märkte ausübt, empirisch ausgewertet. Wir zeigen, daß die Querschnittsvariation unsichtbarer Liquidität entlang des Aktienuniversums in einem hohen Maße durch sichtbare Markteigenschaften erklärt wird. Wir zeigen, daß unsichtbare Order gegenüber sichtbaren Ordern signifikant stärkere Preisfluktuationen hervorrufen. Unsere Resultate geben Grund zu der Annahme, daß Märkte mit hoher unsichtbarer Liquidät volatiler sind und höheren Marktreibungen ausgesetzt sind. In Kapitel 2 entwickeln wir ein strukturelles Handelsmodell und untersuchen die optimale Handelsstrategie mit unsichtbaren Ordern. In diesem Rahmen leiten wir für verschiedene Marktspezifikationen explizite Charakterisierungen der sogenannten optimalen Exposure-Größe her. Unter anderem zeigen wir, daß der Einsatz unsichtbarer Order Transaktionskosten signifikant reduzieren kann. In Kapitel 3 entwickeln wir ein dynamisches, Gleichgewichtsmodell in einem Limitorderbuchmarkt. Innerhalb dieses theoretischen Rahmens können die empirischen Beobachtungen des ersten un zweiten Kapitels rationalisiert werden. Insbesondere zeigen wir daß große versteckte Order Marktineffizienzen hervorrufen und Preisfluktuationen verstärken, indem sie die Koordination zwischen Angebots- und Nachfrageseite schwächen können. / In recent years, the proliferation of hidden liquidity in financial markets has increased dramatically and shifted to the center regulatory debates and market micro-structure panels. Yet investors, scientists and policy makers are at odds about its implications and the adequate regulatory responses. This thesis addresses these issues in three separate chapters on both empirical and theoretical grounds. Chapter 1 provides an empirical investigation of the determinants and impact of hidden order submissions. We report that the cross-sectional variation of hidden liquidity is well explained by observable market characteristics. Second, our results suggest that the hidden orders generate substantial price reactions. Our results suggests that hidden liquidity increases market volatility and trading frictions. Chapter 2 proposes a structural trading model. We investigate trader’s optimal trading strategies with respect to order-exposure in limit order book markets. The optimal exposure size marks a trade-off between costs and benefits of exposure. Our model provides explicit characterizations of the optimal exposure size for various market specifications. Model parameters and exposure strategies are estimated through high-frequency order book data. Our results suggest that hidden orders can substantially enhance trade performance. Chapter 3 develops a dynamic equilibrium model with a public primary market and an off-exchange trading mechanism. Our theory correctly predicts the key findings of chapter one and two. For instance, we show that large hidden orders cause excess returns and increase market volatility and correctly predict the role of the observable market characteristics in the origination of hidden liquidity.
|
137 |
Ein Repräsentationsformat zur standardisierten Beschreibung und wissensbasierten Modellierung genomischer ExpressionsdatenSchober, Daniel 08 June 2006 (has links)
Die Auswertung von Microarray-Daten beginnt oft mit information retrieval-Ansätzen, welche die Datenmassen auf eine im Hinblick auf eine bestimmte Fragestellung besonders interessante und überschaubare Menge von Genen bzw. probe set IDs reduzieren sollen. Vorraussetzung für eine effiziente Suche im Datenbestand ist jedoch eine Semantisierung bzw. Formalisierung der verwendeten Datenformate. Hier wird eine Ontologie als standardisiertes und semantisch definiertes Repräsentationskonstrukt vorgestellt, welches die Formalisierung von Fachwissen in einem interaktiven Wissensmodell erlaubt, das umfassend abgefragt, konsistent interpretiert und gegebenenfalls automatisiert weiterverarbeitet werden kann. Anhand einer molekularbiologischen Ontologie aus 1200 hierarchisch strukturierten Begriffen und am Beispiel des Toll-Like Receptor-Signalwegs wird aufgezeigt, wie ein solch ein objektorientiertes Beschreibungsvokabular zur Annotierung von Genen auf Affymetrix-Microarrays genutzt werden kann. Die Annotationsbegriffe werden über ontologische Konzepte, deren Eigenschaften und deren semantische Verbindungen (relationale Slots) im Wissensbank-Editor Protégé-2000 modelliert. Annotation bedeutet hier ein Gen formal in einen definierten funktionalen Kontext einzubetten. In der Anwendung der Wissensbank entspricht eine Annotation einem "drag and drop" von Genen in ontologische, die Funktion dieser Gene beschreibende, Konzepte. Die weitergehende kontextuale Annotation erfolgt über eine Vernetzung der Gene zu anderen Konzepten oder Genen. Das so erstellte vernetzte Wissensmodell (die knowledgebase) ermöglicht ein inhaltsbasiertes, assoziatives und kontextgeleitetes "Wissens-Browsing". Ontologisch annotierte Gendaten erlauben auch die Anwendung automatischer datengetriebener Visualisierungsstrategien, wie am Beispiel semantischer Netze gezeigt wird. Eine ontologische Anfrageschnittstelle erlaubt auch semantisch komplexe Anfragen an den Datenbestand bei erhöhter Trefferquote und Präzision. / Functional gene annotations provide important search targets and cluster criteria. We introduce an annotation system that exploits the possibilities of modern knowledge management tools, i.e. ontological querying, inference, networking of annotations and automatic datadriven visualization of the annotated model. The Gandr (gene annotation data representation) knowledgebase is an ontological framework for laboratory-specific gene annotation and knowledgemanagement. Gandr uses Protégé-2000 for editing, querying and visualizing microarray data and annotations. Genes can be annotated with provided, newly created or imported ontological concepts. Annotated genes can inherit assigned concept properties and can be related to each other. The resulting knowledgebase can be visualized as interactive semantic network of nodes and edges representing genes with annotations and their functional relationships. This allows for immediate and associative gene context exploration. Ontological query techniques allow for powerful data access. Annotating genes with formal conceptual descriptions can be performed using ‘drag and drop’ of one or more gene instances onto an annotating concept. Compared with unstructured annotation systems, the annotation process itself becomes faster and leads to annotation schemes of better quality owing to enforcement of constraints provided by the ontology. GandrKB enables lab-bench scientists to query for implicit domain knowledge, inferred from the ontological domain model. Full access to data semantics through queries for properties and relationships ensures a more complete and adequate reply of the system.
|
138 |
Word-sense disambiguation in biomedical ontologiesAlexopoulou, Dimitra 12 January 2011 (has links) (PDF)
With the ever increase in biomedical literature, text-mining has emerged as an important technology to support bio-curation and search. Word sense disambiguation (WSD), the correct identification of terms in text in the light of ambiguity, is an important problem in text-mining. Since the late 1940s many approaches based on supervised (decision trees, naive Bayes, neural networks, support vector machines) and unsupervised machine learning (context-clustering, word-clustering, co-occurrence graphs) have been developed. Knowledge-based methods that make use of the WordNet computational lexicon have also been developed. But only few make use of ontologies, i.e. hierarchical controlled vocabularies, to solve the problem and none exploit inference over ontologies and the use of metadata from publications.
This thesis addresses the WSD problem in biomedical ontologies by suggesting different approaches for word sense disambiguation that use ontologies and metadata. The "Closest Sense" method assumes that the ontology defines multiple senses of the term; it computes the shortest path of co-occurring terms in the document to one of these senses. The "Term Cooc" method defines a log-odds ratio for co-occurring terms including inferred co-occurrences. The "MetaData" approach trains a classifier on metadata; it does not require any ontology, but requires training data, which the other methods do not. These approaches are compared to each other when applied to a manually curated training corpus of 2600 documents for seven ambiguous terms from the Gene Ontology and MeSH. All approaches over all conditions achieve 80% success rate on average. The MetaData approach performs best with 96%, when trained on high-quality data. Its performance deteriorates as quality of the training data decreases. The Term Cooc approach performs better on Gene Ontology (92% success) than on MeSH (73% success) as MeSH is not a strict is-a/part-of, but rather a loose is-related-to hierarchy. The Closest Sense approach achieves on average 80% success rate.
Furthermore, the thesis showcases applications ranging from ontology design to semantic search where WSD is important.
|
139 |
Soil Chemical and Microbial Properties in a Mixed Stand of Spruce and Birch in the Ore Mountains (Germany) - A Case StudySchua, Karoline, Wende, Stefan, Wagner, Sven, Feger, Karl-Heinz 27 July 2015 (has links) (PDF)
A major argument for incorporating deciduous tree species in coniferous forest stands is their role in the amelioration and stabilisation of biogeochemical cycles. Current forest management strategies in central Europe aim to increase the area of mixed stands. In order to formulate statements about the ecological effects of mixtures, studies at the stand level are necessary. In a mixed stand of Norway spruce (Picea abies (L.) Karst.) and silver birch (Betula pendula Roth) in the Ore Mountains (Saxony, Germany), the effects of these two tree species on chemical and microbial parameters in the topsoil were studied at one site in the form of a case study. Samples were taken from the O layer and A horizon in areas of the stand influenced by either birch, spruce or a mixture of birch and spruce. The microbial biomass, basal respiration, metabolic quotient, pH-value and the C and N contents and stocks were analysed in the horizons Of, Oh and A. Significantly higher contents of microbial N were observed in the Of and Oh horizons in the birch and in the spruce-birch strata than in the stratum containing only spruce. The same was found with respect to pH-values in the Of horizon and basal respiration in the Oh horizon. Compared to the spruce stratum, in the birch and spruce-birch strata, significantly lower values were found for the contents of organic C and total N in the A horizon. The findings of the case study indicated that single birch trees have significant effects on the chemical and microbial topsoil properties in spruce-dominated stands. Therefore, the admixture of birch in spruce stands may distinctly affect nutrient cycling and may also be relevant for soil carbon sequestration. Further studies of these functional aspects are recommended.
|
140 |
Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMedEisinger, Daniel 08 September 2014 (has links) (PDF)
The patent domain is a very important source of scientific information that is currently not used to its full potential. Searching for relevant patents is a complex task because the number of existing patents is very high and grows quickly, patent text is extremely complicated, and standard vocabulary is not used consistently or doesn’t even exist. As a consequence, pure keyword searches often fail to return satisfying results in the patent domain. Major companies employ patent professionals who are able to search patents effectively, but even they have to invest a lot of time and effort into their search. Academic scientists on the other hand do not have access to such resources and therefore often do not search patents at all, but they risk missing up-to-date information that will not be published in scientific publications until much later, if it is published at all.
Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Similarly, professional patent searches expand beyond keywords by including class codes from various patent classification systems. However, classification-based searches can only be performed effectively if the user has very detailed knowledge of the system, which is usually not the case for academic scientists. Consequently, we investigated methods to automatically identify relevant classes that can then be suggested to the user to expand their query. Since every patent is assigned at least one class code, it should be possible for these assignments to be used in a similar way as the MeSH annotations in PubMed.
In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. In order to gain such knowledge, we perform an in-depth comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms. Our analysis shows that the hierarchies are structurally similar, but terms and annotations differ significantly. The most important differences concern the considerably higher complexity of the IPC class definitions compared to MeSH terms and the far lower number of class assignments to the average patent compared to the number of MeSH terms assigned to PubMed documents.
As a result of these differences, problems are caused both for unexperienced patent searchers and professionals. On the one hand, the complex term system makes it very difficult for members of the former group to find any IPC classes that are relevant for their search task. On the other hand, the low number of IPC classes per patent points to incomplete class assignments by the patent office, therefore limiting the recall of the classification-based searches that are frequently performed by the latter group. We approach these problems from two directions: First, by automatically assigning additional patent classes to make up for the missing assignments, and second, by automatically retrieving relevant keywords and classes that are proposed to the user so they can expand their initial search.
For the automated assignment of additional patent classes, we adapt an approach to the patent domain that was successfully used for the assignment of MeSH terms to PubMed abstracts. Each document is assigned a set of IPC classes by a large set of binary Maximum-Entropy classifiers. Our evaluation shows good performance by individual classifiers (precision/recall between 0:84 and 0:90), making the retrieval of additional relevant documents for specific IPC classes feasible. The assignment of additional classes to specific documents is more problematic, since the precision of our classifiers is not high enough to avoid false positives. However, we propose filtering methods that can help solve this problem.
For the guided patent search, we demonstrate various methods to expand a user’s initial query. Our methods use both keywords and class codes that the user enters to retrieve additional relevant keywords and classes that are then suggested to the user. These additional query components are extracted from different sources such as patent text, IPC definitions, external vocabularies and co-occurrence data. The suggested expansions can help unexperienced users refine their queries with relevant IPC classes, and professionals can compose their complete query faster and more easily. We also present GoPatents, a patent retrieval prototype that incorporates some of our proposals and makes faceted browsing of a patent corpus possible.
|
Page generated in 0.0311 seconds