161 |
Entwicklung und Bewertung von Algorithmen zur Umfeldmodellierung mithilfe von Radarsensoren im Automotive UmfeldLochbaum, Julius 10 March 2022 (has links)
Moderne Fahrzeuge verfügen über verschiedene Assistenzsysteme, welche den Fahrer unterstützen können. Damit diese Systeme auf die jeweilige Verkehrssituation reagieren können, ist eine Erkennung des Fahrzeugumfelds notwendig. Dabei werden verschiedene Sensoren eingesetzt, welche die Umgebung erfassen können.
Entfernungen und Geschwindigkeiten lassen sich z.B. sehr gut mit Radar-Sensoren bestimmen. Aus diesen Sensordaten kann ein virtuelles Umfeldmodell erstellt werden.
Dazu sind mehrere Algorithmen erforderlich, welche die Sensordaten aufbereiten und auswerten.
Um diese Algorithmen testen zu können, wurde von BMW ein Tool zur Darstellung der Sensordaten entwickelt. Das „Radar Analysis Tool“ (RAT) kann die Sensordaten aufgezeichneter Testfahrten abspielen. Dabei ist es möglich, implementierte Algorithmen mit den aufgezeichneten Daten zu testen.
Das Ziel dieser Arbeit ist die Implementierung mehrerer Algorithmen zur Umfeldmodellierung, welche nach verschiedenen Eigenschaften verglichen werden. Dabei werden auch neue Ansätze entwickelt und implementiert. Anhand des Vergleichs lässt sich erkennen, welche Algorithmen besonders gut für den Einsatz im Fahrzeug geeignet sind. Eine Voraussetzung für die Implementierung der Algorithmen ist die Erweiterung des Tools RAT. Dabei wird z.B. eine neue 2D-Ansicht des Fahrzeugumfelds hinzugefügt. Außerdem wird eine neue Schnittstelle entwickelt,
welche die einfache Anbindung von Algorithmen in RAT ermöglicht. Um die verschiedenen Algorithmen auswählen zu können, wird zusätzlich eine Konfiguration implementiert. Dadurch müssen keine Änderungen am Quellcode des
Tools vorgenommen werden, um die Algorithmen zu wechseln.
Um die Algorithmen vergleichen zu können, wird ein Konzept zur Auswertung entworfen. Anhand der Auswertung lassen sich die Vor- und Nachteile der jeweiligen Verfahren erkennen. Diese Ergebnisse helfen, die Auswahl von Algorithmen für den finalen Einsatz im Fahrzeug zu treffen.
|
162 |
Advanced Methods for Entity Linking in the Life SciencesChristen, Victor 25 January 2021 (has links)
The amount of knowledge increases rapidly due to the increasing number of available data sources. However, the autonomy of data sources and the resulting heterogeneity prevent comprehensive data analysis and applications.
Data integration aims to overcome heterogeneity by unifying different data sources and enriching unstructured data. The enrichment of data consists of different subtasks, amongst other the annotation process. The annotation process links document phrases to terms of a standardized vocabulary. Annotated documents enable effective retrieval methods, comparability of different documents, and comprehensive data analysis, such as finding adversarial drug effects based on patient data.
A vocabulary allows the comparability using standardized terms. An ontology can also represent a vocabulary, whereas concepts, relationships, and logical constraints additionally define an ontology. The annotation process is applicable in different domains. Nevertheless, there is a difference between generic and specialized domains according to the annotation process. This thesis emphasizes the differences between the domains and addresses the identified challenges. The majority of annotation approaches focuses on the evaluation of general domains, such as Wikipedia. This thesis evaluates the developed annotation approaches with case report forms that are medical documents for examining clinical trials. The natural language provides different challenges, such as similar meanings using different phrases. The proposed annotation method, AnnoMap, considers the fuzziness of natural language. A further challenge is the reuse of verified annotations. Existing annotations represent knowledge that can be reused for further annotation processes. AnnoMap consists of a reuse strategy that utilizes verified annotations to link new documents to appropriate concepts. Due to the broad spectrum of areas in the biomedical domain, different tools exist. The tools perform differently regarding a particular domain. This thesis proposes a combination approach to unify results from different tools. The method utilizes existing tool results to build a classification model that can classify new annotations as correct or incorrect.
The results show that the reuse and the machine learning-based combination improve the annotation quality compared to existing approaches focussing on the biomedical domain.
A further part of data integration is entity resolution to build unified knowledge bases from different data sources. A data source consists of a set of records characterized by attributes. The goal of entity resolution is to identify records representing the same real-world entity. Many methods focus on linking data sources consisting of records being characterized by attributes. Nevertheless, only a few methods can handle graph-structured knowledge bases or consider temporal aspects. The temporal aspects are essential to identify the same entities over different time intervals since these aspects underlie certain conditions. Moreover, records can be related to other records so that a small graph structure exists for each record. These small graphs can be linked to each other if they represent the same. This thesis proposes an entity resolution approach for census data consisting of person records for different time intervals. The approach also considers the graph structure of persons given by family relationships.
For achieving qualitative results, current methods apply machine-learning techniques to classify record pairs as the same entity. The classification task used a model that is generated by training data. In this case, the training data is a set of record pairs that are labeled as a duplicate or not. Nevertheless, the generation of training data is a time-consuming task so that active learning techniques are relevant for reducing the number of training examples.
The entity resolution method for temporal graph-structured data shows an improvement compared to previous collective entity resolution approaches. The developed active learning approach achieves comparable results to supervised learning methods and outperforms other limited budget active learning methods.
Besides the entity resolution approach, the thesis introduces the concept of evolution operators for communities. These operators can express the dynamics of communities and individuals. For instance, we can formulate that two communities merged or split over time. Moreover, the operators allow observing the history of individuals.
Overall, the presented annotation approaches generate qualitative annotations for medical forms. The annotations enable comprehensive analysis across different data sources as well as accurate queries. The proposed entity resolution approaches improve existing ones so that they contribute to the generation of qualitative knowledge graphs and data analysis tasks.
|
163 |
Algorithms for Map Generation and Spatial Data Visualization in LIFELin, Ying-Chi 27 February 2018 (has links)
The goal of this master thesis is to construct a software system, named the LIFE Spatial Data Visualization System (LIFE-SDVS), to automatically visualize the data obtained in the LIFE project spatially. LIFE stands for the Leipzig Research Centre for Civilization Diseases. It is part of the Medical Faculty of the University of Leipzig and conducts a large medical research project focusing on civilization diseases in the Leipzig population. Currently, more than 20,000 participants have joined this population-based cohort study. The analyses in LIFE have been mostly limited to non-spatial aspects. To integrate geographical facet into the findings, a spatial visualization tool is necessary. Hence, LIFE-SDVS, an automatic map visualization tool wrapped in an interactive web interface, is constructed. LIFE-SDVS is conceptualized with a three-layered architecture: data source, functionalities and spatial visualization layers. The implementation of LIFE-SDVS was achieved by two software components: an independent, self-contained R package lifemap and the LIFE Shiny Application. The package lifemap enables the automatic spatial visualization of statistics on the map of Leipzig and to the extent of the authors knowledge, is the first R package to achieve boundary labeling for maps. The package lifemap also contains two self-developed algorithms. The Label Positioning Algorithm was constructed to find good positions within each region on a map for placing labels, statistical graphics and as starting points for boundary label leaders. The Label Alignment Algorithm solves the leader intersection problem of boundary labeling.
However, to use the plotting functions in lifemap, the users need to have basic knowledge of R and it is a tedious job to manually input the argument values whenever changes on the maps are necessary. An interactive Shiny web application, the LIFE Shiny Application, is therefore built to create a user friendly data exploration and map generation tool. LIFE Shiny Application is capable of obtaining experimental data directly from the LIFE database at runtime. Additionally, a data preprocessing unit can transform the raw data into the format needed for spatial visualization. On the LIFE Shiny Application user interface, users can specify the data to display, including what data to be fetched from database and which part of the data shall be visualized, by using the filter functions provided. Many map features are also available to improve the aesthetic presentation of the maps. The resulting maps can also be downloaded for further usage in scientific publications or reports. Two use cases using LIFE hand grip strength and body mass index data demonstrate the functionalities of LIFESDVS. The current LIFE-SDVS sets a foundation for the spatial visualization of LIFE data. Suggestions on adding further functionalities into the future version are also provided.
|
164 |
Optimierung der Visualisierung eines Dashboards für das Microservice-MonitoringUrban, Dario 29 November 2021 (has links)
Microservice-Architekturen haben sich mittlerweile etabliert und werden von immer mehr Firmen übernommen. Die erhöhte Komplexität der Microservice-Architektur aufgrund der Verteilung des Systems hat jedoch zur Folge, dass die effiziente und erfolgreiche Administration des Systems erschwert wird.
Ziel der Arbeit ist es, alle notwendigen Metriken für ein effizientes und erfolgreiches Microservice-Monitoring zu identifizieren und auf Basis dieser Erkenntnisse das Linkerd-Dashboard prototypisch weiterzuentwickeln. Hierfür wurde eine Literaturrecherche durchgeführt. Darüber hinaus wurden Praktiker mittels eines Online-Fragebogens befragt. Abschließend wurde die prototypische Weiterentwicklung mithilfe eines halbstrukturierten Interviews evaluiert.
Die Literaturrecherche ergab, dass Central-Processing-Unit (CPU)- und Random-Access-Memory (RAM)-Nutzung, Antwortzeit, Workload, Fehlerrate und Service-Interaktion eine Metrik-Menge sind, mit der Microservice-Architekturen effektiv überwacht werden können. Außerdem konnte konstatiert werden, dass die Darstellung der Metriken hauptsächlich mit Visualisierungen realisiert wird.
CPU- und RAM-Auslastung sind eine sinnvolle Erweiterung des Linkerd-Dashboards, da diese in der Literatur sowie im Fragebogen als wichtige Kennzahlen deklariert und alle anderen als essenziell eingestuften Metriken bereits vom Linkerd-Dashboard abgedeckt werden.
Der Prototyp wurde als gelungen eingestuft, benötigt aber einige kleinere Verbesserungen der Visualisierung, bevor er in der Produktion eingesetzt werden kann.
|
165 |
Algorithmische Gouvernementalität am Beispiel von OKCupidWandelt, Alina 16 August 2021 (has links)
No description available.
|
166 |
Entwicklung und Evaluation der Darstellung von Testabdeckungen in GetavizSillus, Aaron 29 September 2021 (has links)
Die Softwarevisualisierung nutzt unter anderem dreidimensionale Modelle zur Darstellung von Software. Diese Modelle erlauben die Exploration von Softwareprojekten durch Interaktion mit einer 3D-Szene. Das Institut für Wirtschaftsinformatik der Universität Leipzig entwickelt im Rahmen der Forschung auf diesem Gebiet das Programm Getaviz, welches verschiedene Funktionen umfasst, um die Analyse von Software zu unterstützen. Im Rahmen der vorliegenden Arbeit wird eine Erweiterung zur Darstellung der Testabdeckung in Getaviz entwickelt. Hierzu werden Techniken aus dem Usability Engineering verwendet, um eine hohe Benutzungsfreundlichkeit zu erreichen. Insbesondere findet der Entwicklungsprozess in mehreren Iterationen statt, in denen das Design durch eine formative Untersuchung bewertet und für die nächste Iteration angepasst wird. Der Entwicklungsprozess sowie der finale Stand sind außerdem auf GitHub (https://github.com/AaronSil/Getaviz/tree/development) als Repository dokumentiert.:Inhaltsverzeichnis
Abbildungsverzeichnis
Tabellenverzeichnis
1 Einleitung
1.1 Motivation und Problemstellung
1.2 Ziel und Aufbau der Arbeit
2 Grundlagen
2.1 Softwarevisualisierung
2.2 Getaviz
2.3 Testabdeckung
2.4 Testabdeckung in der Softwarevisualisierung
2.5 Usability-Engineering
3 Konzeption des Prototyps
3.1 Vorgehen
3.2 Anforderungsanalyse
3.2.1 Eingrenzung des Umfangs
3.2.2 Funktionale Anforderungen
3.2.3 Nicht-funktionale Anforderungen
3.2.4 Zielstellung für die erste Iteration
4 Konzeption der Evaluation
4.1 Untersuchungsgegenstand und -design
4.2 Methoden
4.3 Testdesign
4.3.1 Vorbereitung und Aufbau
4.3.2 Durchführung
4.3.3 Nachbereitung
5 Durchführung der Evaluation
5.1 Stichprobenzusammensetzung
5.2 Erste Iteration
5.3 Zweite Iteration
5.4 Dritte Iteration
6 Implementierung des Prototyps
6.1 Erweiterung des Generators
6.2 Sourcecode-Controller
6.3 Treemap
6.4 Sphären
6.5 Color-Coding
6.6 Farb-Controller
6.7 Experiment-Popover-Fenster
6.8 Tooltip-Controller
6.9 Package Explorer
6.10 Sonstige Features
7 Untersuchungsergebnisse
7.1 Kategorisierung der Ergebnisse
7.2 Interpretation der Ergebnisse
7.3 Diskussion
8 Fazit und Ausblick
Literaturverzeichnis
Selbstständigkeitserklärung
Anhang
Anhang 1: Fragebogen
Anhang 2: Interviewleitfaden
Anhang 3: Eckdaten der Iterationen
Anhang 4: Szenarien
Anhang 5: Implementierung des Prototyps
Anhang 6: Auswertung - Fragebogen
Anhang 7: Auswertung - Findings
|
167 |
Unraveling the genetic secrets of ancient Baikal amphipodsRivarola-Duarte, Lorena 24 August 2021 (has links)
Lake Baikal is the oldest, by volume, the largest, and the deepest freshwater lake on Earth. It is characterized by an outstanding diversity of endemic faunas with more than 350 amphipod species and subspecies (Amphipoda, Crustacea, Arthropoda). They are the dominant benthic organisms in the lake, contributing substantially to the overall biomass. Eulimnogammarus verrucosus, E. cyaneus, and E. vittatus, in particular, serve as emerging models in ecotoxicological studies.
It was, then, necessary to investigate whether these endemic littoral amphipods species form genetically separate populations across Baikal, to scrutinize if the results obtained --~for example, about stress responses~-- with samples from one single location (Bolshie Koty, where the biological station is located), could be extrapolated to the complete lake or not. The genetic diversity within those three endemic littoral amphipod species was determined based on fragments of Cytochrome C Oxidase I (COI) and 18S rDNA (only for E. verrucosus). Gammarus lacustris, a Holarctic species living in water bodies near Baikal, was examined for comparison. The intra-specific genetic diversities within E. verrucosus and E. vittatus (13% and 10%, respectively) were similar to the inter-species differences, indicating the occurrence of cryptic, morphologically highly similar species. This was confirmed with 18S rDNA for E. verrucosus. The haplotypes of E. cyaneus and G. lacustris specimens were, with intra-specific genetic distances of 3% and 2%, respectively, more homogeneous, indicating no --or only recent disruption of-- gene flow of E. cyaneus across Baikal, and recent colonization of water bodies around Baikal by G. lacustris. The data provide the first clear evidence for the formation of cryptic (sub)species within endemic littoral amphipod species of Lake Baikal and mark the inflows/outflow of large rivers as dispersal barriers.
Lake Baikal has provided a stable environment for millions of years, in stark contrast to small, transient water bodies in its immediate vicinity. A highly diverse endemic amphipod fauna is found in one but not the other habitat. To gain more insights and explain the immiscibility barrier between Lake Baikal and non-Baikal environments faunas, the differences in the stress response pathways were studied. To this end, exposure experiments to increasing temperature and a heavy metal (cadmium) as proteotoxic stressors were conducted in Russia. High-quality de novo transcriptome assemblies were obtained, covering multiple conditions, for three amphipod species: E. verrucosus and E. cyaneus -Baikal endemics-, and G. lacustris -Holarctic- as a potential invader. After comparing the transcriptomic stress responses, it was found that both Baikal species possess intact stress response systems and respond to elevated temperature with relatively similar changes in their expression profiles. G. lacustris reacts less strongly to the same stressors, possibly because its transcriptome is already perturbed by acclimation conditions (matching the Lake Baikal littoral).
Comprehensive genomic resources are of utmost importance for ecotoxicological and ecophysiological studies in an evolutionary context, especially considering the exceptional value of Baikal as a UNESCO World Heritage Site. In that context, the results presented here, on the genome of Eulimnogammarus verrucosus, have been the first massive step to establish genomic sequence resources for a Baikalian amphipod (other than mitochondrial genomes and gene expression data in the form of de novo transcriptomes assemblies). Based on the data from a survey of its genome (a single lane of paired-end Illumina HiSeq 2000 reads, 3X) as well as a full dataset (two complete flow cells, 46X) the genome size was estimated as nearly 10 Gb based on the k-mer spectra and the coverage of highly conserved miRNA, hox genes, and other Sanger-sequenced genes. At least two-thirds of the genome are non-unique DNA, and no less than half of the genomic DNA is composed of just five families of repetitive elements, including low complexity sequences. Some of the repeats families found in high abundance in E. verrucosus seem to be species-specific, or Baikalian-specific.
Attempts to use off-the-shelf assembly tools on the available low coverage data, both before and after the removal of highly repetitive components, as well as on the full dataset, resulted in extremely fragmented assemblies. Nevertheless, the analysis of coverage in Hox genes and their homeobox showed no clear evidence for paralogs, indicating that a genome duplication did not contribute to the large genome size. Several mate-pair libraries with bigger insert sizes than the 2kb used here and long reads sequencing technology combined with semi-automated methods for genome assembly seem to be necessary to obtain a reliable assembly for this species.
|
168 |
Jahresspiegel / Universität LeipzigUniversität Leipzig January 2014 (has links)
Mit dem nun im dritten Jahr erscheinenden "Jahresspiegel" informiert die Universitätsleitung über Eckdaten der Entwicklung der Universität und Ergebnisse der Leistungsprozesse in Forschung, Lehre und Verwaltung des abgelaufenen Jahres 2013.
|
169 |
Adding Threshold Concepts to the Description Logic ELFernández Gil, Oliver 18 May 2016 (has links)
We introduce a family of logics extending the lightweight Description Logic EL, that allows us to define concepts in an approximate way. The main idea is to use a graded membership function m, which for each individual and concept yields a number in the interval [0,1] expressing the degree to which the individual belongs to the concept. Threshold concepts C~t for ~ in {<,<=,>,>=} then collect all the individuals that belong to C with degree ~t. We further study this framework in two particular directions. First, we define a specific graded membership function deg and investigate the complexity of reasoning in the resulting Description Logic tEL(deg) w.r.t. both the empty terminology and acyclic TBoxes. Second, we show how to turn concept similarity measures into membership degree functions. It turns out that under certain conditions such functions are well-defined, and therefore induce a wide range of threshold logics. Last, we present preliminary results on the computational complexity landscape of reasoning in such a big family of threshold logics.
|
170 |
Efficient Source Selection For SPARQL Endpoint Query FederationSaleem, Muhammad 13 May 2016 (has links)
The Web of Data has grown enormously over the last years. Currently, it comprises a large compendium of linked and distributed datasets from multiple domains. Due to the decentralised architecture of the Web of Data, several of these datasets contain complementary data. Running complex queries on this compendium thus often requires accessing data from different data sources within one query. The abundance of datasets and the need for running complex query has thus motivated a considerable body of work on SPARQL query federation systems, the dedicated means to access data distributed over the Web of Data.
This thesis addresses two key areas of federated SPARQL query processing: (1) efficient source selection, and (2) comprehensive SPARQL benchmarks to test and ranked federated SPARQL engines as well as triple stores.
Efficient Source Selection: Efficient source selection is one of the most important optimization steps in federated SPARQL query processing. An overestimation of query relevant data sources increases the network traffic, result in irrelevant intermediate results, and can significantly affect the overall query processing time. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. Similarly, only little attention has been paid to the effect of duplicated data on federated querying. This thesis presents HiBISCuS and TBSS, novel hypergraph-based source selection approaches, and DAW, a duplicate-aware source selection approach to federated querying over the Web of Data. Each of these approaches can be combined directly with existing SPARQL query federation engines to achieve the same recall while querying fewer data sources. We combined the three (HiBISCuS, DAW, and TBSS) source selections approaches with query rewriting to form a complete SPARQL query federation engine named Quetsal. Furthermore, we present TopFed, a Cancer Genome Atlas (TCGA) tailored federated query processing engine that exploits the data distribution to perform intelligent source selection while querying over large TCGA SPARQL endpoints. Finally, we address the issue of rights managements and privacy while accessing sensitive resources. To this end, we present SAFE: a global source selection approach that enables decentralised, policy-aware access to sensitive clinical information represented as distributed RDF Data Cubes.
Comprehensive SPARQL Benchmarks: Benchmarking is indispensable when aiming to assess technologies with respect to their suitability for given tasks. While several benchmarks and benchmark generation frameworks have been developed to evaluate federated SPARQL engines and triple stores, they mostly provide a one-fits-all solution to the benchmarking problem. This approach to benchmarking is however unsuitable to evaluate the performance of a triple store for a given application with particular requirements. The fitness of current SPARQL query federation approaches for real applications is difficult to evaluate with current benchmarks as current benchmarks are either synthetic or too small in size and complexity. Furthermore, state-of-the-art federated SPARQL benchmarks mostly focused on a single performance criterion, i.e., the overall query runtime. Thus, they cannot provide a fine-grained evaluation of the systems. We address these drawbacks by presenting FEASIBLE, an automatic approach for the generation of benchmarks out of the query history of applications, i.e., query logs and LargeRDFBench, a billion-triple benchmark for SPARQL query federation which encompasses real data as well as real queries pertaining to real bio-medical use cases.
Our evaluation results show that HiBISCuS, TBSS, TopFed, DAW, and SAFE all can significantly reduce the total number of sources selected and thus improve the overall query performance. In particular, TBSS is the first source selection approach to remain under 5% overall relevant sources overestimation. Quetsal has reduced the number of sources selected (without losing recall), the source selection time as well as the overall query runtime as compared to state-of-the-art federation engines. The LargeRDFBench evaluation results suggests that the performance of current SPARQL query federation systems on simple queries does not reflect the systems\\\'' performance on more complex queries. Moreover, current federation systems seem unable to deal with many of the challenges that await them in the age of Big Data. Finally, the FEASIBLE\\\''s evaluation results shows that it generates better sample queries than the state-of-the-art. In addition, the better query selection and the larger set of query types used lead to triple store rankings which partly differ from the rankings generated by previous works.
|
Page generated in 0.0318 seconds