Global ETD Search

1	Automating Laboratory Operations by Intergrating Laboratory Information Management Systems (LIMS) with Analytical Instruments and Scientific Data Management System (SDMS) Zhu, Jianyong 06 1900 (has links) Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Science in the School of Informatics, Indiana University June 2005 / The large volume of data generated by commercial and research laboratories, along with requirements mandated by regulatory agencies, have forced companies to use laboratory information management systems (LIMS) to improve efficiencies in tracking, managing samples, and precisely reporting test results. However, most general purpose LIMS do not provide an interface to automatically collect data from analytical instruments to store in a database. A scientific data management system (SDMS) provides a “Print-to-Database” technology, which facilitates the entry of reports generated by instruments directly into the SDMS database as Windows enhanced metafiles thus to minimize data entry errors. Unfortunately, SDMS does not allow performing further analysis. Many LIMS vendors provide plug-ins for single instrument but none of them provides a general purpose interface to extract the data from SDMS and store in LIMS. In this project, a general purpose middle layer named LabTechie is designed, built and tested for seamless integration between instruments, SDMS and LIMS. This project was conducted at American Institute of Technology (AIT) Laboratories, an analytical laboratory that specializes in trace chemical measurement of biological fluids. Data is generated from 20 analytical instruments, including gas chromatography/mass spectrometer (GC/MS), high performance liquid chromatography (HPLC), and liquid chromatography/mass spectrometer (LC/MS), and currently stored in NuGenesis SDMS iv (Waters, Milford, MA). This approach can be easily expanded to include additional instruments. laboratory information management scientific data management
2	Realizing a feature-based framework for scientific data mining Mehta, Sameep 13 September 2006 (has links) No description available. Computer Science Scientific Data Mining Scientific Visualization
3	Seafarers, silk, and science : oceanographic data in the making Halfmann, Gregor January 2018 (has links) This thesis comprises an empirical case study of scientific data production in oceanography and a philosophical analysis of the relations between newly created scientific data and the natural world. Based on qualitative interviews with researchers, I reconstruct research practices that lead to the ongoing production of digital data related to long-term developments of plankton biodiversity in the oceans. My analysis is centred on four themes: materiality, scientific representing with data, methodological continuity, and the contribution of non-scientists to epistemic processes. These are critically assessed against the background of today’s data-intensive sciences and increased automation and remoteness in oceanographic practices. Sciences of the world’s oceans have by and large been disregarded in philosophical scholarship thus far. My thesis opens this field for philosophical analysis and reveals various conditions and constraints of data practices that are largely uncontrollable by ocean scientists. I argue that the creation of useful scientific data depends on the implementation and preservation of material, methodological, and social continuities. These allow scientists to repeatedly transform visually perceived characteristics of research samples into meaningful scientific data stored in a digital database. In my case study, data are not collected but result from active intervention and subsequent manipulation and processing of newly created material objects. My discussion of scientific representing with data suggests that scientists do not extract or read any intrinsic representational relation between data and a target, but make data gradually more computable and compatible with already existing representations of natural systems. My arguments shed light on the epistemological significance of materiality, on limiting factors of scientific agency, and on an inevitable balance between changing conditions of concrete research settings and long-term consistency of data practices.
4	Enhanced Bitmap Indexes for Large Scale Data Management Canahuate, Guadalupe M. 08 September 2009 (has links) No description available. bitmap index scientific data management large scale indexing
5	Generic Metadata Handling in Scientific Data Life Cycles Grunzke, Richard 11 May 2016 (has links) (PDF) Scientific data life cycles define how data is created, handled, accessed, and analyzed by users. Such data life cycles become increasingly sophisticated as the sciences they deal with become more and more demanding and complex with the coming advent of exascale data and computing. The overarching data life cycle management background includes multiple abstraction categories with data sources, data and metadata management, computing and workflow management, security, data sinks, and methods on how to enable utilization. Challenges in this context are manifold. One is to hide the complexity from the user and to enable seamlessness in using resources to usability and efficiency. Another one is to enable generic metadata management that is not restricted to one use case but can be adapted with limited effort to further ones. Metadata management is essential to enable scientists to save time by avoiding the need for manually keeping track of data, meaning for example by its content and location. As the number of files grows into the millions, managing data without metadata becomes increasingly difficult. Thus, the solution is to employ metadata management to enable the organization of data based on information about it. Previously, use cases tended to only support highly specific or no metadata management at all. Now, a generic metadata management concept is available that can be used to efficiently integrate metadata capabilities with use cases. The concept was implemented within the MoSGrid data life cycle that enables molecular simulations on distributed HPC-enabled data and computing infrastructures. The implementation enables easy-to-use and effective metadata management. Automated extraction, annotation, and indexing of metadata was designed, developed, integrated, and search capabilities provided via a seamless user interface. Further analysis runs can be directly started based on search results. A complete evaluation of the concept both in general and along the example implementation is presented. In conclusion, generic metadata management concept advances the state of the art in scientific date life cycle management. Generic Metadata Handling Scientific Data Life Cycles ddc:004 rvk:ST 265
6	Middleware for online scientific data analytics at extreme scale Zheng, Fang 22 May 2014 (has links) Scientific simulations running on High End Computing machines in domains like Fusion, Astrophysics, and Combustion now routinely generate terabytes of data in a single run, and these data volumes are only expected to increase. Since such massive simulation outputs are key to scientific discovery, the ability to rapidly store, move, analyze, and visualize data is critical to scientists' productivity. Yet there are already serious I/O bottlenecks on current supercomputers, and movement toward the Exascale is further accelerating this trend. This dissertation is concerned with the design, implementation, and evaluation of middleware-level solutions to enable high performance and resource efficient online data analytics to process massive simulation output data at large scales. Online data analytics can effectively overcome the I/O bottleneck for scientific applications at large scales by processing data as it moves through the I/O path. Online analytics can extract valuable insights from live simulation output in a timely manner, better prepare data for subsequent deep analysis and visualization, and gain improved performance and reduced data movement cost (both in time and in power) compared to the conventional post-processing paradigm. The thesis identifies the key challenges for online data analytics based on the needs of a variety of large-scale scientific applications, and proposes a set of novel and effective approaches to efficiently program, distribute, and schedule online data analytics along the critical I/O path. In particular, its solution approach i) provides a high performance data movement substrate to support parallel and complex data exchanges between simulation and online data analytics, ii) enables placement flexibility of analytics to exploit distributed resources, iii) for co-placement of analytics with simulation codes on the same nodes, it uses fined-grained scheduling to harvest idle resources for running online analytics with minimal interference to the simulation, and finally, iv) it supports scalable efficient online spatial indices to accelerate data analytics and visualization on the deep memory hierarchies of high end machines. Our middleware approach is evaluated with leadership scientific applications in domains like Fusion, Combustion, and Molecular Dynamics, and on different High End Computing platforms. Substantial improvements are demonstrated in end-to-end application performance and in resource efficiency at scales of up to 16384 cores, for a broad range of analytics and visualization codes. The outcome is a useful and effective software platform for online scientific data analytics facilitating large-scale scientific data exploration. Scientific data analytics I/O middleware Middleware Big data High performance computing
7	Modélisation d'expertise scientifique pour la constitution de comités de programme / Modelling scientific expertise to cnstitute the program committee of a scientific conference Tran, Hong Diep 19 December 2017 (has links) La publication scientifique dans les revues spécialisées et les actes de conférences permet de communiquer les progrès en sciences. Les comités de rédaction et de programme sous-jacents représentent la clé de voûte du processus d'évaluation. Avec le développement des revues et le nombre croissant de conférences scientifiques organisées chaque année, rechercher des experts pour participer à ces comités est une activité chronophage mais critique. Cette thèse se focalise sur la tâche de suggestion de membres de comité de programme (CP) pour des conférences scientifiques. Elle comporte trois volets. Premièrement, nous proposons une modélisation basée sur un graphe hétérogène pondéré de l'expertise scientifique multifacette des chercheurs. Deuxièmement, nous définissons des indicateurs scientométriques pour quantifier les critères impliqués dans la constitution de CP. Troisièmement, nous concevons une approche de suggestion de membres de CP pour une conférence donnée, en combinant les résultats des indicateurs scientométriques susmentionnés. Notre approche est expérimentée pour une des conférences de premier plan de notre communauté de recherche : SIGIR, en considérant ses éditions de 1971 à 2015, ainsi que les conférences proches thématiquement. / Academic publishing in specialized journals and conference proceedings is the main way to communicate progress in science. The underlying editorial and program committees represent the cornerstone of the evaluation process. With the development of journals and the increasing number of scientific conferences held annually, searching for experts who would serve in these committees is a time-consuming and yet critical activity. This PhD thesis focuses on the task of suggesting program committee (PC) members for scientific conferences. It is organized into three parts. First, we propose a modelling of the multifaceted scientific expertise of researchers based on a weighted heterogeneous graph. Second, we define scientometric indicators to quantify the criteria involved in the composition of CPs. Third, we design a CP member suggestion approach for a given conference, combining the results of the aforementioned scientometric indicators. Our approach is experimented in the context of leading conferences of our research community: SIGIR, considering its editions from 1971 to 2015, and topically close conferences. Recherche d'information Expertise Recommandation Données scientifiques Conférences scientifiques Information retrieval Expertise Recommendation Scientific data Scientific conferences
8	A database solution for scientific data from driving simulator studies. Rasheed, Yasser January 2011 (has links) Many research institutes produce a huge amount of data. It was said by someone that “We are drowning in data, but starving of information”. This is particularly true for scientific data. The needs and the advantages of being able to search data from different experiments are increasing in order to look for differences and similarities among them and thus doing Meta studies. A Meta-study is the method that takes data from different independent studies and integrate them using statistical analysis. If data is well described and data access is flexible then it is possible to establish unexpected relationships among data. It also helps in the re-using of data from studies that have already been conducted which saves time, money and resources. In this thesis, we explore at the ways to store data from experiments and to make finding cross-experiments more efficient. The main aim of this thesis work is to propose a database solution for storing time series data generated by different simulators and to investigate the feasibility of using it with ICAT. ICAT is a metadata system used for searching and browsing of scientific data. This thesis has been completed in two steps. The first step is aimed at proposing an efficient database solution for storing time series data. The second step is aimed at investigating the feasibility of using ICAT and proposed database solution together. We found out that it is feasible to use ICAT as a metadata system for scientific studies. Since it is free and open source, it can be linked to any system and customized according to the needs. Time Series Database Meta Data ICAT Scientific Data Simulators Computer Sciences Datavetenskap (datalogi)
9	Generic Metadata Handling in Scientific Data Life Cycles Grunzke, Richard 12 April 2016 (has links) Scientific data life cycles define how data is created, handled, accessed, and analyzed by users. Such data life cycles become increasingly sophisticated as the sciences they deal with become more and more demanding and complex with the coming advent of exascale data and computing. The overarching data life cycle management background includes multiple abstraction categories with data sources, data and metadata management, computing and workflow management, security, data sinks, and methods on how to enable utilization. Challenges in this context are manifold. One is to hide the complexity from the user and to enable seamlessness in using resources to usability and efficiency. Another one is to enable generic metadata management that is not restricted to one use case but can be adapted with limited effort to further ones. Metadata management is essential to enable scientists to save time by avoiding the need for manually keeping track of data, meaning for example by its content and location. As the number of files grows into the millions, managing data without metadata becomes increasingly difficult. Thus, the solution is to employ metadata management to enable the organization of data based on information about it. Previously, use cases tended to only support highly specific or no metadata management at all. Now, a generic metadata management concept is available that can be used to efficiently integrate metadata capabilities with use cases. The concept was implemented within the MoSGrid data life cycle that enables molecular simulations on distributed HPC-enabled data and computing infrastructures. The implementation enables easy-to-use and effective metadata management. Automated extraction, annotation, and indexing of metadata was designed, developed, integrated, and search capabilities provided via a seamless user interface. Further analysis runs can be directly started based on search results. A complete evaluation of the concept both in general and along the example implementation is presented. In conclusion, generic metadata management concept advances the state of the art in scientific date life cycle management. info:eu-repo/classification/ddc/004 ddc:004
10	Automation of Laboratory Activities Through Integration of an Electronic Laboratory Notebook (ELN) with a Scientific Data Management System (SDMS) Roberts, Nathan William 14 June 2006 (has links) Submitted to the faculty of the School of Informatics in partial fulfillment of the requirements for the degree Master of Science in Chemical Informatics (Laboratory Informatics Specialization), Indiana University, June 2006 / Industry and academic laboratories have long resisted conversion to electronic laboratory notebooks (ELN) while at the same time integrating many other kinds of information systems, most notably laboratory information management systems (LIMS), chromatography data systems (CDS), and scientific data management systems (SDMS), within laboratory operations. Scientists in both academia and industry stand to gain important functionality unavailable with paper notebooks with the adoption of ELNs such as comprehensive searching of notebooks (keyword, result, and molecular structure/substructure searching, for example); distributed availability; and long term access to data. Currently, most laboratory information systems operate independently, requiring manual data entry by users into each individual system. This process creates data and information disparities as well as creating poor referential integrity within experimental metadata. Electronic laboratory notebooks would provide a logical point around which experiment details and observations could be centered electronically. Through an ELN, experimental documentation or metadata could be communicated automatically with a LIMS, SDMS, or CDS without analyst involvement. This “electronically connected” system would allow analysts to perform their responsibilities without the interruption of independent information systems thus increasing analyst productivity and reducing user entry errors into data management systems. The thesis project consisted of two phases: the first phase was the implementation of an ELN and the second phase was the development of a software developer kit (SDK) for LABTrack based on Web Services. In the first phase of the project the adoption of an ELN, was studied within a classroom laboratory (G823 Introduction to Cell Biology) over the course of one academic semester. In the second phase of the project an SDK for LABTrack was developed to allow the importation of information from custom developed applications into LABTrack. Finally, a web portal integrating LABTrack and NuGenesis was developed to demonstrate the capabilities of the LABTrack SDK and existing capabilities of the NuGenesis SDK. Electronic Laboratory Notebook (ELN) Scientific Data Management System (SDMS) Laboratory Automation

Search results