261 |
Transformace a publikace otevřených a propojitelných dat / Transformation and publication of Open and Linked DataNohejl, Petr January 2013 (has links)
The principle of Open Data and Linked data is in growing interest of many organizations, developers and even government institutions. This work is aimed on providing actual information about development of Open and Linked data, further there are introduced featured tools for creating, manipulating, transformation and other operations regarding Open and Linked Data. Finally, there is description of development of Linked Data application based on universal visualization system Payola.
|
262 |
A methodology for database management of time-variant encodings and/or missing informationThrelfall, William John January 1988 (has links)
The problem presented is how to handle encoded data for which the encodings or decodings change with respect to time, and which contains codes indicating that certain data is unknown, invalid, or not applicable with respect to certain entities during certain time periods.
It is desirable to build a database management system that is capable of knowing about and being able to handle the changes in encodings and the missing information codes by embedding such knowledge in the data definition structure, in order to remove the necessity of having applications programmers and users constantly worrying about how the data is encoded.
The experimental database management language DEFINE is utilized to achieve the desired result, and a database structure is created for a real-life example of data which contains many examples of time-variant encodings and missing information. / Science, Faculty of / Computer Science, Department of / Graduate
|
263 |
Interface conversion between CCITT recommendations X.21 and V.24Van der Harst, Hubert January 1983 (has links)
The subject of this thesis concerns conversion between the interfaces specified by CCITT recommendations X.21 and V.24. The evolution of public data networks against the background of data communications using the telephone network is outlined. The DTE/DCE interface is identified as being of particular importance and is explained in terms of the ISO model for Open Systems interconnection (OSI). CCITT recommendation X.21 is described in detail using the OSI layered approach. Finite state machine (FSM) terminology is defined and the concept of an interface machine introduced. CCITT recommendation V.24 is described in terms of the physical layer of the OSI model. Only those aspects of V.24 relevant to the subject of this thesis are examined. Interface conversion between X.21 and V.24 is discussed in detail and the design of devices to perform the conversion described. A microprocessor-based translator to perform interface conversion between a V.24 DTE and a X.21 DCE for switched circuit use is designed, using the FSM approach. A preliminary model of such a translator, implemented on a development system, is described. Its hardware and software are outlined and areas for further work identified.
|
264 |
Leveraging big data resources and data integration in biology: applying computational systems analyses and machine learning to gain insights into the biology of cancersSinkala, Musalula 24 February 2021 (has links)
Recently, many "molecular profiling" projects have yielded vast amounts of genetic, epigenetic, transcription, protein expression, metabolic and drug response data for cancerous tumours, healthy tissues, and cell lines. We aim to facilitate a multi-scale understanding of these high-dimensional biological data and the complexity of the relationships between the different data types taken from human tumours. Further, we intend to identify molecular disease subtypes of various cancers, uncover the subtype-specific drug targets and identify sets of therapeutic molecules that could potentially be used to inhibit these targets. We collected data from over 20 publicly available resources. We then leverage integrative computational systems analyses, network analyses and machine learning, to gain insights into the pathophysiology of pancreatic cancer and 32 other human cancer types. Here, we uncover aberrations in multiple cell signalling and metabolic pathways that implicate regulatory kinases and the Warburg effect as the likely drivers of the distinct molecular signatures of three established pancreatic cancer subtypes. Then, we apply an integrative clustering method to four different types of molecular data to reveal that pancreatic tumours can be segregated into two distinct subtypes. We define sets of proteins, mRNAs, miRNAs and DNA methylation patterns that could serve as biomarkers to accurately differentiate between the two pancreatic cancer subtypes. Then we confirm the biological relevance of the identified biomarkers by showing that these can be used together with pattern-recognition algorithms to infer the drug sensitivity of pancreatic cancer cell lines accurately. Further, we evaluate the alterations of metabolic pathway genes across 32 human cancers. We find that while alterations of metabolic genes are pervasive across all human cancers, the extent of these gene alterations varies between them. Based on these gene alterations, we define two distinct cancer supertypes that tend to be associated with different clinical outcomes and show that these supertypes are likely to respond differently to anticancer drugs. Overall, we show that the time has already arrived where we can leverage available data resources to potentially elicit more precise and personalised cancer therapies that would yield better clinical outcomes at a much lower cost than is currently being achieved.
|
265 |
Možnosti využití konceptu Big Data v pojišťovnictvíStodolová, Jana January 2019 (has links)
This diploma thesis deals with the phenomenon of recent years called Big Data. Big Data are unstructured data of large volume which cannot be managed and processed by commonly used software tools. The analytical part deals with the concept of Big Data and analyses the possibilities of using this concept in the in-surance sector. The practical part presents specific methods and approaches for the use of big data analysis, specifically in increasing the competitiveness of the insurance company and in detecting insurance frauds. Most space is devoted to data mining methods in modelling the task of detecting insurance frauds. This di-ploma thesis builds on and extends the bachelor thesis of the author titled "Mod-ern technology of data analysis and its use in detection of insurance frauds".
|
266 |
Linked open data pro informace veřejného sektoru / Linked Open Data for Public Sector InformationMynarz, Jindřich January 2012 (has links)
The diploma thesis introduces the domain of proactive disclosure of public sector information via linked open data. At the start, the legal framework encompassing public sector information is expounded along with the basic approaches for its disclosure. The practices of publishing data as open data are defined as an ap- proach for proactive disclosure that is based on the application of the principle of openness to data with the goal to enable equal access and equal use of the data. The reviewed practices range from necessary legal actions, choices of appropriate technologies, and ways in which the technologies should be used to achieve the best data quality. Linked data is presented as a knowledge technology that, for the most part, fulfils the requirements on open technology suitable for open data. The thesis extrapolates further from the adoption of linked open data in the public sector to recognize the impact and challenges proceeding from this change. The distinctive focus on the side supplying data and the trust in the transformative effects of technological changes are identified among the key sources of these challenges. The emphasis on technologies for data disclosure at the expense of a more careful attention to the use of data is presented as a possible source of risks that may undermine the...
|
267 |
An Application for Downloading and Integrating Molecular Biology DataFontaine, Burr R. 24 August 2005 (has links)
Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Sciences
in the School of Informatics Indiana University
July 2004 / Integrating large volumes of data from diverse sources is a formidable challenge for many investigators in the field of molecular biology. Developing efficient methods for accessing and integrating this data is a major focus of investigation in the field of bioinformatics.
In early 2003, the Hereditary Genomics division of the department of Medical and Molecular Genetics at IUPUI recognized the need for a software application that would automate many of the manual processes that were being used to obtain data for their research. The two primary objectives for this project were: 1) an application that would provide large-scale, integrated output tables to help answer questions that frequently arose in the course of their research, and 2) a graphic user interface (GUI) that would minimize or eliminate the need for technical expertise in computer programming or database operations on the part of the end-users.
In early 2003, Indiana University (IU), IBM, and the Indiana Genomics Initiative (INGEN) introduced a new resource called Centralized Life Sciences Data Services (CLSD). CLSD is a centralized data repository that provides programmatic access to biological data that is collected and integrated from multiple public, online databases.
METHODS
1. an in-depth analysis was conducted to assess the department's data requirements and map these requirements to the data available at CLSD
2. CLSD incorporated new data as necessary
3. SQL was written to generate tables that would replace the targeted manual processes
4. a DB2 client was installed in Medical and Molecular Genetics to establish remote access to CLSD
5. a graphic user interface (GUI) was designed and implemented in HTML/CGI
6. a PERL program was written to accept parameters from the web input form, submit queries to CLSD, and generate HTML-based output tables
7. validation, updates, and maintenance procedures were conducted after early prototype implementation
RESULTS AND CONCLUSIONS
This application resulted in a substantial increase in efficiency over the manual methods that were previously used for data collection. The application also allows research teams to update their data much more frequently. A high level of accuracy in the output tables was confirmed by a thorough validation process.
|
268 |
Sibios as a Framework for Biomarker Discovery Using Microarray DataChoudhury, Bhavna 26 July 2006 (has links)
Submitted to the Faculty of the School of Informatics in parial fulfillment of the requirements for the degree of Master of Schience in Bioinformatics Indiana University August 2006 / Decoding the human genome resulted in generating large amount of data that need to be analyzed and given a biological meaning. The field of Life Schiences is highly information driven. The genomic data are mainly the gene expression data that are obtained from measurement of mRNA levels in an organism. Efficiently processing large amount of gene expression data has been possible with the help of high throughput technology. Research studies working on microarray data has led to the possibility of finding disease biomarkers. Carrying out biomarker discovery experiments has been greatly facilitated with the emergence of various analytical and visualization tools as well as annotation databases. These tools and databases are often termed as 'bioinformatics services'.
The main purpose of this research was to develop SIBIOS (Bystem for Integration of Bioinformatics Services) as a platform to carry out microarray experiments for the purpose of biomarker discovery. Such experiments require the understanding of the current procedures adopted by researchers to extract biologically significant genes.
In the course of this study, sample protocols were built for the purpose of biomarker discovery. A case study on the BCR-ABL subtype of ALL was selected to validate the results. Different approaches for biomarker discovery were explored and both statistical and mining techniques were considered. Biological annotation of the results was also carried out. The final task was to incorporate the new proposed sample protocols into SIBIOS by providing the workflow capabilities and therefore enhancing the system's characteristics to be able to support biomarker discovery workflows.
|
269 |
Exploration of 5G Traffic Models using Machine Learning / Analys av trafikmodeller i 5G-nätverk med maskininlärningGosch, Aron January 2020 (has links)
The Internet is a major communication tool that handles massive information exchanges, sees a rapidly increasing usage, and offers an increasingly wide variety of services. In addition to these trends, the services themselves have highly varying quality of service (QoS), requirements and the network providers must take into account the frequent releases of new network standards like 5G. This has resulted in a significant need for new theoretical models that can capture different network traffic characteristics. Such models are important both for understanding the existing traffic in networks, and to generate better synthetic traffic workloads that can be used to evaluate future generations of network solutions using realistic workload patterns under a broad range of assumptions and based on how the popularity of existing and future application classes may change over time. To better meet these changes, new flexible methods are required. In this thesis, a new framework aimed towards analyzing large quantities of traffic data is developed and used to discover key characteristics of application behavior for IP network traffic. Traffic models are created by breaking down IP log traffic data into different abstraction layers with descriptive values. The aggregated statistics are then clustered using the K-means algorithm, which results in groups with closely related behaviors. Lastly, the model is evaluated with cluster analysis and three different machine learning algorithms to classify the network behavior of traffic flows. From the analysis framework a set of observed traffic models with distinct behaviors are derived that may be used as building blocks for traffic simulations in the future. Based on the framework we have seen that machine learning achieve high performance on the classification of network traffic, with a Multilayer Perceptron getting the best results. Furthermore, the study has produced a set of ten traffic models that have been demonstrated to be able to reconstruct traffic for various network entities. / <p>Due to COVID-19 the presentation was performed over ZOOM.</p>
|
270 |
An inexpensive system of geophysical data acquisitionMomayezzadeh, Mohammed January 1987 (has links)
Note:
|
Page generated in 0.1593 seconds