21 |
A Knowledge Based Approach of Toxicity Prediction for Drug Formulation. Modelling Drug Vehicle Relationships Using Soft Computing TechniquesMistry, Pritesh January 2015 (has links)
This multidisciplinary thesis is concerned with the prediction of drug formulations for the reduction of drug toxicity. Both scientific and computational approaches are utilised to make original contributions to the field of predictive toxicology.
The first part of this thesis provides a detailed scientific discussion on all aspects of drug formulation and toxicity. Discussions are focused around the principal mechanisms of drug toxicity and how drug toxicity is studied and reported in the literature. Furthermore, a review of the current technologies available for formulating drugs for toxicity reduction is provided. Examples of studies reported in the literature that have used these technologies to reduce drug toxicity are also reported. The thesis also provides an overview of the computational approaches currently employed in the field of in silico predictive toxicology. This overview focuses on the machine learning approaches used to build predictive QSAR classification models, with examples discovered from the literature provided.
Two methodologies have been developed as part of the main work of this thesis. The first is focused on use of directed bipartite graphs and Venn diagrams for the visualisation and extraction of drug-vehicle relationships from large un-curated datasets which show changes in the patterns of toxicity. These relationships can be rapidly extracted and visualised using the methodology proposed in chapter 4.
The second methodology proposed, involves mining large datasets for the extraction of drug-vehicle toxicity data. The methodology uses an area-under-the-curve principle to make pairwise comparisons of vehicles which are classified according to the toxicity protection they offer, from which predictive classification models based on random forests and decisions trees are built. The results of this methodology are reported in chapter 6.
|
22 |
Forecasting Large-scale Time Series DataHartmann, Claudio 03 December 2018 (has links)
The forecasting of time series data is an integral component for management, planning, and decision making in many domains. The prediction of electricity demand and supply in the energy domain or sales figures in market research are just two of the many application scenarios that require thorough predictions. Many of these domains have in common that they are influenced by the Big Data trend which also affects the time series forecasting. Data sets consist of thousands of temporal fine grained time series and have to be predicted in reasonable time. The time series may suffer from noisy behavior and missing values which makes modeling these time series especially hard, nonetheless accurate predictions are required. Furthermore, data sets from different domains exhibit various characteristics. Therefore, forecast techniques have to be flexible and adaptable to these characteristics.
Long-established forecast techniques like ARIMA and Exponential Smoothing do not fulfill these new requirements. Most of the traditional models only represent one individual time series. This makes the prediction of thousands of time series very time consuming, as an equally large number of models has to be created. Furthermore, these models do not incorporate additional data sources and are, therefore, not capable of compensating missing measurements or noisy behavior of individual time series.
In this thesis, we introduce CSAR (Cross-Sectional AutoRegression Model), a new forecast technique which is designed to address the new requirements on forecasting large-scale time series data. It is based on the novel concept of cross-sectional forecasting that assumes that time series from the same domain follow a similar behavior and represents many time series with one common model. CSAR combines this new approach with the modeling concept of ARIMA to make the model adaptable to the various properties of data sets from different domains. Furthermore, we introduce auto.CSAR, that helps to configure the model and to choose the right model components for a specific data set and forecast task.
With CSAR, we present a new forecast technique that is suited for the prediction of large-scale time series data. By representing many time series with one model, large data sets can be predicted in short time. Furthermore, using data from many time series in one model helps to compensate missing values and noisy behavior of individual series. The evaluation on three real world data sets shows that CSAR outperforms long-established forecast techniques in accuracy and execution time. Finally, with auto.CSAR, we create a way to apply CSAR to new data sets without requiring the user to have extensive knowledge about our new forecast technique and its configuration.
|
23 |
Measurement properties of respondent-defined rating-scales : an investigation of individual characteristics and respondent choicesChami-Castaldi, Elisa January 2010 (has links)
It is critical for researchers to be confident of the quality of survey data. Problems with data quality often relate to measurement method design, through choices made by researchers in their creation of standardised measurement instruments. This is known to affect the way respondents interpret and respond to these instruments, and can result in substantial measurement error. Current methods for removing measurement error are post-hoc and have been shown to be problematic. This research proposes that innovations can be made through the creation of measurement methods that take respondents' individual cognitions into consideration, to reduce measurement error in survey data. Specifically, the aim of the study was to develop and test a measurement instrument capable of having respondents individualise their own rating-scales. A mixed methodology was employed. The qualitative phase provided insights that led to the development of the Individualised Rating-Scale Procedure (IRSP). This electronic measurement method was then tested in a large multi-group experimental study, where its measurement properties were compared to those of Likert-Type Rating-Scales (LTRSs). The survey included pre-validated psychometric constructs which provided a baseline for comparing the methods, as well as to explore whether certain individual characteristics are linked to respondent choices. Structural equation modelling was used to analyse the survey data. Whilst no strong associations were found between individual characteristics and respondent choices, the results demonstrated that the IRSP is reliable and valid. This study has produced a dynamic measurement instrument that accommodates individual-level differences, not addressed by typical fixed rating-scales.
|
24 |
Studies of thermal transpirationYork, David Christopher January 2000 (has links)
No description available.
|
25 |
Problematické aspekty ochrany osobních údajů / Problematic Aspects of Personal Data ProtectionVšetečková, Anna January 2018 (has links)
The thesis consists of five chapters, introduction and conclusion. The author of the thesis deals with introduction to the problematics of personal data protection and its relevance in the contemporary world in the introduction of the diploma thesis as well as with demarcation of the aims of the work. In the first chapter, the basic sources of legislation in the area of personal data protection are demarcated, both in Czech and in European and international level. In the second chapter, the attention is paid to the basics of the legislation in the area of personal data protection, whereas the author deals with demarcation of basic concepts, in the second subchapter she gives an overview of basic principles of personal data processing and in the third subchapter she summarizes legal titles for personal data processing. The institute of Data Protection Officer within the meaning of General Regulation is analysed in the third chapter. The first subchapter deals with demarcation of cases where the processor is obliged to designate the Data Protection Officer. The author pays attention to the problematics of requirements for qualification of the Data Protection Officer in the second subchapter. The major theme of third and fourth subchapter is demarcation of Data Protection Officers position to the controller...
|
26 |
Comparison of object and pixel-based classifications for land-use and land cover mapping in the mountainous Mokhotlong District of Lesotho using high spatial resolution imageryGegana, Mpho January 2016 (has links)
Research Report submitted in partial fulfilment for the degree of Master of Science (Geographical Information Systems and Remote Sensing) School of Geography, Archaeology and Environmental Studies, University of the Witwatersrand, Johannesburg. August 2016. / The thematic classification of land use and land cover (LULC) from remotely sensed imagery data is one of the most common research branches of applied remote sensing sciences. The performances of the pixel-based image analysis (PBIA) and object-based image analysis (OBIA) Support Vector Machine (SVM) learning algorithms were subjected to comparative assessment using WorldView-2 and SPOT-6 multispectral images of the Mokhotlong District in Lesotho covering approximately an area of 100 km2. For this purpose, four LULC classification models were developed using the combination of SVM –based image analysis approach (i.e. OBIA and/or PBIA) on high resolution images (WorldView-2 and/or SPOT-6) and the results were subjected to comparisons with one another. Of the four LULC models, the OBIA and WorldView-2 model (overall accuracy 93.2%) was found to be more appropriate and reliable for remote sensing application purposes in this environment.
The OBIA-WorldView-2 LULC model was subjected to spatial overlay analysis with DEM derived topographic variables in order to evaluate the relationship between the spatial distribution of LULC types and topography, particularly for topographically-controlled patterns. It was discovered that although that there are traces of the relationship between the LULC types distributions and topography, it was significantly convoluted due to both natural and anthropogenic forces such that the topographic-induced patterns for most of the LULC types had been substantial disrupted. / LG2017
|
27 |
Big Data and Regional Science: Opportunities, Challenges, and Directions for Future ResearchSchintler, Laurie A., Fischer, Manfred M. January 2018 (has links) (PDF)
Recent technological, social, and economic trends and transformations are contributing to the production of what is usually referred to as Big Data. Big Data, which is typically defined by four dimensions -- Volume, Velocity, Veracity, and Variety -- changes the methods and tactics for using, analyzing, and interpreting data, requiring new approaches for data provenance, data processing, data analysis and modeling, and knowledge representation. The use and analysis of Big Data involves several distinct stages from "data acquisition and recording" over "information extraction" and "data integration" to "data modeling and analysis" and "interpretation", each of which introduces challenges that need to be addressed. There also are cross-cutting challenges, which are common challenges that underlie many, sometimes all, of the stages of the data analysis pipeline. These relate to "heterogeneity", "uncertainty", "scale", "timeliness", "privacy" and "human interaction". Using the Big Data analysis pipeline as a guiding framework, this paper examines the challenges arising in the use of Big Data in regional science. The paper concludes with some suggestions for future activities to realize the possibilities and potential for Big Data in regional science. / Series: Working Papers in Regional Science
|
28 |
Kvalita kmenových dat a datová synchronizace v segmentu FMCG / Master Data Quality and Data Synchronization in FMCGTlučhoř, Tomáš January 2013 (has links)
This master thesis deals with a topic of master data quality at retailers and suppliers of fast moving consumer goods. The objective is to map a flow of product master data in FMCG supply chain and identify what is the cause bad quality of the data. Emphasis is placed on analyzing a listing process of new item at retailers. Global data synchronization represents one of the tools to increase efficiency of listing process and improve master data quality. Therefore another objective is to clarify the cause of low adoption of global data synchronization at Czech market. The thesis also suggests some measures leading to better master data quality in FMCG and expansion of global data synchronization in Czech Republic. The thesis consists of theoretical and practical part. Theoretical part defines several terms and explores supply chain operation and communication. It also covers theory of data quality and its governance. Practical part is focused on objectives of the thesis. Accomplishment of those objectives is based on results of a survey among FMCG suppliers and retailers in Czech Republic. The thesis contributes to enrichment of academic literature that does not focus on master data quality in FMCG and global data synchronization very much at the moment. Retailers and suppliers of FMCG can use the results of the thesis as an inspiration to improve the quality of their master data. A few methods of achieving better data quality are introduced. The thesis has been assigned by non-profit organization GS1 Czech Republic that can use the results as one of the supporting materials for development of next global data synchronization strategy.
|
29 |
Komplexní řízení kvality dat a informací / Towards Complex Data and Information Quality ManagementPejčoch, David January 2010 (has links)
This work deals with the issue of Data and Information Quality. It critically assesses the current state of knowledge within tvarious methods used for Data Quality Assessment and Data (Information) Quality improvement. It proposes new principles where this critical assessment revealed some gaps. The main idea of this work is the concept of Data and Information Quality Management across the entire universe of data. This universe represents all data sources which respective subject comes into contact with and which are used under its existing or planned processes. For all these data sources this approach considers setting the consistent set of rules, policies and principles with respect to current and potential benefits of these resources and also taking into account the potential risks of their use. An imaginary red thread that runs through the text, the importance of additional knowledge within a process of Data (Information) Quality Management. The introduction of a knowledge base oriented to support the Data (Information) Quality Management (QKB) is therefore one of the fundamental principles of the author proposed a set of best
|
30 |
Aplikace metod DZD na otevřená data / Use of data mining techniques for open dataProkůpek, Miroslav January 2015 (has links)
This diploma thesis examines applications of datamining methods to open data. It is realized by solving analytical questions using the LISp-Miner system. Analytical questions are examined in data from The Czech Trade Inspection Authority from the perspective of the data owner. Procedure used to solve analytical questions is 4ft-Miner. There are presented and resolved four analytical questions, which are the results of the work. Work includes a detailed description of the transformation of the relational database into a format suitable for data mining. A detailed description of the data is also included. The theoretical part deals with the GUHA method and CRISP-DM methodology.
|
Page generated in 0.066 seconds