Global ETD Search

61	Data Management and Wireless Transport for Large Scale Sensor Networks Li, Ming 01 September 2010 (has links) Today many large scale sensor networks have emerged, which span many different sensing applications. Each of these sensor networks often consists of millions of sensors collecting data and supports thousands of users with diverse data needs. Between users and wireless sensors there are often a group of powerful servers that collect and process data from sensors and answer users' requests. To build such a large scale sensor network, we have to answer two fundamental research problems: i) what data to transmit from sensors to servers? ii) how to transmit the data over wireless links? Wireless sensors often can not transmit all collected data due to energy and bandwidth constraints. Therefore sensors need to decide what data to transmit to best satisfy users' data requests. Sensor network users can often tolerate some data errors, thus sensors may transmit data in lower fidelity but still satisfy users' requests. There are generally two types of requests-raw data requests and meta-data requests. To answer users' raw data requests, we propose a model-driven data collection approach, PRESTO. PRESTO splits intelligence between sensors and servers, i.e., resource-rich servers perform expensive model training and resource-poor sensors perform simple model evaluation. PRESTO can significantly reduce data to be transmitted without sacrificing service quality. To answer users' meta-data request, we propose a utility-driven multi-user data sharing approach, MUDS. MUDS uses utility function to unify diverse meta-data metrics. Sensors estimate utility value of each data packet and sends packets with highest utility first to improve overall system utility. After deciding what data to transmit from sensors to servers, the next question is how to transmit these data over wireless links. Wireless transport often suffers low bandwidth and unstable connectivity. In order to improve wireless transport, I propose a clean-slate re-design of wireless transport, Hop. Hop uses reliable perhop block transfer as a building block and builds all other components including hidden-terminal avoidance, congestion avoidance, and end-to-end reliability on top of it. Hop is built based on three key ideas: a) hop-by-hop transfer adapts to the lossy and highly variable nature of wireless channel significantly better than end-to-end transfer, b) the use of blocks as the unit of control is more efficient over wireless links than the use of packets, and c) the duplicated functionality in different layers in the network stack should be removed to simplify the protocol and avoid complex interaction. data management sensor networks wireless networks Computer Sciences
62	Data Cleaning with Minimal Information Disclosure Gairola, Dhruv 11 1900 (has links) Businesses analyze large datasets in order to extract valuable insights from the data. Unfortunately, most real datasets contain errors that need to be corrected before any analysis. Businesses can utilize various data cleaning systems and algorithms to automate the correction of data errors. Many systems correct the data errors by using information present within the dirty dataset itself. Some also incorporate user feedback in order to validate the quality of the suggested data corrections. However, users are not always available for feedback. Hence, some systems rely on clean data sources to help with the data cleaning process. This involves comparing records between the dirty dataset and the clean dataset in order to detect high quality fixes for the erroneous data. Every record in the dirty dataset is compared with every record in the clean dataset in order to find similar records. The values of the records in the clean dataset can be used to correct the values of the erroneous records in the dirty dataset. Realistically, comparing records across two datasets may not be possible due to privacy reasons. For example, there are laws to restrict the free movement of personal data. Additionally, different records within a dataset may have different privacy requirements. Existing data cleaning systems do not factor in these privacy requirements on the respective datasets. This motivates the need for privacy aware data cleaning systems. In this thesis, we examine the role of privacy in the data cleaning process. We present a novel data cleaning framework that supports the cooperation between the clean and the dirty datasets such that the clean dataset discloses a minimal amount of information and the dirty dataset uses this information to (maximally) clean its data. We investigate the tradeoff between information disclosure and data cleaning utility, modelling this tradeoff as a multi-objective optimization problem within our framework. We propose four optimization functions to solve our optimization problem. Finally, we perform extensive experiments on datasets containing up to 3 million records by varying parameters such as the error rate of the dataset, the size of the dataset, the number of constraints on the dataset, etc and measure the impact on accuracy and performance for those parameters. Our results demonstrate that disclosing a larger amount of information within the clean dataset helps in cleaning the dirty dataset to a larger extent. We find that with 80% information disclosure (relative to the weighted optimization function), we are able to achieve a precision of 91% and a recall of 85%. We also compare our algorithms against each other to discover which ones produce better data repairs and which ones take longer to find repairs. We incorporate ideas from Barone et al. into our framework and show that our approach is 30% faster, but 7% worse for precision. We conclude that our data cleaning framework can be applied to real-world scenarios where controlling the amount of information disclosed is important. / Thesis / Master of Computer Science (MCS) / Businesses analyze large datasets in order to extract valuable insights from the data. Unfortunately, most real datasets contain errors that need to be corrected before any analysis. Businesses can utilize various data cleaning systems and algorithms to automate the correction of data errors. Many systems correct the data errors by using information present within the dirty dataset itself. Some also incorporate user feedback in order to validate the quality of the suggested data corrections. However, users are not always available for feedback. Hence, some systems rely on clean data sources to help with the data cleaning process. This involves comparing records between the dirty dataset and the clean dataset in order to detect high quality fixes for the erroneous data. Every record in the dirty dataset is compared with every record in the clean dataset in order to find similar records. The values of the records in the clean dataset can be used to correct the values of the erroneous records in the dirty dataset. Realistically, comparing records across two datasets may not be possible due to privacy reasons. For example, there are laws to restrict the free movement of personal data. Additionally, different records within a dataset may have different privacy requirements. Existing data cleaning systems do not factor in these privacy requirements on the respective datasets. This motivates the need for privacy aware data cleaning systems. In this thesis, we examine the role of privacy in the data cleaning process. We present a novel data cleaning framework that supports the cooperation between the clean and the dirty datasets such that the clean dataset discloses a minimal amount of information and the dirty dataset uses this information to (maximally) clean its data. We investigate the tradeoff between information disclosure and data cleaning utility, modelling this tradeoff as a multi-objective optimization problem within our framework. We propose four optimization functions to solve our optimization problem. Finally, we perform extensive experiments on datasets containing up to 3 million records by varying parameters such as the error rate of the dataset, the size of the dataset, the number of constraints on the dataset, etc and measure the impact on accuracy and performance for those parameters. Our results demonstrate that disclosing a larger amount of information within the clean dataset helps in cleaning the dirty dataset to a larger extent. We find that with 80% information disclosure (relative to the weighted optimization function), we are able to achieve a precision of 91% and a recall of 85%. We also compare our algorithms against each other to discover which ones produce better data repairs and which ones take longer to find repairs. We incorporate ideas from Barone et al. into our framework and show that our approach is 30% faster, but 7% worse for precision. We conclude that our data cleaning framework can be applied to real-world scenarios where controlling the amount of information disclosed is important. Data cleaning Data management Data privacy Information theory
63	Realizing and Satisfying Informational Requirements throughout the Customer Journey: A Case Study on the Industrial Manufacturing Industry Santos, Kenneth, Törnros, Rasmus January 2023 (has links) This thesis examines the informational requirements of customers throughout the customer journey within the industry for industrial facilitating goods and explores how to manage data to meet these requirements. The research adopts a qualitative research design with a case study approach, using semi-structured interviews supplemented with secondary survey data. Thematic analysis and descriptive and correlational analysis were employed to analyze data. The study identifies digital touchpoints as crucial areas for understanding data requirements and data exchange and highlights the importance of data quality and searchability in enhancing customer satisfaction. The research findings simultaneously emphasize the need for a balance between physical and digital touchpoints and the significance of qualitative human interactions in generating positive customer experience. Managerial implications include the importance of updating and delivering the appropriate data sets for different customer roles, investing in business web-presence to facilitate effective customer interactions, and maintaining a balance between physical and digital touchpoints. Customer Journey Data Management Industrial Manufacturing Marketing Business Administration Företagsekonomi
64	Enhanced Bitmap Indexes for Large Scale Data Management Canahuate, Guadalupe M. 08 September 2009 (has links) No description available. bitmap index scientific data management large scale indexing
65	Web-Based Platform for Force Main Infrastructure Asset Management Dasari, Vamsi Mohan Bhaskar 13 August 2016 (has links) Asset management of force main infrastructure entails accurate prediction of the condition of the system to operate and maintain at the lowest overall costs. In this thesis report, guidelines for asset management of force main infrastructure is provided by synthesizing the trends observed in the inspection, condition assessment and renewal engineering strategies. Furthermore, this thesis focuses on development of a centralized web-based platform for advanced asset management of force main infrastructure. The key components involved in this comprehensive asset management of the force main infrastructure are data management, model implementation and information visualization. The thesis depicts various aspects involved in developing a web-based application for utilities that store, collect and analyze the data in dissimilar methods. A risk assessment model employed by a utility to prioritize the assets for renewal is demonstrated with various utilities' data. Consequently, the model is published as geo-processing services through ESRI ArcGIS Server. A visualization tool is developed for individual utilities that interacts with the geo-processing services and renders a web-based interactive map to visualize the model results. A drupal website (www.pipeid.org) is developed to support the data collection and model dissemination process. / Master of Science Force Mains Asset Management Web-based Platform GIS Data Management
66	An Infrastructure to Support Usability Problem Data Analysis Howarth, Jonathan R. 18 May 2004 (has links) Increasing the usability of software by integrating usability engineering into the development cycle has become common practice. Although usability engineering is effective, it can be expensive, and organizations want to receive the best possible returns on their investments. Oftentimes, however, organizations spend large sums of money collecting usability problem data through activities such as usability testing, but do not receive acceptable returns on those investments during redesign. The primary reason is that there is an almost complete lack of methods and tools for usability problem data analysis to transform raw usability data into effective inputs for developers. In this thesis, we develop an infrastructure for usability problem data analysis to address the need for better returns on usability engineering investments. The infrastructure consists of four main components: a framework, a process, tools, and semantic analysis technology. Embedded within the infrastructure is the User Action Framework, a conceptual framework of usability concepts, which is used to organize usability data. The process addresses extraction of usability problems from raw usability data, diagnosis of problems according to usability concepts, and reporting of problems in a form that is usable by developers. The tools leverage the framework and guide practitioners through the process, while the semantic analysis technology supplements the capabilities of the tools to automate parts of the process. / Master of Science usability problem analysis usability problem data management usability engineering
67	Evaluating and Enhancing FAIR Compliance in Data Resource Portal Development Yiqing Qu (18437745) 01 May 2024 (has links) <p dir="ltr">There is a critical need for improvement in scientific data management when the big-data era arrives. Motivated by the evolution and significance of FAIR principles in contemporary research, the study focuses on the development and evaluation of a FAIR-compliant data resource portal. The challenge lies in translating the abstract FAIR principles into actionable, technological implementations and the evaluation. After baseline selection, the study aims to benchmark standards and outperform existing FAIR compliant data resource portals. The proposed approach includes an assessment of existing portals, the interpretation of FAIR principles into practical considerations, and the integration of modern technologies for the implementation. With a FAIR-ness evaluation framework designed and applied to the implementation, this study evaluated and improved the FAIR-compliance of data resource portal. Specifically, the study identified the need for improved persistent identifiers, comprehensive descriptive metadata, enhanced metadata access methods and adherence to community standards and formats. The evaluation of the FAIR-compliant data resource portal with FAIR implementation, showed a significant improvement in FAIR compliance, and eventually enhanced data discoverability, usability, and overall management in academic research.</p> High performance computing FAIR Principles Research Data Management FAIR Compliance
68	Usability Issues within Technical Data Management Systems Dersche, Klara Maria, Nord, Philip January 2019 (has links) The purpose of this thesis is to explore and study the usability issues within Technical Data Management Systems (TDMS). The research has been conducted as a single case study at the gardening and landscape maintenance company Husqvarna. The inductive research led to conducting 10 interviews, 2 expert focus groups and a observational study. An artefact was produced during the research to emulate a potential system. During the research, the researchers identified ten heuristic usability issues within TDMS. Fur- thermore the functional and non-functional needs of Husqvarna have been identified. The artefact was created, based on existing usability guidelines, addressing the usability issues and the needs of Husqvarna. The artefact was used to answer if the applied guidelines have solved the identified usability issues. The conclusion was set, that the applied guidelines had solved the identified issues. With the research being conducted with a single case study, the result may lack generalisability. Future researchers are encouraged to conduct a multiple case study to further identify issues within the research area. usability usability issues usability guidelines data management systems technical data management systems user interface design Engineering and Technology Teknik och teknologier
69	Data management plan: Good housekeeping or a bureaucratic exercise? : Data management in digital humanities projects at Uppsala University Margeti, Anneta January 2023 (has links) Introduction. Research data management is a topic of ongoing discussion, particularly in academic institutions, where researchers strive to effectively handle diverse types of data. This study examines the practices of research data management in selected digital humanities projects at Uppsala University. The objective is to as- sess the impact that data management plans (DMPs) on these interdisciplinary projects and evaluate the applica- tion of the FAIR guiding principles. It is crucial to consider the researchers’ perspective on this matter. Universi- ties could invest in robust data management practices by taking into account the needs and skills of researchers. Method. Semi-structured interviews were conducted using purposive sampling targeting researchers from various departments within the Faculty of Arts who were involved in interdisciplinary digital humanities pro- jects. Eight interviews were carried out with principal investigators (PIs) and researchers. Analysis. The interviews, along with the provided DMPs, were thematically analysed to address the re- search questions regarding the effect of DMPs in the selected projects. Results. The study findings indicate that the PIs and researchers do not perceive the DMP as an integral part of their research work in digital humanities projects. Nonetheless, most participants recognise its signifi- cance and its role could be enhanced in research projects. Challenges typically arise during stages of the research data life cycle, such as data analysis, rather than in the development of the DMP itself. Moreover, the practical implementation of the FAIR principles often poses difficulties due to variations in data types and project goals. Conclusion. The results of this study highlight the need for more actionable DMPs in digital humanities projects and further training for researchers on data management issues. The interdisciplinary nature of these projects facilitates collaboration among researchers in the development of DMPs. Data management plan Data management Digital humanities Interdisciplinarity Fair principles Open Science Social constructivism. Humanities and the Arts Humaniora och konst
70	Byråkratisk pålaga eller användbart verktyg? En studie om forskares syn på datahanteringsplaner / Bureaucratic burden or useful tool? A study of researchers’ views on data management plans Jonsson, Björn January 2024 (has links) Introduction. This thesis analyzes how researchers views the data management plans, how they use them and how they interact with the data support functions at the university. Theory & Method. The theoretical underpinnings of the study are Janken Myrdals theory of the research method as signifier of natural or humanistic research, and the research method is semi-structured interviews. The study’s empirical material consists of six interviews with researchers from four different scientific fields, which have been transcribed and processed through a qualitative thematic and comparative analysis. Results & Analysis. Five main themes have been identified: (1) bureaucratic demands, (2) a lack of support or a lack of interest?, (3) the data management plans connection to ethical review, (4) the impact of the method on the view of the data management plan, (5) time constraints and the plan as support in the research work. Conclusions. Researchers experience of working with data management plans does not give an unambiguous picture: some see it as an administrative burden or as something of a time sink. Others view it as something that benefits both their own work and when collaborating with colleagues. Moreover divided views exist regarding the support offered by the university when working with data management plans. Some respondents view it as adequate while others suggest that researchers needs to inform themselves on how to write the plan. Data management plan research data management researchers ethical review Datahanteringsplan forskningsdatahantering forskare etikprövning Information Studies Biblioteks- och informationsvetenskap

Search results