Global ETD Search

1051	Learning hash codes for multimedia retrieval Chen, Junjie 28 August 2019 (has links) The explosive growth of multimedia data in online media repositories and social networks has led to the high demand of fast and accurate services for large-scale multimedia retrieval. Hashing, due to its effectiveness in coding high-dimensional data into a low-dimensional binary space, has been considered to be effective for the retrieval application. Despite the progress that has been made recently, how to learn the optimal hashing models which can make the best trade-off between the retrieval efficiency and accuracy remains to be open research issues. This thesis research aims to develop hashing models which are effective for image and video retrieval. An unsupervised hashing model called APHash is first proposed to learn hash codes for images by exploiting the distribution of data. To reduce the underlying computational complexity, a methodology that makes use of an asymmetric similarity matrix is explored and found effective. In addition, the deep learning approach to learn hash codes for images is also studied. In particular, a novel deep model called DeepQuan which tries to incorporate product quantization methods into an unsupervised deep model for the learning. Other than adopting only the quadratic loss as the optimization objective like most of the related deep models, DeepQuan optimizes the data representations and their quantization codebooks to explores the clustering structure of the underlying data manifold where the introduction of a weighted triplet loss into the learning objective is found to be effective. Furthermore, the case with some labeled data available for the learning is also considered. To alleviate the high training cost (which is especially crucial given a large-scale database), another hashing model named Similarity Preserving Deep Asymmetric Quantization (SPDAQ) is proposed for both image and video retrieval where the compact binary codes and quantization codebooks for all the items in the database can be explicitly learned in an efficient manner. All the aforementioned hashing methods proposed have been rigorously evaluated using benchmark datasets and found to outperform the related state-of-the-art methods.
1052	A graph database implementation of an event recommender system Olsson, Alexander January 2022 (has links) The internet is larger than ever and so is the amount of information on the internet.The average user on the internet has next to endless possibilities and choices whichcan cause information overload. Companies have therefore developed systems toguide their users to find the right product or object in the form of recommendersystems. Recommender systems are tools created to filter data and find patternsto recommend relevant information for specific customers with the help of differentalgorithms. MarketHype is a company that aggregates large amounts of data aboutevent organizers, their events, their visitors, and related transactions. They want inthe near future to be able to manage and offer event organizers recommended targetgroups for their events using a recommender system.This study tries to find a solution on how to model event data in a graph databaseto support relevant recommendations for event organizers. The method used to answer the question was an empirical research method. The goal was to create aprototype of a recommender system with help of event data. The main focus was tomodel a graph database in the software Neo4j that can be used for finding recommendations with different Cypher queries. A literature study was later conducted tofind what advantages and disadvantages a graph database could have on event data.This information could then answer how further development of the system couldwork.The result was a system that was implemented with the help of data from fourdifferent CSV files. The data provided were information about contacts, persons,orders, and events. This information was used to create the nodes and relationships.A total of 4.4 million nodes were created and around 5 million relationships betweenthose nodes. Collaborative and content-based filtering was the main recommendationtechnique used in order to find the best-suitable recommendations. This was donewith different queries in Cypher.The main conclusion is that a graph database in Neo4j is a good method in orderto implement a recommender system with event data. The result shows that thecollaborative filtering approach is a major factor in the system’s success in findingrelevant information. The approach of letting other contacts decide what the originalcontract wants is proven to work well with event data. The result also states thatthe recommendation is more of an indication because it returns what supposedlywould be the preferences for a contact. A solution for a better recommender systemwas found which includes another layer to the content-based filtering in the form ofcategorized events. Event Recommender system Neo4j Graph database Computer Sciences Datavetenskap (datalogi)
1053	Linked data performance in different databases : Comparison between SQL and NoSQL databases / Prestanda med länkad data i olika databaser : Jämförelse mellan SQL och NoSQL databaser Chavez Alcarraz, Erick, Moraga, Manuel January 2014 (has links) Meepo AB was investigating the possibility of developing a social rating and recommendation service. In a recommendation service, the user ratings are collected in a database, this data is then used in recommendation algorithms to create individual user recommendations. The purpose of this study was to find out which demands are put on a DBMS, database management system, powering a recommendation service, what impact the NoSQL databases have on the performance of recommendation services compared to traditional relational databases, and which DBMS is most suited for storing the data needed to host a recommendation service. Five distinct NoSQL and Relational DBMS were examined, from these three candidates were chosen for a closer comparison. Following a study of recommendation algorithms and services, a test suite was created to compare DBMS performance in different areas using a data set of 100 million ratings. The results show that MongoDB had the best performance in most use cases, while Neo4j and MySQL struggled with queries spanning the whole data set. This paper however never compared performance for real production code. To get a better comparison, more research is needed. We recommend new performance tests for MongoDB and Neo4j using implementations of recommendation algorithms, a larger data set, and more powerful hardware. / Meepo AB undersökte möjligheten att utveckla en social betygs- och rekommendationstjänst. I en rekommendationstjänst samlas användarbetyg i en databas, för att sedan användas i en rekommendationsalgoritm för att skapa individuella rekommendationer till användarna. Syftet med studien var att ta reda på vilka krav som ställs på ett DBMS, databassystem, som driver en rekommendationstjänst, vilken inverkan NoSQL-databaser har på prestandan för rekommendationstjänster jämfört med traditionella relationsdatabaser och vilket DBMS som är mest lämpat för användning i en rekommendation tjänst. Fem olika NoSQL- och Relationsdatabaser undersöktes, från dessa valdes tre kandidater ut för en närmare jämförelse. Efter en studie i rekommendationsalgoritmer och rekommendationstjänster skapades en testsvit för att jämföra databasernas prestanda i olika områden. Till detta användes ett dataset med 100 miljoner betyg. Resultaten visar att MongoDB hade bäst prestanda i flest användningsfall, medan Neo4j och MySQL hade problem med sökningar som sträcker sig över hela datasetet. I denna uppsats jämförs dock inte prestandan med riktig produktionskod. För en bättre jämförelse behövs mer forskning. Vi rekommenderar nya prestandamätningar för MongoDB och Neo4j med implementationer av rekommendationsalgoritmer, ett större dataset och mer kraftfull hårdvara. Database DBMS Performance NoSQL SQL Data Computer Engineering Datorteknik
1054	Communication system between screwdrivers and asphalt rollers Thorstensson, Simon, Sterner, Anton January 2018 (has links) A company producing mostly asphalt rollers has a hard time quality assuring their bolted connections. There is an assembly line at the company where the different parts are put together using screwdrivers. These screwdrivers therefore need to be properly calibrated, to meet the high standards on the bolted connections. The purpose of this research has been to come up with a cost-effective solution for the manufacturing company, so that they can digitally keep track of their screwdrivers and determine the need for calibration. The aim is to achieve full control over the quality of the connected screw-joints. A lot of the research was focused around Industry 4.0 and IoT (Internet of Things) which are two of the most recognized buzzwords in production industries today. A product development process was used to develop the proposed solution that would help solve the identified problems. The result of the research and the product development process is an Excel based database where information about screwdrivers are held. The database can determine when there is a need for a screwdriver to be calibrated based on the total amount of performed tightenings for each screwdriver. The total amount of performed tightenings are determined by the total amount of produced asphalt rollers which is retrieved from a production software within the company. The database also stores timestamps for calibration and service intervals, for each screwdriver. The initial research was focused on new screwdrivers, however, inventorying of the screwdrivers made it clear that most of the screwdrivers in the manufacturing line only had the basic functions. With this information, the focus shifted to include both new and old screwdrivers. The final solution uses existing information from the company to determine when calibration is needed. Most manufacturing companies today still uses old tools, this solution can help them transition into a more digitalized industry. This solution doesn’t require tools to be compatible with a software while still offering the modern functions of newer screwdrivers. Product development database screwdriver Industry 4.0 calibration Mechanical Engineering Maskinteknik
1055	Social media analytics and the role of twitter in the 2014 South Africa general election: a case study Singh, Asheen January 2018 (has links) A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science., University of the Witwatersrand, Johannesburg, 2018 / Social network sites such as Twitter have created vibrant and diverse communities in which users express their opinions and views on a variety of topics such as politics. Extensive research has been conducted in countries such as Ireland, Germany and the United States, in which text mining techniques have been used to obtain information from politically oriented tweets. The purpose of this research was to determine if text mining techniques can be used to uncover meaningful information from a corpus of political tweets collected during the 2014 South African General Election. The Twitter Application Programming Interface was used to collect tweets that were related to the three major political parties in South Africa, namely: the African National Congress (ANC), the Democratic Alliance (DA) and the Economic Freedom Fighters (EFF). The text mining techniques used in this research are: sentiment analysis, clustering, association rule mining and word cloud analysis. In addition, a correlation analysis was performed to determine if there exists a relationship between the total number of tweets mentioning a political party and the total number of votes obtained by that party. The VADER (Valence Aware Dictionary for sEntiment Reasoning) sentiment classifier was used to determine the public’s sentiment towards the three main political parties. This revealed an overwhelming neutral sentiment of the public towards the ANC, DA and EFF. The result produced by the VADER sentiment classifier was significantly greater than any of the baselines in this research. The K-Means cluster algorithm was used to successfully cluster the corpus of political tweets into political-party clusters. Clusters containing tweets relating to the ANC and EFF were formed. However, tweets relating to the DA were scattered across multiple clusters. A fairly strong relationship was discovered between the number of positive tweets that mention the ANC and the number of votes the ANC received in election. Due to the lack of data, no conclusions could be made for the DA or the EFF. The apriori algorithm uncovered numerous association rules, some of which were found to be interest- ing. The results have also demonstrated the usefulness of word cloud analysis in providing easy-to-understand information from the tweet corpus used in this study. This research has highlighted the many ways in which text mining techniques can be used to obtain meaningful information from a corpus of political tweets. This case study can be seen as a contribution to a research effort that seeks to unlock the information contained in textual data from social network sites. / MT 2018 Data Mining Database management Information storage and retrieval system
1056	Analyzing Sensitive Data with Local Differential Privacy Tianhao Wang (10711713) 30 April 2021 (has links) <div>Vast amounts of sensitive personal information are collected by companies, institutions and governments. A key technological challenge is how to effectively extract knowledge from data while preserving the privacy of the individuals involved. In this dissertation, we address this challenge from the perspective of privacy-preserving data collection and analysis. We focus on investigation of a technique called local differential privacy (LDP) and studied several aspects of it. </div><div><br></div><div><br></div><div>In particular, the thesis serves as a comprehensive study of multiple aspects of the LDP field. We investigated the following seven problems: (1) We studied LDP primitives, i.e., the basic mechanisms that are used to build LDP protocols. (2) We then studied the problem when the domain size is very big (e.g., larger than $2^{32$), where finding the values with high frequency is a challenge, because one needs to enumerate through all values. (3) Another interesting setting is when each user possesses a set of values, instead of a single private value. (4) With the basic problems visited, we then aim to make the LDP protocols practical for real-world scenarios. We investigated the case where each user's data is high-dimensional (e.g., in the census survey, each user has multiple questions to answer), and the goal is to recover the joint distribution among the attributes. (5) We also built a system for companies to issue SQL queries over the data protected under LDP, where each user is associated with some public weights and holds some private values; an LDP version of the values is sent to the server from each user. (6) To further increase the accuracy of LDP, we study how to add post-processing steps to protocols to make them consistent while achieving high accuracy for a wide range of tasks, including frequencies of individual values, frequencies of the most frequent values, and frequencies of subsets of values. (7) Finally, we investigate a different model of LDP which is called the shuffler model. While users still use LDP algorithms to report their sensitive data, now there exists a semi-trusted shuffler that shuffles the users' reports and then send them to the server. This model provides better utility but at the cost of requiring more trust that the shuffler should not collude with the server.</div> Computer System Security Data Structures Database Management Local Differential privacy
1057	Evaluation of CockroachDB in a cloud-native environment Håkansson, Kristina, Rosenqvist, Andreas January 2021 (has links) The increased demand for using large databases that scale easily and stay consistent requires service providers to find new solutions for storing data in databases. One solution that has emerged is cloud-native databases. Service providers who effectively can transit to cloud-native databases will benefit from new enterprise applications, industrial automation, Internet of Things (IoT) as well as consumer services, such as gaming and AR/VR. This consequently changes the requirements on a database's architecture and infrastructure in terms of being compatible with the services deployed in a cloud-native environment - this is where CockroachDB comes into the picture. CockroachDB is relatively new and is built from the ground up to run in a cloud-native environment. It is built up with nodes that work as individual machines, and these nodes form a cluster. The authors of this report aim to evaluate the characteristics of the Cockroach database to get an understanding of what it offers to companies that are in a cloud-infrastructure transition phase. For the scope of characteristics, this report is focusing on performance, throughput, stress-test, version hot-swapping, horizontal-/vertical scaling, and node disruptions. To do this, a CockroachDB database was deployed on a Kubernetes cluster, in which simulated traffic was conducted. For the throughput measurement, the TPC-C transaction processing benchmark was used. For scaling, version hot-swapping, and node disruptions, an experimental method was performed. The result of the study confirms the expected outcome. CockroachDB does in fact scale easily, both horizontally and vertically, with minimal effort. It also shows that the throughput remains the same when the cluster is scaled up and out since CockroachDB does not have a master write-node, which is the case with some other databases. CockroachDB also has built-in functionality to handle configuration changes like version hot-swapping and node disruptions. This study concluded that CockroachDB lives up to its promises regarding the subjects handled in the report, and can be seen as a robust, easily scalable database that can be deployed in acloud-native environment. Cloud-native database CockroachDB Performance Scalability Software Engineering Programvaruteknik
1058	Henry Stewart Talks: The Biomedical & Life Sciences Collection Weyant, Emily C., Woodward, Nakia J. 01 January 2021 (has links) Henry Stewart Talks: The Biomedical & Life Sciences Collection is a subscription database containing a variety of lectures on basic science and medical topics. Lectures in this database may be used as a supplement to existing college courses. Additional features of Henry Stewart Talks include several full courses available to faculty upon request and syllabus assistance to link course goals to lectures in the database. Other aspects of Henry Stewart Talks include evidence and expert transparency and ADA compliance of content. database lecture librarianship life science medicine video Medical Library
1059	LabVIEW™ Database Interfacing For Robotic Control Gebregziabher, Netsanet 26 July 2006 (has links) Submitted to the faculty of the School of Informatics in partial fulfillment of the requirements for the degree Master of Science in Chemical Informatics (Laboratory Informatics Specialization)Indiana University May 2006 / The Zymark™ System is a lab automation workstation that uses the Caliper Life Sciences (Hopkinton, MA) Zymate XP robot. At Indiana University-Purdue University Indianapolis, a Zymate is used in a course, INFO I510 Data Acquisition and Laboratory Automation, to demonstrate the fundamentals of laboratory robotics. This robot has been re-engineered to function with National Instruments™ graphical software program LabVIEW™. LabVIEW is an excellent tool for robotic control. Based on changing conditions, it is able to dynamically use data from any source to modify the operating parameters of a robot. For dynamically changing information, storage of that information must be readily accessible. For example, there is a need to continuously store and update the calibration data of the robot, populate the setting of each axis and positioning inside the workplace, and also store robot positioning information. This can be achieved by using a database which allows for robotic control data to be easily searched and accessed. To address this need, an interface was developed which would allow full, dynamic communication between any LabVIEW program (called “virtual instruments,” or VIs) and the database. This has been accomplished by developing a set of subVIs that can be dropped into the calling robotic control VIs. With these subVIs, a user has the ability to create table and column information, delete a table, retrieve table information by clicking a particular table name on the user interface, or query using any SQL-specific combination of columns or tables within the database. For robot functionality, subVIs were created to store and retrieve data such as calibration data points and regression calculations. / Chemical Informatics robotics database informatics control calibration data points regression calculations
1060	Building a Database with Background Equivalent Concentrations to Predict Spectral Overlaps in ICP-MS Liu, Fang 18 May 2017 (has links) No description available. Chemistry ICP-MS spectral overlap database polyatomic ion

Search results