Global ETD Search

11	Real-time road traffic events detection and geo-parsing Kumar, Saurabh 08 August 2018 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / In the 21st century, there is an increasing number of vehicles on the road as well as a limited road infrastructure. These aspects culminate in daily challenges for the average commuter due to congestion and slow moving traffic. In the United States alone, it costs an average US driver $1200 every year in the form of fuel and time. Some positive steps, including (a) introduction of the push notification system and (b) deploying more law enforcement troops, have been taken for better traffic management. However, these methods have limitations and require extensive planning. Another method to deal with traffic problems is to track the congested area in a city using social media. Next, law enforcement resources can be re-routed to these areas on a real-time basis. Given the ever-increasing number of smartphone devices, social media can be used as a source of information to track the traffic-related incidents. Social media sites allow users to share their opinions and information. Platforms like Twitter, Facebook, and Instagram are very popular among users. These platforms enable users to share whatever they want in the form of text and images. Facebook users generate millions of posts in a minute. On these platforms, abundant data, including news, trends, events, opinions, product reviews, etc. are generated on a daily basis. Worldwide, organizations are using social media for marketing purposes. This data can also be used to analyze the traffic-related events like congestion, construction work, slow-moving traffic etc. Thus the motivation behind this research is to use social media posts to extract information relevant to traffic, with effective and proactive traffic administration as the primary focus. I propose an intuitive two-step process to utilize Twitter users' posts to obtain for retrieving traffic-related information on a real-time basis. It uses a text classifier to filter out the data that contains only traffic information. This is followed by a Part-Of-Speech (POS) tagger to find the geolocation information. A prototype of the proposed system is implemented using distributed microservices architecture. Machine Learning Deep Learning Data Mining Distributed Computing
12	Hierarchical Anomaly Detection for Time Series Data Sperl, Ryan E. 07 June 2020 (has links) No description available. Computer Science Information Science time series data anomaly detection moving-average SARIMA streaming data
13	Entropy-driven Clustering of Streaming Data Nagesh Rao, Disha 23 August 2022 (has links) No description available. Computer Science Entropy Streaming Data Clustering merging clusters covariance matrix Gaussian
14	Visualizing Error in Real-Time Video Streaming Data for a Monitoring System Aditya Wardana, I Wayan Kurniawan January 2019 (has links) The aim of this master thesis is to investigate the affordances and limitations of using information visualization methods to visualize errors in real-time video streaming data. The study was carried in Red Bee Media company by following several steps including user research, prototyping, and user evaluation. The user research produced design requirements and basic tasks for the prototype. The prototype had to follow the design requirements and use information visualization techniques to visualize the error data. Next, the prototype was evaluated by 5 expert users, all Red Bee Media employees with 1,5 to 3 years experience of working with the existing Red Bee Media system. The results show the prototype obtained a higher SUS score compared to the Red Bee Media monitoring system. Based on a comparison questionnaire, the prototype also had a better visualization for each basic task compared to Red Bee Media monitoring system. The comments from the user evaluation have been categorized into 4 different labels. Those labels listed several usabilities need to be focused on when developing a video monitoring system. / Syftet med denna masteruppsats är att undersöka möjligheterna och begränsningarna av att använda informationsvisualiseringsmetoder för att visualisera fel i realtidsvideoströmningsdata. Studien genomfördes hos företaget Red Bee Media genom att följa flera steg inklusive användarstudier, framtagning av prototyper och användarutvärdering. Användarstudien gav designkrav och grundläggande uppgifter för prototypen. Prototypen måste följa designkraven och använda informationsvisualiseringstekniker för att visualisera feldata. Därefter utvärderades prototypen av 5 expertanvändare, som är medarbetare inom Red Bee Media med 1,5 till 3 års erfarenhet av att arbeta med det befintliga Red Bee Mediasystemet. Resultaten visar att prototypen erhållit ett högre SUS-poäng jämfört med Red Bee Medias nuvarande övervakningssystem. Genom ett jämförelseformulär erhöll även prototypen en bättre visualisering för varje grundläggande uppgift jämfört med Red Bee Medias övervakningssystem. Kommentarer från användarutvärderingen har kategoriserats i 4 olika kategorier. Dessa anger flertalet användningsområden som måste fokuseras på när ett övervakningssystem utvecklas. Information visualization data visualization video streaming data monitoring system. Computer and Information Sciences Data- och informationsvetenskap
15	Efficient and parallel evaluation of XQuery Li, Xiaogang 22 February 2006 (has links) No description available. Computer Science XQuery XML Streaming Data Data Intensive Computing Restructuring Compiler
16	Predicting Indoor Carbon Dioxide Concentration using Online Machine Learning : Adaptive ventilation control for exhibition halls Carlsson, Filip, Egerhag, Edvin January 2022 (has links) A problem that exhibition halls have is the balance between having good indoor air quality andminimizing energy waste due to the naturally slow decrease of CO2 concentration, which causes Heat-ing, Ventilation and Air-Conditioning systems to keep ventilating empty halls when occupants have leftthe vicinity. Several studies have been made on the topic of CO2 prediction and occupancy predictionbased on CO2 for smaller spaces such as offices and schools. However, few studies have been madefor bigger venues where a larger group of people gather. An online machine learning model using theRiver library was developed to tackle this problem by predicting the CO2 ahead of time. Five datasetswere used for training and predicting, three with real data and two with simulated data. The resultsfrom this model was compared with three already developed traditional models in order to evaluate theperformance of an online machine learning model compared to traditional models. The online machinelearning model was successful in predicting CO2 one hour ahead of time considerably faster than thetraditional models, achieving a r2 score of up to 0.95. CO2 Prediction CO2 Concentration Exhibition Halls Online Machine Learning Machine Learning Streaming Data Computer Systems Datorsystem
17	Erbium : Reconciling languages, runtimes, compilation and optimizations for streaming applications / Erbium : réconcilier les langages, les supports d'exécution, la compilation, et les optimisations pour calculs sur des flux de données Miranda, Cupertino 11 February 2013 (has links) Frappée par les rendements décroissants de la performance séquentielle et les limitations thermiques, l’industrie des microprocesseurs s’est tournée résolument vers les multiprocesseurs sur puce. Ce mouvement a ramené des problèmes anciens et difficiles sous les feux de l’actualité du développement logiciel. Les compilateurs sont l’une des pièces maitresses du puzzle permettant de poursuivre la traduction de la loi de Moore en gains de performances effectifs, gains inaccessibles sans exploiter le parallélisme de threads. Pourtant, la recherche sur les systèmes parallèles s’est concentrée sur les aspects langage et architecture, et le potentiel reste énorme en termes de compilation de programmes parallèles, d’optimisation et d’adaptation de programmes parallèles pour exploiter efficacement le matériel. Cette thèse relève ces défis en présentant Erbium, un langage de bas niveau fondé sur le traitement de flots de données, et mettant en œuvre des communications multi-producteur multi-consommateur ; un exécutif parallèle très efficace pour les architectures x86 et des variantes pour d’autres types d’architectures ; un schéma d’intégration du langage dans un compilateur illustré en tant que représentation intermédiaire dans GCC ; une étude des primitives du langage et de leurs dépendances permettant aux compilateurs d’optimiser des programmes Erbium à l’aide de transformations spécifiques aux programmes parallèles, et également à travers des formes généralisées d’optimisations classiques, telles que l’élimination de redondances partielles et l’élimination de code mort. / As transistors size and power limitations stroke computer industry, hardware parallelism arose as the solution, bringing old forgotten problems back into equation to solve the existing limitations of current parallel technologies. Compilers regain focus by being the most relevant puzzle piece in the quest for the expected computer performance improvements predicted by Moores law no longer possible without parallelism. Parallel research is mainly focused in either the language or architectural aspects, not really giving the needed attention to compiler problems, being the reason for the weak compiler support by many parallel languages or architectures, not allowing to exploit performance to the best. This thesis addresses these problems by presenting: Erbium, a low level streaming data-flow language supporting multiple producer and consumer task communication; a very efficient runtime implementation for x86 architectures also addressing other types of architectures; a compiler integration of the language as an intermediate representation in GCC; a study of the language primitives dependencies, allowing compilers to further optimise the Erbium code not only through specific parallel optimisations but also through traditional compiler optimisations, such as partial redundancy elimination and dead code elimination. Calcul sur des flux de données Représentation intermédiaire Compilation Optimisations Au moment de l'exécution Streaming data-flow Intermediate representation Compilation Optimisations Runtime
18	Approximate Clustering Algorithms for High Dimensional Streaming and Distributed Data Carraher, Lee A. 22 May 2018 (has links) No description available. Computer Engineering data clustering distributed data mining streaming data algorithms locality sensitive hashing count-min cut tree random projection
19	Approximation of OLAP queries on data warehouses Cao, Phuong Thao 20 June 2013 (has links) (PDF) We study the approximate answers to OLAP queries on data warehouses. We consider the relative answers to OLAP queries on a schema, as distributions with the L1 distance and approximate the answers without storing the entire data warehouse. We first introduce three specific methods: the uniform sampling, the measure-based sampling and the statistical model. We introduce also an edit distance between data warehouses with edit operations adapted for data warehouses. Then, in the OLAP data exchange, we study how to sample each source and combine the samples to approximate any OLAP query. We next consider a streaming context, where a data warehouse is built by streams of different sources. We show a lower bound on the size of the memory necessary to approximate queries. In this case, we approximate OLAP queries with a finite memory. We describe also a method to discover the statistical dependencies, a new notion we introduce. We are looking for them based on the decision tree. We apply the method to two data warehouses. The first one simulates the data of sensors, which provide weather parameters over time and location from different sources. The second one is the collection of RSS from the web sites on Internet. [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre OLAP Approximate query answering OLAP data exchange Streaming data Edit distance Sampling algorithm Statistical dependencies Statistical model
20	Automata methods and techniques for graph-structured data Shoaran, Maryam 23 April 2011 (has links) Graph-structured data (GSD) is a popular model to represent complex information in a wide variety of applications such as social networks, biological data management, digital libraries, and traffic networks. The flexibility of this model allows the information to evolve and easily integrate with heterogeneous data from many sources. In this dissertation we study three important problems on GSD. A consistent theme of our work is the use of automata methods and techniques to process and reason about GSD. First, we address the problem of answering queries on GSD in a distributed environment. We focus on regular path queries (RPQs) – given by regular expressions matching paths in graph-data. RPQs are the building blocks of almost any mechanism for querying GSD. We present a fault-tolerant, message-efficient, and truly distributed algorithm for answering RPQs. Our algorithm works for the larger class of weighted RPQs on weighted GSDs. Second, we consider the problem of answering RPQs on incomplete GSD, where different data sources are represented by materialized database views. We explore the connection between “certain answers” (CAs) and answers obtained from “view-based rewritings” (VBRs) for RPQs. CAs are answers that can be obtained on each database consistent with the views. Computing all of CAs for RPQs is NP-hard, and one has to resort to an exponential algorithm in the size of the data–view materializations. On the other hand, VBRs are query reformulations in terms of the view definitions. They can be used to obtain query answers in polynomial time in the size of the data. These answers are CAs, but unfortunately for RPQs, not all of the CAs can be obtained in this way. In this work, we show the surprising result that for RPQs under local semantics, using VBRs to answer RPQs gives all the CAs. The importance of this result is that under such semantics, the CAs can be obtained in polynomial time in the size of the data. Third, we focus on XML–an important special case of GSD. The scenario we consider is streaming XML between exchanging parties. The problem we study is flexible validation of streaming XML under the realistic assumption that the schemas of the exchanging parties evolve, and thus diverge from one another. We represent schemas by using Visibly Pushdown Automata (VPAs), which recognize Visibly Pushdown Languages (VPLs). We model evolution for XML by defining formal language operators on VPLs. We show that VPLs are closed under the defined language operators and this enables us to expand the schemas (for XML) in order to account for flexible or constrained evolution. / Graduate Regular Path Queries (RPQs) Graph Data Distributed Algorithm Query Rewriting Local RPQs Certain Answers XML Schema Evolution Streaming Data Visibly Pushdown Automata (VPAs)

Search results