1 |
Forecasting Trajectory Data : A study by ExperimentationKamisetty Jananni Narasimha, Shiva Sai Sri Harsha Vardhan January 2017 (has links)
Context. The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data. Such spatial trajectory data accumulated by telecommunication operators is huge, analyzing the data with a right tool or method can uncover patterns and connections which can be used for improving telecom services. Forecasting trajectory data or predicting next location of users is one of such analysis. It can be used for producing synthetic data and also to determine the network capacity needed for a cell tower in future. Objectives. The objectives of this thesis is, Firstly, to have a new application for CWT (Collapsed Weighted Tensor) method. Secondly, to modify the CWT method to predict the location of a user. Thirdly, to provide a suitable method for the given Telenor dataset to predict the user’s location over a period of time. Methods. The thesis work has been carried out by implementing the modified CWT method. The predicted location obtained by modified CWT cannot be determined to which time stamp it belongs as the given Telenor dataset contains missing time stamps. So, the modified CWT method is implemented in two different methods. Replacing missing values with first value in dataset. Replacing missing values with second value in dataset. These two methods are implemented and determined which method can predict the location of users with minimal error. Results. The results are carried by assuming that the given Telenor dataset for one week will be same as that for the next week. Users are selected in a random sample and above mentioned methods are performed. Furthermore, RMSD values and computational time are calculated for each method and selected users. Conclusion. Based on the analysis of the results, Firstly, it can be concluded that CWT method have been modified and used for predicting the user’s location for next time stamp. Secondly, the method can be extended to predict over a period of time. Finally, modified CWT method predicts location of the user with minimal error when missing values are replaced by first value in the dataset.
|
2 |
Da modelagem conceitual à representação lógica de trajetórias em SGBDOR e sistemas de DW / From conceptual modeling to logical representation of trajectories in SGBDOR and DW systemsLeal, Bruno de Carvalho January 2011 (has links)
LEAL, Bruno de Carvalho. Da modelagem conceitual à representação lógica de trajetórias em SGBDOR e sistemas de DW. 2011. 120 f. Dissertação (Mestrado em ciência da computação)- Universidade Federal do Ceará, Fortaleza-CE, 2011. / Submitted by Elineudson Ribeiro (elineudsonr@gmail.com) on 2016-07-08T19:40:27Z
No. of bitstreams: 1
2011_dis_bcleal.pdf: 2151043 bytes, checksum: 6cb423b35ccbf999cc937ddda41507be (MD5) / Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2016-07-14T15:22:29Z (GMT) No. of bitstreams: 1
2011_dis_bcleal.pdf: 2151043 bytes, checksum: 6cb423b35ccbf999cc937ddda41507be (MD5) / Made available in DSpace on 2016-07-14T15:22:29Z (GMT). No. of bitstreams: 1
2011_dis_bcleal.pdf: 2151043 bytes, checksum: 6cb423b35ccbf999cc937ddda41507be (MD5)
Previous issue date: 2011 / Com o aumento do número de dispositivos móveis equipados com serviços de localização geográfica, tem se tornado cada vez mais economicamente e tecnicamente possível capturar os percursos (i.e. trajetórias) dos objetos móveis. Muitas aplicações interessantes têm sido desenvolvida com intuito de explorar análises de trajetórias de objetos móveis. Por exemplo, em sistemas de gerenciamento de veículos de entrega, pode ser realizado tanto o monitoramento dos veículos quanto análises para apoio a decisões estratégicas. De modo geral, as trajetórias podem ser analisadas em duas perspectivas: tempo real e histórica. Além disso, aplicações de trajetórias compartilham uma necessidade em comum que é o registro mais estruturado do movimento. Isso permite manipular trajetórias como objetos de primeira classe e adicionar qualquer semântica requerida pela aplicação e, também, a criação de métodos robustos e eficientes para agregar conjuntos de trajetórias de forma a permitir a realização de análises complexas. Este trabalho estende um trabalho anterior na modelagem conceitual de trajetórias pela generalização da ideia de paradas e movimentos e pela definição de um conjunto de funções de agregação para trajetórias. Neste trabalho é proposto, ainda, duas abordagens por modelagem, ambas baseadas em meta-esquemas, para elaboração de esquemas de trajetórias para ambiente transacional e multidimensional. Para demonstrar e provar nossas contribuições apresentamos um caso de estudo real sobre trajetórias de caminhões de entrega. Os resultados experimentais demonstram que as abordagens de modelagem oferecem a flexibilidade necessária para lidar com a complexidade da semântica das trajetórias em análises de tempo real e histórica.
|
3 |
Uma abordagem distribuÃda para preservaÃÃo de privacidade na publicaÃÃo de dados de trajetÃria / A distributed approach for privacy preservation in the publication of trajectory dataFelipe Timbà Brito 17 December 2015 (has links)
AvanÃos em tÃcnicas de computaÃÃo mÃvel aliados à difusÃo de serviÃos baseados em localizaÃÃo tÃm gerado uma grande quantidade de dados de trajetÃria. Tais dados podem ser utilizados para diversas finalidades, tais como anÃlise de fluxo de trÃfego, planejamento de infraestrutura, entendimento do comportamento humano, etc. No entanto, a publicaÃÃo destes dados pode levar a sÃrios riscos de violaÃÃo de privacidade. Semi-identificadores sÃo pontos de trajetÃria que podem ser combinados com informaÃÃes externas e utilizados para identificar indivÃduos associados à sua trajetÃria. Por esse motivo, analisando semi-identificadores, um usuÃrio malicioso pode ser capaz de restaurar trajetÃrias anonimizadas de indivÃduos por meio de aplicaÃÃes de redes sociais baseadas em localizaÃÃo, por exemplo. Muitas das abordagens jà existentes envolvendo anonimizaÃÃo de dados foram propostas para ambientes de computaÃÃo centralizados, assim elas geralmente apresentam um baixo desempenho para anonimizar grandes conjuntos de dados de trajetÃria. Neste trabalho propomos uma estratÃgia distribuÃda e eficiente que adota o modelo de privacidade km-anonimato e utiliza o escalÃvel paradigma MapReduce, o qual permite encontrar semi-identificadores em um grande volume de dados. NÃs tambÃm apresentamos uma tÃcnica que minimiza a perda de informaÃÃo selecionando localizaÃÃes chaves a serem removidas a partir do conjunto de semi-identificadores. Resultados de avaliaÃÃo experimental demonstram que nossa soluÃÃo de anonimizaÃÃo à mais escalÃvel e eficiente que trabalhos jà existentes na literatura. / Advancements in mobile computing techniques along with the pervasiveness of location-based services have generated a great amount of trajectory data. These data can be used for various data analysis purposes such as traffic flow analysis, infrastructure planning, understanding of human behavior, etc. However, publishing this amount of trajectory data may lead to serious risks of privacy breach. Quasi-identifiers are trajectory points that can be linked to external information and be used to identify individuals associated with trajectories. Therefore, by analyzing quasi-identifiers, a malicious user may be able to trace anonymous trajectories back to individuals with the aid of location-aware social networking applications, for example. Most existing trajectory data anonymization approaches were proposed for centralized computing environments, so they usually present poor performance to anonymize large trajectory data sets. In this work we propose a distributed and efficient strategy that adopts the $k^m$-anonymity privacy model and uses the scalable MapReduce paradigm, which allows finding quasi-identifiers in larger amount of data. We also present a technique to minimize the loss of information by selecting key locations from the quasi-identifiers to be suppressed. Experimental evaluation results demonstrate that our proposed approach for trajectory data anonymization is more scalable and efficient than existing works in the literature.
|
4 |
Da modelagem Conceitual à RepresentaÃÃo LÃgica de TrajetÃrias em SGBDOR e Sistemas de DW / From Conceptual Modeling to Logical Representation of Trajectories in SGBDOR and DW SystemsBruno de Carvalho Leal 12 August 2011 (has links)
CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Com o aumento do nÃmero de dispositivos mÃveis equipados com serviÃos de localizaÃÃo geogrÃfica, tem se tornado cada vez mais economicamente e tecnicamente possÃvel capturar os percursos (i.e. trajetÃrias) dos objetos mÃveis. Muitas aplicaÃÃes interessantes tÃm sido desenvolvida com intuito de explorar anÃlises de trajetÃrias de objetos mÃveis. Por exemplo, em sistemas de gerenciamento de veÃculos de entrega, pode ser realizado tanto o monitoramento dos veÃculos quanto anÃlises para apoio a decisÃes estratÃgicas. De modo geral, as trajetÃrias podem ser analisadas em duas perspectivas: tempo real e histÃrica. AlÃm disso, aplicaÃÃes de trajetÃrias compartilham uma necessidade em comum que à o registro mais estruturado do movimento. Isso permite manipular trajetÃrias como objetos de primeira classe e adicionar qualquer semÃntica requerida pela aplicaÃÃo e, tambÃm, a criaÃÃo de mÃtodos robustos e eficientes para agregar conjuntos de trajetÃrias de forma a permitir a realizaÃÃo de anÃlises complexas. Este trabalho estende um trabalho anterior na modelagem conceitual de trajetÃrias pela generalizaÃÃo da ideia de paradas e movimentos e pela definiÃÃo de um conjunto de funÃÃes de agregaÃÃo para trajetÃrias. Neste trabalho à proposto, ainda, duas abordagens por modelagem, ambas baseadas em meta-esquemas, para elaboraÃÃo de esquemas de trajetÃrias para ambiente transacional e multidimensional. Para demonstrar e provar nossas contribuiÃÃes apresentamos um caso de estudo real sobre trajetÃrias de caminhÃes de entrega. Os resultados experimentais demonstram que as abordagens de modelagem oferecem a flexibilidade necessÃria para lidar com a complexidade da semÃntica das trajetÃrias em anÃlises de tempo real e histÃrica.
|
5 |
Algoritmos de calibração e segmentação de trajetórias de objetos móveis com critérios não-supervisionado e semi-supervisionadoSOARES JÚNIOR, Amílcar 10 March 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2017-07-12T13:16:29Z
No. of bitstreams: 2
license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5)
tese_doutorado_amilcar-07-2016_versao-cd (1).pdf: 2101060 bytes, checksum: 21d268c59ad60238bce0cde073e6f3cd (MD5) / Made available in DSpace on 2017-07-12T13:16:29Z (GMT). No. of bitstreams: 2
license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5)
tese_doutorado_amilcar-07-2016_versao-cd (1).pdf: 2101060 bytes, checksum: 21d268c59ad60238bce0cde073e6f3cd (MD5)
Previous issue date: 2016-03-10 / A popularização de tecnologias de captura de dados geolocalizados aumentou a quantidade de dados de trajetórias disponível para análise. Trajetórias de objetos móveis são geradas a partir das posições de um objeto que se move durante um certo intervalo de tempo no espaço geográfico. Para diversas aplicações é necessário que as trajetórias sejam divididas em partições menores, denominadas segmentos, que representam algum comportamento relevante para a aplicação. A literatura reporta diversos trabalhos que propõem a segmentação de trajetórias. Entretanto, pouco se discute a respeito de quais algoritmos são mais adequados para um domínio ou quais valores de parâmetros de entrada fazem com que um algoritmo obtenha o melhor desempenho neste mesmo domínio. A grande maioria dos algoritmos de segmentação de trajetórias utiliza critérios pré-definidos para realizar esta tarefa. Poucos trabalhos procuram utilizar critérios nos quais não se sabe a priori que tipos de segmentos são gerados, sendo esta questão pouco explorada na literatura. Outra questão em aberto é o uso de exemplos para induzir um algoritmo de segmentação a encontrar segmentos semelhantes a estes exemplos em outras trajetórias. Esta proposta de tese objetiva resolver estas questões. Primeiro, são propostos os métodos GEnetic Algorithm based on Roc analysis (GEAR) e o Iterated F-Race for Trajectory Segmentation Algorithms (I/F-Race-TSA), que são métodos para auxiliar na escolha da melhor configuração (i.e. valores de parâmetros de entrada) de algoritmos de segmentação de trajetórias. Segundo, é proposto o Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation (GRASP-UTS), com o objetivo de resolver o problema de segmentação de trajetórias quando o critério de segmentação não é previamente definido. Por último, propomos o GRASP for Semi-supervised Trajectory Segmentation (GRASP-SemTS). O GRASP-SemTS usa exemplos para induzir a tarefa de segmentação a encontrar segmentos semelhantes em outras trajetórias. Foram conduzidos experimentos com os métodos e algoritmos propostos para domínios distintos e para trajetórias reais de objetos móveis. Os resultados mostraram que ambos os métodos GEAR e I/F-Race-TSA foram capazes de calibrar automaticamente os parâmetros de entrada de algoritmos de segmentação de trajetórias para um dado domínio de aplicação. Os algoritmos GRASP-UTS e GRASP-SemTS obtiveram melhor desempenho quando comparados a outros algoritmos de segmentação de trajetórias da literatura contribuindo assim com importantes resultados para a área. / The popularization of technologies for geolocated data increased the amount of trajectory data available for analysis. Moving objects’ trajectories are generated from the positions of an object that moves in the geographical space during a certain amount of time. For many applications, it is necessary to partition trajectories into smaller pieces, named segments, which represent a relevant behavior to the application point of view. The literature reports many studies that propose trajectory segmentation approaches. However, there is a lack of discussions about which algorithm is more likely to be applied in a domain or which values of its input parameters obtain the best performance in the domain. Most algorithms for trajectory segmentation use pre-defined criteria to perform this task. Only few works make use of criteria where the characteristics of the segment are not known a priori and this topic is not well explored in the literature. Another open question is how to use a small amount of labeled segments to induce a segmentation algorithm in order to find such kind of behaviors into unseen trajectories. This thesis proposal aims to solve these questions. First, we propose the GEnetic Algorithm based on Roc analysis (GEAR) and the Iterated F-Race for Trajectory Segmentation Algorithms (I/F-RaceTSA), which are methods that are able to find the best configuration (i.e. input parameter values) of algorithms for trajectory segmentation. Second, we propose a Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation (GRASP-UTS) aiming to solve the trajectory segmentation problem when the criteria is not determined a priori. Last, we propose the GRASP for Semi-supervised Trajectory Segmentation (RGRASP-SemTS). The GRASP-SemTS solves the problem of using a small amount of labeled data to induce the trajectory segmentation algorithm to find such behaviors into unseen trajectories. Experiments were conducted with the methods and algorithms algorithms using real world trajectory data. Results showed that GEAR and I/F-Race-TSA are capable of finding automatically the input parameter values for a domain. The GRASP-UTS and GRASP-SemTS obtained a better performance when compared to other segmentation algorithms from literature, contributing with important results for this field.
|
6 |
UTILIZING BIG TRAJECTORY DATA FOR URBAN VISUAL ANALYTICS AND ACCESSIBILITY STUDIESKamw, Farah Shleemon 17 April 2019 (has links)
No description available.
|
7 |
Implementing Differential Privacy for Privacy Preserving Trajectory Data Publication in Large-Scale Wireless NetworksStroud, Caleb Zachary 14 August 2018 (has links)
Wireless networks collect vast amounts of log data concerning usage of the network. This data aids in informing operational needs related to performance, maintenance, etc., but it is also useful for outside researchers in analyzing network operation and user trends. Releasing such information to these outside researchers poses a threat to privacy of users. The dueling need for utility and privacy must be addressed. This thesis studies the concept of differential privacy for fulfillment of these goals of releasing high utility data to researchers while maintaining user privacy. The focus is specifically on physical user trajectories in authentication manager log data since this is a rich type of data that is useful for trend analysis. Authentication manager log data is produced when devices connect to physical access points (APs) and trajectories are sequences of these spatiotemporal connections from one AP to another for the same device. The fulfillment of this goal is pursued with a variable length n-gram model that creates a synthetic database which can be easily ingested by researchers. We found that there are shortcomings to the algorithm chosen in specific application to the data chosen, but differential privacy itself can still be used to release sanitized datasets while maintaining utility if the data has a low sparsity. / Master of Science / Wireless internet networks store historical logs of user device interaction with it. For example, when a phone or other wireless device connects, data is stored by the Internet Service Provider (ISP) about the device, username, time, and location of connection. A database of this type of data can help researchers analyze user trends in the network, but the data contains personally identifiable information for the users. We propose and analyze an algorithm which can release this data in a high utility manner for the researchers, yet maintain user privacy. This is based on a verifiable approach to privacy called differential privacy. This algorithm is found to provide utility and privacy protection for datasets with many users compared to the size of the network.
|
8 |
INTEGRATING CONNECTED VEHICLE DATA FOR OPERATIONAL DECISION MAKINGRahul Suryakant Sakhare (9320111) 26 April 2023 (has links)
<p> </p>
<p>Advancements in technology have propelled the availability of enriched and more frequent information about traffic conditions as well as the external factors that impact traffic such as weather, emergency response etc. Most newer vehicles are equipped with sensors that transmit their data back to the original equipment manufacturer (OEM) at near real-time fidelity. A growing number of such connected vehicles (CV) and the advent of third-party data collectors from various OEMs have made big data for traffic commercially available for use. Agencies maintaining and managing surface transportation are presented with opportunities to leverage such big data for efficiency gains. The focus of this dissertation is enhancing the use of CV data and applications derived from fusing it with other datasets to extract meaningful information that will aid agencies in data driven efficient decision making to improve network wide mobility and safety performance. </p>
<p>One of the primary concerns of CV data for agencies is data sampling, particularly during low-volume overnight hours. An evaluation of over 3 billion CV records in May 2022 in Indiana has shown an overall CV penetration rate of 6.3% on interstates and 5.3% on non-interstate roadways. Fusion of CV traffic speeds with precipitation intensity from NOAA’s High-Resolution Rapid-Refresh (HRRR) data over 42 unique rainy days has shown reduction in the average traffic speed by approximately 8.4% during conditions classified as very heavy rain compared to no rain. </p>
<p>Both aggregate analysis and disaggregate analysis performed during this study enables agencies and automobile manufacturers to effectively answer the often-asked question of what rain intensity it takes to begin impacting traffic speeds. Proactive measures such as providing advance warnings that improve the situational awareness of motorists and enhance roadway safety should be considered during very heavy rain periods, wind events, and low daylight conditions.</p>
<p>Scalable methodologies that can be used to systematically analyze hard braking and speed data were also developed. This study demonstrated both quantitatively and qualitatively how CV data provides an opportunity for near real-time assessment of work zone operations using metrics such as congestion, location-based speed profiles and hard braking. The availability of data across different states and ease of scalability makes the methodology implementable on a state or national basis for tracking any highway work zone with little to no infrastructure investment. These techniques can provide a nationwide opportunity in assessing the current guidelines and giving feedback in updating the design procedures to improve the consistency and safety of construction work zones on a national level. </p>
<p>CV data was also used to evaluate the impact of queue warning trucks sending digital alerts. Hard-braking events were found to decrease by approximately 80% when queue warning trucks were used to alert motorists of impending queues analyzed from 370 hours of queueing with queue trucks present and 58 hours of queueing without the queue trucks present, thus improving work zone safety. </p>
<p>Emerging opportunities to identify and measure traffic shock waves and their forming or recovery speed anywhere across a roadway network are provided due to the ubiquity of the CV data providers. A methodology for identifying different shock waves was presented, and among the various case studies found typical backward forming shock wave speeds ranged from 1.75 to 11.76 mph whereas the backward recovery shock wave speeds were between 5.78 to 16.54 mph. The significance of this is illustrated with a case study of a secondary crash that suggested accelerating the clearance by 9 minutes could have prevented the secondary crash incident occurring at the back of the queue. Such capability of identifying and measuring shock wave speeds can be utilized by various stakeholders for traffic management decision-making that provide a holistic perspective on the importance of both on scene risk as well as the risk at the back of the queue. Near real-time estimation of shock waves using CV data can recommend travel time prediction models and serve as input variables to navigation systems to identify alternate route choice opportunities ahead of a driver’s time of arrival. </p>
<p>The overall contribution of this thesis is developing scalable methodologies and evaluation techniques to extract valuable information from CV data that aids agencies in operational decision making.</p>
|
9 |
Developing a Cohesive Space-Time Information Framework for Analyzing Movement Trajectories in Real and Simulated EnvironmentsJanuary 2011 (has links)
abstract: In today's world, unprecedented amounts of data of individual mobile objects have become more available due to advances in location aware technologies and services. Studying the spatio-temporal patterns, processes, and behavior of mobile objects is an important issue for extracting useful information and knowledge about mobile phenomena. Potential applications across a wide range of fields include urban and transportation planning, Location-Based Services, and logistics. This research is designed to contribute to the existing state-of-the-art in tracking and modeling mobile objects, specifically targeting three challenges in investigating spatio-temporal patterns and processes; 1) a lack of space-time analysis tools; 2) a lack of studies about empirical data analysis and context awareness of mobile objects; and 3) a lack of studies about how to evaluate and test agent-based models of complex mobile phenomena. Three studies are proposed to investigate these challenges; the first study develops an integrated data analysis toolkit for exploration of spatio-temporal patterns and processes of mobile objects; the second study investigates two movement behaviors, 1) theoretical random walks and 2) human movements in urban space collected by GPS; and, the third study contributes to the research challenge of evaluating the form and fit of Agent-Based Models of human movement in urban space. The main contribution of this work is the conceptualization and implementation of a Geographic Knowledge Discovery approach for extracting high-level knowledge from low-level datasets about mobile objects. This allows better understanding of space-time patterns and processes of mobile objects by revealing their complex movement behaviors, interactions, and collective behaviors. In detail, this research proposes a novel analytical framework that integrates time geography, trajectory data mining, and 3D volume visualization. In addition, a toolkit that utilizes the framework is developed and used for investigating theoretical and empirical datasets about mobile objects. The results showed that the framework and the toolkit demonstrate a great capability to identify and visualize clusters of various movement behaviors in space and time. / Dissertation/Thesis / Ph.D. Geography 2011
|
10 |
Literature Study and Assessment of Trajectory Data Mining Tools / Litteraturstudie och utvärdering av verktyg för datautvinning från rörelsebanedataKihlström, Petter January 2015 (has links)
With the development of technologies such as Global Navigation Satellite Systems (GNSS), mobile computing, and Information and Communication Technology (ICT) the procedure of sampling positional data has lately been significantly simplified. This enables the aggregation of large amounts of moving objects data (i.e. trajectories) containing potential information about the moving objects. Within Knowledge Discovery in Databases (KDD), automated processes for realization of this information, called trajectory data mining, have been implemented. The objectives of this study is to examine 1) how trajectory data mining tasks are defined at an abstract level, 2) what type of information it is possible to extract from trajectory data, 3) what solutions trajectory data mining tools implement for different tasks, 4) how tools uses visualization, and 5) what the limiting aspects of input data are how those limitations are treated. The topic, trajectory data mining, is examined in a literature review, in which a large number of academic papers found trough googling were screened to find relevant information given the above stated objectives. The literature research found that there are several challenges along the process arriving at profitable knowledge about moving objects. For example, the discrete modelling of movements as polylines is associated with an inherent uncertainty since the location between two sampled positions is unknown. To reduce this uncertainty and prepare raw data for mining, data often needs to be processed in some way. The nature of pre-processing depends on sampling rate and accuracy properties of raw in-data as well as the requirements formulated by the specific mining method. Also a major challenge is to define relevant knowledge and effective methods for extracting this from the data. Furthermore are conveying results from mining to users an important function. Presenting results in an informative way, both at the level of individual trajectories and sets of trajectories, is a vital but far from trivial task, for which visualization is an effective approach. Abstractly defined instructions for data mining are formally denoted as tasks. There are four main categories of mining tasks: 1) managing uncertainty, 2) extrapolation, 3) anomaly detection, and 4) pattern detection. The recitation of tasks within this study provides a basis for an assessment of tools used for the execution of these tasks. To arrive at profitable results the dimensions of comparison are selected with the intention to cover the essential parts of the knowledge discovery process. The measures to appraise this are chosen to make results correctly reflect the 1) sophistication, 2) user friendliness, and 3) flexibility of tools. The focus within this thesis is freely available tools, for which the range is proven to be very small and fragmented. The selection of tools found and reported on are: MoveMine 2.0, MinUS, GeT_Move and M-Atlas. The tools are reviewed entirely through utilizing documentation of the tools. The performance of tools is proved to vary along all dimensional measures except visualization and graphical user interface which all tools provide. Overall the systems preform well considering user-friendliness, somewhat good considering sophistication and poorly considering flexibility. However, since the range of tasks, which tools intend to solve, overall is varying it might not be appropriate to compare the tools in term of better or worse. This thesis further provides some theoretical insights for users regarding requirements on their knowledge, both concerning the technical aspects of tools and about the nature of the moving objects. Furthermore is the future of trajectory data mining in form of constraints on information extraction as well as requirements for development of tools discussed, where a more robust open source solution is emphasised. Finally, this thesis can altogether be regarded to provide material for guidance in what trajectory mining tools to use depending on application. Work to complement this thesis through comparing the actual performance of tools, when using them, is desirable. / I och med utvecklingen av tekniker så som Global Navigation Satellite systems (GNSS), mobile computing och Information and Communication Technology (ICT) har tillvägagångsätt för insamling av positionsdata drastiskt förenklats. Denna utveckling har möjliggjort för insamlandet av stora mängder data från rörliga objekt (i.e. trajecotries)(sv: rörelsebanor), innehållande potentiell information om dessa rörliga objekt. Inom Knowledge Discovery in Databases (KDD)(sv: kunskapsanskaffning i databaser) tillämpas automatiserade processer för att realisera sådan information, som kallas trajectory data mining (sv: utvinning från rörelsebanedata). Denna studie ämnar undersöka 1) hur trajectory data mining tasks (sv: utvinning från rörelsebanedata uppgifter) är definierade på en abstrakt nivå, 2) vilken typ av information som är möjlig att utvinna ur rörelsebanedata, 3) vilka lösningar trajectory data ming tools (sv: verktyg för datautvinning från rörelsebanedata) implementerar för olika uppgifter, 4) hur verktyg använder visualisering, och 5) vilka de begränsande aspekterna av input-data är och hur dessa begränsningar hanteras. Ämnet utvinning från rörelsebanedata undersöks genom en litteraturgranskning, i vilken ett stort antal och akademiska rapporter hittade genom googling granskas för att finna relevant information givet de ovan nämnda frågeställningarna. Litteraturgranskningen visade att processen som leder upp till en användbar kunskap om rörliga objekt innehåller dock flera utmaningar. Till exempel är modelleringen av rörelser som polygontåg associerad med en inbyggd osäkerhet eftersom positionen för objekt mellan två inmätningar är okänd. För att reducera denna osäkerhet och förbereda rådata för extraktion måste ofta datan processeras på något sätt. Karaktären av förprocessering avgörs av insamlingsfrekvens och exakthetsegenskaper hos rå indata tillsammans med de krav som ställs av de specifika datautvinningsmetoderna. En betydande utmaning är också att definiera relevant kunskap och effektiva metoder för att utvinna denna från data. Vidare är förmedlandet av resultat från utvinnande till användare en viktig funktion. Att presentera resultat på ett informativt sätt, både på en nivå av enskilda rörelsebanor men och grupper av rörelsebanor är en vital men långt ifrån trivial uppgift, för vilken visualisering är ett effektivt tillvägagångsätt. Abstrakt definierade instruktioner för dataextraktion är formellt betecknade som uppgifter. Det finns fyra huvudkategorier av uppgifter: 1) hantering av osäkerhet, 2) extrapolation, 3) anomalidetektion, and 4) mönsterdetektion. Sammanfattningen av uppgifter som ges i denna rapport utgör ett fundament för en utvärdering av verktyg, vilka används för utförandet av uppgifter. För att landa i ett givande resultat har jämförelsegrunderna för verktygen valts med intentionen att täcka de viktigaste delarna av processen för att förvärva kunskap. Måtten för att utvärdera detta valdes för att reflektera 1) sofistikering, 2) användarvänlighet, och 3) flexibiliteten hos verktygen. Fokuset inom denna studie har varit verktyg som är gratis tillgängliga, för vilka utbudet har visat sig vara litet och fragmenterat. Selektionen av verktyg som hittats och utvärderats var: MoveMine 2.0, MinUS, GeT_Move and M-Atlas. Verktygen utvärderades helt och hållet baserat på tillgänglig dokumentation av verktygen. Prestationen av verktygen visade sig variera längs alla jämförelsegrunder utom visualisering och grafiskt gränssnitt som alla verktyg tillhandahöll. Överlag presterade systemen väl gällande användarvänlighet, någorlunda bra gällande sofistikering och dåligt gällande flexibilitet. Hursomhelst, eftersom uppgifterna som verktygen avser att lösa varierar är det inte relevant att värdera dem mot varandra gällande denna aspekt. Detta arbete tillhandahåller vidare några teoretiska insikter för användare gällande krav som ställs på deras kunskap, både gällande de tekniska aspekterna av verktygen och rörliga objekts beskaffenhet. Vidare diskuteras framtiden för utvinning från rörelsebanedata i form av begränsningar på informationsutvinning och krav för utvecklingen av verktyg, där en mer robust open source lösning betonas. Sammantaget kan detta arbete anses tillhandahålla material för vägledning i vad för verktyg för datautvinning från rörelsebanedata som kan användas beroende på användningsområde. Arbete för att komplettera denna rapport genom utvärdering av verktygens prestation utifrån användning av dem är önskvärt.
|
Page generated in 0.0898 seconds