Global ETD Search

161	MAINFRAME: Military acquisition inspired framework for architectural modeling and evaluation Zellers, Eric M. 27 May 2016 (has links) Military acquisition programs have long been criticized for the exponential growth in program costs required to generate modest improvements in capability. One of the most promising reform efforts to address this trend is the open system architecture initiative, which uses modular design principles and commercial interface standards as a means to reduce the cost and complexity of upgrading systems over time. While conceptually simple, this effort has proven to be exceptionally difficult to implement in practice. This difficulty stems, in large part, from the fact that open systems trade additional cost and risk in the early phases of development for the option to infuse technology at a later date, but the benefits provided by this option are inherently uncertain. Practical implementation therefore requires a decision support framework to determine when these uncertain, future benefits are worth the cost and risk assumed in the present. The objective of this research is to address this gap by developing a method to measure the expected costs, benefits and risks associated with open systems. This work is predicated on three assumptions: (1) the purpose of future technology infusions is to keep pace with the uncertain evolution of operational requirements, (2) successful designs must justify how future upgrades will be used to satisfy these requirements, and (3) program managers retain the flexibility to adapt prior decisions as new information is made available over time. The analytical method developed in this work is then applied to an example scenario for an aerial Intelligence, Surveillance, and Reconnaissance platform with the potential to upgrade its sensor suite in future increments. Final results demonstrate that the relative advantages and drawbacks between open and integrated system architectures can be presented in the context of a cost-effectiveness framework that is currently used by acquisition professionals to manage complex design decisions. Platform design Technology forecasting Modularity Real options Visual analytics
162	Implantation de pratiques et d'outils d'intelligence d'affaires pour supporter la prise de décision dans le sport compétitif : deux exemples venant du football universitaire Bourdon, Adrien January 2017 (has links) Depuis les 10 dernières années, l’intelligence d’affaires et l’analytique (IA&A) sont devenues des sujets d’intérêts pour la recherche en systèmes d’information (SI) ainsi que pour les professionnels du domaine. Les initiatives en IA&A ont généré des bénéfices pour de nombreuses organisations dans plusieurs secteurs tels que la finance, les assurances, le divertissement ou encore les communications. Un domaine relativement nouveau où l’IA&A a fait son apparition est le sport compétitif. Les institutions, organisations sportives et athlètes d’élite tentent de mettre à profit l’utilisation des données et des technologies pour améliorer leurs performances, et ce, à différents niveaux. Alors que l’utilisation d’outils en IA&A dans les sports compétitifs a joui d’une médiatisation plus importante ces derniers temps, la recherche académique dans le domaine reste encore à un stade primaire. Dans cette étude, nous utiliserons deux situations dans le football universitaire pour mettre en évidence les composantes d’un cadre conceptuel de création de valeur dans les sports compétitifs. Nous présenterons une méthodologie afin d’optimiser le processus de recrutement collégial d’une organisation universitaire, ainsi qu’une autre méthodologie afin d’optimiser la prise de décision dans une situation de match précise. Intelligence d'affaires Sports Analytique Sports analytics Football Recrutement 3rd down
163	Health Data Analytics: Data and Text Mining Approaches for Pharmacovigilance Liu, Xiao, Liu, Xiao January 2016 (has links) Pharmacovigilance is defined as the science and activities relating to the detection, assessment, understanding, and prevention of adverse drug events (WHO 2004). Post-approval adverse drug events are a major health concern. They attribute to about 700,000 emergency department visits, 120,000 hospitalizations, and $75 billion in medical costs annually (Yang et al. 2014). However, certain adverse drug events are preventable if detected early. Timely and accurate pharmacovigilance in the post-approval period is an urgent goal of the public health system. The availability of various sources of healthcare data for analysis in recent years opens new opportunities for the data-driven pharmacovigilance research. In an attempt to leverage the emerging healthcare big data, pharmacovigilance research is facing a few challenges. Most studies in pharmacovigilance focus on structured and coded data, and therefore miss important textual data from patient social media and clinical documents in EHR. Most prior studies develop drug safety surveillance systems using a single data source with only one data mining algorithm. The performance of such systems is hampered by the bias in data and the pitfalls of the data mining algorithms adopted. In my dissertation, I address two broad research questions: 1) How do we extract rich adverse drug event related information in textual data for active drug safety surveillance? 2) How do we design an integrated pharmacovigilance system to improve the decision-making process for drug safety regulatory intervention? To these ends, the dissertation comprises three essays. The first essay examines how to develop a high-performance information extraction framework for patient reports of adverse drug events in health social media. I found that medical entity extraction, drug-event relation extraction, and report source classification are necessary components for this task. In the second essay, I address the scalability issue of using social media for pharmacovigilance by proposing a distant supervision approach for information extraction. In the last essay, I develop a MetaAlert framework for pharmacovigilance with advanced text mining and data mining techniques to provide timely and accurate detection of adverse drug reactions. Models, frameworks, and design principles proposed in these essays advance not only pharmacovigilance research, but also more broadly contribute to health IT, business analytics, and design science research. Health Data Analytics Pharmacovigilance Text Mining Management Data Mining
164	Predictive analytics and data management in beef cattle production medicine Abell, Kaitlynn M. January 1900 (has links) Doctor of Philosophy / Department of Diagnostic Medicine/Pathobiology / Robert L. Larson / Bradley J. White / Utilization of data analytics allows for rapid and real-time decision making in the food animal production industry. The objective of my research was to implement and utilize different data analytic strategies in multiple sectors of the beef cattle industry in order to determine management, health, and performance strategies. A retrospective analysis using reproductive and genomic records demonstrated that a bull will sire a larger number of calves in a multiple sire-pasture compared to other bulls in the same pasture. A further study was performed to determine if behavior differences existed among bulls in a multiple-sire pasture, and the ability of accelerometers to predict breeding behaviors. Machine learning techniques used classifiers on accelerometer data to predict behavior events lying, standing, walking, and mounting. The classifiers were able to accurately predict lying and standing, but walking and mounting resulted in a lower predictable accuracy due to the extremely low prevalence of these behaviors. Finally, a new form of meta-analysis to the veterinary literature, a mixed treatment comparison, was able to accurately identify differences in metaphylactic antimicrobials on outcomes of bovine respiratory disease morbidity, mortality, and retreatment morbidity. The meta-analysis was not successful in determining the effects of metaphylactic antimicrobials on performance outcomes. Veterinary medicine Predictive analytics Data management Beef cattle production
165	The effect of quality metrics on the user watching behaviour in media content broadcast Setterquist, Erik January 2016 (has links) Understanding the effects of quality metrics on the user behavior is important for the increasing number of content providers in order to maintain a competitive edge. The two data sets used are gathered from a provider of live streaming and a provider of video on demand streaming. The important quality and non quality features are determined by using both correlation metrics and relative importance determined by machine learning methods. A model that can predict and simulate the user behavior is developed and tested. A time series model, machine learning model and a combination of both are compared. Results indicate that both quality features and non quality features are important in understanding user behavior, and the importance of quality features are reduced over time. For short prediction times the model using quality features is performing slightly better than the model not using quality features. BigData Analytics Machine Learning Time Series Analysis QoE Media Broadcast
166	Sběr sémanticky obohacených clickstreamů / The gatheringof semantically enriched clickstreams Bača, Roman January 2009 (has links) The aim of this thesis is to bring near to the readers the area of webmining and familiarize them with tools, which deal with data mining on the web. The main emphasis is placed on the analytical software program called Piwik. This analytical tool is compared with others nowadays available analytical tools. This thesis also aims to create a compact documentation of the software Piwik. The largest part of this documentation is devoted to the newly programmed plugin. The principle of information retrieval, based on user behavior on the web, is described from the common viewpoint and leads to more factual form of description of information retrieval using this new plugin.
167	Scalable Discovery and Analytics on Web Linked Data Abdelaziz, Ibrahim 07 1900 (has links) Resource Description Framework (RDF) provides a simple way for expressing facts across the web, leading to Web linked data. Several distributed and federated RDF systems have emerged to handle the massive amounts of RDF data available nowadays. Distributed systems are optimized to query massive datasets that appear as a single graph, while federated systems are designed to query hundreds of decentralized and interlinked graphs. This thesis starts with a comprehensive experimental study of the state-of-the-art RDF systems. It identifies a set of research problems for improving the state-of-the-art, including: supporting the emerging RDF analytics required by many modern applications, querying linked data at scale, and enabling discovery on linked data. Addressing these problems is the focus of this thesis. First, we propose Spartex; a versatile framework for complex RDF analytics. Spartex extends SPARQL to seamlessly combine generic graph algorithms with SPARQL queries. Spartex implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. We demonstrate that Spartex scales to datasets with billions of edges, and is at least as fast as the state-of-the-art specialized RDF engines. For analytical tasks, Spartex is an order of magnitude faster than existing alternatives. To address the scalability limitation of federated RDF engines, we propose Lusail; a scalable system for querying geo-distributed RDF graphs. Lusail follows a two-tier strategy: (i) locality-aware decomposition of the query into subqueries to maximize the computations at the endpoints and minimize intermediary results, and (ii) selectivity-aware execution to reduce network latency and increase parallelism. Our experiments on billions of triples show that Lusail outperforms existing systems by orders of magnitude in scalability and response time. Finally, enabling discovery on linked data is challenging due to the prior knowledge required to formulate SPARQL queries. To address these challenges; we develop novel techniques to (i) predict semantically equivalent SPARQL queries from a set of keywords by leveraging word embeddings, and (ii) generate fine-grained and non-blocking query plans to get fast and early results. Graph Analytics Linked Data RDF SPARQL Data Discovery
168	Estimating Bus Passengers' Origin-Destination of Travel Route Using Data Analytics on Wi-Fi and Bluetooth Signals Jalali, Shahrzad 16 May 2019 (has links) Accurate estimation of Origin and Destination (O-D) of passengers has been an essential objective for public transit agencies because knowledge of passengers’ flow enables them to forecast ridership, and plan for bus schedules, and bus routes. However, obtaining O-D information using traditional ways, such as conducting surveys, cannot fulfill today’s requirements of intelligent transportation and route planning in smart cities. Estimating bus passengers’ O-D using Wi-Fi and Bluetooth signals detected from their mobile devices is the primary objective of this project. For this purpose, we collected anonymized passengers’ data using SMATS TrafficBoxTM sensor provided by “SMATS Traffic Solutions” company. We then performed pre-processing steps including data cleaning, feature extraction, and data normalization, then, built various models using data mining techniques. The main challenge in this project was to distinguish between passengers’ and non-passengers’ signals since the sensor captures all signals in its surrounding environment including substantial noise from devices outside of the bus. To address this challenge, we applied Hierarchical and K-Means clustering algorithms to separate passengers from non-passengers’ signals automatically. By assigning GPS data to passengers’ signals, we could find commuters’ O-D. Moreover, we developed a second method based on an online analysis of sequential data, where specific thresholds were set to recognize passengers’ signals in real time. This method could create the O-D matrix online. Finally, in the validation phase, we compared the ground truth data with both estimated O-D matrices in both approaches and calculated their accuracy. Based on the final results, our proposed approaches can detect more than 20% of passengers (compared to 5% detection rate of traditional survey-based methods), and estimate the origin and destination of passengers with an accuracy of about 93%. With such promising results, these approaches are suitable alternatives for traditional and time-consuming ways of obtaining O-D data. This enables public transit companies to enhance their service offering by efficiently planning and scheduling the bus routes, improving ride comfort, and lowering operating costs of urban transportation. Origin-Destination Passengers' demands Mobile devices Data analytics
169	Outlier Detection In Big Data Cao, Lei 29 March 2016 (has links) The dissertation focuses on scaling outlier detection to work both on huge static as well as on dynamic streaming datasets. Outliers are patterns in the data that do not conform to the expected behavior. Outlier detection techniques are broadly applied in applications ranging from credit fraud prevention, network intrusion detection to stock investment tactical planning. For such mission critical applications, a timely response often is of paramount importance. Yet the processing of outlier detection requests is of high algorithmic complexity and resource consuming. In this dissertation we investigate the challenges of detecting outliers in big data -- in particular caused by the high velocity of streaming data, the big volume of static data and the large cardinality of the input parameter space for tuning outlier mining algorithms. Effective optimization techniques are proposed to assure the responsiveness of outlier detection in big data. In this dissertation we first propose a novel optimization framework called LEAP to continuously detect outliers over data streams. The continuous discovery of outliers is critical for a large range of online applications that monitor high volume continuously evolving streaming data. LEAP encompasses two general optimization principles that utilize the rarity of the outliers and the temporal priority relationships among stream data points. Leveraging these two principles LEAP not only is able to continuously deliver outliers with respect to a set of popular outlier models, but also provides near real-time support for processing powerful outlier analytics workloads composed of large numbers of outlier mining requests with various parameter settings. Second, we develop a distributed approach to efficiently detect outliers over massive-scale static data sets. In this big data era, as the volume of the data advances to new levels, the power of distributed compute clusters must be employed to detect outliers in a short turnaround time. In this research, our approach optimizes key factors determining the efficiency of distributed data analytics, namely, communication costs and load balancing. In particular we prove the traditional frequency-based load balancing assumption is not effective. We thus design a novel cost-driven data partitioning strategy that achieves load balancing. Furthermore, we abandon the traditional one detection algorithm for all compute nodes approach and instead propose a novel multi-tactic methodology which adaptively selects the most appropriate algorithm for each node based on the characteristics of the data partition assigned to it. Third, traditional outlier detection systems process each individual outlier detection request instantiated with a particular parameter setting one at a time. This is not only prohibitively time-consuming for large datasets, but also tedious for analysts as they explore the data to hone in on the most appropriate parameter setting or on the desired results. We thus design an interactive outlier exploration paradigm that is not only able to answer traditional outlier detection requests in near real-time, but also offers innovative outlier analytics tools to assist analysts to quickly extract, interpret and understand the outliers of interest. Our experimental studies including performance evaluation and user studies conducted on real world datasets including stock, sensor, moving object, and Geolocation datasets confirm both the effectiveness and efficiency of the proposed approaches. big data outlier detection data stream distributed algorithm data analytics
170	Analysis Guided Visual Exploration of Multivariate Data Yang, Di 04 May 2007 (has links) Visualization systems traditionally focus on graphical representation of information. They tend not to provide integrated analytical services that could aid users in tackling complex knowledge discovery tasks. Users¡¯ exploration in such environments is usually impeded due to several problems: 1) Valuable information is hard to discover, when too much data is visualized on the screen. 2) They have to manage and organize their discoveries off line, because no systematic discovery management mechanism exists. 3) Their discoveries based on visual exploration alone may lack accuracy. 4) They have no convenient access to the important knowledge learned by other users. To tackle these problems, it has been recognized that analytical tools must be introduced into visualization systems. In this paper, we present a novel analysis-guided exploration system, called the Nugget Management System (NMS). It leverages the collaborative effort of human comprehensibility and machine computations to facilitate users¡¯ visual exploration process. Specifically, NMS first extracts the valuable information (nuggets) hidden in datasets based on the interests of users. Given that similar nuggets may be re-discovered by different users, NMS consolidates the nugget candidate set by clustering based on their semantic similarity. To solve the problem of inaccurate discoveries, data mining techniques are applied to refine the nuggets to best represent the patterns existing in datasets. Lastly, the resulting well-organized nugget pool is used to guide users¡¯ exploration. To evaluate the effectiveness of NMS, we integrated NMS into XmdvTool, a freeware multivariate visualization system. User studies were performed to compare the users¡¯ efficiency and accuracy of finishing tasks on real datasets, with and without the help of NMS. Our user studies confirmed the effectiveness of NMS. Keywords: Visual Analytics, Visual Knowledge Visual Analytics Visual Knowledge Discovery Multivariate analysis Visualization

Search results