• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 591
  • 119
  • 109
  • 75
  • 40
  • 40
  • 27
  • 22
  • 19
  • 11
  • 8
  • 7
  • 6
  • 6
  • 5
  • Tagged with
  • 1226
  • 1226
  • 181
  • 170
  • 163
  • 156
  • 150
  • 150
  • 149
  • 129
  • 112
  • 110
  • 110
  • 109
  • 108
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
441

An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases

Lambert, Joshua 01 January 2017 (has links)
It is estimated that Periodontal Diseases effects up to 90% of the adult population. Given the complexity of the host environment, many factors contribute to expression of the disease. Age, Gender, Socioeconomic Status, Smoking Status, and Race/Ethnicity are all known risk factors, as well as a handful of known comorbidities. Certain vitamins and minerals have been shown to be protective for the disease, while some toxins and chemicals have been associated with an increased prevalence. The role of toxins, chemicals, vitamins, and minerals in relation to disease is believed to be complex and potentially modified by known risk factors. A large comprehensive dataset from 1999-2003 from the National Health and Nutrition Examination Survey (NHANES) contains full and partial mouth examinations on subjects for measurement of periodontal diseases as well as patient demographic information and approximately 150 environmental variables. In this dissertation, a Feasible Solution Algorithm (FSA) will be used to investigate statistical interactions of these various chemical and environmental variables related to periodontal disease. This sequential algorithm can be used on traditional statistical modeling methods to explore two and three way interactions related to the outcome of interest. FSA can also be used to identify unique subgroups of patients where periodontitis is most (or least) prevalent. In this dissertation, FSA is used to explore the NHANES data and suggest interesting relationships between the toxins, chemicals, vitamins, minerals and known risk factors that have not been previously identified.
442

ESSAYS ON EXTERNAL FORCES IN CAPITAL MARKETS

Painter, Marcus 01 January 2019 (has links)
In the first chapter, I find counties more likely to be affected by climate change pay more in underwriting fees and initial yields to issue long-term municipal bonds compared to counties unlikely to be affected by climate change. This difference disappears when comparing short-term municipal bonds, implying the market prices climate change risks for long-term securities only. Higher issuance costs for climate risk counties are driven by bonds with lower credit ratings. Investor attention is a driving factor, as the difference in issuance costs on bonds issued by climate and non-climate affected counties increases after the release of the 2006 Stern Review on climate change. In the second chapter, I document the investment value of alternative data and examine how market participants react to the data's dissemination. Using satellite images of parking lots of US retailers, I find a long-short trading strategy based on growth in car count earns an alpha of 1.6% per month. I then show that, after the release of satellite data, hedge fund trades are more sensitive to growth in car count and are more profitable in affected stocks. Conversely, individual investor demand becomes less sensitive to growth in car count and less profitable in affected stocks. Further, the increase in information asymmetry between investors due to the availability of alternative data leads to a decrease in the liquidity of affected firms.
443

DATA COLLECTION FRAMEWORK AND MACHINE LEARNING ALGORITHMS FOR THE ANALYSIS OF CYBER SECURITY ATTACKS

Unknown Date (has links)
The integrity of network communications is constantly being challenged by more sophisticated intrusion techniques. Attackers are shifting to stealthier and more complex forms of attacks in an attempt to bypass known mitigation strategies. Also, many detection methods for popular network attacks have been developed using outdated or non-representative attack data. To effectively develop modern detection methodologies, there exists a need to acquire data that can fully encompass the behaviors of persistent and emerging threats. When collecting modern day network traffic for intrusion detection, substantial amounts of traffic can be collected, much of which consists of relatively few attack instances as compared to normal traffic. This skewed distribution between normal and attack data can lead to high levels of class imbalance. Machine learning techniques can be used to aid in attack detection, but large levels of imbalance between normal (majority) and attack (minority) instances can lead to inaccurate detection results. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2019. / FAU Electronic Theses and Dissertations Collection
444

Organizational Success in the Big Data Era: Development of the Albrecht Data-Embracing Climate Scale (ADEC)

Albrecht, Lauren Rebecca 01 September 2016 (has links)
In today’s information age, technological advances in virtually every industry allow organizations, both big and small, to create and store more data than ever before. Though data are highly abundant, they are still often underutilized resources with regard to improving organizational performance. The popularity and intrigue around big data specifically has opened up new opportunities to study how organizations embrace evidence and use it to improve their business. Generally, the focus of big data has mainly been on specific technologies, techniques, or its use in everyday life; however, what has been critically missing from the conversation is the consideration of culture and climate to support effective data use in organizations. Currently, many organizations want to develop a data-embracing climate or create changes to make their existing climates more data-informed. The purpose of this project was to develop a scale to assess the current state of data usage in organizations, which can be used to help organizations measure how well they manage, share, and use data to make informed decisions. I defined the phenomena of a data-embracing climate based on reviewing a broad range of business, computer science, and industrial-organizational psychology literature. Using this definition, I developed a scale to measure this newly defined construct by first conducting an exploratory factor analysis, then an item retranslation task, and finally a confirmatory factor analysis. This research provides support for the reliability and validity of the Albrecht Data-Embracing Climate Scale (ADEC); however, the future of this new area of research could benefit by replicating the results of this study and gaining support for the new construct. Implications for science and practice are discussed. I sought to make a valuable contribution to the field of I-O psychology and to make a useful instrument for researchers and practitioners in multiple and diverse fields. I hope others will benefit from this scale to measure how organizations use evidence from data to make informed decisions and gain a competitive advantage beyond intuition alone. Do not cite without express permission from the author.
445

Essays in History and Spatial Economics with Big Data

Lee, Sun Kyoung January 2019 (has links)
This dissertation contains three essays in History and Spatial Economics with Big Data. As a part of my dissertation, I develop a modern machine-learning based approach to connect large datasets. Merging several massive databases and matching the records within them presents challenges — some straightforward and others more complex. I employ artificial intelligence and machine learning technologies to link and then analyze massive amounts of historical US federal census, Department of Labor, and Bureau of Labor Statistics data. The transformation of the US economy during this time period was remarkable, from a rural economy at the beginning of the 19th century to an industrial nation by the end. More strikingly, after lagging behind the technological frontier for most of the nineteenth century, the United States entered the twenty-first century as the global technology leader and the richest nation in the world. Results from this dissertation reveal how people lived and how the business operated. It tells us the past that led us to where we are now in terms of people, geography, prices and wages, wealth, revenue, output, capital, numbers, and types of workers, urbanization, migration, and industrialization. As a part of this endeavor, the first chapter studies how the benefits of improving urban mass transit infrastructures in cities are shared across workers with different skills. It exploits a unique historical setting to estimate the impact of urban transportation infrastructure: the introduction of mass-public transit infrastructure in the late nineteenth and twentieth-century New York City. I linked individual-level US census data to investigate how urban transit infrastructure differentially affects the welfare of workers with heterogenous skill. My second chapter measures immigrants' role in the US rise as an economic power. Especially, this chapter focuses on a potential mechanism by which immigrants might have spurred economic prosperity: the transfer of new knowledge. This is the first project to use advances in quantitative spatial theory along with advanced big-data techniques to understand the contribution of immigrants to the process of U.S. economic growth. The key benefit of this approach is to link modern theory with massive amounts of microeconomic data about individual immigrants—their locations and occupations—to address questions that are extremely difficult to assess otherwise. Specifically, the dataset will help the researchers understand the extent to which the novel ideas and expertise immigrants brought to U.S. shores drove the nation’s emergence as an industrial and technological powerhouse. My third chapter exploits advances in data digitization and machine learning to study intergenerational mobility in the United States before World War II. Using machine learning techniques, I construct a massive database for multiple generations of fathers and sons. This allows us to identify “land of opportunities": locations and times in American history where kids had chances to move up in the income ladder. I find that intergenerational mobility elasticities were relatively stable during 1880-1940; there are regional disparities in terms of giving kids opportunities to move up, and the geographic disparities of intergenerational mobility have evolved over time.
446

Value as a Motivating Factor for Collaboration : The case of a collaborative network for wind asset owners for potential big data sharing

Kenjangada Kariappa, Ganapathy, Bjersér, Marcus January 2019 (has links)
The world's need for energy is increasing while we realize the consequences of existing unsustainable methods for energy production. Wind power is a potential partial solution, but it is a relatively new source of energy. Advances in technology and innovation can be one solution, but the wind energy industry is embracing them too slow due to, among other reasons, lack of incentives in terms of the added value provided. Collaboration and big data may possibly provide a key to overcome this. However, to our knowledge, this research area has received little attention, especially in the context of the wind energy industry.   The purpose of this study is to explore value as a motivating factor for potential big data collaboration via a collaborative network. This will be explored within the context of big data collaboration, and the collaborative network for wind asset owners O2O WIND International. A cross sectional, multi-method qualitative single in-depth case study is conducted. The data collected and analyzed is based on four semi-structured interviews and a set of rich documentary secondary data on the 25 of the participants in the collaborative network in the form of 3866 pages and 124 web pages visited.  The main findings are as follows. The 25 participants of the collaborative network were evaluated and their approach to three different types of value were visualized through a novel model: A three-dimensional value approach space. From this visualization clusters of participants resulting in 6 different approaches to value can be distinguished amongst the 25 participants.  Furthermore, 14 different categories of value as the participants express are possible to create through the collaborative network has been identified. These values have been categorized based on fundamental types of value, their dimensions and four value processes. As well as analyzed for patterns and similarities amongst them. The classification results in a unique categorization of participants of a collaborative network. These categories prove as customer  segments that the focal firm of the collaborative network can target.  The interviews resulted in insights about the current state of the industry, existing and future market problems and needs as well as existing and future market opportunities. Then possible business model implications originating from our findings, for the focal firm behind the collaborative network O2O WIND International as well as the participants of the collaboration, has been discussed. We conclude that big data and collaborative networks has potential for value creation in the wind power sector, if the business model of those involved takes it into account. However, more future research is necessary, and suggestions are made.
447

A big data analytics framework to improve healthcare service delivery in South Africa

Mgudlwa, Sibulela January 2018 (has links)
Thesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2018. / Healthcare facilities in South Africa accumulate big data, daily. However, this data is not being utilised to its full potential. The healthcare sector still uses traditional methods to store, process, and analyse data. Currently, there are no big data analytics tools being used in the South African healthcare environment. This study was conducted to establish what factors hinder the effective use of big data in the South African healthcare environment. To fulfil the objectives of this research, qualitative methods were followed. Using the case study method, two healthcare organisations were selected as cases. This enabled the researcher to find similarities between the cases which drove them towards generalisation. The data collected in this study was analysed using the Actor-Network Theory (ANT). Through the application of ANT, the researcher was able to uncover the influencing factors behind big data analytics in the healthcare environment. ANT was essential to the study as it brought out the different interactions that take place between human and non-human actors, resulting in big data. From the analysis, findings were drawn and interpreted. The interpretation of findings led to the developed framework in Figure 5.5. This framework was developed to guide the healthcare sector of South Africa towards the selection of appropriate big data analytics tools. The contribution of this study is in twofold; namely, theoretically and practically. Theoretically, the developed framework will act as a useful guide towards the selection of big data analytics tools. Practically, this guide can be used by South African healthcare practitioners to gain better understanding of big data analytics and how they can be used to improve healthcare service delivery.
448

Bringing interpretability and visualization with artificial neural networks

Gritsenko, Andrey 01 August 2017 (has links)
Extreme Learning Machine (ELM) is a training algorithm for Single-Layer Feed-forward Neural Network (SLFN). The difference in theory of ELM from other training algorithms is in the existence of explicitly-given solution due to the immutability of initialed weights. In practice, ELMs achieve performance similar to that of other state-of-the-art training techniques, while taking much less time to train a model. Experiments show that the speedup of training ELM is up to the 5 orders of magnitude comparing to standard Error Back-propagation algorithm. ELM is a recently discovered technique that has proved its efficiency in classic regression and classification tasks, including multi-class cases. In this thesis, extensions of ELMs for non-typical for Artificial Neural Networks (ANNs) problems are presented. The first extension, described in the third chapter, allows to use ELMs to get probabilistic outputs for multi-class classification problems. The standard way of solving this type of problems is based 'majority vote' of classifier's raw outputs. This approach can rise issues if the penalty for misclassification is different for different classes. In this case, having probability outputs would be more useful. In the scope of this extension, two methods are proposed. Additionally, an alternative way of interpreting probabilistic outputs is proposed. ELM method prove useful for non-linear dimensionality reduction and visualization, based on repetitive re-training and re-evaluation of model. The forth chapter introduces adaptations of ELM-based visualization for classification and regression tasks. A set of experiments has been conducted to prove that these adaptations provide better visualization results that can then be used for perform classification or regression on previously unseen samples. Shape registration of 3D models with non-isometric distortion is an open problem in 3D Computer Graphics and Computational Geometry. The fifth chapter discusses a novel approach for solving this problem by introducing a similarity metric for spectral descriptors. Practically, this approach has been implemented in two methods. The first one utilizes Siamese Neural Network to embed original spectral descriptors into a lower dimensional metric space, for which the Euclidean distance provides a good measure of similarity. The second method uses Extreme Learning Machines to learn similarity metric directly for original spectral descriptors. Over a set of experiments, the consistency of the proposed approach for solving deformable registration problem has been proven.
449

Shared and distributed memory parallel algorithms to solve big data problems in biological, social network and spatial domain applications

Sharma, Rahil 01 December 2016 (has links)
Big data refers to information which cannot be processed and analyzed using traditional approaches and tools, due to 4 V's - sheer Volume, Velocity at which data is received and processed, and data Variety and Veracity. Today massive volumes of data originate in domains such as geospatial analysis, biological and social networks, etc. Hence, scalable algorithms for effcient processing of this massive data is a signicant challenge in the field of computer science. One way to achieve such effcient and scalable algorithms is by using shared & distributed memory parallel programming models. In this thesis, we present a variety of such algorithms to solve problems in various above mentioned domains. We solve five problems that fall into two categories. The first group of problems deals with the issue of community detection. Detecting communities in real world networks is of great importance because they consist of patterns that can be viewed as independent components, each of which has distinct features and can be detected based upon network structure. For example, communities in social networks can help target users for marketing purposes, provide user recommendations to connect with and join communities or forums, etc. We develop a novel sequential algorithm to accurately detect community structures in biological protein-protein interaction networks, where a community corresponds with a functional module of proteins. Generally, such sequential algorithms are computationally expensive, which makes them impractical to use for large real world networks. To address this limitation, we develop a new highly scalable Symmetric Multiprocessing (SMP) based parallel algorithm to detect high quality communities in large subsections of social networks like Facebook and Amazon. Due to the SMP architecture, however, our algorithm cannot process networks whose size is greater than the size of the RAM of a single machine. With the increasing size of social networks, community detection has become even more difficult, since network size can reach up to hundreds of millions of vertices and edges. Processing such massive networks requires several hundred gigabytes of RAM, which is only possible by adopting distributed infrastructure. To address this, we develop a novel hybrid (shared + distributed memory) parallel algorithm to efficiently detect high quality communities in massive Twitter and .uk domain networks. The second group of problems deals with the issue of effciently processing spatial Light Detection and Ranging (LiDAR) data. LiDAR data is widely used in forest and agricultural crop studies, landscape classification, 3D urban modeling, etc. Technological advancements in building LiDAR sensors have enabled highly accurate and dense LiDAR point clouds resulting in massive data volumes, which pose computing issues with processing and storage. We develop the first published landscape driven data reduction algorithm, which uses the slope-map of the terrain as a filter to reduce the data without sacrificing its accuracy. Our algorithm is highly scalable and adopts shared memory based parallel architecture. We also develop a parallel interpolation technique that is used to generate highly accurate continuous terrains, i.e. Digital Elevation Models (DEMs), from discrete LiDAR point clouds.
450

From Crisis to Crisis: A Big Data, Antenarrative Analysis of How Social Media Users Make Meaning During and After Crisis Events

Bair, Adam R. 01 May 2016 (has links)
This dissertation examines how individuals use social media to respond to crisis situations, both during and after the event. Using both rhetorical criticism and David Boje’s theories and concepts regarding the development of antenarrative—a process of making sense of past, present, and future events—I explored how social media users make sense of and respond to a crisis. Specifically, my research was guided by three major questions: Are traditional, pre-social media image-repair strategies effective in social media environments? How do participants use social media in crisis events, and how does this usage shape the rhetorical framing of a crisis? How might organizations effectively adapt traditional crisis communication plans to be used in social media during future crisis events? These questions were applied to four case studies to provide a range of insights about not only how individuals respond to a crisis, but also what strategies organizations use to present information about it. These cases were carefully selected to include a variety of crisis types and responses and include the following: A business (H&R Block) communicating to clients about a software error A governmental organization (the NTSB) presenting information about the cause of an airplane crash and about missteps in its response A governmental group (the CDC) responding to a global health crisis with various audiences and types of responses An activist movement (Black Lives Matter) attempting to unify social media users to lobby for change and highlight the scope of the issues to the nation Analyses of these cases not only show how individuals and groups used social media to make sense of crisis events, but also how the rhetorical strategies used to respond to a crisis situation. Understanding how individuals and groups make sense of crises will provide additional understanding to information designers, public relations professionals, organizations and businesses, and individuals using social media to effect change.

Page generated in 0.0532 seconds