• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 589
  • 119
  • 109
  • 75
  • 40
  • 40
  • 27
  • 22
  • 19
  • 10
  • 7
  • 7
  • 6
  • 6
  • 5
  • Tagged with
  • 1222
  • 1222
  • 179
  • 169
  • 163
  • 156
  • 150
  • 149
  • 148
  • 129
  • 112
  • 110
  • 109
  • 109
  • 107
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

Semantic Keyword Search on Large-Scale Semi-Structured Data

January 2016 (has links)
abstract: Keyword search provides a simple and user-friendly mechanism for information search, and has become increasingly popular for accessing structured or semi-structured data. However, there are two open issues of keyword search on semi/structured data which are not well addressed by existing work yet. First, while an increasing amount of investigation has been done in this important area, most existing work concentrates on efficiency instead of search quality and may fail to deliver high quality results from semantic perspectives. Majority of the existing work generates minimal sub-graph results that are oblivious to the entity and relationship semantics embedded in the data and in the user query. There are also studies that define results to be subtrees or subgraphs that contain all query keywords but are not necessarily ``minimal''. However, such result construction method suffers from the same problem of semantic mis-alignment between data and user query. In this work the semantics of how to {\em define} results that can capture users' search intention and then the generation of search intention aware results is studied. Second, most existing research is incapable of handling large-scale structured data. However, as data volume has seen rapid growth in recent years, the problem of how to efficiently process keyword queries on large-scale structured data becomes important. MapReduce is widely acknowledged as an effective programming model to process big data. For keyword query processing on data graph, first graph algorithms which can efficiently return query results that are consistent with users' search intention are proposed. Then these algorithms are migrated to MapReduce to support big data. For keyword query processing on schema graph, it first transforms a keyword query into multiple SQL queries, then all generated SQL queries are run on the structured data. Therefore it is crucial to find the optimal way to execute a SQL query using MapReduce, which can minimize the processing time. In this work, a system called SOSQL is developed which generates the optimal query execution plan using MapReduce for a SQL query $Q$ with time complexity $O(n^2)$, where $n$ is the number of input tables of $Q$. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2016
162

An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm

January 2017 (has links)
abstract: This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA is studied from three aspects: nonorthogonality, including interaction terms and variable misspecification. A new accurate variable selection algorithm is proposed to help the implementation of IBOSS algorithms when a large number of variables are present with sparse important variables among them. Aggregating random subsample results, this variable selection algorithm is much more accurate than the LASSO method using full data. Since the time complexity is associated with the number of variables only, it is also very computationally efficient if the number of variables is fixed as n increases and not massively large. More importantly, using subsamples it solves the problem that full data cannot be stored in the memory when a data set is too large. / Dissertation/Thesis / Masters Thesis Statistics 2017
163

Implicações do fenômeno big data na análise para inteligência estratégica

Nesello, Priscila 10 April 2014 (has links)
Uma grande quantidade de dados é produzida diariamente por operações comerciais e financeiras, mídias sociais e dispositivos móveis, sensores e outros equipamentos inseridos no mundo físico. Este fenômeno deu origem ao termo big data, e seus efeitos podem ser percebidos por empresas, ciência e governo. Entretanto, é a inteligência estratégica, não a informação, que auxilia gerentes a extrair valor dos grandes volumes de dados. Para isto, é necessário transformar a informação dispersa no ambiente em conhecimento estruturado e útil à tomada de decisão nas organizações. Este é um processo complexo, pois apesar das ferramentas e técnicas disponíveis é indispensável que o profissional em inteligência saiba lidar com a complexidade cognitiva inerente ao processo de análise. Neste contexto, o objetivo deste trabalho foi o de examinar como o fenômeno big data afeta o processo de análise na atividade de inteligência estratégica. A pesquisa tratou de como o fenômeno big data é percebido pelos entrevistados em suas atividades analíticas em inteligência estratégica e propôs uma análise acerca de suas implicações. Para isso, foi realizado um estudo exploratório qualitativo. Foram entrevistados profissionais brasileiros, residentes nos Estados do Rio Grande do Sul, Rio de Janeiro, Distrito Federal e São Paulo. Estes entrevistados foram selecionados por meio de agentes com atuação, conhecimento e trânsito nos campos de inteligência estratégica e/ou big data. O roteiro que serviu de base para estas entrevistas foi estruturado a partir das dimensões do fenômeno big data e seus efeitos nas atividades analíticas no processo de inteligência estratégica. A técnica utilizada para análise dos dados foi análise de conteúdo. Os resultados indicam que o volume de big data contribui para a compreensão dos métodos de coleta, entretanto prejudica o domínio da matéria. Outras descobertas revelam que para alguns entrevistados big data já integra as práticas profissionais na realização de análises mais elaboradas e no desenvolvimento de projetos específicos. Entretanto para outros, big data ainda não é uma realidade, não sendo percebida a necessidade de utilizar grandes volumes de dados nas análises. Este fato também denota um paradoxo entre a caracterização da produção de conhecimento no campo do big data e o seu uso no campo profissional de inteligência estratégica. Por um lado o maior volume de trabalhos sobre o big data está no campo profissional das organizações produtivas e não na academia e por outro os profissionais em inteligência ainda não percebem o valor do fenômeno para sua atuação profissional. / Submitted by Marcelo Teixeira (mvteixeira@ucs.br) on 2014-07-14T16:45:52Z No. of bitstreams: 1 Dissertacao Priscila Nesello.pdf: 453125 bytes, checksum: 3e65516684c242236ed38829114d775e (MD5) / Made available in DSpace on 2014-07-14T16:45:52Z (GMT). No. of bitstreams: 1 Dissertacao Priscila Nesello.pdf: 453125 bytes, checksum: 3e65516684c242236ed38829114d775e (MD5) / A considerable amount of data is daily produced by business and financial operations, social media, mobile devices, sensors as well as other gadgets available in the world. This phenomenon gave rise to the big data term whose effects can be perceived by companies, science and governments. However, the strategic intelligence, not the information itself, supports managers eliciting values from big volume of data. For this purpose, transforming the dispersed information in the environment into structure knowledge is necessary and useful for organizations’ decisions. This is a complex process, and despite the tools and available techniques, it is indispensable that the professional in intelligence knows how to deal with inherit cognitive complexity in the courses of analyses. In this context, the objective of the present work was examining how the phenomenon big data affected the course of analyses in the strategic intelligence activity. The research was addressed on how the phenomenon big data was perceived by the interviewers in their analytics activities in strategic intelligence. It also proposed analysis based on its implications. In order to achieve this, a qualitative exploratory study was conducted. Several Brazilian professionals were interviewed, including residents in states such as Rio Grande do Sul, Rio de Janeiro, Distrito Federal and São Paulo. Those interviewed were chosen by agents with experience and knowledge in the strategic intelligence field and/or big data. The guide used for the interviews was structured from dimensions of big data phenomenon and its effects on the course of analyses in the strategic intelligence activity. The technique used for analyzing data was through content review. Results indicate that the volume of big data contributes to the comprehension of collection methods even though it eventually might debilitate the ability to grasp the topic. Other discoveries show that for some of those interviewed, big data has already integrated professional practices on not only performing a more detailed analyses but also developing specific projects. Nevertheless, big data is not a reality yet for others since the necessity of utilizing big volume of data for analysis is not really being perceived. This fact also denotes a paradox between the characterization of production knowledge within big data field and its use in the professional area of strategic intelligence. On the one hand, the great workload about big data is located in the professional area of productive organizations. Not in the academy, though. After all, intelligence professionals have not realized yet the real value of big data phenomenon for their professional performance.
164

Contribuição das redes de T.I. para formulação da estratégia e gestão das empresas /

Leão, Mauricio Teixeira. January 2014 (has links)
Orientador: José Paulo Alves Fusco / Banca: José de Souza Rodrigues / Banca: Ethel Cristina Chiari da Silva / Resumo: Esta pesquisa estudou a influência das redes de Tecnologia da Informação (T.I.) na formulação da estratégia e gestão das empresas. Novas técnicas de desenvolvimento de application (APP) para smartphones, sistemas de capturas de dados e sistemas de análises de grandes volumes de dados não estruturados como vídeo, e-mails, postagens em blogs, tweets, textos, imagens, etc. estão gerando novas demandas para as empresas. Uma nova corrida no universo digital está em curso. Empresas em toda a parte estão esforçando-se para conseguirem tratar estas grandes massas de dados e extrair desta pesquisa avaliar os prováveis impactos causados pelo desenvolvimento das redes de TI, suas tecnologias associadas e as redes sociais, nos modelos de negócios das organizações e como esse desenvolvimento pode refletir nas configurações das cadeias de fornecimentos, tanto a montante como a jusante. Foram analisados, a partir de importantes periódicos mundiais, artigos de muitos dos principais pesquisadores cujas obras guardam relação com os objetivos de pesquisa deste trabalho. Também foram realizadas pesquisas qualitativas com três casos estudados em três empresas dos setores industrial e de prestação de serviços. Foram encontrados indícios que puderam comprovar que o uso das redes de T.I. combinadas e/ou suportando redes de empresas, redes sociais, se refletem em estratégias de gestão e também na concepção de suas cadeias de fornecimento / Abstract: This research studied the influence of networks of information technology (IT) in the formulation of strategy and management of companies. New techniques for application development (APP) for smartphones, data capture systems and system analyses of large volumes of unstructured data such as vídeo, e-mails, blog posts, tweets, texts, imges, etc. are generating new demands for companies. A new racing in the digital world is progressing. Everywhere companies are struggling to attend these large masses of data and extract information that they confer competitive advantages. Also part of the scope of this research to assess the likely impacts caused by the development of IT networks, their associated technologies and the social networking , in the business models of organization and as this development may reflect in the setting of the supply chains, both upstream and downstream. Were analyzed, from important global journals, articles from many of the leading researches whose works keep relation with research objectives of this work. Were also conduced qualitative research with three case studies in three industrial sectors and a providing service. Were forund evidence that might prove that the use of IT networks combined and/or supporting business networks, social networks, are reflected in management strategies and also in the design of their supply chains / Mestre
165

Real-time decision support systems in a selected big data environment

Muchemwa, Regis Fadzi January 2016 (has links)
Thesis (MTech (Business Information Systems))--Cape Peninsula University of Technology, 2016. / The emergence of big data (BD) has rendered existing conventional business intelligence (BI) tools inefficient and ineffective for real-time decision support systems (DSS). The inefficiency and ineffectiveness is perceived when business users need to make decisions based on stale and sometimes, incomplete data sets, which potentially leads to slow and poor decision making. In recent years, industry and academia have invented new technologies to process BD such as Hadoop, Spark, in-memory databases and NOSQL databases. The appearance of these new technologies have escalated to an extent, that organisations are faced with the challenge of determining most suitable technologies that are appropriate for real-time DSS requirements. Due to BD still being a new concept, there are no standard guidelines or frameworks available to assist in the evaluation and comparing of BD technologies. This research aims to explore factors that influence the selection of technologies appropriate for real-time DSSs in a BD environment. In addition, it further proposes evaluation criteria that can be used to compare and select these technologies. To achieve this aim, a literature analysis to understand the concept of BD, real-time DSSs and related technologies is conducted. Qualitative as well as quantitative research techniques are used after interviews are conducted with BI experts who have BD knowledge and experience. Experimental research in a computer laboratory is also conducted. The purpose of the interviews is to ascertain which technologies are being used for BD analytics and in addition, which evaluation criteria organisations use when choosing such a technology. Furthermore, a comparative computer laboratory experiment is conducted to compare three tools which run on Hadoop namely; Hive, Impala and Spark. The purpose of the experiment is to test if system performance is different for the three tools when analysing the same data set and the same computer resources. The impirical results reveals nine main factors which impact the selection of technologies appropriate for real-time DSS in a BD environment, and ten application independent evaluation criteria. Furthermore, the experiment results indicate that system performance in terms of latency, is significantly different among the three tools compared.
166

Game Analytics och Big Data

Erlandsson, Niklas January 2016 (has links)
Game Analytics är ett område som vuxit fram under senare år. Spelutvecklare har möjligheten att analysera hur deras kunder använder deras produkter ned till minsta knapptryckning. Detta kan resultera i stora mängder data och utmaning ligger i att lyckas göra något vettigt av sitt data. Utmaningarna med speldata beskrivs ofta med liknande egenskaper som används för att beskriva Big Data: volume, velocity och variability. Detta borde betyda att det finns potential för ett givande samarbete. Studiens syfte är att analysera och utvärdera vilka möjligheter Big Data ger att utveckla området Game Analytics. För att uppfylla syftet genomförs en litteraturstudie och semi-strukturerade intervjuer med individer aktiva inom spelbranschen. Resultatet visar att källorna är överens om att det finns värdefull information bland det data som kan lagras, framförallt i de monetära, generella och centrala (core) till spelet värdena. Med mer avancerad analys kan flera andra intressanta mönster grävas fram men ändå är det övervägande att hålla sig till de enklare variablerna och inte bry sig om att gräva djupare. Det är inte för att datahanteringen skulle bli för omständlig och svår utan för att analysen är en osäker investering. Även om någon tar sig an alla utmaningar speldata ställer fram finns det en osäkerhet på informationens tillit och användbarheten hos svaren. Framtidsvisionerna inom Game Analytics är blygsamma och inom den närmsta framtiden är det nästan bara effektiviseringar och en utbredning som förutspås vilket inte direkt ställer några nya krav på datahanteringen. / Game Analytics is a research field that appeared recently. Game developers have the ability to analyze how customers use their products down to every button pressed. This can result in large amounts of data and the challenge is to make sense of it all. The challenges with game data is often described with the same characteristics used to define Big Data: volume, velocity and variability. This should mean that there is potential for a fruitful collaboration. The purpose of this study is to analyze and evaluate what possibilities Big Data has to develop the Game Analytics field. To fulfill this purpose a literature review and semi-structured interviews with people active in the gaming industry were conducted. The results show that the sources agree that valuable information can be found within the data you can store, especially in the monetary, general and core values to the specific game. With more advanced analysis you may find other interesting patterns as well but nonetheless the predominant way seems to be sticking to the simple variables and staying away from digging deeper. It is not because data handling or storing would be tedious or too difficult but simply because the analysis would be too risky of an investment. Even if you have someone ready to take on all the challenges game data sets up, there is not enough trust in the answers or how useful they might be. Visions of the future within the field are very modest and the nearest future seems to hold mostly efficiency improvements and a widening of the field, making it reach more people. This does not really post any new demands or requirements on the data handling.
167

Aplicação da arquitetura lambda na construção de um ambiente big data educacional para análise de dados

Mendes, Renê de Ávila 09 February 2017 (has links)
Submitted by Marta Toyoda (1144061@mackenzie.br) on 2018-02-09T19:36:53Z No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Paola Damato (repositorio@mackenzie.br) on 2018-02-22T13:28:09Z (GMT) No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-02-22T13:28:09Z (GMT). No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-02-09 / To properly deal with volume, velocity and variety data dimensions in educational contexts is a major concern for Educational Institutions and both Educational Data Mining and Learning Analytics Researchers have cooperated to properly address this challenge which is popularly called Big Data. Hardware developments have been made to increase computing power, storage capacity and efficiency in energy use. New technologies in databases, file systems and distributed systems, as well as developments in data transmission techniques, data management, data analysis and visualization have been trying to overcome the challenge of processing, storing and analyzing large volumes of data and the inability to meet simultaneously the requirements of consistency, availability and tolerance of partitions. Although the architecture definition is the main task in a Big Data system design, objective guidelines for the selection of the architecture and the tools for the implementation of Big Data systems were not found in the literature. The present research aims to analyze the main architectures for both batch and stream processing and to use one of them in the construction of a Big Data environment, providing important orientations to Researchers, Technicians and Managers. Academic data and logs of the Virtual Learning Environment Moodle of an Academic Unit of a Higher Education Institution are used. / Lidar adequadamente com as dimensões de volume, velocidade e variedade dos dados no contexto educacional é um importante desafio para as Instituições de Ensino, e Pesquisadores das áreas de Mineração de Dados Educacionais e Learning Analytics têm cooperado para tratar adequadamente este desafio, popularmente chamado de Big Data. Desenvolvimentos em hardware têm sido feitos para aumentar o poder computacional, a capacidade de armazenamento e a eficiência no uso de energia. Novas tecnologias de bancos de dados, sistemas de arquivos e sistemas distribuídos, além do desenvolvimento de técnicas de transmissão, administração, análise e visualização de dados têm tentado vencer o desafio de processar, armazenar e analisar grandes volumes de dados e a impossibilidade de atender simultaneamente os requisitos de consistência, disponibilidade e tolerância a partições. Embora a definição da arquitetura seja a principal tarefa em um projeto de sistema Big Data, não foram encontradas na literatura orientações objetivas para a seleção da arquitetura e das ferramentas para a implementação de aplicações Big Data. A presente pesquisa tem por objetivo analisar as principais arquiteturas para processamento em lote e em fluxo e utilizar uma delas na construção de um ambiente Big Data, fornecendo importantes orientações a Pesquisadores, Técnicos e Gestores. São utilizados dados acadêmicos e logs do Ambiente Virtual de Aprendizagem Moodle de uma Unidade Acadêmica de uma Instituição de Ensino Superior.
168

Response to Technological Innovation: The Impact of STEM Graduates on Employment Opportunities in Accounting Services Firms

He, Annette 01 January 2018 (has links)
What is the effect of STEM (science, technology, engineering, math) degrees on employment in accounting services? Many accounting firms are beginning to rely on recent technological developments such as big data and Artificial Intelligence. Although firms have traditionally hired professionals from pure accounting backgrounds, technology is creating a new demand for skills focusing on data analytics, computer science, statistics, and many more. This thesis analyzes the impact of increasing employment diversity; one way of maximizing the potential of technological innovation is to focus recruiting on STEM graduates. Thus, this thesis uses an empirical analysis on the effect of STEM degrees and accounting services employment; this relationship is compared with variables that have had or will affect accounting services employment in the twenty-first century: Sarbanes Oxley regulations, accounting degrees, public research and development funding, and unemployment rates. The conclusions from this analysis help suggest future educational implications for accountants.
169

Big data of tree species distributions: how big and how good?

Serra-Diaz, Josep M., Enquist, Brian J., Maitner, Brian, Merow, Cory, Svenning, Jens-C. 15 January 2018 (has links)
Background: Trees play crucial roles in the biosphere and societies worldwide, with a total of 60,065 tree species currently identified. Increasingly, a large amount of data on tree species occurrences is being generated worldwide: from inventories to pressed plants. While many of these data are currently available in big databases, several challenges hamper their use, notably geolocation problems and taxonomic uncertainty. Further, we lack a complete picture of the data coverage and quality assessment for open/public databases of tree occurrences. Methods: We combined data from five major aggregators of occurrence data (e.g. Global Biodiversity Information Facility, Botanical Information and Ecological Network v.3, DRYFLOR, RAINBIO and Atlas of Living Australia) by creating a workflow to integrate, assess and control data quality of tree species occurrences for species distribution modeling. We further assessed the coverage - the extent of geographical data - of five economically important tree families (Arecaceae, Dipterocarpaceae, Fagaceae, Myrtaceae, Pinaceae). Results: Globally, we identified 49,206 tree species (84.69% of total tree species pool) with occurrence records. The total number of occurrence records was 36.69 M, among which 6.40 M could be considered high quality records for species distribution modeling. The results show that Europe, North America and Australia have a considerable spatial coverage of tree occurrence data. Conversely, key biodiverse regions such as South-East Asia and central Africa and parts of the Amazon are still characterized by geographical open-public data gaps. Such gaps are also found even for economically important families of trees, although their overall ranges are covered. Only 15,140 species (26.05%) had at least 20 records of high quality. Conclusions: Our geographical coverage analysis shows that a wealth of easily accessible data exist on tree species occurrences worldwide, but regional gaps and coordinate errors are abundant. Thus, assessment of tree distributions will need accurate occurrence quality control protocols and key collaborations and data aggregation, especially from national forest inventory programs, to improve the current publicly available data.
170

Improvements on Scientific System Analysis

Grupchev, Vladimir 01 January 2015 (has links)
Thanks to the advancement of the modern computer simulation systems, many scientific applications generate, and require manipulation of large volumes of data. Scientific exploration substantially relies on effective and accurate data analysis. The shear size of the generated data, however, imposes big challenges in the process of analyzing the system. In this dissertation we propose novel techniques as well as using some known designs in a novel way in order to improve scientific data analysis. We develop an efficient method to compute an analytical query called spatial distance histogram (SDH). Special heuristics are exploited to process SDH efficiently and accurately. We further develop a mathematical model to analyze the mechanism leading to errors. This gives rise to a new approximate algorithm with improved time/accuracy tradeoff. Known MS analysis systems follow a pull-based design, where the executed queries mandate the data needed on their part. Such a design introduces redundant and high I/O traffic as well as cpu/data latency. To remedy such issues, we design and implement a push-based system, which uses a sequential scan-based I/O framework that pushes the loaded data to a number of pre-programmed queries. The efficiency of the proposed system as well as the approximate SDH algorithms is backed by the results of extensive experiments on MS generated data.

Page generated in 0.0505 seconds