61 |
Big data analýzy a statistické zpracování metadat v archivu obrazové zdravotnické dokumentace / Big Data Analysis and Metadata Statistics in Medical Images ArchivesPšurný, Michal January 2017 (has links)
This Diploma thesis describes issues of big data in healthcare focus on picture archiving and communication system. DICOM format are store images with header where it could be other valuable information. This thesis mapping data from 1215 studies.
|
62 |
The Influence of a Return of Native Grasslands upon the Ecology and Distribution of Small Rodents in Big Bend National ParkBaccus, John T. 08 1900 (has links)
In the southwestern United States there is a delicate balance between the existing grasslands and the rodent fauna. The purpose of this investigation was to determine the influence of secondary succession of native grasslands upon the ecology and distribution of small rodents. Two methods of determining the rodent species were plot quadrates and trap lines using Sherman live traps.
|
63 |
Are HiPPOs losing power in organizational decision-making? : An exploratory study on the adoption of Big Data AnalyticsMoquist Sundh, Ellinor January 2019 (has links)
Background: In the past decades, big data (BD) has become a buzzword which is associated with the opportunities of gaining competitive advantage and enhanced business performance. However, data in a vacuum is not valuable, but its value can be harnessed when used to drive decision-making. Consequently, big data analytics (BDA) is required to generate insights from BD. Nevertheless, many companies are struggling in adopting BDA and creating value. Namely, organizations need to deal with the hard work necessary to benefit from the analytics initiatives. Therefore, businesses need to understand how they can effectively manage the adoption of BDA to reach decision-making quality. The study answers the following research questions: What factors could influence the adoption of BDA in decision-making? How can the adoption of BDA affect the quality of decision-making? Purpose: The purpose of this study is to explore the opportunities and challenges of adopting big data analytics in organizational decision-making. Method: Data is collected through interviews based on a theoretical framework. The empirical findings are deductively and inductively analysed to answer the research questions. Conclusion: To harness value from BDA, companies need to deal with several challenges and develop capabilities, leading to decision-maker quality. The major challenges of BDA adoption are talent management, leadership focus, organizational culture, technology management, regulation compliance and strategy alignment. Companies should aim to develop capabilities regarding: knowledge exchange, collaboration, process integration, routinization, flexible infrastructure, big data source quality and decision maker quality. Potential opportunities generated from the adoption of BDA, leading to improved decision-making quality, are: automated decision-making, predictive analytics and more confident decision makers.
|
64 |
Reconhecimento de traços de personalidade com base em textos / Personality traits recognition through textsSilva, Barbara Barbosa Claudino da 27 February 2018 (has links)
Apresentamos uma pesquisa na área de Processamento de Linguagem Natural, para reconhecimento de personalidade com base em textos da língua portuguesa. Neste trabalho utilizamos textos provenientes da rede social Facebook, em conjunto com o modelo de personalidade dos Cinco Grandes Fatores, para construir um córpus rotulado com as personalidades de seus autores e, após a identificação das características mais relevantes para o reconhecimento de personalidade, construir modelos computacionais utilizando essas características. Utilizando-se métodos provenientes de léxicos, como o dicionário LIWC ou atributos psicolinguísticos, e métodos provenientes do próprio texto, como bag of words, representação distribuída de palavras e de documentos foram desenvolvidos modelos para reconhecimento de personalidade sem a necessidade de outros métodos mais comumente utilizados para essa tarefa, como inventários ou entrevistas com psicólogos. Os resultados dos métodos de representação distribuída são ligeiramente superiores do que os resultados utilizando o dicionário LIWC, com a vantagem de não exigirem recursos dependentes de um idioma específico / We present a research proposal in the Natural Language Processing field, to recognize personality through texts in the portuguese language. Using texts from the social network Facebook we built a corpus labeled with authors Big-5 personality traits, and after identifying the most relevant atributes to recognize personality, we built computational models based on those attributes. The model was expected to recognize personality without the help of any other methods commonly used in this task, such as inventories or interviews with psychologists. Using lexical methods such as the LIWC dictionary or psycholinguistic attributes, and methods from the text itself, such as bag of words, distributed representation of words and documents, we obtained models for personality recognition without the need of other methods most commonly used for this task. The results of distributed representation methods are slightly better than the results using the LIWC dictionary, with the advantage of not requiring features dependent on a specific language
|
65 |
Reconhecimento de traços de personalidade com base em textos / Personality traits recognition through textsBarbara Barbosa Claudino da Silva 27 February 2018 (has links)
Apresentamos uma pesquisa na área de Processamento de Linguagem Natural, para reconhecimento de personalidade com base em textos da língua portuguesa. Neste trabalho utilizamos textos provenientes da rede social Facebook, em conjunto com o modelo de personalidade dos Cinco Grandes Fatores, para construir um córpus rotulado com as personalidades de seus autores e, após a identificação das características mais relevantes para o reconhecimento de personalidade, construir modelos computacionais utilizando essas características. Utilizando-se métodos provenientes de léxicos, como o dicionário LIWC ou atributos psicolinguísticos, e métodos provenientes do próprio texto, como bag of words, representação distribuída de palavras e de documentos foram desenvolvidos modelos para reconhecimento de personalidade sem a necessidade de outros métodos mais comumente utilizados para essa tarefa, como inventários ou entrevistas com psicólogos. Os resultados dos métodos de representação distribuída são ligeiramente superiores do que os resultados utilizando o dicionário LIWC, com a vantagem de não exigirem recursos dependentes de um idioma específico / We present a research proposal in the Natural Language Processing field, to recognize personality through texts in the portuguese language. Using texts from the social network Facebook we built a corpus labeled with authors Big-5 personality traits, and after identifying the most relevant atributes to recognize personality, we built computational models based on those attributes. The model was expected to recognize personality without the help of any other methods commonly used in this task, such as inventories or interviews with psychologists. Using lexical methods such as the LIWC dictionary or psycholinguistic attributes, and methods from the text itself, such as bag of words, distributed representation of words and documents, we obtained models for personality recognition without the need of other methods most commonly used for this task. The results of distributed representation methods are slightly better than the results using the LIWC dictionary, with the advantage of not requiring features dependent on a specific language
|
66 |
Is nerd the new sexy? Um estudo sobre a recepção da série televisiva The Big Bang Theory / Is nerd the new sexy? A study on the reception of the television series The Big Bang TheorySilva, Soraya Madeira da January 2016 (has links)
SILVA, Soraya Madeira da. Is nerd the new sexy? Um estudo sobre a recepção da série televisiva The Big Bang Theory. 2016. 178f. – Dissertação (Mestrado) – Universidade Federal do Ceará, Instituto de Cultura e Arte, Programa de Pós-graduação em Comunicação Social, Fortaleza (CE), 2016. / Submitted by Márcia Araújo (marcia_m_bezerra@yahoo.com.br) on 2016-06-15T14:32:35Z
No. of bitstreams: 1
2016_dis_smsilva.pdf: 3247280 bytes, checksum: 6dcebf3f11041ecdfb9d88b5efa68c2e (MD5) / Approved for entry into archive by Márcia Araújo (marcia_m_bezerra@yahoo.com.br) on 2016-06-15T14:37:59Z (GMT) No. of bitstreams: 1
2016_dis_smsilva.pdf: 3247280 bytes, checksum: 6dcebf3f11041ecdfb9d88b5efa68c2e (MD5) / Made available in DSpace on 2016-06-15T14:37:59Z (GMT). No. of bitstreams: 1
2016_dis_smsilva.pdf: 3247280 bytes, checksum: 6dcebf3f11041ecdfb9d88b5efa68c2e (MD5)
Previous issue date: 2016 / This research aims to investigate the relationship of people with the TV show The Big Bang Theory and their perception as to whether they consider themselves or are considered nerds. This group, which has long been seen and treated as a pariah of society, has gained fame in recent years and had his image reformulated in the media. This work, in a first moment, seeks to address the nerd profile, analyzing their history, characteristics and media representations in products as TV series and movies, to make a reflection about what means to be nerd currently. For this analysis, the authors Nugent (2008), Goffman (1988), Fernando and Rios (2001) and Bourdieu (1983) are used to identify the group's distinguishing characteristics, their stigmatization in society and its relationship with consumption and the media. Then, a discussion about the connection between communication and culture is aroused, using authors like Caune (2004), Thompson (2001), Schulman (2004) and Morley (1996), among others, to highlight the importance of cultural studies within the scope of this research. Production and consumption are intertwined when we look conveyed cultural products in mass media, so TV series, their classification, public relationship and the importance of the characters that make them up are analyzed as elements connecting product and audience. Jost (2012), Esquenazi (2010), Seger (2006), Davis (2001) and Field (2001) are used to explain the production processes of TV series and character creation, fundamentals to understand the success of an American TV show The Big Bang Theory, displayed on CBS (EUA) and Warner Channel (Brazil). After a detailed analyze of this sitcom's characters, the results of the research carried out for this job are presented. As methodology, a structured survey, with a quantitative and a qualitative approach was applied in a random sample of 600 person, with the purpose of investigate their consuming habits, favorite TV series, connection with characters, perceptions about The Big Bang Theory and their vision about consider themselves or be considered nerd by others. At the conclusion of this research, it is reported that the relationship between people and cultural products they consume is based on affect and identification with the plot and characters in the story. Regarding The Big Bang Theory series, different opinions are presented on the character stereotyping and narrative evolution. Finally, it's concluded that being a nerd, or be considered as well, nowadays it is still something that carries a lot of negativity for those who do not fall within the group, but becomes a empowering factor for who is included. This identity is constructed through the high consumption of cultural products aimed at establishing an emotional connection with these people and offering a projection of the narrative of their lives. / Esta pesquisa tem como objetivo investigar a relação das pessoas com a série televisiva The Big Bang Theory e sua percepção a respeito de se considerarem ou serem consideradas nerds. Este grupo, durante muito tempo visto e retratado como pária da sociedade, vem ganhando fama nos últimos anos e tem sua imagem reformulada nos meios midiáticos. Este trabalho, em um primeiro momento, procura traçar o perfil do nerd, analisando seu histórico, características e representações midiáticas, em produtos como séries e filmes, para fazer uma reflexão sobre o que é ser nerd atualmente. Para esta avaliação, os autores Nugent (2008), Goffman (1988), Fernando e Rios (2001) e Bourdieu (1983) são usados para identificar as características distintivas do grupo, sua estigmatização perante a sociedade e sua relação com o consumo e a mídia. Em seguida, levanta-se uma discussão a respeito da conexão entre comunicação e cultura, utilizando autores como Caune (2004), Thompson (2001), Schulman (2004) e Morley (1996), dentre outros, para ressaltar a importância dos Estudos Culturais dentro do âmbito desta pesquisa. Produção e consumo estão interligados quando analisamos produtos culturais veiculados em meios de comunicação de massa, por isso são analisados as séries televisivas, sua classificação, relação com o público e a importância dos personagens que as compõem como elementos de conexão entre produto e audiência. Jost (2012), Esquenazi (2010), Seger (2006), Davis (2001) e Field (2001) são utilizados para explanar os processos de produção de séries e de criação de personagens, fundamentais para entender o sucesso da série televisiva americana The Big Bang Theory, exibida pela CBS (EUA) e pela Warner Channel (Brasil). Após uma análise detalha dos personagens destas sitcom, apresenta-se os resultados da pesquisa realizada para este trabalho. Como metodologia, um questionário estruturado, com abordagem quantitativa e qualitativa, foi aplicado em uma amostra aleatória de 600 pessoas, com o objetivo de investigar seus hábitos de consumo, séries favoritas, conexão com os personagens, percepções acerca da série The Big Bang Theory e sua visão sobre considerarem-se ou serem considerados nerds por outras pessoas. Na conclusão desta pesquisa, relata-se que a relação das pessoas com os produtos culturais que consomem é baseada por afetos e identificação com o enredo e personagens da história. Em relação à série The Big Bang Theory, opiniões diversas são apresentadas sobre a estereotipificação dos personagens e evolução da narrativa. Por fim, conclui-se que ser nerd, ou ser considerado assim, hoje em dia ainda é algo que carrega bastante negatividade para quem não se insere no grupo, mas se torna um fator de empoderamento para quem se inclui. Esta identidade é construída através do alto consumo de produtos culturais que visam estabelecer uma conexão afetiva com essas pessoas e oferecer uma projeção da narrativa de suas vidas.
|
67 |
An Explorative Study on the Perceived Challenges and Remediating Strategies for Big Data among Data PractitionersSoprano, Olga, Pilipiec, Patrick January 2020 (has links)
Abstract Background: Worldwide, new data are generated exponentially. The emergence of Internet of Things has resulted in products that were designed first to generate data. Big data are valuable, as they have the potential to create business value. Therefore, many organizations are now heavily investing in big data. Despite the incredible interest, big data analytics involves many challenges that need to be overcome. A taxonomy of these challenges is available that was created from the literature. However, this taxonomy fails to represent the view of data practitioners. Little is known about what practitioners do, what problems they have, and how they view the relationship between analysis and organizational innovation. Objective: The purpose of this study was twofold. First, it investigated what data practitioners consider the main challenges of big data and that may prevent creating organizational innovation. Second, it investigated what strategies these data practitioners recommend to remediate these challenges. Methodology: A survey using semi-structured interviews was performed to investigate what data practitioners view as the challenges of big data and what strategies they recommend to remediate those challenges. The study population was heterogeneous and consisted of 10 participants that were selected using purposive sampling. The interviews were conducted between February 27, 2020 and March 24, 2020. Thematic analysis was used to analyze the transcripts. Results: Ninety per cent of the data practitioners experienced working with low quality, unstructured, and incomplete data as a very time-consuming process. Various challenges related to the organizational aspects of analyzing data emerged, such as a lack of experienced human resources, insufficient knowledge of management about the process and value of big data, a lack of understanding about the role of data scientists, and issues related to communication and collaboration between employees and departments. Seventy per cent of the participants experienced insufficient time to learn new technologies and techniques. In addition, twenty per cent of practitioners experienced challenges related to accessing data, but those challenges were primarily reported by consultants. Twenty per cent argued that organizations do not use a proper data-driven approach. However, none of the practitioners experienced difficulties with data policies because this was already been taken care of by the legal department. Nevertheless, uncertainties still exist about what data can and cannot be used for analysis. The findings are only partially consistent with the taxonomy. More specifically, the reported challenges of data policies, industry structure, and access to data differ significantly. Furthermore, the challenge of data quality was not addressed in the taxonomy, but it was perceived as a major challenge to practitioners. Conclusion: The data practitioners only partially agreed with the taxonomy of challenges. The dimensions of access to data, data policies, and industry structure were not considered a challenge to creating organizational innovation. Instead, practitioners emphasized that the 3 dimension of organizational change and talent, and to a lesser extend also the dimension of technology and techniques, involve significant challenges that can severely impact the creation of organizational innovation using big data. In addition, novel and significant challenges such as data quality were identified. Furthermore, for each dimension, the practitioners recommended relevant strategies that may help others to mitigate the challenges of big data analytics and to use big data to create business value.
|
68 |
Assessment of Factors Influencing Intent-to-Use Big Data Analytics in an Organization: A Survey StudyMadhlangobe, Wayne 01 January 2018 (has links)
The central question was how the relationship between trust-in-technology and intent-to-use Big Data Analytics in an organization is mediated by both Perceived Risk and Perceived Usefulness. Big Data Analytics is quickly becoming a critically important driver for business success. Many organizations are increasing their Information Technology budgets on Big Data Analytics capabilities. Technology Acceptance Model stands out as a critical theoretical lens primarily due to its assessment approach and predictive explanatory capacity to explain individual behaviors in the adoption of technology. Big Data Analytics use in this study was considered a voluntary act, therefore, well aligned with the Theory of Reasoned Action and the Technology Acceptance Model. Both theories have validated the relationships between beliefs, attitudes, intentions and usage behavior. Predicting intent-to-use Big Data Analytics is a broad phenomenon covering multiple disciplines in literature. Therefore, a robust methodology was employed to explore the richness of the topic. A deterministic philosophical approach was applied using a survey method approach as an exploratory study which is a variant of the mixed methods sequential exploratory design. The research approach consisted of two phases: instrument development and quantitative. The instrument development phase was anchored with a systemic literature review to develop an instrument and ended with a pilot study. The pilot study was instrumental in improving the tool and switching from a planned covariance-based SEM approach to PLS-SEM for data analysis. A total of 277 valid observations were collected. PLS-SEM was leveraged for data analysis because of the prediction focus of the study and the requirement to assess both reflective and formative measures in the same research model. The measurement and structural models were tested using the PLS algorithm. R2, f2, and Q2 were used as the basis for the acceptable fit measurement. Based on the valid structural model and after running the bootstrapping procedure, Perceived Risk has no mediating effect on Trust-in-Technology on Intent-to-Use. Perceived Usefulness has a full mediating effect. Level of education, training, experience and the perceived capability of analytics within an organization are good predictors of Trust-in-Technology.
|
69 |
Querying graphs with dataVrgoc, Domagoj January 2014 (has links)
Graph data is becoming more and more pervasive. Indeed, services such as Social Networks or the Semantic Web can no longer rely on the traditional relational model, as its structure is somewhat too rigid for the applications they have in mind. For this reason we have seen a continuous shift towards more non-standard models. First it was the semi-structured data in the 1990s and XML in 2000s, but even such models seem to be too restrictive for new applications that require navigational properties naturally modelled by graphs. Social networks fit into the graph model by their very design: users are nodes and their connections are specified by graph edges. The W3C committee, on the other hand, describes RDF, the model underlying the Semantic Web, by using graphs. The situation is quite similar with crime detection networks and tracking workflow provenance, namely they all have graphs inbuilt into their definition. With pervasiveness of graph data the important question of querying and maintaining it has emerged as one of the main priorities, both in theoretical and applied sense. Currently there seem to be two approaches to handling such data. On the one hand, to extract the actual data, practitioners use traditional relational languages that completely disregard various navigational patterns connecting the data. What makes this data interesting in modern applications, however, is precisely its ability to compactly represent intricate topological properties that envelop the data. To overcome this issue several languages that allow querying graph topology have been proposed and extensively studied. The problem with these languages is that they concentrate on navigation only, thus disregarding the data that is actually stored in the database. What we propose in this thesis is the ability to do both. Namely, we will study how query languages can be designed to allow specifying not only how the data is connected, but also how data changes along paths and patterns connecting it. To this end we will develop several query languages and show how adding different data manipulation capabilities and different navigational features affects the complexity of main reasoning tasks. The story here is somewhat similar to the early success of the relational data model, where theoretical considerations led to a better understanding of what makes certain tasks more challenging than others. Here we aim for languages that are both efficient and capable of expressing a wide variety of queries of interest to several groups of practitioners. To do so we will analyse how different requirements affect the language at hand and at the end provide a good base of primitives whose inclusion into a language should be considered, based on the applications one has in mind. Namely, we consider how adding a specific operation, mechanism, or capability to the language affects practical tasks that such an addition plans to tackle. In the end we arrive at several languages, all of them with their pros and cons, giving us a good overview of how specific capabilities of the language affect the design goals, thus providing a sound basis for practitioners to choose from, based on their requirements.
|
70 |
Randomized coordinate descent methods for big data optimizationTakac, Martin January 2014 (has links)
This thesis consists of 5 chapters. We develop new serial (Chapter 2), parallel (Chapter 3), distributed (Chapter 4) and primal-dual (Chapter 5) stochastic (randomized) coordinate descent methods, analyze their complexity and conduct numerical experiments on synthetic and real data of huge sizes (GBs/TBs of data, millions/billions of variables). In Chapter 2 we develop a randomized coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth separable convex function and prove that it obtains an ε-accurate solution with probability at least 1 - p in at most O((n/ε) log(1/p)) iterations, where n is the number of blocks. This extends recent results of Nesterov [43], which cover the smooth case, to composite minimization, while at the same time improving the complexity by the factor of 4 and removing ε from the logarithmic term. More importantly, in contrast with the aforementioned work in which the author achieves the results by applying the method to a regularized version of the objective function with an unknown scaling factor, we show that this is not necessary, thus achieving first true iteration complexity bounds. For strongly convex functions the method converges linearly. In the smooth case we also allow for arbitrary probability vectors and non-Euclidean norms. Our analysis is also much simpler. In Chapter 3 we show that the randomized coordinate descent method developed in Chapter 2 can be accelerated by parallelization. The speedup, as compared to the serial method, and referring to the number of iterations needed to approximately solve the problem with high probability, is equal to the product of the number of processors and a natural and easily computable measure of separability of the smooth component of the objective function. In the worst case, when no degree of separability is present, there is no speedup; in the best case, when the problem is separable, the speedup is equal to the number of processors. Our analysis also works in the mode when the number of coordinates being updated at each iteration is random, which allows for modeling situations with variable (busy or unreliable) number of processors. We demonstrate numerically that the algorithm is able to solve huge-scale l1-regularized least squares problems with a billion variables. In Chapter 4 we extended coordinate descent into a distributed environment. We initially partition the coordinates (features or examples, based on the problem formulation) and assign each partition to a different node of a cluster. At every iteration, each node picks a random subset of the coordinates from those it owns, independently from the other computers, and in parallel computes and applies updates to the selected coordinates based on a simple closed-form formula. We give bounds on the number of iterations sufficient to approximately solve the problem with high probability, and show how it depends on the data and on the partitioning. We perform numerical experiments with a LASSO instance described by a 3TB matrix. Finally, in Chapter 5, we address the issue of using mini-batches in stochastic optimization of Support Vector Machines (SVMs). We show that the same quantity, the spectral norm of the data, controls the parallelization speedup obtained for both primal stochastic subgradient descent (SGD) and stochastic dual coordinate ascent (SCDA) methods and use it to derive novel variants of mini-batched (parallel) SDCA. Our guarantees for both methods are expressed in terms of the original nonsmooth primal problem based on the hinge-loss. Our results in Chapters 2 and 3 are cast for blocks (groups of coordinates) instead of coordinates, and hence the methods are better described as block coordinate descent methods. While the results in Chapters 4 and 5 are not formulated for blocks, they can be extended to this setting.
|
Page generated in 0.0256 seconds