Global ETD Search

151	Approches collaboratives pour la classification des données complexes / Collaborative approaches for complex data classification Rabah, Mazouzi 12 December 2016 (has links) La présente thèse s'intéresse à la classification collaborative dans un contexte de données complexes, notamment dans le cadre du Big Data, nous nous sommes penchés sur certains paradigmes computationels pour proposer de nouvelles approches en exploitant des technologies de calcul intensif et large echelle. Dans ce cadre, nous avons mis en oeuvre des classifieurs massifs, au sens où le nombre de classifieurs qui composent le multi-classifieur peut être tres élevé. Dans ce cas, les méthodes classiques d'interaction entre classifieurs ne demeurent plus valables et nous devions proposer de nouvelles formes d'interactions, qui ne se contraignent pas de prendre la totalité des prédictions des classifieurs pour construire une prédiction globale. Selon cette optique, nous nous sommes trouvés confrontés à deux problèmes : le premier est le potientiel de nos approches à passer à l'echelle. Le second, relève de la diversité qui doit être créée et maintenue au sein du système, afin d'assurer sa performance. De ce fait, nous nous sommes intéressés à la distribution de classifieurs dans un environnement de Cloud-computing, ce système multi-classifieurs est peut etre massif et ses propréités sont celles d'un système complexe. En terme de diversité des données, nous avons proposé une approche d'enrichissement de données d'apprentissage par la génération de données de synthèse, à partir de modèles analytiques qui décrivent une partie du phenomène étudié. Aisni, la mixture des données, permet de renforcer l'apprentissage des classifieurs. Les expérientations menées ont montré un grand potentiel pour l'amélioration substantielle des résultats de classification. / This thesis focuses on the collaborative classification in the context of complex data, in particular the context of Big Data, we used some computational paradigms to propose new approaches based on HPC technologies. In this context, we aim at offering massive classifiers in the sense that the number of elementary classifiers that make up the multiple classifiers system can be very high. In this case, conventional methods of interaction between classifiers is no longer valid and we had to propose new forms of interaction, where it is not constrain to take all classifiers predictions to build an overall prediction. According to this, we found ourselves faced with two problems: the first is the potential of our approaches to scale up. The second, is the diversity that must be created and maintained within the system, to ensure its performance. Therefore, we studied the distribution of classifiers in a cloud-computing environment, this multiple classifiers system can be massive and their properties are those of a complex system. In terms of diversity of data, we proposed a training data enrichment approach for the generation of synthetic data from analytical models that describe a part of the phenomenon studied. so, the mixture of data reinforces learning classifiers. The experimentation made have shown the great potential for the substantial improvement of classification results. Classification Ensemble de classifieurs Big data Cloud-computing Diversité
152	Discovery of novel prognostic tools to stratify high risk stage II colorectal cancer patients utilising digital pathology Caie, Peter David January 2015 (has links) Colorectal cancer (CRC) patients are stratified by the Tumour, Node and Metastasis (TNM) staging system for clinical decision making. Additional genomic markers have a limited utility in some cases where precise targeted therapy may be available. Thus, classical clinical pathological staging remains the mainstay of the assessment of this disease. Surgical resection is generally considered curative for Stage II patients, however 20-30% of these patients experience disease recurrence and disease specific death. It is imperative to identify these high risk patients in order to assess if further treatment or detailed follow up could be beneficial to their overall survival. The aim of the thesis was to categorise Stage II CRC patients into high and low risk of disease specific death through novel image based analysis algorithms. Firstly, an image analysis algorithm was developed to quantify and assess the prognostic value of three histopathological features through immuno-fluorescence: lymphatic vessel density (LVD), lymphatic vessel invasion (LVI) and tumour budding (TB). Image analysis provides the ability to standardise their quantification and negates observer variability. All three histopathological features were found to be predictors of CRC specific death within the training set (n=50); TB (HR =5.7; 95% CI, 2.38-13.8), LVD (HR =5.1; 95% CI, 2.04-12.99) and LVI (HR =9.9; 95% CI, 3.57- 27.98). Only TB (HR=2.49; 95% CI, 1.03-5.99) and LVI (HR =2.46; 95%CI, 1 - 6.05), however, were significant predictors of disease specific death in the validation set (n=134). Image analysis was further employed to characterise TB and quantify intra-tumoural heterogeneity. Tumour subpopulations within CRC tissue sections were segmented for the quantification of differential biomarker expression associated with epithelial mesenchymal transition and aggressive disease. Secondly, a novel histopathological feature ‘Sum Area Large Tumour Bud’ (ALTB) was identified through immunofluorescence coupled to a novel tissue phenomics approach. The tissue phenomics approach created a complex phenotypic fingerprint consisting of multiple parameters extracted from the unbiased segmentation of all objects within a digitised image. Data mining was employed to identify the significant parameters within the phenotypic fingerprint. ALTB was found to be a more significant predictor of disease specific death than LVI or TB in both the training set (HR = 20.2; 95% CI, 4.6 – 87.9) and the validation set (HR = 4; 95% CI, 1.5 – 11.1). Finally, ALTB was combined with two parameters, ‘differentiation’ and ‘pT stage’, which were exported from the original patient pathology report to form an integrative pathology score. The integrative pathology score was highly significant at predicting disease specific death within the validation set (HR = 7.5; 95% CI, 3 – 18.5). In conclusion, image analysis allows the standardised quantification of set histopathological features and the heterogeneous expression of biomarkers. A novel image based histopathological feature combined with classical pathology allows the highly significant stratification of Stage II CRC patients into high and low risk of disease specific death. 616.99
153	Os dados como base à criação de um método de planejamento de propaganda / Lima, Carlos Eduardo de. January 2018 (has links) Orientador: Francisco Machado Filho / Banca: Marcos Americo / Banca: Nirave Reigota Caram / Resumo: O presente estudo visa identificar as inúmeras transformações que o planejamento de propaganda tem enfrentado desde o advento da Internet e das tecnologias de comunicação e informação baseadas em big data, Machine Learning, cluster e outras ferramentas de inteligência de dados. Dessa forma, buscou-se fazer um levantamento histórico e documental sobre os modelos de planejamento de propaganda e briefs criativos. Percebeu-se fundamental traçar uma breve documentação histórica sobre a concepção da disciplina de planejamento para o planejador e a forma como esse processo foi desenvolvido no Brasil, assim como sua evolução. Fez-se necessário também definir conceitos sobre big data e inovação, buscando identificar como afetam a estrutura e as metodologias até então usadas pelo planejamento. Com isso, objetivou-se poder entender como o planejador está sendo levado a desenvolver novas competências que abordam diferentes disciplinas, além das que já eram aplicadas no processo de investigação e criação do planejamento. Dessa forma, foram utilizadas metodologias de pesquisa de campo com entrevistas em profundidade com heads e diretores de planejamento de agências de comunicação e players reconhecidos por sua competência e experiência no planejamento de propaganda. Sendo assim, esta pesquisa apresenta uma proposta de um método de planejamento que, por meio de ferramentas baseadas em softwares e aplicativos, permita que o profissional de planejamento possa gerar ideias inovadoras e propor ... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: This study aims to spot the countless transformations that the advertising planning has been passing through since the appearance of the Internet, as well as communication and information technologies based upon big data, Machine Learning, cluster and othe r data intelligence mechanisms. Along these lines, it was undertaken to assemble historical and documental facts about advertising planning and creative briefs guidelines. It was noticed the importance to picture a brief historical documentation about the conception of the planning subject for planners and the way this process was developed in Brazil, as well as its evolution. It was also necessary to define concepts about big data and innovation, in order to find how they impact the structure and methodolo gies used by the advertising planning until then. Thereby, the goal is to understand how the planner is being compelled to develop new skills which approach different matters, beyond the ones that were already used in the process of inquiring and creating in advertising planning. Thus, field research methodologies were applied with in - depth interviews with heads and directors of planning at communication agencies and market players whom are renowned for their competence and experience in advertising plannin g. Therefore, this essay proposes a planning approach which, utilizing tools based upon softwares and appliances, enables planners to develop disrupting ideas and come up with new mindsets to agencies. / Mestre Propaganda. Planejamento. Metodologia. Mídia digital. Tecnologia. Big data. Advertising.
154	MACHINE LEARNING ON BIG DATA FOR STOCK MARKET PREDICTION Fallahi, Faraz 01 August 2017 (has links) In recent decades, the rapid development of information technology in the big data field has introduced new opportunities to explore a large amount of data available online. The Global Database of Events, Location (Language), and Tone (GDELT) is the largest, most comprehensive, and highest resolution open source database of human society that includes more than 440 million entries capturing information about events that have been covered by local, national, and international news sources since 1979 in over 100 languages. GDELT constructs a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what is happening around the world, what its context is and who is involved, and how the world is feeling about it, every single day. On the other hand, the stock market prediction has also been a long-time attractive topic and is extensively studied by researchers in different fields with numerous studies of the correlation between stock market fluctuations and different data sources derived from the historical data of world major stock indices or external information from social media and news. Support Vector Machine (SVM) and Logistic Regression are two of the most widely used machine learning techniques in recent studies. The main objective of this research project is to investigate the worthiness of information derived from GDELT project in improving the accuracy of stock market trend prediction specifically for the next days' price changes. This research is based on data sets of events from GDELT database and daily prices of Bitcoin and some other stock market companies and indices from Yahoo Finance, all from March 2015 to May 2017. Then multiple different machine learning and specifically classification algorithms are applied to data sets generated, first using only features derived from historical market prices and then including more features derived from external sources, in this case, GDELT. Then the performance is evaluated for each model over a range of parameters. Finally, experimental results show that using information gained from GDELT has a direct positive impact on improving the prediction accuracy. Keywords: Machine Learning, Stock Market, GDELT, Big Data, Data Mining Big Data Bitcoin Data Mining GDELT Machine learning Stock Market
155	Caracterização e modelagem multivariada do desempenho de sistemas de arquivos paralelos Inacio, Eduardo Camilo January 2015 (has links) Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2015. / Made available in DSpace on 2015-04-29T21:10:29Z (GMT). No. of bitstreams: 1 332968.pdf: 1630035 bytes, checksum: ab750b282530f4ce742e30736aa9d74d (MD5) Previous issue date: 2015 / A quantidade de dados digitais gerados diariamente vem aumentando de forma significativa. Por consequência, as aplicações precisam manipular volumes de dados cada vez maiores, dos mais variados formatos e origens, em alta velocidade, sendo essa problemática denominada como Big Data. Uma vez que os dispositivos de armazenamento não acompanharam a evolução de desempenho observada em processadores e memórias principais, esses acabam se tornando os gargalos dessas aplicações. Sistemas de arquivos paralelos são soluções de software que vêm sendo amplamente adotados para mitigar as limitações de entrada e saída (E/S) encontradas nas plataformas computacionais atuais. Contudo, a utilização eficiente dessas soluções de armazenamento depende da compreensão do seu comportamento diante de diferentes condições de uso. Essa é uma tarefa particularmente desafiadora, em função do caráter multivariado do problema, ou seja, do fato de o desempenho geral do sistema depender do relacionamento e da influência de um grande conjunto de variáveis. Nesta dissertação se propõe um modelo analítico multivariado para representar o comportamento do desempenho do armazenamento em sistemas de arquivos paralelos para diferentes configurações e cargas de trabalho. Um extenso conjunto de experimentos, executados em quatro ambientes computacionais reais, foi realizado com o intuito de identificar um número significativo de variáveis relevantes, caracterizar a influência dessas variáveis no desempenho geral do sistema e construir e avaliar o modelo proposto.Como resultado do esforço de caracterização, o efeito de três fatores, não explorados em trabalhos anteriores, é apresentado. Os resultados da avaliação realizada, comparando o comportamento e valores estimados pelo modelo com o comportamento e valores medidos nos ambientes reais para diferentes cenários de uso, demonstraram que o modelo proposto obteve sucesso na representação do desempenho do sistema. Apesar de alguns desvios terem sido encontrados nos valores estimados pelo modelo, considerando o número significativamente maior de cenários de uso avaliados nessa pesquisa em comparação com propostas anteriores encontradas na literatura, a acurácia das predições foi considerada aceitável.<br> / Abstract : The amount of digital data generated dialy has increased significantly.Consequently, applications need to handle increasing volumes of data, in a variety of formats and sources, with high velocity, namely Big Data problem. Since storage devices did not follow the performance evolution observed in processors and main memories, they become the bottleneck of these applications. Parallel file systems are software solutions that have been widely adopted to mitigate input and output (I/O) limitations found in current computing platforms. However, the efficient utilization of these storage solutions depends on the understanding of their behavior in different conditions of use. This is a particularly challenging task, because of the multivariate nature of the problem, namely the fact that the overall performance of the system depends on the relationship and the influence of a large set of variables. This dissertation proposes an analytical multivariate model to represent storage performance behavior in parallel file systems for different configurations and workloads. An extensive set of experiments, executed in four real computing environments, was conducted in order to identify a significant number of relevant variables, to determine the influence of these variables on overall system performance, and to build and evaluate the proposed model. As a result of the characterization effort, the effect of three factors, not explored in previous works, is presented. Results of the model evaluation, comparing the behavior and values estimated by the model with behavior and values measured in real environments for different usage scenarios, showed that the proposed model was successful in system performance representation. Although some deviations were found in the values estimated by the model, considering the significantly higher number of usage scenarios evaluated in this research work compared to previous proposals found in the literature, the accuracy of prediction was considered acceptable. Computação Big data Analise multivariada Arquivos de computador Banco de dados
156	Mapeamento de qualidade de experiência (QOE) através de qualidade de serviço (QOS) focado em bases de dados distribuídas Souza, Ramon Hugo de January 2017 (has links) Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2017. / Made available in DSpace on 2017-08-22T04:23:28Z (GMT). No. of bitstreams: 1 347258.pdf: 3773483 bytes, checksum: 2ab3ff4810fbf2c17a929b1ac3ab553c (MD5) Previous issue date: 2017 / A falta de conceitualização congruente sobre qualidade de serviço (QoS) para bases de dados (BDs) foi o fator que impulsionou o estudo resultante nesta tese. A definição de QoS como uma simples verificação de se um nó corre risco de falha devido ao número de acessos, como faziam, na época do levantamento bibliométrico desta tese, alguns sistemas comerciais, era uma simplificação exagerada para englobar um conceito tão complexo. Outros trabalhos que dizem lidar com estes conceitos também não são exatos, em termos matemáticos, e não possuem definições concretas ou com qualidade passível de utilização ou replicação, o que torna inviável sua aplicação ou mesmo verificação. O foco deste estudo é direcionado à bases de dados distribuídas (BDDs), de maneira que a conceitualização aqui desenvolvida é também compatível, ao menos parcialmente, com modelos não distribuídos de BDs. As novas definições de QoS desenvolvidas são utilizadas para se lidar com o conceito correlacionado de qualidade de experiência (QoE), em uma abordagem em nível de sistema focada em completude de QoS. Mesmo sendo QoE um conceito multidimensional, difícil de ser mensurado, o foco é mantido em uma abordagem passível de mensuramento, de maneira a permitir que sistemas de BDDs possam lidar com autoavaliação. A proposta de autoavaliação surge da necessidade de identificação de problemas passíveis de autocorreção. Tendo-se QoS bem definida, de maneira estatística, pode-se fazer análise de comportamento e tendência comportamental de maneira a se inferir previsão de estados futuros, o que permite o início de processo de correção antes que se alcance estados inesperados, por predição estatística. Sendo o objetivo geral desta tese a definição de métricas de QoS e QoE, com foco em BDDs, lidando com a hipótese de que é possível se definir QoE estatisticamente com base em QoS, para propósitos de nível de sistema. Ambos os conceitos sendo novos para BDDs quando lidando com métricas mensuráveis exatas. E com estes conceitos então definidos, um modelo de recuperação arquitetural é apresentado e testado para demonstração de resultados quando da utilização das métricas definidas para predição comportamental.<br> / Abstract : The hitherto lack of quality of service (QoS) congruent conceptualization to databases (DBs) was the factor that drove the initial development of this thesis. To define QoS as a simple verification that if a node is at risk of failure due to memory over-commitment, as did some commercial systems at the time that was made the bibliometric survey of this thesis, it is an oversimplification to encompass such a complex concept. Other studies that quote to deal with this concept are not accurate and lack concrete definitions or quality allowing its use, making infeasible its application or even verification. Being the focus targeted to distributed databases (DDBs), the developed conceptualization is also compatible, at least partially, with models of non-distributed DBs. These newfound QoS settings are then used to handle the correlated concept of quality of experience (QoE) in a system-level approach, focused on QoS completeness. Being QoE a multidimensional concept, hard to be measured, the focus is kept in an approach liable of measurement, in a way to allow DDBs systems to deal with self-evaluation. The idea of self-evaluation arises from the need of identifying problems subject to self-correction. With QoS statistically well-defined, it is possible to analyse behavior and to indetify tendencies in order to predict future states, allowing early correction before the system reaches unexpected states. Being the general objective of this thesis the definition of metrics of QoS and QoE, focused on DDBs, dealing with the hypothesis that it is possible to define QoE statistically based on QoS, for system level purposes. Both these concepts being new to DDBs when dealing with exact measurable metrics. Once defined these concepts, an architectural recovering model is presented and tested to demonstrate the results when using the metrics defined for behavioral prediction. Informática Banco de dados Computação em nuvem Big data
157	Aplicação de ETL para a integração de dados com ênfase em big data na área de saúde pública Pinto, Clícia dos Santos 05 March 2015 (has links) Submitted by Santos Davilene (davilenes@ufba.br) on 2016-05-30T15:55:34Z No. of bitstreams: 1 Dissertação_Mestrado_Clicia(1).pdf: 2228201 bytes, checksum: d990a114eac5a988c57ba6d1e22e8f99 (MD5) / Made available in DSpace on 2016-05-30T15:55:34Z (GMT). No. of bitstreams: 1 Dissertação_Mestrado_Clicia(1).pdf: 2228201 bytes, checksum: d990a114eac5a988c57ba6d1e22e8f99 (MD5) / Transformar os dados armazenados em informações úteis tem sido um desafio cada vez maior e mais complexo a medida em que o volume de dados produzidos todos os dias aumenta. Nos últimos anos, conceitos e tecnologias de Big Data têm sido amplamente utilizados como solução para o gerenciamento de grandes quantidades de dados em diferentes domínios. A proposta deste trabalho diz respeito `a utiliza¸c˜ao de técnicas de ETL (extração,transformação e carga) no desenvolvimento de um módulo de pré-processamento para o pareamento probabilístico de registros em bases de dados na área de Saúde Pública. A utiliza¸c˜ao da ferramenta de processamento distribuído do Spark garante o tratamento adequado para o contexto de Big Data em que esta pesquisa está inserida, gerando respostas em tempo hábil. Ciência da Computação Big Data ETL pré-processamento correlação de registros Spark
158	Uma proposta de modelo conceitual para uso de Big Data e Open data para Smart Cities Klein, Vinicius Barreto January 2015 (has links) Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia e Gestão do Conhecimento, Florianópolis, 2015. / Made available in DSpace on 2016-10-19T12:58:00Z (GMT). No. of bitstreams: 1 339506.pdf: 1667510 bytes, checksum: c7bc948b31b480bb71ca28c0bfe1b9aa (MD5) Previous issue date: 2015 / Atualmente vivemos um contexto onde a sociedade produz um alto volume de dados, produzidos pelas mais diversas fontes, em diferentes formatos e esquemas distintos, e de forma cada vez mais veloz. Este fato corresponde ao fenômeno big data. Contribuindo com este fenômeno, o movimento dados abertos (open data) também adiciona novas fontes de dados produzidos pela sociedade atual. Os dados big data e os dados abertos podem servir de insumo para a geração de conhecimento, e as smart cities (cidades inteligentes) podem se beneficiar deste processo. Estas cidades representam um conceito que envolve utilizar TICs (tecnologias da informação e comunicação) como meio de melhorar a qualidade de vida nos centro urbanos atuais. Esta ideia motiva-se principalmente pelos diversos problemas enfrentados pelos habitantes destas cidades, como o mal gerenciamento de seus recursos naturais, altos índices de poluição atmosférica, trânsito intenso, taxas de crimes dentre outros, causados principalmente pela alta concentração de pessoas nestes locais. Neste contexto, o objetivo desta dissertação é identificar as principais fontes de dados e suas características, e interligá-las às necessidades das smart cities. Foi desenvolvida uma proposta de modelo conceitual para smart cities, que utiliza big data e open data como fonte de dados. Para isso, foi realizada primeiramente uma pesquisa exploratória dos temas relacionados à pesquisa que foram organizados na fundamentação teórica. Em seguida, foram desenvolvidas questões de competência e outras práticas do método OntoKEM para desenvolvimento de ontologias, que guiaram o desenvolvimento do Modelo. Estas questões foram respondidas com auxílio do modelo CESM de Bunge. Em seguida, foi proposto o Modelo, organizado em camadas, e foi realizada sua verificação em um cenário de uso, onde foram apresentadas discussões, resultados e sugestões futuras.<br> / Abstract: We currently live in a context where society produces a high volume of data, generated by various sources, in different formats and different schemes, and in a increasingly fast way. This fact corresponds to the big data phenomenon. Contributing to this phenomenon, the movement open data also adds new data sources produced by today's society. The big data and open data can sources serve as input for the generation of knowledge, and the smart cities can benefit from this process. These cities represent a concept that involves using ICTs (information and communication technologies) as means of improving the quality of life in today's urban center. This idea motivated is mainly by the various problems faced by inhabitants of these cities, such as mismanagement of its natural resources, high levels of air pollution, heavy traffic, crime rates and others problems mainly caused by the high concentration of people in these places. In this context, in order to identify the main data sources and their characteristics, and connect them to the needs of smart cities, a proposal for a conceptual model for smart cities was developed, which uses big data and open data as data sources. To do this, first it was made an exploratory research of issues related to research and then they were organized in the theoretical foundation chapter. Then competency questions have been developed and other ontoKEM practices for developing ontologies, which guided the development of the model. These questions were answered with the aid of Bunge CESM model. Then the model was proposed, organized in layers and checked in a usage scenario, where discussions were presented, so as the results and future researches related. Engenharia e gestão do conhecimento Gestão do conhecimento Big data Tecnologia da informação
159	Contribuição das redes de T.I. para formulação da estratégia e gestão das empresas Leão, Mauricio Teixeira [UNESP] 20 August 2014 (has links) (PDF) Made available in DSpace on 2015-03-03T11:52:29Z (GMT). No. of bitstreams: 0 Previous issue date: 2014-08-20Bitstream added on 2015-03-03T12:06:37Z : No. of bitstreams: 1 000805280.pdf: 646771 bytes, checksum: 48e2a72e7375616751063069bed6aa12 (MD5) / Esta pesquisa estudou a influência das redes de Tecnologia da Informação (T.I.) na formulação da estratégia e gestão das empresas. Novas técnicas de desenvolvimento de application (APP) para smartphones, sistemas de capturas de dados e sistemas de análises de grandes volumes de dados não estruturados como vídeo, e-mails, postagens em blogs, tweets, textos, imagens, etc. estão gerando novas demandas para as empresas. Uma nova corrida no universo digital está em curso. Empresas em toda a parte estão esforçando-se para conseguirem tratar estas grandes massas de dados e extrair desta pesquisa avaliar os prováveis impactos causados pelo desenvolvimento das redes de TI, suas tecnologias associadas e as redes sociais, nos modelos de negócios das organizações e como esse desenvolvimento pode refletir nas configurações das cadeias de fornecimentos, tanto a montante como a jusante. Foram analisados, a partir de importantes periódicos mundiais, artigos de muitos dos principais pesquisadores cujas obras guardam relação com os objetivos de pesquisa deste trabalho. Também foram realizadas pesquisas qualitativas com três casos estudados em três empresas dos setores industrial e de prestação de serviços. Foram encontrados indícios que puderam comprovar que o uso das redes de T.I. combinadas e/ou suportando redes de empresas, redes sociais, se refletem em estratégias de gestão e também na concepção de suas cadeias de fornecimento / This research studied the influence of networks of information technology (IT) in the formulation of strategy and management of companies. New techniques for application development (APP) for smartphones, data capture systems and system analyses of large volumes of unstructured data such as vídeo, e-mails, blog posts, tweets, texts, imges, etc. are generating new demands for companies. A new racing in the digital world is progressing. Everywhere companies are struggling to attend these large masses of data and extract information that they confer competitive advantages. Also part of the scope of this research to assess the likely impacts caused by the development of IT networks, their associated technologies and the social networking , in the business models of organization and as this development may reflect in the setting of the supply chains, both upstream and downstream. Were analyzed, from important global journals, articles from many of the leading researches whose works keep relation with research objectives of this work. Were also conduced qualitative research with three case studies in three industrial sectors and a providing service. Were forund evidence that might prove that the use of IT networks combined and/or supporting business networks, social networks, are reflected in management strategies and also in the design of their supply chains Tecnologia da informação Redes de computadores Redes de informação Big data Information technology
160	A framework for scoring and tagging NetFlow data Sweeney, Michael John January 2019 (has links) With the increase in link speeds and the growth of the Internet, the volume of NetFlow data generated has increased significantly over time and processing these volumes has become a challenge, more specifically a Big Data challenge. With the advent of technologies and architectures designed to handle Big Data volumes, researchers have investigated their application to the processing of NetFlow data. This work builds on prior work wherein a scoring methodology was proposed for identifying anomalies in NetFlow by proposing and implementing a system that allows for automatic, real-time scoring through the adoption of Big Data stream processing architectures. The first part of the research looks at the means of event detection using the scoring approach and implementing as a number of individual, standalone components, each responsible for detecting and scoring a single type of flow trait. The second part is the implementation of these scoring components in a framework, named Themis1, capable of handling high volumes of data with low latency processing times. This was tackled using tools, technologies and architectural elements from the world of Big Data stream processing. The performance of the framework on the stream processing architecture was shown to demonstrate good flow throughput at low processing latencies on a single low end host. The successful demonstration of the framework on a single host opens the way to leverage the scaling capabilities afforded by the architectures and technologies used. This gives weight to the possibility of using this framework for real time threat detection using NetFlow data from larger networked environments. Big data Electronic data processing High performance computing

Search results