Global ETD Search

41	Internet das coisas: controvérsias nas notícias e redes temáticas Singer, Talyta Louise January 2014 (has links) Submitted by Pós-Com Pós-Com (pos-com@ufba.br) on 2015-04-17T15:06:44Z No. of bitstreams: 1 Talyta Louise Todescat Singer - Dissertação.pdf: 4034000 bytes, checksum: dffc645dc88549ae313a8951db3d574f (MD5) / Approved for entry into archive by Vania Magalhaes (magal@ufba.br) on 2017-09-29T16:25:19Z (GMT) No. of bitstreams: 1 Talyta Louise Todescat Singer - Dissertação.pdf: 4034000 bytes, checksum: dffc645dc88549ae313a8951db3d574f (MD5) / Made available in DSpace on 2017-09-29T16:25:19Z (GMT). No. of bitstreams: 1 Talyta Louise Todescat Singer - Dissertação.pdf: 4034000 bytes, checksum: dffc645dc88549ae313a8951db3d574f (MD5) / CNPQ / Esta pesquisa se dedica a realizar um estudo exploratório da internet das coisas, identificando e descrevendo os tensionamentos surgidos da produção, captura, processamento e/ou transmissão de informação por objetos interconectados observáveis a partir dos rastros digitais públicos. Nos concentramos em identificar questões sensíveis, temas que levantem discussões e mobilizem diferentes tipos de atores – desenvolvedores, políticos, usuários, indústria, leis, protocolos – e possam ser compreendidas por mais de um ponto de vista. A pesquisa tem como marco teórico a Teoria Ator-Rede que a partir de seu princípio de simetria entre humanos e não-humanos que confere a ambos a possibilidade de agência. A metodologia empregada é a de cartografia de controvérsias, um conjunto de técnicas aplicáveis a exploração e visualização de conflitos. A pesquisa empírica é formada por um mapeamento da rede temática da internet das coisas a partir de web crawling de sites em português e em inglês e análise de conteúdo de notícias publicadas sobre o assunto em sites de notícia de grande visibilidade. A pesquisa identificou controvérsias em seis temas: dependência tecnológica, software livre, padronização, legislação, privacidade e segurança. A dissertação está dividida em três capítulos: o primeiro dedicado a criar um panorama atual da internet das coisas, discutir conceitos e apresentar uma linha do tempo; o segundo apresenta conceitos-chave da Teoria Ator-Rede e as etapas da cartografia de controvérsias; o último capítulo explicita as etapas de coleta e processamento de dados e os principais resultados. Nas considerações finais apresentamos um quadro síntese das controvérsias encontradas e uma árvore de argumentos que identifica os principais pontos de discordância e os principais atores que participam das discussões em torno dos objetos conectados. / The present research is an exploratory study of internet of things that identifies and describes the conflicts emerging on connected objects production, processing and transmission of information. We are concerned about the sensible questions that involve different kinds of actants -users, developers, politicians, industry, laws, communication protocols - and needs to be observed by multiple viewpoints. Actor - Network Theory is our theoretical framework that includes the principle of generalized symmetry be tween humans and non-humans and gives them both agency capacity. We use the cartography of controversies as a method to explore and visualize public debates through a set of techniques that includes web crawling and content analysis. Our results show controversies about six subjects: internet addiction, free software, standardization, legislation, privacy and security. The research is divides in three chapters: the first one creates a landscape of internet of things, its definition and history; the second one presents the key concepts about Actor-Network Theory and the layers of cartography of controversies; the last chapter reports our methodological choices and main results. Our final remarks show a summary table of controversies found and a disagreement tree. Comunicação e cultura contemporâneas Internet das coisas Cartografia de controvérsias Teoria ator-rede Análise de conteúdo Web crawling
42	Breaking Hash-Tag Detection Algorithm for Social Media (Twitter) January 2015 (has links) abstract: In trading, volume is a measure of how much stock has been exchanged in a given period of time. Since every stock is distinctive and has an alternate measure of shares, volume can be contrasted with historical volume inside a stock to spot changes. It is likewise used to affirm value patterns, breakouts, and spot potential reversals. In my thesis, I hypothesize that the concept of trading volume can be extrapolated to social media (Twitter). The ubiquity of social media, especially Twitter, in financial market has been overly resonant in the past couple of years. With the growth of its (Twitter) usage by news channels, financial experts and pandits, the global economy does seem to hinge on 140 characters. By analyzing the number of tweets hash tagged to a stock, a strong relation can be established between the number of people talking about it, to the trading volume of the stock. In my work, I overt this relation and find a state of the breakout when the volume goes beyond a characterized support or resistance level. / Dissertation/Thesis / Masters Thesis Computer Science 2015 Computer science Finance Economics Algorithm hashtags Stock Prediction Twitter Volume Breakout Web Crawling
43	Preenchimento automático de formulários na web oculta / Automatically filling in hiddenweb forms Kantorski, Gustavo Zanini January 2014 (has links) Muitas informações disponíveis na Web estão armazenadas em bancos de dados on-line e são acessíveis somente após um usuário enviar uma consulta por meio de uma interface de busca. Essas informações estão localizadas em uma parte da Web conhecida como Web Oculta ou Web Profunda e, geralmente, são inacessíveis por máquinas de busca tradicionais. Uma vez que a forma de acessar os dados na Web Oculta se dá por intermédio de submissões de consultas, muitos trabalhos têm focado em como preencher automaticamente campos de formulários. Esta tese apresenta uma metodologia para o preenchimento de formulários na Web Oculta. Além disso, descreve uma categorização das técnicas de preenchimento de formulários existentes no estado da arte de coleta na Web Oculta, produzindo uma análise comparativa entre elas. A solução proposta descreve um método automático para seleção de valores para campos de formulários combinando heurísticas e técnicas de aprendizagem de máquina. Experimentos foram realizados em formulários reais da Web, de vários domínios, e os resultados indicam que a abordagem proposta apresenta desempenho comparável aos obtidos pelas técnicas do estado da arte, sendo inclusive significativamente diferente com base em avaliação estatística. / A large portion of the information on the Web is stored inside online databases. Such information is accessible only after the users submit a query through a search interface. TheWeb portion in which that information is located is called HiddenWeb or DeepWeb, and generally this part is inaccessible by traditional search engines crawlers. Since the only way to access the Hidden Web pages is through the query submissions, many works have focused on how to fill in form fields automatically, aiming at enhancing the amount of distinct information hidden behind Web forms. This thesis presents an automatic solution to value selection for fields in Web forms. The solution combines heuristics and machine learning techniques for improving the selection of values. Furthermore, this proposal also describes a categorization of form filling techniques and a comparative analysis between works in the state of the art. Experiments were conducted on real Web sites and the results indicated that our approach significantly outperforms a baseline method in terms of coverage without additional computational cost. Recuperacao : Informacao Banco : Dados Serviços Web Crawling Deep web Filling web forms Hidden web
44	On the application of focused crawling for statistical machine translation domain adaptation Laranjeira, Bruno Rezende January 2015 (has links) O treinamento de sistemas de Tradução de Máquina baseada em Estatística (TME) é bastante dependente da disponibilidade de corpora paralelos. Entretanto, este tipo de recurso costuma ser difícil de ser encontrado, especialmente quando lida com idiomas com poucos recursos ou com tópicos muito específicos, como, por exemplo, dermatologia. Para contornar esta situação, uma possibilidade é utilizar corpora comparáveis, que são recursos muito mais abundantes. Um modo de adquirir corpora comparáveis é a aplicação de algoritmos de Coleta Focada (CF). Neste trabalho, são propostas novas abordagens para CF, algumas baseadas em n-gramas e outras no poder expressivo das expressões multipalavra. Também são avaliadas a viabilidade do uso de CF para realização de adaptação de domínio para sistemas genéricos de TME e se há alguma correlação entre a qualidade dos algoritmos de CF e dos sistemas de TME que podem ser construídos a partir dos respectivos dados coletados. Os resultados indicam que algoritmos de CF podem ser bons meios para adquirir corpora comparáveis para realizar adaptação de domínio para TME e que há uma correlação entre a qualidade dos dois processos. / Statistical Machine Translation (SMT) is highly dependent on the availability of parallel corpora for training. However, these kinds of resource may be hard to be found, especially when dealing with under-resourced languages or very specific domains, like the dermatology. For working this situation around, one possibility is the use of comparable corpora, which are much more abundant resources. One way of acquiring comparable corpora is to apply Focused Crawling (FC) algorithms. In this work we propose novel approach for FC algorithms, some based on n-grams and other on the expressive power of multiword expressions. We also assess the viability of using FC for performing domain adaptations for generic SMT systems and whether there is a correlation between the quality of the FC algorithms and of the SMT systems that can be built with its collected data. Results indicate that the use of FCs is, indeed, a good way for acquiring comparable corpora for SMT domain adaptation and that there is a correlation between the qualities of both processes. Linguística computacional Estatística aplicada Focused crawling Statistical machine translation Domain adaptation Comparable corpora
45	Skrapa försäljningssidor på nätet : Ett ramverk för webskrapningsrobotar Karlsson, Emil, Edberg, Mikael January 2016 (has links) På internet finns det idag ett stort utbud av försäljningswebbsidor där det hela tiden inkommer nya annonser. Vi ser att det finns ett behov av ett verktyg som övervakar de här webbsidorna dygnet runt för att se hur mycket som säljs och vad som säljs. Att skapa ett program som övervakar webbsidor är tidskrävande, därför har vi skapat ett ramverk som underlättar skapandet av webbskrapare som är fokuserade på att listbaserade försäljningswebbsidor på nätet. Det finns flera olika ramverk för webbskrapning, men det finns väldigt få som endast är fokuserade på den här typen av webbsidor. Web scraping Web crawling Framework Listbased sales websites. Computer Sciences Datavetenskap (datalogi)
46	Preenchimento automático de formulários na web oculta / Automatically filling in hiddenweb forms Kantorski, Gustavo Zanini January 2014 (has links) Muitas informações disponíveis na Web estão armazenadas em bancos de dados on-line e são acessíveis somente após um usuário enviar uma consulta por meio de uma interface de busca. Essas informações estão localizadas em uma parte da Web conhecida como Web Oculta ou Web Profunda e, geralmente, são inacessíveis por máquinas de busca tradicionais. Uma vez que a forma de acessar os dados na Web Oculta se dá por intermédio de submissões de consultas, muitos trabalhos têm focado em como preencher automaticamente campos de formulários. Esta tese apresenta uma metodologia para o preenchimento de formulários na Web Oculta. Além disso, descreve uma categorização das técnicas de preenchimento de formulários existentes no estado da arte de coleta na Web Oculta, produzindo uma análise comparativa entre elas. A solução proposta descreve um método automático para seleção de valores para campos de formulários combinando heurísticas e técnicas de aprendizagem de máquina. Experimentos foram realizados em formulários reais da Web, de vários domínios, e os resultados indicam que a abordagem proposta apresenta desempenho comparável aos obtidos pelas técnicas do estado da arte, sendo inclusive significativamente diferente com base em avaliação estatística. / A large portion of the information on the Web is stored inside online databases. Such information is accessible only after the users submit a query through a search interface. TheWeb portion in which that information is located is called HiddenWeb or DeepWeb, and generally this part is inaccessible by traditional search engines crawlers. Since the only way to access the Hidden Web pages is through the query submissions, many works have focused on how to fill in form fields automatically, aiming at enhancing the amount of distinct information hidden behind Web forms. This thesis presents an automatic solution to value selection for fields in Web forms. The solution combines heuristics and machine learning techniques for improving the selection of values. Furthermore, this proposal also describes a categorization of form filling techniques and a comparative analysis between works in the state of the art. Experiments were conducted on real Web sites and the results indicated that our approach significantly outperforms a baseline method in terms of coverage without additional computational cost. Recuperacao : Informacao Banco : Dados Serviços Web Crawling Deep web Filling web forms Hidden web
47	Link Extraction for Crawling Flash on the Web Antelius, Daniel January 2015 (has links) The set of web pages not reachable using conventional web search engines is usually called the hidden or deep web. One client-side hurdle for crawling the hidden web is Flash files. This thesis presents a tool for extracting links from Flash files up to version 8 to enable web crawling. The files are both parsed and selectively interpreted to extract links. The purpose of the interpretation is to simulate the normal execution of Flash in the Flash runtime of a web browser. The interpretation is a low level approach that allows the extraction to occur offline and without involving automation of web browsers. A virtual machine is implemented and a set of limitations is chosen to reduce development time and maximize the coverage of interpreted byte code. Out of a test set of about 3500 randomly sampled Flash files the link extractor found links in 34% of the files. The resulting estimated web search engine coverage improvement is almost 10%. Flash crawling spidering deep web hidden web virtual machine interpretation Computer Sciences Datavetenskap (datalogi)
48	Human Interactions on Online Social Media : Collecting and Analyzing Social Interaction Networks Erlandsson, Fredrik January 2018 (has links) Online social media, such as Facebook, Twitter, and LinkedIn, provides users with services that enable them to interact both globally and instantly. The nature of social media interactions follows a constantly growing pattern that requires selection mechanisms to find and analyze interesting data. These interactions on social media can then be modeled into interaction networks, which enable network-based and graph-based methods to model and understand users’ behaviors on social media. These methods could also benefit the field of complex networks in terms of finding initial seeds in the information cascade model. This thesis aims to investigate how to efficiently collect user-generated content and interactions from online social media sites. A novel method for data collection that is using an exploratory research, which includes prototyping, is presented, as part of the research results in this thesis. Analysis of social data requires data that covers all the interactions in a given domain, which has shown to be difficult to handle in previous work. An additional contribution from the research conducted is that a novel method of crawling that extracts all social interactions from Facebook is presented. Over the period of the last few years, we have collected 280 million posts from public pages on Facebook using this crawling method. The collected posts include 35 billion likes and 5 billion comments from 700 million users. The data collection is the largest research dataset of social interactions on Facebook, enabling further and more accurate research in the area of social network analysis. With the extracted data, it is possible to illustrate interactions between different users that do not necessarily have to be connected. Methods using the same data to identify and cluster different opinions in online communities have also been developed and evaluated. Furthermore, a proposed method is used and validated for finding appropriate seeds for information cascade analyses, and identification of influential users. Based upon the conducted research, it appears that the data mining approach, association rule learning, can be used successfully in identifying influential users with high accuracy. In addition, the same method can also be used for identifying seeds in an information cascade setting, with no significant difference than other network-based methods. Finally, privacy-related consequences of posting online is an important area for users to consider. Therefore, mitigating privacy risks contributes to a secure environment and methods to protect user privacy are presented. Social Media Social Networks Crawling Complex Networks Information Cascade Seed Selection Privacy Computer Science Datavetenskap (datalogi)
49	Diseño e implementación de sistema distribuido y colaborativo de peticiones HTTP/S Pulgar Romero, Francisco Leonardo January 2018 (has links) Memoria para optar al título de Ingeniero Civil en Computación / En la actualidad existen muchos computadores y dispositivos tecnológicos con capacidad computacional ociosa, con el potencial de ser usados. Es así como existen una gran cantidad de proyectos donde personas donan voluntariamente su poder computacional para ayudar en problemas tales como: renderización de animaciones 3D, correr simulaciones de experimentos, estudiar conjeturas matemáticas, optimización de variables y parámetros en Machine Learning, estudiar estructuras de proteínas y moléculas, clasificación de galaxias, predicción del clima, entre un sinfín de aplicaciones posibles tanto en el área de investigación como en el área empresarial. Esa necesidad de poder de procesamiento y recursos computacionales ha llevado a crear tecnologías como la computación grid (o en malla), que consiste en un sistema de computación distribuido que permite coordinar computadoras de diferente hardware y software haciendo uso de estos para resolver en paralelo tareas en común. La presente memoria tiene como fin la creación de un sistema distribuido en malla donde dispositivos tecnológicos se comunican con un servidor central para recopilar datos de internet; usando así la capacidad ociosa de dispositivos tecnológicos y brindando ayuda voluntaria a aquel que necesite recopilar datos de internet. Durante el desarrollo de este trabajo se implementa un sistema de administración de usuarios y dispositivos tecnológicos realizado con Django, un sistema de distribución de consultas HTTP/S desarrollado con Tornado y un software que corre de lado de los dispositivos tecnológicos para resolver tareas y mandar resultados, hecho en Python. Estos tres sistemas se comunican entre ellos para lograr la distribución de las consultas HTTP/S, pero son independientes entre sí, ayudando a la escalabilidad y tolerancia a fallos del sistema general. Finalmente se realizan pruebas y experimentos de los diferentes componentes para obtener datos relevantes que nos permitan estudiar el comportamiento del sistema, identificando ventajas y desventajas del uso del mismo. Los resultados obtenidos muestran que a medida que aumenta la cantidad de dispositivos tecnológicos que colaboran en una tarea, disminuyen los tiempos de resolución de éstas; además se demuestra una correlación directa entre el tiempo de respuesta de una consulta HTTP/S y la distancia física que existe entre el dispositivo que hace la consulta y el servidor web. Ingeniería de software Servidores Web Sistema distribuido colaborativo Computación grid Distributed systems Crawling
50	Étude de la mobilité quadrupède en position ventrale chez le nouveau-né et le nourrisson humain / Very early crawling; study of the quadrupedal mobility in the prone position on the newborn and human infant Forma, Vincent 28 November 2016 (has links) La locomotion autonome est une étape clef du développement du nourrisson. Elle débute dans la majorité des cas par la marche quadrupède au deuxième semestre de vie. Cependant, dès la naissance, le nouveau-né est déjà capable de se propulser de manière autonome en position ventrale. Cette mobilité quadrupède précoce a été très peu étudiée, car considérée par la plupart des auteurs comme un simple réflexe de reptation, destiné à disparaître rapidement sous l'influence du développement cortical : cette reptation n'aurait aucun lien avec la marche mature, n'impliquerait pas les bras et aurait pour fonction principale de permettre au nouveau-né de se propulser jusqu'au sein maternel. Contrairement à ce point de vue, quelques auteurs ont observé que cette mobilité semblait complexe et pouvait éventuellement persister jusqu'à l'âge de 2-3 mois, dans un contexte adapté. Ces observations posent la question de savoir si cette mobilité primitive, loin d'être un simple réflexe, pourrait être en lien avec la marche quadrupède et bipède mature. L'objectif de cette thèse est d'étudier les différentes caractéristiques, en particulier cinématiques, de cette mobilité quadrupède depuis la naissance jusqu'au sixième mois. Dans ce but, un dispositif a été créé, le CrawliSkate, qui permet de libérer les bras et faciliter la propulsion. Trois études ont été menées et montrent que cette mobilité quadrupède est loin d'être un simple réflexe stéréotypé, qu'elle implique une coordination des jambes et des bras, qu'elle peut en partie être modifiée dès la naissance à un niveau supra spinal par la vision et enfin qu'elle persiste tout en se modifiant durant tout le premier semestre de la vie. / Self-produced locomotion is a key stage in infant development, which usually begins with hand and knees crawling in the second semester of life. Since the moment of birth, however, newborns are already capable of autonomous propulsion from a prone position. This precocious form of quadrupedalism remains largely unstudied due in part to the fact that most researchers consider these creeping movements to constitute a mere reflex, destined dissipate as cortical development progresses. Under such an interpretation, this creeping « reflex » would have no link with mature, bipedal walking, would not recruit the upper limbs and would serve mainly as a mechanism by which newborns could reach the maternal breast. Contrary to this point of view, a handful of authors have observed that these patterns of locomotion seem complex, and might persist in some form until the age of 2-3 months. These observations invite us to consider the possibility that such primitive locomotion might be directly involved in the emergence of quadrupedal and bipedal gait. The present thesis examines the various characteristics (particularly cinematic) of this prone mobility, from birth to about six months of age. To this end, we describe the creation of an experimental tool that frees the use of a newborn's limbs and facilitates the aforementioned form of propulsion: the CrawliSkate. We present three studies showing that neonatal prone mobility goes beyond simple reflexes, involves coordination between the upper and lower limbs, and can be partially modified at birth at a supra-spinal level through visual stimulation. Lastly, we demonstrate that this pattern of locomotion persists, albeit with heavy modification, throughout the first semester of life. Locomotion Nouveau-né Quadrupède Développement moteur Nourrisson Locomotion Newborns Crawling Motor development Infant 612.76

Search results