Global ETD Search

231	Sistema de recomendação de objeto de aprendizagem baseado em postagens extraídas do ambiente virtual de aprendizagem Silva, Reinaldo de Jesus da January 2016 (has links) Os fóruns de discussões apresentam-se com umas das ferramentas de interação utilizadas nos ambientes virtuais de aprendizagem (AVAs). Esta pesquisa tem como objetivo propor um sistema computacional para recomendação de Objeto de Aprendizagem (OA), levando em consideração as postagens feitas de dentro dos fóruns de um Ambiente Virtual de Aprendizagem (AVA). A metodologia utilizada foi a pesquisa qualitativa, dos tipos descritiva e explicativa. Esse sistema identifica as palavras-chave nos fóruns de um AVA; usam as palavras-chave como indícios dos interesses dos usuários; classifica (atributos pesos) as palavras mais relevantes (Hot Topics); submete a um mecanismo de busca (repositório), neste trabalho foram usados os motores de busca, para fins de teste e oferece os resultados da busca aos usuários. As contribuições deste sistema para os sujeitos participantes desta pesquisa são: recomendação automática de OA para os alunos e professores; aplicação de mineração de dados para sistema gestão educacional; técnica de mineração de textos, utilizando algoritmo TFPDF (Term Frequency Proportional Document Frequency) e integração do AVA com repositório digital. Para validar o sistema de recomendação de OA em um AVA foi desenvolvido protótipo do sistema com uma amostra, contendo vinte e cinco alunos e cinco professores de duas turmas das disciplinas de Modelagem de Banco de Dados e Interface de Usuários e Sistemas Computacionais do curso de Engenharia de Computação da Universidade Estadual do Maranhão. O estudo realizado sobre o tema, e relatado nessa tese, tem como foco a recomendação de OA nos fóruns de um AVA. A avaliação e validação realizadas, através de protótipo do sistema com professores e alunos evidenciaram que o sistema de recomendação de Web Services (RECOAWS) proposto atende às expectativas e pode apoiar professores e alunos, nas suas atividades pedagógicas, dentro dos fóruns. / Discussion forums get present with one of interaction tools used in virtual learning environments (VLEs). This research aims to propose a computational system for Learning Object recommendation (LO), taking into account the posts made from within the forums of a Virtual Learning Environment (VLE). The methodology used was a qualitative study of descriptive and explanatory types. This system identifies the keywords in the forums of a VLE; It uses the keywords as evidence of the interests of users; ranks (attributes weights) the most relevant words (Hot Topics); It submits to a search engine (repository), this work were used search engines for testing purposes and provides the search results to users. The contributions of this system to the participants in this study are: automatic recommendation of LO for students and teachers; data mining application to educational management system; text mining techniques, using TF * PDF algorithm (Term Frequency * Proportional Document Frequency) and integration of the VLE with digital repository. To validate the LO recommendation system in a VLE was developed prototype system with a sample, with twenty-five students and five teachers from two classes of database modeling disciplines and User Interface and Computational Systems of Engineering course Computing of the State University of Maranhão. The study on the subject, and reported in this thesis is focused on LO recommendation in the forums of a VLE. The evaluation and validation performed by the prototype system with teachers and students showed that the Web Services recommendation system (RecoaWS) proposed meets expectations and can support teachers and students in their educational activities within the forums. Ambiente virtual Objeto de aprendizagem Aluno Professor Virtual learning environment Learning object Text mining Recommendation
232	Mineração textual e produção de fanfictions : processos desencadeadores de oportunidades de letramento no ensino de língua estrangeira Barcellos, Patrícia da Silva Campelo Costa January 2013 (has links) Esta tese tem por objetivo investigar como o letramento em língua estrangeira (LE) pode ser apoiado pelo uso de um recurso digital passível de auxiliar os processos de leitura e produção textual. Assim, a presente pesquisa baseia-se nos estudos de Feldman e Sanger (2006) acerca da mineração de textos e nas pesquisas de Black (2007, 2009) sobre a incorporação de um gênero textual característico da internet (fanfiction) na aprendizagem de línguas. Através da utilização de um recurso de mineração de texto (Sobek), a partir do qual ocorre a extração dos termos mais recorrentes em um texto, os participantes deste estudo criaram narrativas, em meio digital. Os doze alunos participantes da pesquisa utilizaram a ferramenta Sobek como mediadora da produção de histórias conhecidas como fanfictions, nas quais novas tramas são criadas a partir de elementos culturais já reconhecidos na mídia. Os informantes eram seis graduandos em Letras e seis alunos de um curso de extensão, ambos os grupos na Universidade Federal do Rio Grande do Sul (UFRGS). Na tarefa proposta, cada aprendiz leu uma fanfiction de sua escolha, publicada na web, e utilizou a ferramenta de mineração para formar grafos com os termos mais recorrentes da história. Durante tal processo, o aluno tinha oportunidade de fazer associações entre as expressões do texto, de modo a formar, na ferramenta Sobek, uma imagem em rede (grafo) que representasse termos recorrentes nesse gênero textual (tais como o uso de tempos verbais no passado e adjetivos para caracterizar personagens e contexto). Posteriormente, esse grafo foi repassado a um colega, que assim iniciou seu processo de composição com base nessa imagem representativa do texto. A partir da análise dos dados, observou-se que a utilização da ferramenta digital deu suporte à produção textual em LE, e sua subsequente prática de letramento, visto que os autores se apoiaram no recurso de mineração para criar suas narrativas fanfiction. / This doctoral thesis aims at investigating how literacy in a foreign language (FL) may be supported by the use of a digital resource which can help the processes of reading and writing. Thus, the present research is based on studies by Feldman and Sanger (2006) about text mining, and on research by Black (2007, 2009) about the incorporation of a textual genre characteristic of the Internet (fanfiction) in language learning. Through the use of a text mining resource (Sobek), which promotes the extraction of frequent terms present in a text, the participants of this study created narratives, in digital media. The twelve students who participated in the research used the tool Sobek to mediate the production of stories known as fanfictions, in which new plots are created from cultural elements already recognized in the media. The participants were six undergraduate students of Languages and six students who were part of an extension course, both groups at the Federal University of Rio Grande do Sul (UFRGS). In the proposed task, each student read a fanfiction of his/her choice, which was published on a website, and used the mining tool to develop graphs with the recurrent terms found in the story. During this process, the student had the opportunity to make associations between expressions from the text, using the software Sobek, so as to form an image (graph) that represented terms used in this textual genre (such as the use of verbal tenses in the past and adjectives to describe characters and context). Later, this graph was forwarded to a peer, who then began his/her writing process based on this picture originated from a text. From the data analysis, it was observed that the use of a digital tool supported the text production in the FL, and its following practice of literacy, as the authors relied on the mining resource to create their fanfictions. Letramento Língua estrangeira Produção textual Ficção Conto Romance Text mining Fanfiction Foreign language literacy
233	Mineração de textos aplicada na previsão e detecção de eventos adversos no Hospital de Clínicas de Porto Alegre Silva, Daniel Antonio da January 2017 (has links) Este trabalho apresenta os resultados de uma pesquisa que teve como objetivo avaliar o desempenho de métodos de mineração de textos na previsão e detecção de Eventos Adversos (EA). A primeira etapa foi a revisão sistemática da literatura que buscou identificar os métodos de mineração de textos e as áreas da saúde que esses estão sendo aplicados para prever e detectar EA. Após essa etapa foi realizada uma aplicação de métodos de mineração de textos para prever Infecções do Sítio Cirúrgico (ISC) a partir do texto livre de descrições cirúrgicas no Hospital de Clínicas de Porto Alegre (HCPA). Por fim, métodos de mineração de textos foram aplicados para detectar ISC a partir do texto das evoluções de pacientes 30 (trinta) dias após uma cirurgia. Como resultados, destaca-se a identificação dos melhores métodos de pré-processamento e mineração de textos para prever e detectar ISC no HCPA, podendo ser aplicados a outros EA. O método Stochastic Gradient Descent (SGD) apresentou o melhor desempenho, 79,7% de ROC-AUC na previsão de EA. Já para detecção de EA o melhor método foi o Logistic Regression, com desempenho 80,6% de ROC-AUC. Os métodos de mineração de textos podem ser usados para apoiar de maneira eficaz a previsão e detecção de EA, direcionando ações de vigilância para a melhoria da segurança do paciente. / This work presents the results of a research that aimed to evaluate the performance of text mining methods in the prediction and detection of Adverse Events (AE). The first step was the systematic review of the literature that sought to identify the methods of text mining and the health areas they are being applied to predict and detect AE. After this step, an application of text mining methods was performed to predict Surgical Site Infections (SSI) from the free text of medical records at Hospital de Clínicas de Porto Alegre (HCPA). Finally, text mining methods were applied to detect SSI from the text of medical records 30 (thirty) days after surgery. As results, is highlight the identification of the best methods of pre-processing and text mining to predict and detect SSI in the HCPA, and can be applied to other AE. The Stochastic Gradient Descent (SGD) presented the best performance, 79.7% of ROC-AUC in the prediction of AE. Already for the detection of AE the best method was the Logistic Regression, with performance 80.6% of ROC-AUC. Text mining methods can be used to effectively support the prediction and detection of AE by directing surveillance actions to improve patient safety. Mineração de dados Controle de infecções Hospital de Clínicas de Porto Alegre Adverse Events Surgical Infection Text Mining
234	Uso do minerador de textos sobek como ferramenta de apoio à compreensão textual Epstein, Daniel January 2017 (has links) A presente tese tem por objetivo investigar os efeitos do uso do minerador de textos Sobek no processo de leitura e compreensão textual de estudantes. Este minerador de textos é capaz de extrair informações relevantes de textos e representá-las de forma gráfica. Esta tese está apoiada nas teorias de aprendizagem significativa, de uso de mapas conceituais para representação de conhecimento e em pesquisas que apontam que representações gráficas de palavras auxiliam na leitura de textos e na sua decodificação. De acordo com a pesquisa de David Ausubel, a aprendizagem significativa ocorre através da assimilação de novos conceitos e ideias e associação destas ao conhecimento que a pessoa já possui. Através da utilização de um minerador de textos com representação gráfica de informações, busca-se apresentar aos estudantes uma representação visual de textos. Esta representação se assemelha a de um mapa conceitual, de forma a auxiliar no processo de compreensão e assimilação de informações pelos estudantes. Nesta representação, ligações entre termos considerados relevantes pelo minerador auxiliam no entendimento destes termos e simbolizam relações presentes no texto, fato esse que pode auxiliar os estudantes a compreenderem melhor o texto e relacionarem novas informações àquelas que já possuem Nesta tese, foi realizado um estudo para auxiliar estudantes nas atividades relacionadas ao letramento. A pesquisa se caracteriza como mista (qualitativa e quantitativa). A coleta de dados se deu a partir da aplicação de questionários com professores e alunos, além de avaliações com o objetivo de verificar contribuições do uso da ferramenta a partir de seu uso do ponto de vista do letramento. Como resultado, encontramos que estudantes que utilizaram o Sobek obtiveram um número mais elevado de respostas corretas nas atividades de interpretação de textos. Em média, os alunos acertaram 66% das questões quando utilizando o minerador de textos Sobek, contra apenas 47% das questões que eram respondidas sem o apoio do minerador. Outro resultado apresentado é o alto grau de satisfação de alunos e professores quanto à tecnologia e seu uso em sala de aula. Além destes resultados, obtivemos uma avaliação acerca da capacidade do minerador de textos de extrair termos considerados relevantes ao texto. / This thesis aimed to investigate the effects of using Sobek Text Miner to improve literacy. Sobek is a tool capable of extracting relevant information from texts and representing them in a graphical way. The thesis is supported by meaningful learning theory, conceptual maps theory and several research theories which indicate that graphical representation of words may improve reading capability and word decoding. According to David Ausubel, meaningful learning occurs through the assimilation of new concepts and ideas and association of those to what the person already knows. Using text mining with graphical representation of information, we seek to provide students with a graphical representation of a text. This text representation is similar to a concept map, helping students assimilate and comprehend that information. In Sobek’s representation, the relationship between terms considered relevant to text comprehension may assist students to better understand the meaning of each term and demonstrate relationships that are presented in the text, improving context comprehension. Furthermore, the relationship between terms may help information assimilation, once it relates the new information with previous known information This project conducted a study using Sobek text miner in classroom to support student’s literacy. In order to assess the tool’s possible benefits in reading and comprehension activities, we designed a series of classroom activities. To evaluate those activities, qualitative and quantitative approaches were used. The study was conducted in two primary schools, with students from 5th grade and 8th grade. Interviews were also made with the teachers and students, inquiring them about the tool's and main functions and its ability to help students from a literacy point of view. The study shows that students answered more correct question when using Sobek than when no support technology was used. Also, both students and teachers approved the software and agreed that it does improve student’s text comprehension. It also describes an evaluation of Sobek's capability to extract terms considered relevant for text comprehension. Tecnologia educacional Aprendizagem significativa Compreensão de texto Text Comprehension Sobek Reading Meenaningful learning Text mining Graphs Literacy
235	Expressão da ciência nas políticas públicas relativas à obesogenicidade nos Estados Unidos da América Finocchio, Caroline Pauletto Spanhol January 2014 (has links) A obesidade decorre de um processo multifatorial que envolve aspectos biológicos, comportamentais e ambientais. Atualmente, o tema, por sua dimensão e universalidade, tem despertado o interesse coletivo, sobretudo da ciência, dos governos e da mídia. Um visível esforço está sendo empreendido com vistas ao controle dessa pandemia, com chamamento à responsabilidade de todos os stakeholders, entre eles os atores do Agronegócio mundial. Com o propósito de evidenciar os fundamentos científicos dessas iniciativas e as interrelações entre os agentes envolvidos, buscou-se identificar as dimensões disciplinares presentes nas publicações da FAO/WHO, do governo e da mídia dos Estados Unidos sobre obesogenicidade. Para tanto, foi realizada a análise documental das publicações divulgadas em meio eletrônico por cada um dos agentes, utilizando a mineração de texto. Para a construção dos argumentos que norteiam a pesquisa foram utilizadas as Teorias do Agendamento, do Enquadramento e Priming. Para a construção da estrutura analítica utilizada na mineração de texto foram utilizados 4.648 artigos científicos disponíveis no Portal Web of Science que abordam o tema. Além disso, para caracterizar a dimensão Agronegócio foram coletados todos os artigos publicados no Agribusiness International Journal e no International Food and Agribusiness Management Review no período de 2003-2013. Após a coleta, foram construídos os dicionários de palavras representativos de cada dimensão disciplinar e do Agronegócio, utilizados no escaneamento dos documentos. A base de dados foi composta por 3.342 políticas introduzidas ou promulgadas pelos estados norte-americanos, 1.168 artigos jornalísticos publicados no The New York Times e no The Washington Post e 67 publicações da FAO/WHO publicados no período de 2003-2013. Os resultados indicaram que a mídia tem enquadrado frequentemente a temática sob a perspectiva das Ciências da Saúde, seguida da Multidisciplinar e Agronegócio. Já para o governo, as dimensões disciplinares mais frequentes são Multidisciplinar, Agronegócio e as Ciências da Saúde. Na FAO/WHO as Ciências da Saúde, Multidisciplinar e Agronegócio são as mais frequentes. Mesmo considerando as diferenças quanto ao enquadramento do tema pelos stakeholders, nota-se a existência de alguma semelhança entre esses enquadramentos, evidenciada pela similaridade entre as Ciências da Saúde, Multidisciplinar e as Ciências da Vida. Destaca-se ainda que a participação do Agronegócio é expressiva nos instrumentos políticos dos Estados Unidos, sugerindo o seu papel no crescimento da obesidade coletiva e na sua responsabilidade frente à desejada reversão dessa tendência mundial. / Obesity results from a multifactorial process involving biological, behavioral, and environmental aspects. Today, the scale and universality of obesity has attracted widespread interest, especially among the scientific community, the government, and the media. A visible effort is being made to control this pandemic, with a call for responsibility by all stakeholders, including the leaders of the global agribusiness industry. Aiming to highlight the scientific foundations of these initiatives, and the interrelationships between those involved, we sought to identify the disciplinary dimensions regarding an obesogenic environment in the publications by the FAO/WHO, government, and media in the United States. Therefore, a documentary analysis of publications disseminated electronically by individual agents was conducted, using text mining. Agenda-setting theory, framing, and priming were used to construct the arguments that guide this research. To build the analytical framework used in text mining, 4.648 scientific articles available on the Web of Science portal addressing the issue were used. Furthermore, to characterize the scale of the situation, agribusiness articles published in the International Agribusiness Journal and the International Food and Agribusiness Management Review from 2003-2013 were also used. Subsequently, dictionaries of words representative of each disciplinary dimension and agribusiness were constructed and used while scanning the documents. The database comprised 3.342 introduced or promulgated policies by the North American states, 1.168 media articles published in The New York Times and The Washington Post, and 67 publications by the FAO/WHO, published during this period. The results indicated that the media has often framed the issue from the perspective of health sciences, followed by multidisciplinary and agribusiness. As for the government, the most frequent disciplinary dimensions are multidisciplinary area, agribusiness and health sciences. In the FAO/WHO, health sciences, multidisciplinary area and agribusiness are the most frequent. Even considering the differences in the framing of the issue by stakeholders, it is to be noted that there is some similarity between these frameworks, with a joint occurrence of the health sciences, life sciences, and multidisciplinary area. Note also that the participation of the agribusiness industry is significant in political instruments of the United States, suggesting its role in the growth of obesity and collective responsibility to be taken for the desired reversal of this global trend. Obesidade Política pública Agronegócio Análise de dados Agenda-setting Obesogenic environment Text mining Public policy Media Agribusiness
236	Linking clinical records to the biomedical literature Alnazzawi, Noha Abdulkareem D. January 2016 (has links) Narrative information in Electronic Health Records (EHRs) contains a wealth of clinical information about treatments, diagnosis, medication and family history. In addition, the scientific literature represents a rich source of information that summarises the latest results and new research findings relevant to different diseases. These two textual sources often contain different types of valuable phenotypic information that may be complementary to each other. Combining details from each source thus has the potential to be useful in uncovering new disease-phenotypic associations. In turn, these associations can help to identify patients with high risk factors, and they can be useful in developing solutions to control the causes responsible for the development of different diseases. However, clinicians at the point of care have limited time to review the large volume of potentially useful information that is locked away in unstructured text format. This in turn limits the utility of this “raw” information to clinical practitioners and computerised applications. Accordingly, the provision of automated and efficient means to extract, combine and present phenotype information that may be scattered amongst a large number of different textual sources in an easily digestible format is a prerequisite to the effective use and comprehensive understanding of details contained within both the records and the literature. The development of such facilities can in turn help in deriving information about disease correlations and supporting clinical decisions. This thesis is the first comprehensive study focussing on extracting and integrating phenotypic information from two different biomedical sources using Text Mining (TM) techniques. In this research, we describe our work on (1) extracting phenotypic information from both EHRs and the biomedical literature; (2) extracting the relations between phenotypic information and distilling them from EHRs using an event-based approach; and (3) using normalisation methods to link the phenotypic information found in EHRs with associated mentions found in the literature as a first step towards the automatic integration of information from these heterogeneous sources. 610.28
237	As dimensões disciplinares na comunicação científica em biocombustíveis Gomes, Janaína January 2009 (has links) A comunicação científica constitui o substrato da pesquisa científica. Por meio dela se configuram os campos de legitimação do conhecimento. Este trabalho de tese se dedicou ao estudo do campo dos biocombustíveis através da comunicação científica. O referido campo de pesquisa envolve diferentes áreas do conhecimento e se refere às demandas energéticas da sociedade pós-industrial. Foram analisados dez anos da comunicação científica para se estabelecer as dimensões disciplinares sobre as quais essa discussão se sustenta. Para tanto, dois métodos de pesquisa foram combinados. Utilizou-se a bibliometria e a análise de conteúdo quantitativa, através de técnicas de text mining. A análise bibliométrica foi realizada com dados quantitativos sobre a comunicação científica, disponíveis na base Web of Science. A análise de conteúdo quantitativa foi feita com textos completos dos artigos e revisões científicas sobre biocombustíveis, utilizando-se o software Wordstat. Os dados bibliométicos apresentaram um alto grau de interdisciplinaridade expresso pela inter-relação de 132 áreas do conhecimento. Ademais, observou-se a predominância das áreas da Química (1.513 artigos e revisões), Engenharias (1.157) e Ciências Agrárias (1.029), configurando um campo com inserção eminentemente tecnológica. Na análise de conteúdo foi possível revelar uma inserção muito significativa das Ciências Sociais na argumentação dos artigos e revisões analisados. Com os dados obtidos foi possível dividir o campo dos biocombustíveis em três grupos de dimensões disciplinares, que o contextualizam. No primeiro grupo, de maior abrangência, participam as dimensões disciplinares das Ciências Agrárias, das Ciências Sociais e das Ciências Ambientais. No segundo grupo, que constitui a base tecnológica do campo, se expressam as dimensões disciplinares da Química, da Engenharia e da Microbiologia. O terceiro grupo, de expressão emergente, reúne as dimensões disciplinares da Biologia e Bioquímica, Ciências Animais e Vegetais, Biologia Molecular e Genética, Economia, Ciência dos Materiais, Nanociências e Nanotecnologia, Geociências, Física, Humanidades, Ciências Multidisciplinares, Matemática e Ciências da Computação. Infere-se que o primeiro grupo de dimensões disciplinares encerra os componentes que justificam socialmente o progresso do campo dos biocombustíveis, enquanto o segundo grupo representa a base tecnológica em que se sustenta essa temática de pesquisa. O terceiro grupo representa as áreas emergentes. No trabalho, formula-se uma métrica para a aferição da expressão da Interdisciplinaridade, útil também para outros campos de pesquisa. / The scientific communication on biofuels published from 1998 to 2007 was analysed by the use of a combination of bibliometric methods and techniques of content analysis. The analysis characterized this field of research as interdisciplinary with marked social relevance. The bibliometric study showed that in this research field 132 different, interacting areas concur with knowledge. Content analysis configured this field under the context of three groups of disciplinary dimensions. The first group, of broader influence, includes Agricultural Sciences, Social Sciences, and Environmental Sciences. The second group, which makes up the technological bases of the field, includes the disciplinary dimensions of Chemistry, Engineering, and Microbiology. In the third group, there are the disciplinary dimensions with emergent expressions in the field of biofuels, namely Biology and Biochemistry, Animal and Plant Sciences, Molecular Biology and Genetics, Economy, Material Sciences, Nanosciences and Nanotechnology, Geosciences, Physics, Humanities, Muldisciplinary Sciences, Mathematics, and Computer Sciences. One can infer from the study that the first group of disciplinary dimensions conform the elements that socially validate the progress of the research in the field of biofuels. Furthermore, in this work a metric is presented for the measurement of the expression of the interdisciplinarity of a research field, useful for the analysis of the biofuel research field and of others as well. Comunicação científica Biocombustíveis Interdisciplinarity Agroenergy Social communication Content analysis Text mining
238	Assimetria de informação e sinalização na cadeia da carne bovina Ceolin, Alessandra Carla January 2011 (has links) A presente pesquisa tem como objetivo identificar e analisar os principais Mecanismos de Sinalização utilizados na cadeia da carne bovina brasileira, bem como, verificar o grau de ocorrências dessas sinalizações pela Ciência, pelo Governo e pela Mídia, por meio de processos de mineração de textos e descoberta do conhecimento. Embasando-se na Economia da Informação como teoria-chave dessa análise e concentrando-se nas teorias da Assimetria de Informação, Seleção Adversa e Sinalização e, também, da revisão das interações entre Ciência, Mídia e Governo, foram definidas cinco dimensões, sob as quais os sinais sobre a carne bovina são enquadrados: Econômica, Comunicação e Marketing, Qualidade, Institucional e Sistema de Produção. Para análise, foram coletados documentos textuais em formato eletrônico ao longo de cinco anos. A busca dos documentos deu-se em bases de dados de publicações científicas, em portais do Governo e em arquivos dos jornais e magazines disponíveis na rede mundial de computadores , a partir de palavras-chave relacionadas à cadeia da carne bovina. Foram selecionados 4.281 artigos científicos para compor a base de dados da Ciência, 730 documentos para a base de dados do Governo e 5.439 documentos na base de dados da Mídia, totalizando 10.450 documentos selecionados. Esses documentos foram armazenados e classificados no software QDA Miner segundo as variáveis fonte (Ciência, Governo e Mídia) e ano (2005, 2006, 2007, 2008 e 2009). Para extrair o conhecimento das bases textuais, foi elaborada uma estrutura de análise constituída pelas dimensões, códigos (sinais) e palavras -chave. Na sequência foram analisados e codificados os documentos com os respectivos sinalizadores. Aplicando-se a estrutura de análise do software, foram utilizadas representações gráficas elaboradas a partir da frequência relativa das ocorrências de cada código nas três fontes de informação. De acordo com os resultados apresentados foi possível verificar que os termos e dimensões utilizados pela Ciência, Governo e Mídia diferem nas frequências em praticamente todos os códigos , demonstrando pouca sinergia entre as fontes de informação . No decorrer do período analisado, percebeuse maior oscilação na frequência das publicações que incluíam os sinais Sanidade e Sustentabilidade, os quais foram mais enfatizados nos últimos anos da presente análise. Os códigos Certificação e Rastreabilidade que pareciam ter maior destaque para fornecer informações sobre aspectos ligados à carne bovina brasileira, de acordo com estudos existentes, não são os mais emitidos, de acordo com esta pesquisa. A maior identidade da Ciência ao longo do período de anál ise foi para a Dimensão Sistema de Produção, demonstrando que as publicações científicas possuem um caráter mais voltado à melhoria da produção em si, como alimentação, genética, reprodução, idade de abate, dentre outros. Em relação ao Governo, observou-se maior identidade dessa fonte de informação para a Dimensão Institucional, com exceção do ano de 2007, onde se percebe maior es ênfases das publicações do Governo nas Dimensões Qualidade e Sistema de Produção . Já em relação ao que se observa na Mídia, há maior interesse em publicações nas Dimensões Qualidade e Sistema de Produção. Ao analisar os resultados entre as três fontes de informação, verifica-se maior proximidade entre o que se observa na Mídia e no Governo, visto a predominância da Dimensão Sistema de Produção na Ciência. Analisando comparativamente as fontes de informação foi possível observar que o maior grau de similaridade entre os códigos foi observad o nos documentos publicados pelo Governo. Embora os códigos adotados para sinalizar informações sobre a cadeia da carne bovina sejam os mesmos, enquadrados igualmente em cinco Dimensões, determinados sinalizadores são utilizados com maior destaque, similaridades e agrupamentos de códigos e, de acordo com as peculiaridades de cada fonte de informação investigada (Ciência, Governo e Mídia). / The goal of this research is to identify and analyze the main signaling mechanisms utilized in the Brazilian beef chain, as well as to investigate the degree of occurrence of these signalings by science, government, and media, through text mining and knowledge discover processes. Using Information Economy as key theory of this analysis and concentrating on the theories of Information Asymmetry, Adverse Selection and Signaling and, also, through revision of the interactions between science, media and government , five dimensions have been defined which encompass the signals about beef: economic, communication and marketing, quality, institutional and production system. To perform the analysis, electronic text documents over a five-year period have been collected. The documents have been searched in scientific journal databases, government portals, and news paper and magazine files available on the Internet, based on keywords connected to beef chain. The selection included 4.281 scientific articles to compose the science database, 730 documents to the government database, and 5.439 documents to the media database, in a total of 10.450 documents selected. These documents have been stored and classified in the QDA Miner software through the source variables (science, government and media ) and year (2005, 2006, 2007, 2008 and 2009). To extract the knowledge from the text databases, it has been elaborated an analysis framework composed by dimensions, codes (signals) and keywords. The next step was to analyze and codi fy the documents with the respective signallers. With application of the software analysis framework , it has been used graph representations elaborated from the relative frequency of each code occurrences in the three sources of information. According to the results found, it h as been evidenced that the terms and dimensions used by science, government an d media differ in frequency in practically all codes, demonstrating little synergy between the information sources. Over the period analyzed, a greater oscillation has been perceived in publications that included the sani ty and sustainability signals, which have been more emphasized in certain years. According to this research, the certification and traceability signals which, judging from existing studies, seemed to have the greatest emphasis to supply information on aspects related with Brazilian beef, are not the most issued. Science, over the analysis period, has been more identified with the production system dimension, demonstrating that scientific journals are more focused on the enhancement of production itself, such as feeding, genetics, breeding, slaughter age, among others. With regard to the government, it has been observed that this source of information is more identified with the institutional dimension, with exception of the year 2007, where the greatest emphasis of government publications is perceived in the quality and production syste m dimensions. In the media, quality and production system dimensions are the most emphasized in publications. Analyzing the results between the three sources of information, a closer connection between media and government is perceived, given the predominance of production system dimension in science . Comparative analysis between the sources of information revealed that the greatest degree of similarity between the codes has been found in the documents published by the government. Although the codes adopted to signaling information about beef chain are the same, equally framed in five dimensions, certain signallers are utilized with a greater emphasis, similarities and code groupings, and in accordance with the peculiarities of each source of information investigated (science, government and media). Agronegócio Cadeia produtiva Carne Informação Economia da informação Information Text mining Information asymmetry Signaling Beef
239	Context-Aware Adaptive Hybrid Semantic Relatedness in Biomedical Science January 2016 (has links) abstract: Text mining of biomedical literature and clinical notes is a very active field of research in biomedical science. Semantic analysis is one of the core modules for different Natural Language Processing (NLP) solutions. Methods for calculating semantic relatedness of two concepts can be very useful in solutions solving different problems such as relationship extraction, ontology creation and question / answering [1–6]. Several techniques exist in calculating semantic relatedness of two concepts. These techniques utilize different knowledge sources and corpora. So far, researchers attempted to find the best hybrid method for each domain by combining semantic relatedness techniques and data sources manually. In this work, attempts were made to eliminate the needs for manually combining semantic relatedness methods targeting any new contexts or resources through proposing an automated method, which attempted to find the best combination of semantic relatedness techniques and resources to achieve the best semantic relatedness score in every context. This may help the research community find the best hybrid method for each context considering the available algorithms and resources. / Dissertation/Thesis / Doctoral Dissertation Biomedical Informatics 2016 Computer science Health sciences Nanotechnology biomedical informatics natural language processing semantic analysis semantic relatedness text mining
240	A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems January 2016 (has links) abstract: Text Classification is a rapidly evolving area of Data Mining while Requirements Engineering is a less-explored area of Software Engineering which deals the process of defining, documenting and maintaining a software system's requirements. When researchers decided to blend these two streams in, there was research on automating the process of classification of software requirements statements into categories easily comprehensible for developers for faster development and delivery, which till now was mostly done manually by software engineers - indeed a tedious job. However, most of the research was focused on classification of Non-functional requirements pertaining to intangible features such as security, reliability, quality and so on. It is indeed a challenging task to automatically classify functional requirements, those pertaining to how the system will function, especially those belonging to different and large enterprise systems. This requires exploitation of text mining capabilities. This thesis aims to investigate results of text classification applied on functional software requirements by creating a framework in R and making use of algorithms and techniques like k-nearest neighbors, support vector machine, and many others like boosting, bagging, maximum entropy, neural networks and random forests in an ensemble approach. The study was conducted by collecting and visualizing relevant enterprise data manually classified previously and subsequently used for training the model. Key components for training included frequency of terms in the documents and the level of cleanliness of data. The model was applied on test data and validated for analysis, by studying and comparing parameters like precision, recall and accuracy. / Dissertation/Thesis / Masters Thesis Engineering 2016 Computer science Engineering data analytics R requirements classification text classification text mining

Search results