141 |
Normalização textual de conteúdo gerado por usuário / User-generated content text normalizationBertaglia, Thales Felipe Costa 18 August 2017 (has links)
Conteúdo Gerado por Usuário (CGU) é a denominação dada ao conteúdo criado de forma espontânea por indivíduos comuns, sem vínculos com meios de comunicação. Esse tipo de conteúdo carrega informações valiosas e pode ser explorado por diversas áreas do conhecimento. Muito do CGU é disponibilizado em forma de textos avaliações de produtos, comentários em fóruns sobre filmes e discussões em redes sociais são exemplos. No entanto, a linguagem utilizada em textos de CGU diverge, de várias maneiras, da norma culta da língua, dificultando seu processamento por técnicas de PLN. A linguagem de CGU é fortemente ligada à língua utilizada no cotidiano, contendo, assim, uma grande quantidade de ruídos. Erros ortográficos, abreviações, gírias, ausência ou mau uso de pontuação e de capitalização são alguns ruídos que dificultam o processamento desses textos. Diversos trabalhos relatam perda considerável de desempenho ao testar ferramentas do estado-daarte de PLN em textos de CGU. A Normalização Textual é o processo de transformar palavras ruidosas em palavras consideradas corretas e pode ser utilizada para melhorar a qualidade de textos de CGU. Este trabalho relata o desenvolvimento de métodos e sistemas que visam a (a) identificar palavras ruidosas em textos de CGU, (b) encontrar palavras candidatas a sua substituição, e (c) ranquear os candidatos para realizar a normalização. Para a identificação de ruídos, foram propostos métodos baseados em léxicos e em aprendizado de máquina, com redes neurais profundas. A identificação automática apresentou resultados comparáveis ao uso de léxicos, comprovando que este processo pode ser feito com baixa dependência de recursos. Para a geração e ranqueamento de candidatos, foram investigadas técnicas baseadas em similaridade lexical e word embeddings. Concluiu-se que o uso de word embeddings é altamente adequado para normalização, tendo atingido os melhores resultados. Todos os métodos propostos foram avaliados com base em um córpus de CGU anotado no decorrer do projeto, contendo textos de diferentes origens: fóruns de discussão, reviews de produtos e publicações no Twitter. Um sistema, Enelvo, combinando todos os métodos foi implementado e comparado a um outro sistema normalizador existente, o UGCNormal. Os resultados obtidos pelo sistema Enelvo foram consideravelmente superiores, com taxa de correção entre 67% e 97% para diferentes tipos de ruído, com menos dependência de recursos e maior flexibilidade na normalização. / User Generated Content (UGC) is the name given to content created spontaneously by ordinary individuals, without connections to the media. This type of content carries valuable information and can be exploited by several areas of knowledge. Much of the UGC is provided in the form of texts product reviews, comments on forums about movies, and discussions on social networks are examples. However, the language used in UGC texts differs, in many ways, from the cultured norm of the language, making it difficult for NLP techniques to handle them. UGC language is strongly linked to the language used in daily life, containing a large amount of noise. Spelling mistakes, abbreviations, slang, absence or misuse of punctuation and capitalization are some noises that make it difficult to process these texts. Several works report considerable loss of performance when testing NLP state-of-the-art tools in UGC texts. Textual Normalization is the process of turning noisy words into words considered correct and can be used to improve the quality of UGC texts. This work reports the development of methods and systems that aim to (a) identify noisy words in UGC, (b) find candidate words for substitution, and (c) rank candidates for normalization. For the identification of noisy words, lexical-based methods and machine learning ones using deep neural networks were proposed. The automatic identification presented results comparable to the use of lexicons, proving that this process can be done with low dependence of resources. For the generation and ranking of candidates, techniques based on lexical similarity and word embeddings were investigated. It was concluded that the use of embeddings is highly suitable for normalization, having achieved the best results. All proposed methods were evaluated based on a UGC corpus annotated throughout the project, containing texts from different sources: discussion forums, product reviews and tweets. A system, Enelvo, combining all methods was implemented and compared to another existing normalizing system, UGCNormal. The results obtained by the Enelvo system were considerably higher, with a correction rate between 67 % and 97 % for different types of noise, with less dependence on resources and greater flexibility in normalization.
142 |
Persuasiveness in the discourse of wine : The rhetoric of Robert ParkerHommerberg, Charlotte January 2011 (has links)
The primary purpose of this study is to explore a case of remarkably powerful contemporary rhetoric, namely Robert Parker’s wine writing, which has had an unprecedented impact in the world of prestigious wine for more than two decades. Parker, an American autodidact who gave up his career in law to become a fulltime wine critic, is considered the most influential critic of all time. This background motivates the approach of the current enquiry, which targets the persuasiveness in Parker’s writing. The investigation strives to bring to the fore both explicit and implicit elements of his wine reviews that have the potential to contribute to rhetorical success. The material selected for analysis comprises a corpus of reviews extracted from Parker’s extensive bulk of wine writing. The texts are studied against the backdrop of socio-cultural and institutional frames. Considerable importance is assigned to the fact that the reviews occur within a strictly specialized field of discourse with a highly conventionalized configuration. This hermeneutic enquiry approaches the topic from three analytical perspectives, designed to highlight persuasiveness in representations, argumentation and appraisal. The presentation reports on schematic patterns in Parker’s discourse as well as close interpretation of individual texts. The analysis of representations shows that both visual and verbal representations contribute to the persuasiveness of the text. The argumentative exploration of Parker’s discourse, which is assisted by the analytical tools of pragma-dialectics, demonstrates that the reviews involve rational argumentation on several subordinate levels, given in support of assessments and recommendations. Finally, the perspective of appraisal draws on the analytical resources provided by the Appraisal model to shed light on the way in which the audience is positioned to respond with respect to emotional, associative and perceptual values. The results indicate that the persuasiveness of Parker’s discourse arises as a result of concordance among an intricate array of interrelated factors. The audience is recurrently demonstrated to play a crucial role as co-constructors of the message. The present study also has methodological outcomes, presenting a novel combination of analytical methods to perform contextually situated discourse analysis. In addition, the material is allowed to challenge the theoretical ideas and notions that are addressed.
143 |
Begreppet Likabehandling, en social konstruktion : en kvalitativ textanalys av fem västsvenska skolors likabehandlingsplaner / The concept of Equality of treatment, a social construction : a qualitative text analysis of five western Swedish school’s equality of treatment planRasmussen, Anna, Karlsson, Annelie January 2013 (has links)
Syftet med studien var att studera hur begreppet likabehandling konstrueras genom språketsanvändning i likabehandlingsplaner upprättade i en västsvensk kommun läsåret 2012/2013. Iförhållande till detta formulerades följande frågeställningar. Hur konstrueras begreppetlikabehandling genom språkets användning i likabehandlingsplanerna, vilka likheter ellerskillnader kan urskiljas i hur begreppet likabehandling konstrueras genom språketsanvändning i de olika likabehandlingsplanerna? Studien tar sin utgångspunkt idiskrimineringslagens föreskrifter om att en likabehandlingsplan ska uppträttas iutbildningsverksamheter samt i tidigare forskning om likabehandling i relation till begreppsom rättvisa, diskriminering och värdering. Studiens teoretiska utgångspunkter ärsocialkonstruktionism och symbolisk interaktionism. Likabehandlingsplanerna har analyseratsmed kvalitativ textanalys inspirerad av diskursanalytiska perspektiv samt med inspiration avkvantitativ innehållsanalys. Fokus ligger på grammatik och ordval som är frekventförekommande i planerna. Resultatet visar att skolorna använder sig av ord somlikabehandlingsplan, diskriminering, känna, mål, åtgärd och ansvar för att konstrueralikabehandling. Det finns också tecken på att vissa av dessa ord ska inneha samma innebördför samtliga deltagare i processen. Skolorna presenterar en verklighetsyn där det krävshandling för att uppnå likabehandling dock är skolorna till stor del inte överens om vilkahandlingar det gäller. Det finns tydliga tecken på att skolorna definierar situationen kringlikabehandling på olika sätt vilket kan komma att påverka hur det praktiskalikabehandlingsarbetet ser ut. / The purpose of the study was to study how the concept of equality of treatment is constructedthrough the use of language in equality of treatment plans drawn up in a western Swedishmunicipality during the school year of 2012/2013. Questions were formulated in relation withthe aim of the study. How is the concept of equality of treatment constructed by the languageuse in equality of treatment plans, what similarities or differences can be seen in how theconcept of equality of treatment is constructed through language use in the different equalityof treatment plans? The study is based on the sections of the discrimination law thatprescribes that an equality of treatment plan should be established in every educationalsystem, as well as in previous research of equality of treatment in relation to concepts such asjustice, discrimination and values. The study is also based on socialconstructionism andsymbolic interactionism. The equality of treatment plans were analyzed using qualitative textanalysis inspired by discourse analysis perspectives and quantitative content analysis focusedon grammar and vocabulary that are frequent in the plans. The result shows that the schoolsuse the words equality of treatment plan, discrimination, feel, goals, action, and responsibilityto construct equality of treatment. There are also signs that some of these words shall holdthe same meaning for all participants in the process. The schools present a reality whereaction is needed to achieve equality of treatment but the schools do not agree on type ofactions. There are clear signs that the schools define the situation regarding equality oftreatment in different ways, which may have effect on the practical work of equality of treatment.
144 |
Edith Wharton's View of Women: Lily Bart in The House of MirthJohansson, Monique January 2011 (has links)
In this essay I plan to show how Wharton, through Lily, criticised society, and more specifically its expectations of women. My thesis is that Wharton and her character Lily exposed the upper class society of New York, and its ruthlessness, by voicing a woman’s point of view. Therefore, the main purpose here is to reveal the complexity of the lives women led in order to fulfil society’s expectations and I thereby plan to explore what it was like living in a world governed by strict rules of conduct.
145 |
Den nya ämnesplanen i moderna språk : vad innebär den för förändringar och hur tolkas den?Plaza, Cajsa January 2012 (has links)
The purpose of this research is to study the new curriculum in the topic of modern languages in the upper secondary school, and compare the curriculum from year 2011 with the one from year 2000 and thus try to find differences in the content and wordng. The research also aims to investigate teachers' interpretations and understanding of the new curriculum. The study is divided into two sub‐studies and those are based on the two methods of text analysis and interview. Text analysis of policy documents is made parallell with some aspects found in the background materials of the upper secondaryschool reform, which can be summarized as precision and globalization. My results also show that the subject presentation in the new curriculum of modern languages was diminished in scope and summaized. The interview study connects to the aspects that have emerged in the text analysis. The results of the interview study is that teachers find it difficult to interpret the new curriculum in an equivalent way, and this probably has to do with the fact that they have not received a sufficient or equivalent preparation in the new policydocuments. The general opinion of the teachers is that they have noticed the attempts of precision in the new curriculum, but they would wish even more precision. The new curriculum has mostly changed their teaching in the aspect of the new grading scale.
146 |
The Colonizer and the Colonized in Kazuo Ishiguro's Novels, An Artist of the Floating World and The Remains of the DayJohansson, Monique January 2012 (has links)
This essay investigates the colonized self in Kazuo Ishiguro’s An Artist of the Floating World and The Remains of the Day, by analyzing the novels from a postcolonial perspective. Furthermore, it discusses how and why Masuji Ono and Mr. Stevens are affected by Japanese imperialism and British colonialism. Through a close reading of the novels, this essay argues that the protagonists are ‘colonized’ by their own countries, and eventually also ‘imperialized,’ or influenced, by America following the Second World War. Ono is ‘colonized’ by his colleague Matsuda, while Mr. Stevens is ‘colonized’ by his employer, Mr. Darlington. Later on, they are both ‘imperialized’ through the American occupation and influence.
147 |
Predicting weight loss in blogs using computerized text analysisChung, Cindy Kyuah 16 October 2009 (has links)
An increasing number of people are turning to online blogging communities
devoted to self-change for smoking, shopping, and other behaviors. To understand
processes underlying effective self-change, the current project tracked the language and
social dynamics of a dieting blog community using computerized text analysis. Three
research questions were asked: What predicts weight loss in blogs? What changes in
blogging predict weight loss? Can we predict dropping out or successful weight loss
based on the first two entries? A community of blogs devoted to weight loss was
examined (n = 2530). Most bloggers were female, and on average, around 30 years old,
approximately 200 pounds, with a goal weight of about 140 pounds. A sample of blogs
by females that had blogged at least 15 entries within the first 15 weeks of blogging
resulted in a total of 186 blogs, representing over 9,200 entries for analysis.
Computerized text analysis was used to examine language for rates of self-focus,
emotionality, cognitive processing, keeping food diaries, and social support. Rates of blogging were assessed by word counts, number of active weeks, and mean entries per
week. Social support was assessed through the use of social words, the size of the social
network, along with the positivity and negativity of the comments. The discrepancy
between start and goal weight was also assessed. The results suggested that having larger
weight loss goals and blogging about personal events was a more effective weight loss
strategy than keeping an online food intake diary. The degree to which bloggers were
socially integrated with the blog community was found to be a potent predictor of weight
loss. Online components of behavioral treatment programs could encourage dieters to
browse and comment on other dieters’ progress, and to share personal narratives rather
than simply focusing on the benefits of food intake diaries, nutrition, and exercise. The
current project points to the power of computerized text analytic tools to address
important theoretical and practical social psychological issues that are evolving on the
internet. Specifically, language analysis methods can identify which dimensions of blogging communities can help or hinder self-change processes. / text
148 |
憂鬱傾向者之微博書寫分析 / Search for Depress Tendency: An Analysis on Chinese Micro-Blog Texts任喆鸝, Ren, Zhe Li Unknown Date (has links)
透過對十位已確認之憂鬱症患者之微博關係圈進行滾雪球,發現 127憂鬱傾向者,共爬取憂鬱傾向者之微博文本20748則,作為文本分析之數據集,並運用內容分析、質化分析、詞頻分析及詞語共現等多種方法分析文本。
分析結果顯示:(1)透過對文本進行語調、情緒、主題及憂鬱程度的編碼後,我們發現憂鬱傾向者在微博之書寫含62%的負面語調及25.1%的憂鬱文本,其中,負面及憂鬱程度較高的書寫主題是「自我」、「親情」、「自殺」及「睡眠障礙」。(2)深入對「自我」及「親情」憂鬱書寫的質化分析後,發現他們不同於一般人的心理特質,其中,「自我厭惡」及「不被理解」是他們心中最難以釋懷的角落。(3)由於「自殺」、「睡眠障礙」屬於憂鬱人群特徵,經過分析發現透過主題關聯詞的共現詞組有助於辨識憂鬱人群,其中,「睡眠障礙」共現詞的憂鬱文本辨識度達74%,「自殺」共現詞的憂鬱文本辨識度達34%,未來透過機器的方式,可進一步優化該方法,提升憂鬱文本的辨識度。 / This research aims to answer the following questions:(1)What are the characteristics of micro-blog writing by the depressed tendency people? (2)How to identify the text in social media? Ten Wei-bo users with identified depressed tendency were chosen as starting points of snow-ball searching, and 127 users were located. A total of 20748 messages from this group of the users was collected as the dataset. Multiple methods were applied to analyze the texts: content analysis, qualitative text analysis, word frequency analysis and word co-occurrence.
The result indicated that: (1)Through the coding of the text tone, mood, theme and degree of depression, we find out that in micro blog writing, the depressive tendency uses 62% of the negative tone and 25.1% of the blue text. Among them, higher negative and degree of depression of writing subjects are "self", "family", "suicide" and "sleep disorder". (2)Through deep qualitative analysis of "self" and "affection" depressed writing, the "self loathing" and "don't understand" in their mind are the most unforgettable. (3)Because the depressed people have the features of "suicide" and "sleep disorder", through the analysis, we find that through theme related words, it is helpful in the identification of the depression text. Among them, the "sleep disorders" co-occurrence words depressed text identification is up to 74%, and "suicide" co-occurrence words depressed text identification degree is 34 %.In the future, through the computer, we can further optimize the method, and enhance the degree of identification of depression text.
149 |
Barn, familj och klass : en jämförande studie av läseböcker i svenskämnet för årskurs 3, från 1950 till 2009Andreasson Nielsen, Karin January 2011 (has links)
In primary school reading-books in Swedish language is something that students come across early in school and have done ever since the 1800s. After World War II there have been major changes in society. 1962 there was a school reform where the school turned into a nine-year system of compulsory education. That is why it is interesting to see what reading books contain with regard to notions of children, family and class and how the developments have been from the 1950s until the present time. This study aims to analyze and compare the different aspects of how notions of children, family and class are produced in the reading books for primary school for pupils in third grade, between 1950 and 2009. The research questions were: What are the notions of children, family and class in the reading-books? What differences/similarities exist in the reading books from the 1950s to the recent decades? What could be the reason? I have chosen to apply qualitative text analysis originating in idea analysis. Because my focus was based on an attempt to track, analyse and compare the concepts, perceptions and representations of the reading books for third grade. My results shows that the children portrayed in the books from 1950-1965 are well mannered. And their relationships with the adults are strong and they learn from their parents, grandparents and relatives. The study also shows that the children portrayed in the reading books from 1976 and forward, make their own choices and play more major roles in the stories. The results show that the nuclear family is seen as an ideal. The study also shows that there are children in the book from 2009 who are experiencing economic inequalities in the form of clothes and other belongings. Overall, in the books i think that the middle class shows strongest but the markers seen in the stories are relatively vague to be seen in the stories.
150 |
Like, share and tag : A comparative study of UNDP Stockholm and UNDP New York’s usage of Facebook as a communication tool / Gilla, dela och tagga : En komparativ studie av UNDP Stockholm och UNDP New Yorks användande av Facebook som kommunikationsverktygPetersson, Victor January 2012 (has links)
The purpose of this research is to study how the UNDP offices in Stockholm and New York are using Facebook to set the agenda regarding the Millennium Development Goals, but also how the offices are communicating and presenting the goals towards the public. The research is based on publications from the two Facebook groups “millenniemålen – åtta mål för en bättre värld” and “United Nations Development Programme – UNDP” published between November 2011 until April 2012. The publications were categorized and analyzed using content analysis, a method that allows categorization of data which enables me to compare the two offices publications rate but also the amount of publications according to which MDG to be in focus. Text analysis of 24 publications allowed me to detect a pattern as well as analysis the two way communication occurring. The text analysis provides an understanding of the how the organizations are working with setting the agenda of the Millennium Development Goals, but also how a relationship is created through communication. The theoretical standpoint for the thesis draws on agenda-setting, strategic communication and Public Relations, also referred to as PR, as the two offices are working with awareness towards the public - a work that need a creation of relationship to the audience. This study shows that the two offices are communicating the Millennium Development Goals differently towards the public, with different results. The New York office are interacting with its followers on Facebook, directing the readers to engage in the set topics through questions and statements and creating dialogues organization to reader but also reader to reader. The UNDP Stockholm is using Facebook as a gateway to their webpage, where information is presented as a news article with little or none chance for the reader to comply. The setting of the agenda is done through the publications, but the publications lack the tools showing if the agenda have been embraced by the readers. / Syftet med denna uppsats är att undersöka hur UNDP kontoren i Stockholm och New York använder Facebook för att sätta agendan på millenniemålen, men även visa hur kontoren kommunicerar och presenterar målen till besökarna/följarna av sidorna. Undersökningen är baserad på publikationer gjorda på de två kontorens Facebook sidor “millenniemålen – åtta mål för en bättre värld” och “United Nations Development Programme – UNDP” under tidsperioden november 2011 till april 2012. Publikationerna blev kategoriserade och analyserade med hjälp av innehållsanalys, en metod som genom sin kategoriseringsprocess möjliggör jämförelse mellan de två kontorens sätt att sätta agendan och arbeta med publiken, men även nödvändig information som millenniemål i fokus i enskilda publikationer. Text analys applicerades på 24 publikationer för att mer djupgående förstå hur de två kontoren jobbar för att skapa dialog samt en relation till läsarna genom den satta agendan. Hur UNDP kontoren adresserar läsarna samt överbyggande teman för texterna blev synliga genom denna analys metod. Som teoretisk grund använder jag ”agenda-setting”, strategisk kommunikation samt Public Relations, också refererad till som PR. Båda kontoren har till uppgift att uppmärksamma allmänheten på UNDPs agenda, en uppgift som innebär relationsskapande till sin publik. Resultatet visar att de två kontoren skiljer sig åt när det gäller att kommunicera millenniemålen. New York kontorets interagerande med sin publik genom Facebook, där de skapar dialog kring ett satt ämne mellan organisationen och publiken, men även mellan publiken själv, visar på en fungerande strategi att skapa intresse kring målen. Interaktionen visar även att publiken engagerar sig i de publikationerna som blivit gjorda, med andra ord den agendan som blivit satt. UNDP Stockholm använder Facebook som en ”gateway” till sin hemsida, där informationen blir presenterad som en nyhetsartikel med få inbjudningar till dialog eller chans att respondera . Även om agendan är tydlig i texterna, finns det inga bevis på att läsarna är mottagliga för den.
Page generated in 0.4827 seconds