Global ETD Search

11	Automatic Patent Classification Yehe, Nala January 2020 (has links) Patents have a great research value and it is also beneficial to the community of industrial, commercial, legal and policymaking. Effective analysis of patent literature can reveal important technical details and relationships, and it can also explain business trends, propose novel industrial solutions, and make crucial investment decisions. Therefore, we should carefully analyze patent documents and use the value of patents. Generally, patent analysts need to have a certain degree of expertise in various research fields, including information retrieval, data processing, text mining, field-specific technology, and business intelligence. In real life, it is difficult to find and nurture such an analyst in a relatively short period of time, enabling him or her to meet the requirement of multiple disciplines. Patent classification is also crucial in processing patent applications because it will empower people with the ability to manage and maintain patent texts better and more flexible. In recent years, the number of patents worldwide has increased dramatically, which makes it very important to design an automatic patent classification system. This system can replace the time-consuming manual classification, thus providing patent analysis managers with an effective method of managing patent texts. This paper designs a patent classification system based on data mining methods and machine learning techniques and use KNIME software to conduct a comparative analysis. This paper will research by using different machine learning methods and different parts of a patent. The purpose of this thesis is to use text data processing methods and machine learning techniques to classify patents automatically. It mainly includes two parts, the first is data preprocessing and the second is the application of machine learning techniques. The research questions include: Which part of a patent as input data performs best in relation to automatic classification? And which of the implemented machine learning algorithms performs best regarding the classification of IPC keywords? This thesis will use design science research as a method to research and analyze this topic. It will use the KNIME platform to apply the machine learning techniques, which include decision tree, XGBoost linear, XGBoost tree, SVM, and random forest. The implementation part includes collection data, preprocessing data, feature word extraction, and applying classification techniques. The patent document consists of many parts such as description, abstract, and claims. In this thesis, we will feed separately these three group input data to our models. Then, we will compare the performance of those three different parts. Based on the results obtained from these three experiments and making the comparison, we suggest using the description part data in the classification system because it shows the best performance in English patent text classification. The abstract can be as the auxiliary standard for classification. However, the classification based on the claims part proposed by some scholars has not achieved good performance in our research. Besides, the BoW and TFIDF methods can be used together to extract efficiently the features words in our research. In addition, we found that the SVM and XGBoost techniques have better performance in the automatic patent classification system in our research. Engineering and Technology Teknik och teknologier Elektroteknik och elektronik
12	[pt] DIFERENCIAÇÕES DE GÊNERO NA CARACTERIZAÇÃO DE PERSONAGENS: UMA PROPOSTA METODOLÓGICA E PRIMEIROS RESULTADOS / [en] GENDER REPRESENTATIONS ON CHARACTERS DESCRIPTION: A METHODOLOGICAL PROPOSAL AND EARLY RESULTS FLAVIA MARTINS DA ROSA P DA SILVA 10 August 2021 (has links) [pt] Este trabalho apresenta uma metodologia que propõe a combinação de dados quantitativos e distanciados com a leitura mais detalhada e aproximada em análises de discurso, oferecendo a oportunidade de novos olhares sobre os dados e diversas perspectivas de análise. A metodologia faz uso de recursos dos estudos linguísticos com corpus, tais como listas de frequência, preferência, categorização e leitura de linhas de concordância. Demonstra-se sua aplicação, tomando-se como objeto de exploração obras da literatura brasileira em domínio público compiladas em um corpus com cerca de 5 milhões de palavras, anotado semântica e morfossintaticamente, e utilizam-se ferramentas computacionais que permitem buscas com base em padrões léxico-sintáticos da língua portuguesa. O objetivo é identificar como as personagens masculinas e femininas são caracterizadas nos textos, possibilitando tanto elaborar uma visão geral de como mulheres e homens são construídos através da linguagem. O estudo se dá em duas frentes: observando os predicadores na descrição das personagens e as ações são desempenhadas por elas, fazendo distinção entre masculinas e femininas, comparando-as e analisando as diferenças de forma crítica. / [en] This work presents a methodology that proposes the combination of quantitative and distant-read data with detailed, closer reading in discourse analysis, enabling new possible views over data and diverse perspectives of analysis. This methodology makes use of resources most used in corpus-based linguistic, such as frequency lists, preferences, categorization, and reading concordance lines. Its application is demonstrated using as exploration object Brazilian literature titles in the public domain, compiled in a corpus with approximately 5 million words, semantically and morpho-syntactically annotated, and by using computational tools that enable searches based on lexical-syntactic patterns of the Portuguese language. The purpose is to identify how the male and female characters are portrayed in those texts, enabling the creation of a general view on how women and men are built through language. The study happens in two fronts: by observing the predicates used on describing characters and the actions these characters take, comparing the male and female results and analyzing them in a critical way. [pt] LITERATURA BRASILEIRA [pt] PAPEIS DE GENERO [pt] MINERACAO DE DADOS TEXTUAIS [pt] HUMANIDADES DIGITAIS [pt] ANALISE DE DISCURSO [pt] LINGUISTICA DE CORPUS [en] BRAZILIAN LITERATURE [en] GENDER ROLES [en] TEXT DATA MINING [en] DIGITAL HUMANITIES [en] DISCOURSE ANALYSIS [en] CORPUS LINGUISTICS

Search results

Automatic Patent Classification

[pt] DIFERENCIAÇÕES DE GÊNERO NA CARACTERIZAÇÃO DE PERSONAGENS: UMA PROPOSTA METODOLÓGICA E PRIMEIROS RESULTADOS / [en] GENDER REPRESENTATIONS ON CHARACTERS DESCRIPTION: A METHODOLOGICAL PROPOSAL AND EARLY RESULTS