91 |
[en] TEXT CATEGORIZATION: CASE STUDY: PATENT S APPLICATION DOCUMENTS IN PORTUGUESE / [pt] CATEGORIZAÇÃO DE TEXTOS: ESTUDO DE CASO: DOCUMENTOS DE PEDIDOS DE PATENTE NO IDIOMA PORTUGUÊSNEIDE DE OLIVEIRA GOMES 08 January 2015 (has links)
[pt] Atualmente os categorizadores de textos construídos por técnicas de
aprendizagem de máquina têm alcançado bons resultados, tornando viável a
categorização automática de textos. A proposição desse estudo foi a definição de
vários modelos direcionados à categorização de pedidos de patente, no idioma
português. Para esse ambiente foi proposto um comitê composto de 6 (seis)
modelos, onde foram usadas várias técnicas. A base de dados foi constituída de
1157 (hum mil cento e cinquenta e sete) resumos de pedidos de patente,
depositados no INPI, por depositantes nacionais, distribuídos em várias
categorias. Dentre os vários modelos propostos para a etapa de processamento da
categorização de textos, destacamos o desenvolvido para o Método 01, ou seja, o
k-Nearest-Neighbor (k-NN), modelo também usado no ambiente de patentes, para
o idioma inglês. Para os outros modelos, foram selecionados métodos que não os
tradicionais para ambiente de patentes. Para quatro modelos, optou-se por
algoritmos, onde as categorias são representadas por vetores centróides. Para um
dos modelos, foi explorada a técnica do High Order Bit junto com o algoritmo k-
NN, sendo o k todos os documentos de treinamento. Para a etapa de préprocessamento
foram implementadas duas técnicas: os algoritmos de stemização
de Porter; e o StemmerPortuguese; ambos com modificações do original. Foram
também utilizados na etapa do pré-processamento: a retirada de stopwords; e o
tratamento dos termos compostos. Para a etapa de indexação foi utilizada
principalmente a técnica de pesagem dos termos intitulada: frequência de termos
modificada versus frequência de documentos inversa TF -IDF . Para as medidas
de similaridade ou medidas de distância destacamos: cosseno; Jaccard; DICE;
Medida de Similaridade; HOB. Para a obtenção dos resultados foram usadas as
técnicas de predição da relevância e do rank. Dos métodos implementados nesse
trabalho, destacamos o k-NN tradicional, o qual apresentou bons resultados
embora demande muito tempo computacional. / [en] Nowadays, the text s categorizers constructed based on learning techniques,
had obtained good results and the automatic text categorization became viable.
The purpose of this study was the definition of various models directed to text
categorization of patent s application in Portuguese language. For this
environment was proposed a committee composed of 6 (six) models, where were
used various techniques. The text base was constituted of 1157 (one thousand one
hundred fifty seven) abstracts of patent s applications, deposited in INPI, by
national applicants, distributed in various categories. Among the various models
proposed for the step of text categorization s processing, we emphasized the one
devellopped for the 01 Method, the k-Nearest-Neighbor (k-NN), model also used
in the English language patent s categorization environment. For the others
models were selected methods, that are not traditional in the English language
patent s environment. For four models, there were chosen for the algorithms,
centroid vectors representing the categories. For one of the models, was explored
the High Order Bit technique together with the k-NN algorithm, being the k all the
training documents. For the pre-processing step, there were implemented two
techniques: the Porter s stemization algorithm; and the StemmerPortuguese
algorithm; both with modifications of the original. There were also used in the
pre-processing step: the removal of the stopwards; and the treatment of the
compound terms. For the indexing step there was used specially the modified
documents term frequency versus documents term inverse frequency TF-IDF .
For the similarity or distance measures there were used: cosine; Jaccard; DICE;
Similarity Measure; HOB. For the results, there were used the relevance and the
rank technique. Among the methods implemented in this work it was emphasized
the traditional k-NN, which had obtained good results, although demands much
computational time.
|
92 |
Probabilistic Regression using Conditional Generative Adversarial NetworksOskarsson, Joel January 2020 (has links)
Regression is a central problem in statistics and machine learning with applications everywhere in science and technology. In probabilistic regression the relationship between a set of features and a real-valued target variable is modelled as a conditional probability distribution. There are cases where this distribution is very complex and not properly captured by simple approximations, such as assuming a normal distribution. This thesis investigates how conditional Generative Adversarial Networks (GANs) can be used to properly capture more complex conditional distributions. GANs have seen great success in generating complex high-dimensional data, but less work has been done on their use for regression problems. This thesis presents experiments to better understand how conditional GANs can be used in probabilistic regression. Different versions of GANs are extended to the conditional case and evaluated on synthetic and real datasets. It is shown that conditional GANs can learn to estimate a wide range of different distributions and be competitive with existing probabilistic regression models.
|
93 |
The boxing discourse in late Georgian EnglandUngar, Ruti 12 November 2012 (has links)
Die Arbeit untersucht den Diskurs um das Boxen in der englischen Gesellschaft zwischen circa 1780 und 1820. Sie zeigt, dass er ein wichtiger Schauplatz für die Austragung sozialer, politischer und kultureller Konflikte war. Im Diskurs um das Boxen spiegeln sich in besonderem Maße die Konflikte zwischen civic humanism und politeness wieder, des Konfliktes zwischen zwei einander entgegengesetzten Männlichkeitsidealen: das Ideal vom starken Mann, das von den Boxern verkörpert wird und dem gegenüber das Ideal des verweichlichten und einfühlsamen ‚polite man‘. Boxen nimmt auch eine zentrale Funktion in den Debatten über die Rolle der Arbeiterklasse im ‚body politic‘ ein: von Konservativen wurde es eingesetzt als gegenrevolutionäre Maßnahme, um die Masse zu mobilisieren ohne Ihnen eine politische Teilhabe zu geben. Radikale sahen es als ein Instrument, um die Arbeiter zu ermächtigen, sie über Ihre Rechte zu informieren und deren Ansprüche auf Emanzipation zu legitimieren. Boxen war zudem ein Schlachtfeld, um verschiedene Verständnisse von Rasse und nationaler Identität auszutragen: einem Verständnis, dass das nationale Ganze als ethnisch homogen konstruierte und einem inklusiveren Verständnis der englischen Nation, das Minoritäten nicht ausschließen musste. / The study examines the discourse on boxing in English society circa 1780 to 1820. It shows that it was a site of struggle between diverse notions of gender, class, race, and nation. Boxing was a central arena for the opposition between civic humanism and politeness. It was an arena for the struggle between two diametrically opposed manly ideals, the strong and corporeal ideal epitomized by the boxers versus the feminine and sensitive polite ideal. Boxing took on an important role in the debates on the place of the working class in the body politic; conservatives perceived boxing as a counter-revolutionary measure and way to mobilise the masses in defence of their country without granting them political rights. Radicals viewed it as a tool to empowering the workers, educating them on their rights and legitimizing their claims for emancipation. Boxing was also a site of struggle between conflicting notions of race and differing ideas of national identity, specifically between one which saw the nation as ethnically homogenous and another, more cultural understanding of national identity, which was more inclusive to minorities.
|
94 |
Development of Sensors and Microcontrollers for Underwater RobotsJebelli, Ali January 2014 (has links)
Nowadays, small autonomous underwater robots are strongly preferred for remote exploration of unknown and unstructured environments. Such robots allow the exploration and monitoring of underwater environments where a long term underwater presence is required to cover a large area. Furthermore, reducing the robot size, embedding electrical board inside and reducing cost are some of the challenges designers of autonomous underwater robots are facing. As a key device for reliable operation-decision process of autonomous underwater robots, a relatively fast and cost effective controller based on Fuzzy logic and proportional-integral-derivative method is proposed in this thesis. It efficiently models nonlinear system behaviors largely present in robot operation and for which mathematical models are difficult to obtain. To evaluate its response, the fault finding test approach was applied and the response of each task of the robot depicted under different operating conditions. The robot performance while combining all control programs and including sensors was also investigated while the number of program codes and inputs were increased.
|
95 |
Začleňování fotovoltaických elektráren do elektrizační soustavy / Integration of Photovoltaic Power Plants in the Electricity SystemMichl, Pavel January 2010 (has links)
The thesis discuses an integration of photovoltaic power stations to electric network. The first part describes connecting conditions of small sources to distribution system, including administrative requirements, feasibility study, and requirements to the energy meters, measuring, control devices, switching devices and protection. The second part is aimed to describe problems of the photovoltaic system. Solar radiation generating and reducing of its intensity incident upon the earth surface are described in this part. The quantum of produced electric power depends on climatic conditions in the fixed area, seasons, etc. This work also discusses the types of photovoltaic cells and their actual efficiency. Inverters are further important components of the photovoltaic system. The parameters of the inverters have a great influence on the total actual efficiency of the photovoltaic system. Different methods of the photovoltaic panels’ connection with the inverters and their advantages and disadvantages are also mentioned. The supporting structure of the photovoltaic panels and eventually transformer are further important components of photovoltaic system. The work also analyze the methods of connection of the photovoltaic power station to distributive low voltage and medium voltage network, electric energy accumulation and possibilities of the sale of produced electric energy. The large number of the connected photovoltaic power stations has negative influences to electric network. The third part contains the design of a photovoltaic power plant with a capacity of 516,24 kWp on the scoped area in southern Bohemia. The project documentation for the location where the power plant is designed is also made. It contains the design of photovoltaic panels, the design of the inverters to get an optimal power load. This part also contains a calculation of the photovoltaic system losses and the design of transformer and the cable junction calculation of the distributive system. The feasibility study of the power plant connected to distributive system is also conducted. Its delivery rate will be connected to the distribution point Řípov (110/22 kV). The calculation results show us that this photovoltaic power plant can be linked to the distribution system. The final part of this paper contains an economic estimate of the photovoltaic power plant operating and the calculation of the return. An Economic return is influenced by the wide range of values that affect the total return rate. The calculation of an operating economy is made for several variants. The return rate in refer to contemporary redemption price for 2010 with no consideration for a bank loan is 7 years. If we consider the bank loan it would be 12 years. The penetrative reduction of the redemption price is expected for 2011. Calculation works with the decline of 30 %. It would extend the rate of return to 11 years without a bank loan or to 22 years with the bank loan. The bank loan is considered to cover 80 % of the investment.
|
Page generated in 0.0148 seconds