Spelling suggestions: "subject:"dataanalysis"" "subject:"data.analysis""
311 |
Getting Things in Order: An Introduction to the R package seriationHahsler, Michael, Hornik, Kurt, Buchta, Christian January 2007 (has links) (PDF)
Seriation, i.e., finding a linear order for a set of objects given data and a loss or merit function, is a basic problem in data analysis. Caused by the problem's combinatorial nature, it is hard to solve for all but very small sets. Nevertheless, both exact solution methods and heuristics are available. In this paper we present the package seriation which provides the infrastructure for seriation with R. The infrastructure comprises data structures to represent linear orders as permutation vectors, a wide array of seriation methods using a consistent interface, a method to calculate the value of various loss and merit functions, and several visualization techniques which build on seriation. To illustrate how easily the package can be applied for a variety of applications, a comprehensive collection of examples is presented. / Series: Research Report Series / Department of Statistics and Mathematics
|
312 |
Statistické srovnání výsledků perkutánních, ureteroskopických a robotických operací pro obstrukci ureteropelvické junkce. / Statistical evaluation of percutan, ureteroscopic a robotic surgeries of ureteropelvic obstructionMasarovičová, Martina January 2008 (has links)
The aim of this diploma thesis is statistical processing of a sample of patients that have been hospitalized and treated for ureteropelvic junction obstruction at the urological department of ÚNV Prague in last 20 years and to determine the optimal treatment method. Evaluation of surgical techniques from the surgical and economical point of creates a comprehensive image of advantages and disadvantages connected with application of a particular method and enables all participating subjects to decide in case of doubt. In this case the statistical analysis is a proper instrument, leading to find answers, however, it also gives an opportunity for discussion.
|
313 |
NETWORK AND TOPOLOGICAL ANALYSIS OF SCHOLARLY METADATA: A PLATFORM TO MODEL AND PREDICT COLLABORATIONLance C Novak (7043189) 15 August 2019 (has links)
The scale of the scholarly community complicates searches within scholarly databases,
necessitating keywords to index the topics of any given work. As a result, an author’s choice in
keywords affects the visibility of each publication; making the sum of these choices a key
representation of the author’s academic profile. As such the underlying network of investigators
are often viewed through the lens of their keyword networks. Current keyword networks connect
publications only if they use the exact same keyword, meaning uncontrolled keyword choice
prevents connections despite semantic similarity. Computational understanding of semantic
similarity has already been achieved through the process of word embedding, which transforms
words to numerical vectors with context-correlated values. The resulting vectors preserve semantic
relations and can be analyzed mathematically. Here we develop a model that uses embedded
keywords to construct a network which circumvents the limitations caused by uncontrolled
vocabulary. The model pipeline begins with a set of faculty, the publications and keywords of
which are retrieved by SCOPUS API. These keywords are processed and then embedded. This
work develops a novel method of network construction that leverages the interdisciplinarity of
each publication, resulting in a unique network construction for any given set of publications. Postconstruction the network is visualized and analyzed with topological data analysis (TDA). TDA is
used to calculate the connectivity and the holes within the network, referred to as the zero and first
homology. These homologies inform how each author connects and where publication data is
sparse. This platform has successfully modelled collaborations within the biomedical department
at Purdue University and provides insight into potential future collaborations.
|
314 |
[en] THE DEMOCRATIC ELITISM AND DISCOURSES OF THE BRAZILIAN SUPREME COURT / [pt] O ELITISMO DEMOCRÁTICO E DISCURSOS DO STFSHANDOR TOROK MOREIRA 08 January 2013 (has links)
[pt] Como o Supremo Tribunal Federal reconstrói a relação entre Estado e
Cidadania no Brasil contemporâneo, especialmente no que diz respeito à
democracia nacional? Com apoio em dois modelos teóricos sobre a democracia, o
elitismo democrático e os públicos participativos, a dissertação investigou o
discurso público produzido pelo STF ao julgar determinados casos, identificando
indícios de abuso de poder discursivo pela Corte nos mesmos. O referido abuso de
poder discursivo é caracterizado pela influência do marco teórico do elitismo
democrático e seu consequente potencial de reproduzir e reforçar desenho
institucional servil ao repertório de ação não universalizável da elite política
nacional. / [en] How the Brazilian Supreme Court (BSC) reconstructs the relation
between State and Citizenship in contemporary Brazil, especially concerning the
national democracy? The public discourse manufactured by the BSC whilst
deciding certain cases was investigated through the lenses of two theoretical
models of democracy, democratic elitism and participatory publics, in search for
evidences of discourse power abuse. Such abuse is characterized by the influence
of the democratic elitism framework and its potential to reproduce and reinforce
an institutional design unable to counteract the problematic action repertoir of the
Brazilian political elite.
|
315 |
Data analysis and visualization of the 360degrees interactional datasetsLozano Prieto, David January 2019 (has links)
Nowadays, there has been an increasing interest in using 360degrees video in medical education. Recent efforts are starting to explore how nurse students experience and interact with 360degrees videos. However, once these interactions have been registered in a database, there is a lack of ways to analyze these data, which generates a necessity of creating a reliable method that can manage all this collected data, and visualize the valuable insights of the data. Hence, the main goal of this thesis is to address this challenge by designing an approach to analyze and visualize this kind of data. This will allow teachers in health care education, and medical specialists to understand the collected data in a meaningful way. To get the most suitable solution, several meetings with nursing teachers took place to draw the first draft structure of an application which acts as the needed approach. Then, the application was used to analyze collected data in a study made in December. Finally, the application was evaluated through a questionnaire that involved a group of medical specialists related to education. The initial outcome from those testing and evaluations indicate that the application successfully achieves the main goals of the project, and it has allowed discussing some ideas that will help in the future to improve the 360degrees video experience and evaluation in the nursing education field providing an additional tool to analyze, compare and assess students.
|
316 |
Statistical Consulting at Draper LaboratoryRichard, Noelle M. 27 August 2014 (has links)
"This Master’s capstone was conducted in conjunction with Draper Laboratory, a non-profit research and development organization in Cambridge, Massachusetts. During a three month period, the author worked for the Microfabrication Department, assisting with projects related to statistics and quality control. The author gained real-world experience in data collection and analysis, and learned a new statistical software. Statistical methods covered in this report include regression analysis, control charts and capability, Gage R & R studies, and basic exploratory data analysis."
|
317 |
Predicting the area of industry : Using machine learning to classify SNI codes based on business descriptions, a degree project at SCB / Att prediktera näringsgrensindelning : Ett examensarbete om tillämpningavmaskininlärning för att klassificeraSNI-koder utifrån företagsbeskrivningarhos SCBDahlqvist-Sjöberg, Philip, Strandlund, Robin January 2019 (has links)
This study is a part of an experimental project at Statistics Sweden,which aims to, with the use of natural language processing and machine learning, predict Swedish businesses’ area of industry codes, based on their business descriptions. The response to predict consists of the most frequent 30 out of 88 main groups of Swedish standard industrial classification (SNI) codes that each represent a unique area of industry. The transformation from business description text to numerical features was done through the bag-of-words model. SNI codes are set when companies are founded, and due to the human factor, errors can occur. Using data from the Swedish Companies Registration Office, the purpose is to determine if the method of gradient boosting can provide high enough classification accuracy to automatically set the correct SNI codes that differ from the actual response. Today these corrections are made manually. The best gradient boosting model was able to correctly classify 52 percent of the observations, which is not considered high enough to implement automatic code correction into a production environment.
|
318 |
Integration med ett klick? : En studie av gigekonomins effekt på flyktingars arbetsmarknadsmöjligheterBjörk, Agnes, Bizas, Aliki January 2019 (has links)
Sedan 2015, då antalet asylsökande i Sverige nådde rekordhöjder, har frågan om hur flyktingar ska integrera sig i samhället tagits upp. Konsensus är att arbetsmarknadsetableringen är en viktig faktor och därmed har så kallade ”enkla jobb” föreslagits som en lösning på att öka sysselsättningen för flyktingar. Samtidigt kan det observeras ett nytt fenomen på arbetsmarknaden, gigekonomin. Kan gigekonomins enkla jobb vara lösningen? Syftet med denna undersökning är att analysera gigekonomins påverkan på flyktingars arbetsmarknadsutfall i form av sysselsättning och inkomst. I studien används paneldataanalys för att isolera effekten av gigekonomiföretagens etablering på flyktingars sysselsättning och inkomster. Undersökningens huvudestimat visar att existensen av gigekonomi i en kommun ökar sysselsättningen för flyktingar med 5,2 procentenheter. Efter utförande av flera känslighetstest kan dock denna effekt inte betraktas som robust. Därmed kan inte undersökningen påvisa att det är just gigekonomin som har orsakat denna effekt. / Since 2015, when the number of asylum seekers in Sweden grew to record breaking heights, the question about how refugees will integrate into society has been raised. The consensus is that the integration into the labour market is an important factor and therefore entry level jobs have been suggested as a solution. Currently a new phenomenon can be observed on the labour market, the gig economy. The objective of this paper is to analyse the effect of the gig economy on refugees’ social outcome in terms of employment and income. The study uses panel data analysis to isolate the effect of the gig economy on refugees employment and income. The main estimate of the study shows that the impact of the existance of a gig economy in i municipality increases the employment for refugees by 5,2 percentage points. However, after the execution of multiple robustness tests the effect cannot be considered robust. Therefore, the study cannot prove that it is specifically the gig economy that is the cause to this effect.
|
319 |
Classifying RGB Images with multi-colour Persistent HomologyByttner, Wolf January 2019 (has links)
In Image Classification, pictures of the same type of object can have very different pixel values. Traditional norm-based metrics therefore fail to identify objectsin the same category. Topology is a branch of mathematics that deals with homeomorphic spaces, by discarding length. With topology, we can discover patterns in the image that are invariant to rotation, translation and warping. Persistent Homology is a new approach in Applied Topology that studies the presence of continuous regions and holes in an image. It has been used successfully for image segmentation and classification [12]. However, current approaches in image classification require a grayscale image to generate the persistence modules. This means information encoded in colour channels is lost. This thesis investigates whether the information in the red, green and blue colour channels of an RGB image hold additional information that could help algorithms classify pictures. We apply two recent methods, one by Adams [2] and the other by Hofer [25], on the CUB-200-2011 birds dataset [40] andfind that Hofer’s method produces significant results. Additionally, a modified method based on Hofer that uses the RGB colour channels produces significantly better results than the baseline, with over 48 % of images correctly classified, compared to 44 % and with a more significant improvement at lower resolutions.This indicates that colour channels do provide significant new information and generating one persistence module per colour channel is a viable approach to RGB image classification.
|
320 |
A relação entre índice de sentimento de mercado e as taxas de retorno das ações: uma análise com dados em painel / The relationship between market sentiment index and stock returns: a panel data analysisYoshinaga, Claudia Emiko 09 December 2009 (has links)
Na teoria clássica de finanças, o sentimento do investidor não é considerado um fator importante sobre os preços das ações. Embora a existência do sentimento do investidor não seja negada, as teorias normalmente partem do princípio de que, em mercados financeiros competitivos, comportamentos de agentes quase-racionais são rapidamente eliminados. Esta tese tem o objetivo de investigar a relação entre o sentimento de mercado e as taxas de retorno futuras das ações. É proposta uma metodologia para a criação de um índice de sentimento específico para o mercado brasileiro com uso da análise de componentes principais. Com o objetivo de verificar a relação deste índice de sentimento com as taxas de retorno das ações, foi estimado um modelo de apreçamento em que esta variável foi incluída, para o período de 1999 a 2008. A amostra foi composta por empresas não-financeiras com ações listadas na BOVESPA, com uma negociabilidade mínima que garantisse observações suficientes e representativas para validar os resultados encontrados na pesquisa. O modelo de apreçamento foi estimado por GMM, levando em consideração o índice de sentimento de mercado, o risco sistêmico das empresas (medido pelo beta) e fatores como tamanho, índice market-to-book, alavancagem, momentum e crescimento da receita. Empregaram-se diferentes procedimentos para estimar os parâmetros dos modelos empíricos formulados, com o propósito de isolar influências espúrias, ocasionadas pela presença de heterogeneidade não-observada, pela existência de eventuais observações extremas ou mesmo pela possível endogeneidade dos regressores. Os resultados deste estudo empírico sugerem que o sentimento é um fator relevante no apreçamento das ações no mercado brasileiro. A relação negativa e significante entre o índice de sentimento e as taxas de retorno, encontrada consistentemente em diferentes modelos, indica um padrão de reversão nas taxas de retornos, ou seja, após um período de sentimento positivo, o impacto nas taxas de retorno no período seguinte é negativo, e vice-versa. / In classical finance theory investor sentiment is not considered an important factor in asset pricing. Although the existence of investor sentiment is not denied, theories assume that in competitive markets quasi-rational behavior is quickly offset by rational agents. The main goal of this thesis is to investigate the relationship between investor sentiment and future stock return rates. It is proposed a methodology to create a sentiment index specifically to the Brazilian market using principal components analysis. In order to analyze the relationship between this sentiment index and the future stock returns, it was estimated a pricing model including this variable for the period comprehending 1999 to 2008. Considering a negotiability restriction to assure representative and sufficient observations to validate a pricing model, the sample consisted of non-financial firms listed at BOVESPA. The pricing model was estimated by GMM considering the sentiment index, systematic risk (market beta) and factors as firm size, market-to-book ratio, leverage and return predictability measured by momentum or income growth. Different estimation procedures were applied to find empirical models coefficients which are less affected by spurious influence such as unobserved heterogeneity, outliers or possible regressors endogeneity. Results of the empirical study suggest that sentiment is a relevant factor in Brazilian asset pricing models. A negative and statistically significant relationship between the sentiment index and stock returns was consistently found in different models specifications. These findings suggest the existence of a reversion pattern in stock returns, meaning that after a positive sentiment period, the impact on subsequent stock returns is negative and vice-versa.
|
Page generated in 0.0598 seconds