• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 7
  • 2
  • 1
  • Tagged with
  • 22
  • 22
  • 7
  • 6
  • 6
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Development of a data-driven marketing strategy for an online pharmacy

Holmér, Gelaye Worku, Gamage, Ishara H. January 2022 (has links)
The term electronic commerce (e-commerce) refers to a business model that allows companies and individuals to buy and sell goods and services over the internet. The focus of this thesis is on online pharmacies, a segment of the ecommerce market. Even though internet pharmacies are still subject to the same stringent rules imposed on pharmacies that limit the scope for their market growth, it has shown a notable increase in the past decades. The main goal of this thesis is to develop a data-driven marketing strategy based on a Swedish based online pharmacy’s daily sales data. The methodology of the data analysis includes exploratory data analysis (EDA) and market basket analysis (MBA) using the Apriori algorithm and the application of marketing frameworks and theories from a data-driven standpoint. In addition to the data analysis, this paper proposes a conceptual framework of a digital marketing strategy based on the RACE framework (reach, act, convert, and engage). The result of the analysis has led to the following data-driven marketing strategy: Special attention should be paid to association rules with a high lift ration value; high gross profit margin percentile (GPMP) products should have a volume-based marketing strategy that focuses on lower prices on subsequent items; and price bundling is the best marketing strategy for low GPMP products. Some of the practical ideas mentioned in this thesis paper include optimizing keyword search for a high GPMP product type and sending reminder emails and push alerts to avoid cart abandonment. The findings and recommendations presented in this thesis can be used by online pharmacies to extract knowledge that may support several decisions ranging from raising overall order size, marketing campaigns, to increasing the sales of products with a high gross profit margin.
12

Graph-based Modern Nonparametrics For High-dimensional Data

Wang, Kaijun January 2019 (has links)
Developing nonparametric statistical methods and inference procedures for high-dimensional large data have been a challenging frontier problem of statistics. To attack this problem, in recent years, a clear rising trend has been observed with a radically different viewpoint--``Graph-based Nonparametrics," which is the main research focus of this dissertation. The basic idea consists of two steps: (i) representation step: code the given data using graphs, (ii) analysis step: apply statistical methods on the graph-transformed problem to systematically tackle various types of data structures. Under this general framework, this dissertation develops two major research directions. Chapter 2—based on Mukhopadhyay and Wang (2019a)—introduces a new nonparametric method for high-dimensional k-sample comparison problem that is distribution-free, robust, and continues to work even when the dimension of the data is larger than the sample size. The proposed theory is based on modern LP-nonparametrics tools and unexplored connections with spectral graph theory. The key is to construct a specially-designed weighted graph from the data and to reformulate the k-sample problem into a community detection problem. The procedure is shown to possess various desirable properties along with a characteristic exploratory flavor that has practical consequences. The numerical examples show surprisingly well performance of our method under a broad range of realistic situations. Chapter 3—based on Mukhopadhyay and Wang (2019b)—revisits some foundational questions about network modeling that are still unsolved. In particular, we present unified statistical theory of the fundamental spectral graph methods (e.g., Laplacian, Modularity, Diffusion map, regularized Laplacian, Google PageRank model), which are often viewed as spectral heuristic-based empirical mystery facts. Despite half a century of research, this question has been one of the most formidable open issues, if not the core problem in modern network science. Our approach integrates modern nonparametric statistics, mathematical approximation theory (of integral equations), and computational harmonic analysis in a novel way to develop a theory that unifies and generalizes the existing paradigm. From a practical standpoint, it is shown that this perspective can provide adequate guidance for designing next-generation computational tools for large-scale problems. As an example, we have described the high-dimensional change-point detection problem. Chapter 4 discusses some further extensions and application of our methodologies to regularized spectral clustering and spatial graph regression problems. The dissertation concludes with the a discussion of two important areas of future studies. / Statistics
13

Revis?o taxon?mica das esp?cies brasileiras de Alternanthera Forssk (Amaranthaceae Juss.)

Souza, Luisa Ramos Senna 10 January 2015 (has links)
Submitted by Ricardo Cedraz Duque Moliterno (ricardo.moliterno@uefs.br) on 2017-10-09T21:51:42Z No. of bitstreams: 1 TESE_LUISA RAMOS.pdf: 12171909 bytes, checksum: dfd039e3c7457072b9d4c4c178c30162 (MD5) / Made available in DSpace on 2017-10-09T21:51:42Z (GMT). No. of bitstreams: 1 TESE_LUISA RAMOS.pdf: 12171909 bytes, checksum: dfd039e3c7457072b9d4c4c178c30162 (MD5) Previous issue date: 2015-01-10 / Amaranthaceae is a family of about 180 genera and 2,500 species, divided in 8 subfamilies, and distributed in tropical and temperate areas of both hemispheres. It represents the most species-rich lineage within the Caryophyllales. For Brazil are recognized 158 species in 27 genera, with three endemic, and most of them included in Gomphrenoideae.This subfamily includes two major Brazilian genera,Gomphrena with 45 species and Alternanthera with 36 species. Alternanthera is a monophyletic group of about 100 species of pantropical distribution, characterized by sessile or pedunculate axillaryinflorescences, with the partial inflorescence unit reduced to a single flower. Flowers are sessile or pedicellate, protected by a bract and two bracteoles, bisexual, with (4-)5 tepals, (4-)5 stamens, with fused filaments forming a basal tube, free towards the tube end and alternating with pseudo-staminodes.The latest revision of Alternanthera for Brazil dates backto that of the Flora Brasiliensis,of more than 160 years ago, fully justifyinga new revision, the principal objective of our thesis. It isdivided into four chapters, followingan introduction. Chapter 1 includes a morphological study of the Alternanthera species in Brazil, using 107 characters, of which 99 were considered informative and have been evaluated using DELTA, resulting in a description of vegetative and floral organs of Alternanthera,with the focuson the most important of them for the group's taxonomy anda discussionof their terminology.Chapter 2 isa study of the Alternanthera brasiliana complex of five species, using the statistical exploratory analysis. Only four species in the complexwere recognized, with a proposalto synonymize Alternanthera ramosissimatoA. brasiliana.Chapter 3 is the text of a paperon the new species Alternanthera catingae, submitted for publication in Phytotaxa. Chapter 4 is a revision of the Brazilian species of Alternanthera, based on the study of 1,900 specimens deposited in 19 herbaria and using the adopted typological species concept.In addition to the traditional revisionary methodology, modernanalytic tools were used inthe evaluation of species occurring in Brazil. We recognized 35 species, with one, A.catingae, new to science, synonymizedsix species, and made one new combination. The work should be an important contributionto the study of the Brazilian Amaranthaceae, especially in terms of reaching the objective of making the Flora of Brazil go online in 2020. / As Amaranthaceae constituem uma fam?lia com cerca de 180 g?neros e 2.500 esp?cies, com distribui??o nas faixas tropicais e temperadas dos dois hemisf?rios e representam a mais rica linhagem de esp?cies dentre as Caryophyllales. Inclui 8 subfam?lias . Para o Brasil s?o referidas 158 esp?cies distribu?das em 27 g?neros, dos quais tr?s s?o end?micos, grande parte desses g?neros s?o inclu?dos em Gomphrenoideae. Esta subfam?lia inclui os maiores g?neros representados no Brasil, Gomphrena com 45 esp?cies e Alternanthera com 36 esp?cies. Alternanthera ? um grupo monofil?tico com cerca de 100 esp?cies, possui distribui??o pantropical e ? caracterizado por apresentar infloresc?ncias axilares, s?sseis ou pedunculadas, com unidade parcial da infloresc?ncia reduzida a ?nica flor. As flores s?o s?sseis ou pediceladas, protegidas por uma br?ctea e duas bract?olas, bissexuadas, com (4?)5 t?palas, (4?)5 estames com filetes fundidos formando um tubo basal, livres acima do tubo e alternados com pseudo-estamin?dios. A ?ltima revis?o de Alternanthera para o Brasil foi elaborada para a Flora brasiliensis mais de 160 anos atr?s, justificando-se plenamente a nova revis?o proposta, que ? o principal objetivo desta tese. Os resultados obtidos encontram-se distribu?dos em quatro cap?tulos, que se seguem ap?s a Introdu??o. No Cap?tulo 1 ? apresentado o estudo morfol?gico das esp?cies de Alternanthera do Brasil, onde foram levantados 107 caracteres dos quais 99 foram considerado informativos e foram avaliados utilizando o Programa DeLTA. Como resultado, ? produzida uma descri??o dos ?rg?os vegetativos e florais de Alternanthera destacando os mais importantes para taxonomia do grupo, bem como uma discuss?o dos diferentes termos utilizados na morfologia do grupo. No Cap?tulo 2 ? apresentado o estudo do ?complexo Alternanthera brasiliana? composto por cinco esp?cies, atrav?s da abordagem de estat?stica explorat?ria. Como resultado foram reconhecidas apenas quatro esp?cies no grupo, com a proposta de sinonimiza??o de Alternanthera ramosissima em A. brasiliana. No Cap?tulo 3 ? inclu?do o texto original da nova esp?cie Alternanthera catingae, enviado para publica??o na revista Phytotaxa. No Cap?tulo 4 ? apresentada a revis?o das esp?cies brasileiras de Alternanthera realizada a partir da an?lise de mais de 1900 esp?cimes do g?nero, depositados em 19 herb?rios, com a utiliza??o do conceito tipol?gico de esp?cies. Al?m da metodologia tradicional utilizada em trabalhos de revis?o, foram utilizadas tamb?m ferramentas diferenciadas para an?lise e avalia??o das esp?cies que ocorrem no Brasil. Foram reconhecidas 35 esp?cies, das quais uma nova para a ci?ncia, A.catingae, seis sin?nimos de esp?cies cujos tipos s?o do p?is, e uma nova combina??o. O trabalho aqui apresentado se constitui um importante avan?o para o estudo das Amaranthaceae brasileiras, especialmente visando a flora do Brasil on line para 2020.
14

A influência da atmosfera criada pelo ambiente do ponto de venda sobre o comportamento de compra:uma pesquisa exploratória sobre a pinkbiju unidade metrô paulista

Santos, Thiago Alves dos 16 September 2014 (has links)
Made available in DSpace on 2016-04-25T16:44:41Z (GMT). No. of bitstreams: 1 Thiago Alves dos Santos.pdf: 2228117 bytes, checksum: f1e8c38014c8f8acae35bbbea66084d7 (MD5) Previous issue date: 2014-09-16 / This paper aims at identifying the influence of variables that make up the layout on the purchase behavior of the consumer, considering the object of this paper, the selling point, its place and retail of bijouterie and accessories. For this, based on a bibliographic search about purchase behavior, the selling point, through the exploratory analysis and an attitudinal instrument, along the collection of data, in order to draw a parallel between the variables that make up the layout of the store and purchase behavior of consumers. It was concluded that all the variables that constitute the layout of retail store influence the purchase behavior of the consumer, including the colors, the lighting and the visual and physical accessibility of the product / Este estudo teve como objetivo investigar a influência de variáveis que constituem o layout sobre o comportamento de compra do consumidor, considerando o ponto de venda objeto deste estudo, sua praça e cenário de atuação, ou seja, o varejo de bijuterias e acessórios. Para tanto, utilizou-se a pesquisa bibliográfica e a análise ambiental de uma loja específica e de seu ponto de venda, a análise ambiental de cunho exploratório, utilizou-se de um instrumento atitudinal ao longo da coleta de dado, a fim de traçar um paralelo entre as variáveis que compõem o layout da loja e o comportamento de compra do consumidor. Concluiu-se que todas as variáveis que constituem o layout exercem influência sobre o comportamento de compra do consumidor, entre elas as cores, a iluminação e a acessibilidade visual e física do produto
15

Contribui??es da pesquisa operacional para gest?o da produ??o e opera??es : uma an?lise explorat?ria da literatura

Oliveira, Felipe Fernandes de 10 May 2011 (has links)
Made available in DSpace on 2014-12-17T14:53:03Z (GMT). No. of bitstreams: 1 FelipeFO_DISSERT.pdf: 2855164 bytes, checksum: 59d14cba23381a5570780ca5cdca38f6 (MD5) Previous issue date: 2011-05-10 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / This research aims to investigate the evolution presented during three decades (1980, 1990 and 2000) of using the tools of Operations Research (OR) as a suport to decision making in Production Operation Management (POM). Hypothesis tests were made to verify the proportional growth of a given area over the decades to the detriment of the areas of facility layout, capacity planning, production scheduling and inventory management. Six journals were selected and from them more than 800 articles were used for classification and analysis in the grounds of review. It also discussed possible ways for future research and comparisons are made with other papers of literature review. As a result, it was found that areas of heuristics and simulation showed a greater quantity of contributions in all POM areas of this study / A presente pesquisa tem como objetivo investigar a evolu??o apresentada durante tr?s d?cadas (1980, 1990 e 2000) do uso das ferramentas de Pesquisa Operacional (PO) como aux?lio ? tomada de decis?o em Gest?o da Produ??o e Opera??es (GPO). Para tal foram realizados testes de hip?teses para verificar o crescimento proporcional de determinada ?rea da PO durante as d?cadas em detrimento das ?reas de layout, planejamento da capacidade, programa??o da produ??o e gest?o de estoques. Seis peri?dicos foram selecionados e a partir deles mais de 800 artigos foram utilizados para classifica??o e an?lise na fundamenta??o da an?lise explorat?ria da literatura. ? discutido ainda poss?veis caminhos da pesquisa para o futuro e s?o feitas compara??es com outros trabalhos de revis?o de literatura. Como resultado, verificou-se que as ?reas de heur?stica e simula??o apresentaram um maior quantitativo de contribui??es em todas as ?reas da GPO pesquisadas
16

GrAPHiSTUne approche d’analyse exploratoire pour l’identification des dynamiques des phénomènes spatio-temporels. / GrAPHiSTAn exploratory analysis approach for the identification of dynamics of spatio-temporal phenomena.

Gautier, Jacques 02 October 2018 (has links)
Les données permettant de décrire des phénomènes spatio-temporels sont de plus en plus nombreuses. Ces nouvelles données peuvent alors être éloignées de celles habituellement observées pour l'étude de certains phénomènes. Leur analyse, selon une approche hypothético-déductive telle qu'elle est majoritairement effectuée en statistique et dans les SIG, peut ainsi passer sous silence certaines informations insoupçonnées, mais pertinentes, sur les dynamiques de ces phénomènes spatio-temporels.Il peut alors être intéressant de simplement donner à voir les données, pour observer ce qu'elles ont à montrer, avant de les analyser. Ce principe est celui de l'analyse exploratoire: le procédé est de permettre à un utilisateur d'effectuer une exploration libre des données, au moyen de représentations visuelles, afin de mettre en lumière des structures ou des relations insoupçonnées. Aujourd'hui, l'analyse exploratoire est notamment possible au moyen d'environnements de visualisation, intégrant différentes représentations graphiques et cartographiques interactives.Les environnements de visualisation sont majoritairement développés de manière ad hoc, dans le cadre d'une thématique particulière. Or l'émergence constante de nouvelles données incite à promouvoir des méthodes d'analyse applicables à des phénomènes de différentes natures. En fonction de la problématique dans laquelle s'insèrent ces derniers, les dynamiques sur lesquelles va se focaliser l'analyse diffèrent. Analyser un phénomène météorologique dans un but de prévision implique de s’intéresser aux récurrences cycliques du phénomène. Analyser l'évolution d'une population pour la mise en place de politiques publiques implique d’analyser ce phénomène sur le temps long et selon différentes zones de l’espace.Notre objectif est de proposer une méthode d'analyse exploratoire des phénomènes spatio-temporels et de leurs dynamiques, indépendante du thème traité. Pour cela, nous proposons un environnement de géovisualisation, GrAPHiST (Géovisualisation pour l'Analyse des PHenomenes Spatio-Temporels), permettant l'analyse de différentes dynamiques, selon différentes échelles spatiales et temporelles (linéaires ou cycliques). Développer cet environnement implique de s’interroger sur la modélisation du changement dans l’espace, la nature des dynamiques spatio-temporelles à étudier, et les outils visuels et interactifs permettant de les identifier.Ainsi, les contributions de notre recherche se situent à plusieurs niveaux :- une modélisation générique des phénomènes spatio-temporels, sous la forme de séries événementielles;- de nouvelles méthodes de représentations graphiques et interactives, autorisant la recherche et l'identification des dynamiques spatio-temporelles, notamment: l'introduction de diagrammes temporels interactifs permettant la recherche visuelle de récurrences cycliques dans les données spatio-temporelles; l'utilisation de règles de symbologie permettant la visualisation des relations entre les composantes temporelle et spatiale des phénomènes; de nouvelles méthodes de représentations des agrégats d'événements proches, permettant d'identifier des structures dans leur distribution spatio-temporelle;- la formalisation d’une approche d'analyse exploratoire des dynamiques spatio-temporelles, déclinée en plusieurs scénarios selon l’objectif poursuivi.Nous validons notre approche en l'appliquant à l'analyse de différents jeux de données. L'objectif est de vérifier la possibilité d'identifier des dynamiques, relatives au temps linéaire ou cyclique, au moyen de GrAPHiST, et d'illustrer le caractère générique de l'approche, ainsi que les opportunités d'analyse offertes par l'environnement. / Datasets allowing the description of spatio-temporal phenomena are becoming ever more numerous. These new data can be very different from those usually observed for studying spatio-temporal phenomena. An analysis through a hypothetico-deductive approach, like is mainly done in statistic and GIS domains, can ignore some unsuspected, but relevant, information about the dynamics of these spatio-temporal phenomena.It can be interesting then, to just present the data, to observe what they have to show, before analysing them. This is the principle of the exploratory data analysis: the process is to allow a user to freely explore data, through visual representations, in order to highlight unsuspected structures or relationships. Today, exploratory analysis is possible through visualization environments, which integrate different graphic or cartographic interactive representations.Visualization environments are mainly developed in an ad hoc manner, in the context of a particular thematic field. However, the constant appearance of new data encourages promoting analysis methods, which could be applied to several types of phenomena. According to the domain related to these phenomena, the analysis will be focused on different dynamics. Analysing a meteorological phenomenon, in a forecasting purpose, implies a focus on the cyclic recurrences of the phenomenon. Analysing the increase of a population, for the purpose of deciding public policies, implies an analysis of the phenomenon on a long-term, through different spatial areas.Our objective is to propose a method for the exploratory analysis of spatio-temporal phenomena and their dynamics, which would be independent of the topic. In order to achieve this, we propose a geovisualization environment, GrAPHiST (Géovisualisation pour l'Analyse des PHenomenes Spatio-Temporels; Geovisualization for spatio-temporal phenomena analysis), allowing the analysis of several dynamics, through different spatial and temporal (linear or cyclic) scales. Developing this environment implies to focus on how spatial changes are modelled, on the nature of the spatio-temporal dynamics we have to study, and on the visual and interactive tools, which allow the identification of these dynamics.So, the contributions of our research can be found at several levels:a generic modelling approach of spatio-temporal phenomena, in the form of event series;new graphical and interactive representation methods, which allow the searching and the identification of spatio-temporal dynamics, including: the introduction of interactive temporal diagrams, which allow the visual searching of cyclic recurrences in spatio-temporal data; the use of symbology rules, which allow the visualization of relationships between the spatial and temporal components of phenomena; new methods to represent aggregated closed events, which allow to identify structures in their spatio-temporal distribution;the formalization of an exploratory approach for the spatio-temporal dynamics analysis, divided into several scenarios, according to the purpose of the analysis.We validate our proposition by applying it to the analysis of several datasets. The objective is to verify the possibility to identify dynamics, related to linear or cyclic time, through the use of GrAPHiST, and to illustrate the generic aspect of the approach, as well as the analysis opportunities given by the environment.
17

Machine Learning and Multivariate Statistical Tools for Football Analytics

Malagón Selma, María del Pilar 05 October 2023 (has links)
[ES] Esta tesis doctoral se centra en el estudio, implementación y aplicación de técnicas de aprendizaje automático y estadística multivariante en el emergente campo de la analítica deportiva, concretamente en el fútbol. Se aplican procedimientos comunmente utilizados y métodos nuevos para resolver cuestiones de investigación en diferentes áreas del análisis del fútbol, tanto en el ámbito del rendimiento deportivo como en el económico. Las metodologías empleadas en esta tesis enriquecen las técnicas utilizadas hasta el momento para obtener una visión global del comportamiento de los equipos de fútbol y pretenden ayudar al proceso de toma de decisiones. Además, la metodología se ha implementado utilizando el software estadístico libre R y datos abiertos, lo que permite la replicabilidad de los resultados. Esta tesis doctoral pretende contribuir a la comprensión de los modelos de aprendizaje automático y estadística multivariante para la predicción analítica deportiva, comparando su capacidad predictiva y estudiando las variables que más influyen en los resultados predictivos de estos modelos. Así, siendo el fútbol un juego de azar donde la suerte juega un papel importante, se proponen metodologías que ayuden a estudiar, comprender y modelizar la parte objetiva de este deporte. Esta tesis se estructura en cinco bloques, diferenciando cada uno en función de la base de datos utilizada para alcanzar los objetivos propuestos. El primer bloque describe las áreas de estudio más comunes en la analítica del fútbol y las clasifica en función de los datos utilizados. Esta parte contiene un estudio exhaustivo del estado del arte de la analítica del fútbol. Así, se recopila parte de la literatura existente en función de los objetivos alcanzados, conjuntamente con una revisión de los métodos estadísticos aplicados. Estos modelos son los pilares sobre los que se sustentan los nuevos procedimientos aquí propuestos. El segundo bloque consta de dos capítulos que estudian el comportamiento de los equipos que alcanzan la Liga de Campeones o la Europa League, descienden a segunda división o permanecen en mitad de la tabla. Se proponen varias técnicas de aprendizaje automático y estadística multivariante para predecir la posición de los equipos a final de temporada. Una vez realizada la predicción, se selecciona el modelo con mejor precisión predictiva para estudiar las acciones de juego que más discriminan entre posiciones. Además, se analizan las ventajas de las técnicas propuestas frente a los métodos clásicos utilizados hasta el momento. El tercer bloque consta de un único capítulo en el que se desarrolla un código de web scraping para facilitar la recuperación de una nueva base de datos con información cuantitativa de las acciones de juego realizadas a lo largo del tiempo en los partidos de fútbol. Este bloque se centra en la predicción de los resultados de los partidos (victoria, empate o derrota) y propone la combinación de una técnica de aprendizaje automático, random forest, y la regresión Skellam, un método clásico utilizado habitualmente para predecir la diferencia de goles en el fútbol. Por último, se compara la precisión predictiva de los métodos clásicos utilizados hasta ahora con los métodos multivariantes propuestos. El cuarto bloque también comprende un único capítulo y pertenece al área económica del fútbol. En este capítulo se aplica un novedoso procedimiento para desarrollar indicadores que ayuden a predecir los precios de traspaso. En concreto, se muestra la importancia de la popularidad a la hora de calcular el valor de mercado de los jugadores, por lo que este capítulo propone una nueva metodología para la recogida de información sobre la popularidad de los jugadores. En el quinto bloque se revelan los aspectos más relevantes de esta tesis para la investigación y la analítica en el fútbol, incluyendo futuras líneas de trabajo. / [CA] Aquesta tesi doctoral se centra en l'estudi, implementació i aplicació de tècniques d'aprenentatge automàtic i estadística multivariant en l'emergent camp de l'analítica esportiva, concretament en el futbol. S'apliquen procediments comunament utilitzats i mètodes nous per a resoldre qu¿estions d'investigació en diferents àrees de l'anàlisi del futbol, tant en l'àmbit del rendiment esportiu com en l'econòmic. Les metodologies emprades en aquesta tesi enriqueixen les tècniques utilitzades fins al moment per a obtindre una visió global del comportament dels equips de futbol i pretenen ajudar al procés de presa de decisions. A més, la metodologia s'ha implementat utilitzant el programari estadístic lliure R i dades obertes, la qual cosa permet la replicabilitat dels resultats. Aquesta tesi doctoral pretén contribuir a la comprensió dels models d'aprenentatge automàtic i estadística multivariant per a la predicció analítica esportiva, comparant la seua capacitat predictiva i estudiant les variables que més influeixen en els resultats predictius d'aquests models. Així, sent el futbol un joc d'atzar on la sort juga un paper important, es proposen metodologies que ajuden a estudiar, comprendre i modelitzar la part objectiva d'aquest esport. Aquesta tesi s'estructura en cinc blocs, diferenciant cadascun en funció de la base de dades utilitzada per a aconseguir els objectius proposats. El primer bloc descriu les àrees d'estudi més comuns en l'analítica del futbol i les classifica en funció de les dades utilitzades. Aquesta part conté un estudi exhaustiu de l'estat de l'art de l'analítica del futbol. Així, es recopila part de la literatura existent en funció dels objectius aconseguits, conjuntament amb una revisió dels mètodes estadístics aplicats. Aquests models són els pilars sobre els quals se sustenten els nous procediments ací proposats. El segon bloc consta de dos capítols que estudien el comportament dels equips que aconsegueixen la Lliga de Campions o l'Europa League, descendeixen a segona divisió o romanen a la meitat de la taula. Es proposen diverses tècniques d'aprenentatge automàtic i estadística multivariant per a predir la posició dels equips a final de temporada. Una vegada realitzada la predicció, se selecciona el model amb millor precisió predictiva per a estudiar les accions de joc que més discriminen entre posicions. A més, s'analitzen els avantatges de les tècniques proposades enfront dels mètodes clàssics utilitzats fins al moment. El tercer bloc consta d'un únic capítol en el qual es desenvolupa un codi de web scraping per a facilitar la recuperació d'una nova base de dades amb informació quantitativa de les accions de joc realitzades al llarg del temps en els partits de futbol. Aquest bloc se centra en la predicció dels resultats dels partits (victòria, empat o derrota) i proposa la combinació d'una tècnica d'aprenentatge automàtic, random forest, i la regressió Skellam, un mètode clàssic utilitzat habitualment per a predir la diferència de gols en el futbol. Finalment, es compara la precisió predictiva dels mètodes clàssics utilitzats fins ara amb els mètodes multivariants proposats. El quart bloc també comprén un únic capítol i pertany a l'àrea econòmica del futbol. En aquest capítol s'aplica un nou procediment per a desenvolupar indicadors que ajuden a predir els preus de traspàs. En concret, es mostra la importància de la popularitat a l'hora de calcular el valor de mercat dels jugadors, per la qual cosa aquest capítol proposa una nova metodologia per a la recollida d'informació sobre la popularitat dels jugadors. En el cinqué bloc es revelen els aspectes més rellevants d'aquesta tesi per a la investigació i l'analítica en el futbol, incloent-hi futures línies de treball. / [EN] This doctoral thesis focuses on studying, implementing, and applying machine learning and multivariate statistics techniques in the emerging field of sports analytics, specifically in football. Commonly used procedures and new methods are applied to solve research questions in different areas of football analytics, both in the field of sports performance and in the economic field. The methodologies used in this thesis enrich the techniques used so far to obtain a global vision of the behaviour of football teams and are intended to help the decision-making process. In addition, the methodology was implemented using the free statistical software R and open data, which allows for reproducibility of the results. This doctoral thesis aims to contribute to the understanding of the behaviour of machine learning and multivariate models for analytical sports prediction, comparing their predictive capacity and studying the variables that most influence the predictive results of these models. Thus, since football is a game of chance where luck plays an important role, this document proposes methodologies that help to study, understand, and model the objective part of this sport. This thesis is structured into five blocks, differentiating each according to the database used to achieve the proposed objectives. The first block describes the most common study areas in football analytics and classifies them according to the available data. This part contains an exhaustive study of football analytics state of the art. Thus, part of the existing literature is compiled based on the objectives achieved, with a review of the statistical methods applied. These methods are the pillars on which the new procedures proposed here are based. The second block consists of two chapters that study the behaviour of teams concerning the ranking at the end of the season: top (qualifying for the Champions League or Europa League), middle, or bottom (relegating to a lower division). Several machine learning and multivariate statistical techniques are proposed to predict the teams' position at the season's end. Once the prediction has been made, the model with the best predictive accuracy is selected to study the game actions that most discriminate between positions. In addition, the advantages of our proposed techniques compared to the classical methods used so far are analysed. The third block consists of a single chapter in which a web scraping code is developed to facilitate the retrieval of a new database with quantitative information on the game actions carried out over time in football matches. This block focuses on predicting match outcomes (win, draw, or loss) and proposing the combination of a machine learning technique, random forest, and Skellam regression model, a classical method commonly used to predict goal difference in football. Finally, the predictive accuracy of the classical methods used so far is compared with the proposed multivariate methods. The fourth block also comprises a single chapter and pertains to the economic football area. This chapter applies a novel procedure to develop indicators that help predict transfer fees. Specifically, it is shown the importance of popularity when calculating the players' market value, so this chapter is devoted to propose a new methodology for collecting players' popularity information. The fifth block reveals the most relevant aspects of this thesis for research and football analytics, including future lines of work. / Malagón Selma, MDP. (2023). Machine Learning and Multivariate Statistical Tools for Football Analytics [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/197630
18

Detection of driver sleepiness during daylight and darkness

Eklind, Johanna, Meyerson, Amanda January 2023 (has links)
Driving sleepiness is a serious problem worldwide. It is of interest to develop reliable sleepiness detection systems to implement in vehicles, and for such a system both physi-ological data and driver performance data can be used. The reasons for driver sleepiness can be many, where an interesting factor to consider is the light condition of the environment, specifically daylight and darkness. Daylight and darkness has shown to affect human sleepiness in general and it is therefore of importance to investigate the effect of it on driver sleepiness independent of other factors. This thesis aimed to investigate whether light condition is a parameter that should be considered when developing a sleepiness detection system in a vehicle. This was done by investigating if the course of sleepiness would be affected by daylight and darkness, and if adding light condition information as a parameter to a classification model improved the performance of the sleepiness classification. To achieve this, the study was based upon data collected from driving simulator tests conducted by the Swedish National Road and Transport Research Institute (VTI). Test subjects drove in simulated daylight and darkness during both daytime while rested and nighttime while sleep-deprived. An exploratory and statistical analysis was conducted of several sleepiness indicators extracted from physio-logical data and simulator data. Three different classification models were implemented. The indicators pointed to a higher level of driver sleepiness during night compared to during day, as well as an increase with time on task. However, no clear trends pointed to daylight and darkness having affected the sleepiness of the driver. The classification models showed a marginal improvement when including light condition as a feature, however not large enough to draw any specific conclusion regarding the effect. The conclusion was that an effect of daylight and darkness on the course of driver sleepiness could not be seen in this thesis. The adding of light and dark as a feature did not significantly improve the classification models’ performances. In summary, further investigations of the effect of daylight and darkness in relation to driver sleepiness are needed.
19

O professor de matemática e a análise exploratória de dados no ensino médio

Cardoso, Ricardo 24 May 2007 (has links)
Made available in DSpace on 2016-04-27T16:58:14Z (GMT). No. of bitstreams: 1 Ricardo Cardoso.pdf: 519569 bytes, checksum: 93a997cb00e05e91088b9b8f83a07660 (MD5) Previous issue date: 2007-05-24 / Secretaria da Educação do Estado de São Paulo / Statistics has stand out lately due to its utility in almost every area of human knowledge. Existing research and essay on the subject suggest the need for deepening knowledge on the difficulties with teaching process of this discipline. Our main issue is to investigate whether Brazilian public school teachers develop the Descriptive Statistics in high school and whether they are able to use, efficiently, the basic statistics concepts Data Organization, Measures of Central Tendency, Separatrixes and Dispersion to solve daily practical problems. The aim of this research is to verify the level of knowledge mobilization by high-school teachers. According to LINS (2004, p.54) we know that the general impression not registered systematically by researches that the licenciate s mathematical formation, in great part similar to the future graduate, does not contribute substantially to the formation of this future professional, unless to reinforce the expositive classes routines . We will try to diagnose in which level of knowledge the high-school teacher is in school curriculum of Statistics. Based on BIFI s questionnaire (2006, p.54), we will verify whether the high-school Mathematics teacher is able to calculate, justify and relate the described measures. The answers of the activities will be analyzed through C.H.I.C. software / A Estatística tem se destacado ultimamente por sua utilidade em praticamente todas as áreas do conhecimento humano. Pesquisas e dissertações existentes sobre o assunto sugerem a necessidade de aprofundar os conhecimentos sobre as dificuldades no processo de ensino da disciplina. Nossa questão principal é investigar se os Professores da Rede Pública desenvolvem o ensino da Estatística Descritiva no ensino médio, e se estão aptos a utilizar, de forma eficaz, as noções estatísticas de base Organização de dados, Medidas de Tendência Central , Separatrizes e Dispersão, para resolver problemas práticos de seu cotidiano. O intuito dessa pesquisa é verificar o nível de mobilização dos conhecimentos por parte dos Professores do Ensino Médio. Segundo LINS (2004, p.54) sabemos que persiste a impressão geral não documentada de forma sistemática por pesquisas de que a formação matemática do licenciado, em boa parte similar a do futuro bacharel, não contribui de modo substancial para a formação daquele futuro profissional, a não ser ao reforçar as rotinas de aulas expositivas . Tentaremos diagnosticar qual o nível de conhecimento em que o Professor do Ensino Médio se encontra no conteúdo curricular de Estatística. Baseando-se no questionário de BIFI (2006, p. 54), verificar se o professor de matemática do ensino médio é capaz de calcular, justificar e relacionar as medidas descritas. As respostas das atividades serão analisadas com o auxílio do software C.H.I.C
20

Caracterização de fertilizantes forticados com pós de aciaria através da análise exploratória de dados

Lourdes, Ângela Maria Ferreira de Oliveira 28 July 2014 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-03T11:55:00Z No. of bitstreams: 1 angelamariaferreiradeoliveiralourdes.pdf: 1193191 bytes, checksum: 66a561f0014fed2c1b249bc429df57de (MD5) / Rejected by Adriana Oliveira (adriana.oliveira@ufjf.edu.br), reason: Favor verificar se o nome está correto e escreve sem acentos mesmo: Silva, Julio Cesar Jose da on 2017-05-13T13:15:10Z (GMT) / Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-15T19:49:28Z No. of bitstreams: 1 angelamariaferreiradeoliveiralourdes.pdf: 1193191 bytes, checksum: 66a561f0014fed2c1b249bc429df57de (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-05-17T16:05:21Z (GMT) No. of bitstreams: 1 angelamariaferreiradeoliveiralourdes.pdf: 1193191 bytes, checksum: 66a561f0014fed2c1b249bc429df57de (MD5) / Made available in DSpace on 2017-05-17T16:05:21Z (GMT). No. of bitstreams: 1 angelamariaferreiradeoliveiralourdes.pdf: 1193191 bytes, checksum: 66a561f0014fed2c1b249bc429df57de (MD5) Previous issue date: 2014-07-28 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Uma das fontes de elementos essenciais utilizados em fertilizantes são os resíduos industriais, tais como o pó de aciaria. Essas substâncias constituem uma alternativa viável ao fornecimento de nutrientes para as plantas. Mas, por outro lado, esses resíduos podem conter elementos altamente tóxicos. Neste trabalho, amostras de fertilizantes, pó de aciaria e amostras de fertilizantes fortificadas com pó de aciaria foram caracterizadas utilizando espectrometria de absorção atômica com chama (F AAS), espectrometria de emissão com chama (F AES), espectroscopia no infravermelho (IV), espectroscopia Raman e difração de raios X (DRX). Para a determinação de Cr, K, Na, Pb e Zn nas amostras investigadas por F AAS e F AES, otimizou-se um procedimento de extração usando banho de ultrassom, cujas condições ótimas para a extração foram: 50 mg de fertilizantes, 5,00 mL de água régia diluida (50% v/v) e seis etapas de 20 minutos de sonicação, sendo intercaladas com 1 minuto de agitação. Os resultados obtidos foram concordantes a 95% de confiança para todos os analitos em relação ao método de referência adotado (por adição de padrão), apresentando adequadas precisão (RSD < 10%) e exatidão (recuperações entre 90 a 103% para os fertilizantes e de 85,0 a 105,4% para o pó de aciaria). As espectroscopias do Infravermelho e Raman mostraram-se eficientes para a análise mineralógica, cujos resultados foram confirmados utilizando-se a técnica de DRX. Assim, com as técnicas utilizadas (IV, Raman e DRX) foi possível conhecer a composição orgânica e inorgânica das amostras analisadas. A análise exploratória dos dados obtidos usando as técnicas de PCA (análise de componentes principais) e HCA (análise hierárquica de clusters) permitiu a diferenciação de amostras fortificadas a partir de 5% de pó de aciaria (F AAS e F AES) das amostras não fortificadas, a associação quimiométrica desses com os dados dos espectros de IV, permitiram que a classificação ocorre-se mesmo na presença de pó de aciaria puro. Os dados obtidos através dos espectros de IV e Raman, juntos permitiram a classificação de amostras fortificadas a partir de 1% de pó de aciaria. / Industrial residues, such as flue dust, has being used as a source of essential trace elements in fertilizers . These residues have different essential elements and can supply the plant necessity for mineral nutrients. On the other hand, these residues may also contain highly toxic elements. In this work, samples of fertilizers, flue dust and fertilizer samples spiked with different concentrations of flue dust were characterized using different analytical techniques such as: flame atomic absorption spectrometry (F AAS), flame atomic emission spectrometry (F AES), infrared spectroscopy (IR), Raman spectroscopy and X-ray diffraction (XRD). For the determination of Cr, K, Na, Pb and Zn in the investigated samples by F AAS and F AES an extraction procedure using ultrasound bath was optimized. The optimum conditions for the analytes extraction were: 50 mg of fertilizers, 5.00 mL of diluted aqua regia solution (50 v/v %). The developed method was conducted in six steps of sonication per 20 minutes with intervals of 1 minute (on stirring). The results were in agreement at 95% confidence level for all analytes investigated in relation to the reference method adopted (standard addition). Appropriated precision (RSD <10%) and accuracy were obtained with recoveries between 90-103% for fertilizers and 85-105% for flue dust sample. The IR and Raman spectroscopy proved mineralogical (organic and inorganic) information, which were attested by DRX. Exploratory data analysis using PCA (principal component analysis) and HCA (hierarchical cluster analysis) techniques allowed differentiating spiked samples (from 5% of flue dust) and non-spiked samples by using data from F AAS and F AES analysis. The association of elemental data with IR and Raman data allowed differentiating spiked samples from 1% of industrial flue dust.

Page generated in 0.2402 seconds