Spelling suggestions: "subject:"multidimensional data analysis"" "subject:"ultidimensional data analysis""
1 |
Delikvence mládeže a její hodnotové souvislosti / Juvenile delinquency and its moral aspectsPrůšová, Barbora January 2014 (has links)
This thesis is focused on analysis of youth delinquency in terms of Per-Olof H. Wikström's Situational Action Theory or rather modelling data relating to this area of research International Self-Report Delinquency Study 3. The main aim of the thesis is to introduce and evaluate this theoretical-empirical model for the explanation of youth delinquency. The work is split into three main parts - theoretical, methodological and empirical. First one consists of the definition of basic concepts and show Wikström' s 'situational action theory applied to the delinquency topic. In methodological part there is a description of ISRD-3 survey, basic indicators of sample and data collection methods used. And then there is an explanation how operationalization of individual explanatory variables in the model was done. Empirical part is dedicated to multidimensional analysis of data and evaluation of this concept. The results demonstrate the success of the analytical model and its application as a default theory in the examination of youth delinquency.
|
2 |
Delikvence mládeže a její hodnotové souvislosti / Juvenile delinquency and its moral aspectsPrůšová, Barbora January 2014 (has links)
This thesis is focused on analysis of youth delinquency in terms of Per-Olof H. Wikström's Situational Action Theory or rather modelling data relating to this area of research International Self-Report Delinquency Study 3. The main aim of the thesis is to introduce and evaluate this theoretical-empirical model for the explanation of youth delinquency. The work is split into three main parts - theoretical, methodological and empirical. First one consists of the definition of basic concepts and show Wikström' s 'situational action theory applied to the delinquency topic. In methodological part there is a description of ISRD-3 survey, basic indicators of sample and data collection methods used. In empirical part is an explanation how operationalization of individual explanatory variables in the model was done. This part is also dedicated to multidimensional analysis of data and evaluation of this concept. The results demonstrate the success of the analytical model and its application as a default theory in the examination of youth delinquency. Key words: youth delinquency, Situational Action Theory, multidimensional data analysis
|
3 |
Modélisation et exécution des applications d'analyse de données multi-dimentionnelles sur architectures distribuées. / Modelling and executing multidimensional data analysis applications over distributed architectures.Pan, Jie 13 December 2010 (has links)
Des quantités de données colossalles sont générées quotidiennement. Traiter de grands volumes de données devient alors un véritable challenge pour les logiciels d'analyse des données multidimensionnelles. De plus, le temps de réponse exigé par les utilisateurs de ces logiciels devient de plus en plus court, voire intéractif. Pour répondre à cette demande, une approche basée sur le calcul parallèle est une solution. Les approches traditionnelles reposent sur des architectures performantes, mais coûteuses, comme les super-calculateurs. D'autres architectures à faible coût sont également disponibles, mais les méthodes développées sur ces architectures sont souvent bien moins efficaces. Dans cette thèse, nous utilisons un modèle de programmation parallèle issu du Cloud Computing, dénommé MapReduce, pour paralléliser le traitement des requêtes d'analyse de données multidimensionnelles afin de bénéficier de mécanismes de bonne scalabilité et de tolérance aux pannes. Dans ce travail, nous repensons les techniques existantes pour optimiser le traitement de requête d'analyse de données multidimensionnelles, y compris les étapes de pré-calcul, d'indexation, et de partitionnement de données. Nous avons aussi résumé le parallélisme de traitement de requêtes. Ensuite, nous avons étudié le modèle MapReduce en détail. Nous commençons par présenter le principe de MapReduce et celles du modèle étendu, MapCombineReduce. En particulier, nous analysons le coût de communication pour la procédure de MapReduce. Après avoir présenté le stockage de données qui fonctionne avec MapReduce, nous présentons les caractéristiques des applications de gestion de données appropriées pour le Cloud Computing et l'utilisation de MapReduce pour les applications d'analyse de données dans les travaux existants. Ensuite, nous nous concentrons sur la parallélisation des Multiple Group-by query, une requête typique utilisée dans l'exploration de données multidimensionnelles. Nous présentons la mise en oeuvre de l'implémentation initiale basée sur MapReduce et une optimisation basée sur MapCombineReduce. Selon les résultats expérimentaux, notre version optimisée montre un meilleur speed-up et une meilleure scalabilité que la version initiale. Nous donnons également une estimation formelle du temps d'exécution pour les deux implémentations. Afin d'optimiser davantage le traitement du Multiple Group-by query, une phase de restructuration de données est proposée pour optimiser les jobs individuels. Nous re-definissons l'organisation du stockage des données, et nous appliquons les techniques suivantes, le partitionnement des données, l'indexation inversée et la compression des données, au cours de la phase de restructuration des données. Nous redéfinissons les calculs effectués dans MapReduce et dans l'ordonnancement des tâches en utilisant cette nouvelle structure de données. En nous basant sur la mesure du temps d'exécution, nous pouvons donner une estimation formelle et ainsi déterminer les facteurs qui impactent les performances, telles que la sélectivité de requête, le nombre de mappers lancés sur un noeud, la distribution des données « hitting », la taille des résultats intermédiaires, les algorithmes de sérialisation adoptée, l'état du réseau, le fait d'utiliser ou non le combiner, ainsi que les méthodes adoptées pour le partitionnement de données. Nous donnons un modèle d'estimation des temps d'exécution et en particulier l'estimation des valeurs des paramètres différents pour les exécutions utilisant le partitionnement horizontal. Afin de soutenir la valeur-unique-wise-ordonnancement, qui est plus flexible, nous concevons une nouvelle structure de données compressées, qui fonctionne avec un partitionnement vertical. Cette approche permet l'agrégation sur une certaine valeur dans un processus continu. / Along with the development of hardware and software, more and more data is generated at a rate much faster than ever. Processing large volume of data is becoming a challenge for data analysis software. Additionally, short response time requirement is demanded by interactive operational data analysis tools. For addressing these issues, people look for solutions based on parallel computing. Traditional approaches rely on expensive high-performing hardware, like supercomputers. Another approach using commodity hardware has been less investigated. In this thesis, we are aiming to utilize commodity hardware to resolve these issues. We propose to utilize a parallel programming model issued from Cloud Computing, MapReduce, to parallelize multidimensional analytical query processing for benefit its good scalability and fault-tolerance mechanisms. In this work, we first revisit the existing techniques for optimizing multidimensional data analysis query, including pre-computing, indexing, data partitioning, and query processing parallelism. Then, we study the MapReduce model in detail. The basic idea of MapReduce and the extended MapCombineReduce model are presented. Especially, we analyse the communication cost of a MapReduce procedure. After presenting the data storage works with MapReduce, we discuss the features of data management applications suitable for Cloud Computing, and the utilization of MapReduce for data analysis applications in existing work. Next, we focus on the MapReduce-based parallelization for Multiple Group-by query, a typical query used in multidimensional data exploration. We present the MapReduce-based initial implementation and a MapCombineReduce-based optimization. According to the experimental results, our optimized version shows a better speed-up and a better scalability than the other version. We also give formal execution time estimation for both the initial implementation and the optimized one. In order to further optimize the processing of Multiple Group-by query processing, a data restructure phase is proposed to optimize individual job execution. We redesign the organization of data storage. We apply, data partitioning, inverted index and data compressing techniques, during data restructure phase. We redefine the MapReduce job's calculations, and job scheduling relying on the new data structure. Based on a measurement of execution time we give a formal estimation. We find performance impacting factors, including query selectivity, concurrently running mapper number on one node, hitting data distribution, intermediate output size, adopted serialization algorithms, network status, whether using combiner or not as well as the data partitioning methods. We give an estimation model for the query processing's execution time, and specifically estimated the values of various parameters for data horizontal partitioning-based query processing. In order to support more flexible distinct-value-wise job-scheduling, we design a new compressed data structure, which works with vertical partition. It allows the aggregations over one certain distinct value to be performed within one continuous process.
|
4 |
Město pro byznys: Vícerozměrná statistická analýza a možné návrhy na zdokonalení projektu / Město pro byznys: Multi-dimensional statistical analysis and the possible suggestions on how to improve the projectKrajča, Marek January 2014 (has links)
The main objective of my diploma thesis is multidimensional data analysis. Analyzed data come from the comparative research Město pro byznys 2013 (Eng. translation: The city for business 2013). Another goal is to propose some changes that could improve the project. Used methods for multidimensional data analysis are exploratory analysis, principal component analysis, factor analysis and cluster analysis. Among others, for proposing some changes I use multi-criteria decision analysis.
|
5 |
Transdisciplinary and inter-relationships between evaluation and development of asynchronous learning through university course participants narratives in discussion forums / Transdisciplinaridade e inter-relaÃÃes entre avaliaÃÃo e desenvolvimento da aprendizagem assÃncrona atravÃs de narrativas de cursistas universitÃrios em fÃruns de discussÃoMaria Iracema Pinho de Sousa 10 December 2015 (has links)
CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Entre as mudanÃas globais enfrentadas nas pouco mais de trÃs Ãltimas dÃcadas, destacam-se as maneiras como a informaÃÃo, no espaÃo digital e na Web, à acessada, inter-relacionada e ressignificada, no aspecto da construÃÃo de novos conhecimentos e saberes. Estas mudanÃas estÃo significativamente vinculadas à quebra de paradigmas e crises entre as Ãreas de conhecimento. Neste cenÃrio, a sociedade enfrenta uma inesperada corrida de rÃpidas transformaÃÃes, fusÃes e nascimento de novas Ãreas de conhecimento interdisciplinares, carreando o processo educacional para uma contÃnua e desafiante crise. Progressivamente, o uso pedagÃgico das TIC ocupa os espaÃos da outrora educaÃÃo conservadora presencial e instrucionista, outorgando cenÃrios favorecedores à integraÃÃo pedagÃgica entre: as mÃdias, o construtivismo, a autonomia, a criatividade e o aprender juntos. Nas prÃximas dÃcadas, decorrerÃo mudanÃas significativas nas propostas de avaliaÃÃo, notadamente voltadas para o desenvolvimento da aprendizagem colaborativa, em espaÃos assÃncronos, o que suscita profundas reflexÃes, concernentes Ãs transposiÃÃes teÃrico-metodolÃgicas e prÃticas, que incidirÃo sobre a ressignificaÃÃo e relaÃÃes entre o desenvolvimento da aprendizagem e sua avaliaÃÃo. A presente investigaÃÃo, de carÃter qualitativo e de pesquisa-aÃÃo, se insere no cenÃrio do desenvolvimento de atividades pedagÃgicas colaborativas, expressas por narrativas, postadas em FÃruns de discussÃo, durante o transcurso de uma disciplina de InformÃtica na EducaÃÃo, ofertada presencialmente pelo Departamento de Fundamentos da EducaÃÃo da FACED/UFC, em que majoritariamente participaram estudantes de graduaÃÃo. A proposta didÃtica da disciplina se baseia nos pressupostos da teoria da aprendizagem significativa ausubeliana, no mapeamento cognitivo, na visÃo de Okada, e no estar junto virtual, segundo Valente e Almeida. Foram escolhidos dois de seus FÃruns temÃticos disciplinares, que abordavam as temÃticas de estudo: o uso pedagÃgico de mapas conceituais e pressupostos da avaliaÃÃo formativa da aprendizagem, numa visÃo construtivista (AUSUBEL, FERNANDES, MASETTO, PERRONOUD, ZABALA). Parte do referencial teÃrico da Tese permeia a avaliaÃÃo na forma clÃssica e contribuiÃÃes nÃo clÃssicas, que apontam para a necessidade de se repensar a avaliaÃÃo, numa Ãtica do construtivismo e da aprendizagem assÃncrona. Os conjuntos de narrativas, postados pelos estudantes, sÃo constituÃdos por dados multidimensionais fortemente entrelaÃados, que foram mapeados e analisados, sob a forma de categorias, à luz do referencial teÃrico e objetivos adotados na Tese, sob o foco da anÃlise textual discursiva (MORAES, GALIAZZI) e da anÃlise qualitativa de dados multidimensionais, fazendo-se o uso do software CHIC (ALMOULOUD, PRADO, VALENTE). Em seguida, partindo-se de uma Ãrvore de similaridade, gerada pelo CHIC, foram estabelecidas inter-relaÃÃes hierÃrquicas e relacionais, entre as categorias, organizadas em trÃs eixos temÃticos, e as narrativas dos cursistas, de modo a analisar, numa Ãtica do referencial teÃrico da Tese, como se desenvolve a aprendizagem e como a avaliaÃÃo, qualitativa e formativa, pode estar associada ao desenvolvimento da aprendizagem. Da anÃlise multidimensional das narrativas dos cursistas, numa Ãtica transdisciplinar, emergiram indÃcios preliminares de inter-relaÃÃes hierÃrquicas e relacionais, entre os trÃs campos de conhecimentos desenvolvimento da aprendizagem, avaliaÃÃo e saberes pedagÃgicos e tecnolÃgicos, e as aÃÃes pedagÃgicas vivenciadas nos FÃruns TelEduc, que questionam se repensar os pressupostos da avaliaÃÃo da aprendizagem e a convergÃncia dos futuros caminhos das prÃticas pedagÃgicas e avaliativas. / Among the global challenges, FACED during the last three decades, it is important to point
out the different ways how the information, in the digital space and on the Web, is accessed,
interrelated, in order to give a new meaning, according the aspect of building new
knowledges. These changes are meaningful linked to paradigms break and crises between the
areas of knowledge. In this scenario, the human society faces an unexpected rush of rapid
transformations, mergers and birth of new areas of interdisciplinary knowledge, leading the
educational process for continuous and challenging crisis. Progressively, the pedagogical use
of ICT occupies the space of the once classroom and instructional conservative education,
favoring granting scenarios to the pedagogical integration of: media, constructivism,
autonomy, creativity and learning together. In the coming decades, it will take significant
changes in the proposal evaluation, notably focused on the development of collaborative
learning in asynchronous spaces, which raises profound reflections, concerning the theoretical
and methodological and practice transpositions, which will focus on redefining and
relationships between learning development and its assessment. This research, that is
qualitative and action research, is included in the scenario of the development of collaborative
educational activities, expressed by narratives, posted in the TelEduc discussion Forums,
during the development of activities on a course of informatics in Education, offered by the
Department of Education Foundations of FACED/UFC, where mostly attended graduate
students. The didactic proposal of the course is based on the assumptions of the theory of
Ausubel‟s meaningful learning, cognitive mapping, in Okada's vision, and on the learning
together in the virtual space, according to Valente and Almeida. Two of his disciplinary
thematic forums were chosen which addressed the topics of study: the educational use of
concept maps and assumptions of formative assessment of learning, according the
constructivist view (AUSUBEL; FERNANDES; MASETTO; PERRENOUD; ZABALA).
Part of the Thesis theoretical framework permeates the assessment in a classical view and
non-classical contributions, what point out to the need to rethink the assessment in a
perspective of constructivism and asynchronous learning. The sets of narratives, posted by
students, are made up of multidimensional data strongly intertwined, which were mapped and
analyzed in the form of categories. This is done based on the theoretical framework and
objectives adopted in this Thesis, from the perspective of the discursive textual analysis
(MORAES; GALIAZZI) and the multi-dimensional analysis for qualitative data, making the
use of software CHIC (ALMEIDA; ALMOULOUD; VALENTE). Then, starting from a
similarity tree, generated by CHIC, hierarchical and relational interrelationships have been
established between the categories, organized into three themes, and the narratives of the
course participants. This is done in order to analyze, in a perspective of the theoretical
framework of the Thesis, how learning can be developed and how the assessment, qualitative
and formative, may be associated with the learning development. From the multidimensional
analysis of the narratives of the course participants, and according a transdisciplinary
perspective, it was emerged preliminary evidence of hierarchical and relational
interrelationships, between the knowledge fields learning development, assessment and
technological pedagogical content knowledge, and the collaborative pedagogical actions,
experienced in the TelEduc Forums, that induce to rethink the assumptions of the assessment
of learning and the future possibles ways of the pedagogical and assessment practices.
|
6 |
Aceleração de uma variação do problema k-nearest neighbors / Acceleration of a variation of the K-nearest neighbors problemMorais Neto, Jorge Peixoto de 29 January 2014 (has links)
Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2014-11-25T13:07:50Z
No. of bitstreams: 2
Dissertação - Jorge Peixoto de Morais Neto - 2014.pdf: 1582808 bytes, checksum: 3115f942e2c8a9cf83601835af3af1c5 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2014-11-25T14:42:09Z (GMT) No. of bitstreams: 2
Dissertação - Jorge Peixoto de Morais Neto - 2014.pdf: 1582808 bytes, checksum: 3115f942e2c8a9cf83601835af3af1c5 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2014-11-25T14:42:09Z (GMT). No. of bitstreams: 2
Dissertação - Jorge Peixoto de Morais Neto - 2014.pdf: 1582808 bytes, checksum: 3115f942e2c8a9cf83601835af3af1c5 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Previous issue date: 2014-01-29 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Let M be a metric space and let P be a subset of M. The well known k-nearest neighbors
problem (KNN) consists in finding, given q 2 M, the k elements of P with are closest to
q according to the metric of M. We discuss a variation of KNN for a particular class of
pseudo-metric spaces, described as follows. Let m 2 N be a natural number and let d be
the Euclidean distance in Rm. Given p 2 Rm:
p := (p1; : : : ; pm)
let C (p) be the set of the m rotations of p’s coordinates:
C (p) := f(p1; : : : ; pm); (p2; : : : ; pm; p1); : : : ; (pm; p1; : : : ; pm1)g
we define the special distance de as:
de(p;q) := min
p02C (p)
d(p0;q):
de is a pseudo-metric, and (Rm;de) is a pseudo-metric space. The class of pseudo-metric
spaces under discussion is
f(Rm;de) j m 2 N:g
The brute force approach is too costly for instances of practical size. We present a more
efficient solution employing parallelism, the FFT (fast Fourier transform) and the fast
elimination of unfavorable training vectors.We describe a program—named CyclicKNN
—which implements this solution.We report the speedup of this program over serial brute
force search, processing reference datasets. / Seja M um espaço métrico e P um subconjunto de M. O conhecido problema k vizinhos
mais próximos (k-neareast neighbors, KNN) consiste em encontrar, dado q 2 M, os k
elementos de P mais próximos de q conforme a métrica de M. Abordamos uma variação
do problema KNN para uma classe particular de espaços pseudo-métricos, descrita a
seguir. Seja m 2 N um natural e seja d a distância euclidiana em Rm. Dado um vetor
p 2 Rm:
p := (p1; : : : ; pm)
seja C (p) o conjunto das m rotações das coordenadas de p:
C (p) := f(p1; : : : ; pm); (p2; : : : ; pm; p1); : : : ; (pm; p1; : : : ; pm1)g
definimos a distância especial de como:
de(p;q) := min
p02C (p)
d(p0;q):
de é uma pseudo-métrica, e (Rm;de) é um espaço pseudo-métrico. A classe de espaços
pseudo-métricos abordada é
(Rm;de) j m 2 N:
A solução por força bruta é cara demais para instâncias de tamanho prático. Nós apresentamos
uma solução mais eficiente empregando paralelismo, a FFT (transformada rápida
de Fourier) e a eliminação rápida de vetores de treinamento desfavoráveis. Desenvolvemos
um programa—chamado CyclicKNN—que implementa essa solução. Reportamos
o speedup desse programa em comparação com a força bruta sequencial, processando
bases de dados de referência.
|
Page generated in 0.2579 seconds