• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • 13
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 53
  • 53
  • 53
  • 23
  • 13
  • 12
  • 12
  • 11
  • 10
  • 10
  • 10
  • 9
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Towards the Automatic Classification of Student Answers to Open-ended Questions

Alvarado Mantecon, Jesus Gerardo 24 April 2019 (has links)
One of the main research challenges nowadays in the context of Massive Open Online Courses (MOOCs) is the automation of the evaluation process of text-based assessments effectively. Text-based assessments, such as essay writing, have been proved to be better indicators of higher level of understanding than machine-scored assessments (E.g. Multiple Choice Questions). Nonetheless, due to the rapid growth of MOOCs, text-based evaluation has become a difficult task for human markers, creating the need of automated systems for grading. In this thesis, we focus on the automated short answer grading task (ASAG), which automatically assesses natural language answers to open-ended questions into correct and incorrect classes. We propose an ensemble supervised machine learning approach that relies on two types of classifiers: a response-based classifier, which centers around feature extraction from available responses, and a reference-based classifier which considers the relationships between responses, model answers and questions. For each classifier, we explored a set of features based on words and entities. For the response-based classifier, we tested and compared 5 features: traditional n-gram models, entity URIs (Uniform Resource Identifier) and entity mentions both extracted using a semantic annotation API, entity mention embeddings based on GloVe and entity URI embeddings extracted from Wikipedia. For the reference-based classifier, we explored fourteen features: cosine similarity between sentence embeddings from student answers and model answers, number of overlapping elements (words, entity URI, entity mention) between student answers and model answers or question text, Jaccard similarity coefficient between student answers and model answers or question text (based on words, entity URI or entity mentions) and a sentence embedding representation. We evaluated our classifiers on three datasets, two of which belong to the SemEval ASAG competition (Dzikovska et al., 2013). Our results show that, in general, reference-based features perform much better than response-based features in terms of accuracy and macro average f1-score. Within the reference-based approach, we observe that the use of S6 embedding representation, which considers question text, student and model answer, generated the best performing models. Nonetheless, their combination with other similarity features helped build more accurate classifiers. As for response-based classifiers, models based on traditional n-gram features remained the best models. Finally, we combined our best reference-based and response-based classifiers using an ensemble learning model. Our ensemble classifiers combining both approaches achieved the best results for one of the evaluation datasets, but underperformed on the remaining two. We also compared the best two classifiers with some of the main state-of-the-art results on the SemEval competition. Our final embedded meta-classifier outperformed the top-ranking result on the SemEval Beetle dataset and our top classifier on SemEval SciEntBank, trained on reference-based features, obtained the 2nd position. In conclusion, the reference-based approach, powered mainly by sentence level embeddings and other similarity features, proved to generate the most efficient models in two out of three datasets and the ensemble model was the best on the SemEval Beetle dataset.
12

Aplicação da arquitetura lambda na construção de um ambiente big data educacional para análise de dados

Mendes, Renê de Ávila 09 February 2017 (has links)
Submitted by Marta Toyoda (1144061@mackenzie.br) on 2018-02-09T19:36:53Z No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Paola Damato (repositorio@mackenzie.br) on 2018-02-22T13:28:09Z (GMT) No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-02-22T13:28:09Z (GMT). No. of bitstreams: 2 RENÊ DE ÁVILA MENDES.pdf: 2131022 bytes, checksum: 371eff9a643c4104cbd7ced2b556bab5 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-02-09 / To properly deal with volume, velocity and variety data dimensions in educational contexts is a major concern for Educational Institutions and both Educational Data Mining and Learning Analytics Researchers have cooperated to properly address this challenge which is popularly called Big Data. Hardware developments have been made to increase computing power, storage capacity and efficiency in energy use. New technologies in databases, file systems and distributed systems, as well as developments in data transmission techniques, data management, data analysis and visualization have been trying to overcome the challenge of processing, storing and analyzing large volumes of data and the inability to meet simultaneously the requirements of consistency, availability and tolerance of partitions. Although the architecture definition is the main task in a Big Data system design, objective guidelines for the selection of the architecture and the tools for the implementation of Big Data systems were not found in the literature. The present research aims to analyze the main architectures for both batch and stream processing and to use one of them in the construction of a Big Data environment, providing important orientations to Researchers, Technicians and Managers. Academic data and logs of the Virtual Learning Environment Moodle of an Academic Unit of a Higher Education Institution are used. / Lidar adequadamente com as dimensões de volume, velocidade e variedade dos dados no contexto educacional é um importante desafio para as Instituições de Ensino, e Pesquisadores das áreas de Mineração de Dados Educacionais e Learning Analytics têm cooperado para tratar adequadamente este desafio, popularmente chamado de Big Data. Desenvolvimentos em hardware têm sido feitos para aumentar o poder computacional, a capacidade de armazenamento e a eficiência no uso de energia. Novas tecnologias de bancos de dados, sistemas de arquivos e sistemas distribuídos, além do desenvolvimento de técnicas de transmissão, administração, análise e visualização de dados têm tentado vencer o desafio de processar, armazenar e analisar grandes volumes de dados e a impossibilidade de atender simultaneamente os requisitos de consistência, disponibilidade e tolerância a partições. Embora a definição da arquitetura seja a principal tarefa em um projeto de sistema Big Data, não foram encontradas na literatura orientações objetivas para a seleção da arquitetura e das ferramentas para a implementação de aplicações Big Data. A presente pesquisa tem por objetivo analisar as principais arquiteturas para processamento em lote e em fluxo e utilizar uma delas na construção de um ambiente Big Data, fornecendo importantes orientações a Pesquisadores, Técnicos e Gestores. São utilizados dados acadêmicos e logs do Ambiente Virtual de Aprendizagem Moodle de uma Unidade Acadêmica de uma Instituição de Ensino Superior.
13

Minería de datos educacionales: modelos de predicción del desempeño escolar en alumnos de enseñanza básica

Molen Moris, Johan van der January 2013 (has links)
Ingeniero Civil Matemático / En los últimos años, se ha abierto una oportunidad de hacer análisis más precisos de las habilidades y desempeños de los estudiantes. De a poco, han comenzado a proliferar sistemas de ejercitación en línea y tutores inteligentes que permiten registrar una gran cantidad de información valiosa referente al aprendizaje de los alumnos. La Minería de Datos Educacionales (MDE), es un campo de estudio dedicado a desarrollar métodos matemáticos para analizar datos provenientes de ambientes relacionados a la educación, y extraer la mayor cantidad de información para tratar de entender mejor a los estudiantes, profesores y actores relacionados, con el fin de mejorar los procesos educativos. En esta memoria se aborda el problema de predecir el desempeño de un alumno dados sus datos históricos recopilados a partir de su interacción en un sistema computacional de ejercitación en línea. Este desafío se ha constituido últimamente como uno de los más importantes dentro de la MDE, tal como evidencia el aumento de publicaciones relacionadas, y el gran interés que ha despertado por parte de universidades y entidades gubernamentales. En este trabajo, se analizan los registros almacenados de más de medio millón de ejercicios en línea realizados semanalmente en el 2011 por 805 estudiantes en 23 cursos de cuarto básico de 13 escuelas vulnerables, explorando varios de los enfoques más usados para enfrentar este problema, y proponiendo nuevas variantes para mejorar los resultados y ayudar a la detección de observaciones anómalas que podrían incluir instancias de "gaming the system". Adicionalmente, se estudia el problema de conocer cómo ciertos contenidos impactan en otros. Se trata de un problema de Minería de Datos Educacionales central en el diseño curricular y la planificación de clases. Usualmente esta red de influencias causales se construye en base a las opiniones de expertos. Algunos contribuyen explicitando la dependencia lógica de los contenidos y otros con sus experiencias personales al enseñar esos contenidos. Sin embargo, es muy importante contrastar esas opiniones con el proceso de aprendizaje que efectivamente ocurre en el aula y construir redes causales en base a la evidencia empírica. Aprovechamos los datos y técnicas de Minería de Datos para generar automáticamente la primera red causal de contenidos de un currículo construida empíricamente. Finalmente, se reporta el análisis del impacto de la ejercitación en línea en el desempeño de la prueba SIMCE. Mediciones en condiciones de laboratorio muestran que la ejercitación aumenta el aprendizaje. Sin embargo, implementaciones escolares no han mostrado impactos positivos. Este trabajo muestra la experiencia con escuelas vulnerables donde los estudiantes hacen decenas de ejercicios matemáticos semanales en un sistema en línea. El SIMCE de matemáticas subió significativamente, más de tres veces el aumento histórico logrado a nivel nacional en 2011. Además, los cursos que realizaron mayor cantidad de ejercicios lograron un mayor aumento en el SIMCE, independiente del efecto del profesor y de la escuela.
14

Caracterização de alunos em ambientes de ensino online: estendendo o uso da DAMICORE para minerar dados educacionais / Characterization of students in online learning environments: extending the use of DAMICORE to educational data mining

Luis Fernando de Souza Moro 04 May 2015 (has links)
Com a popularização do uso de recursos tecnológicos na educação, uma enorme quantidade de dados, relacionados às interações entre alunos e esses recursos, é armazenada. Analisar esses dados, visando caracterizar os alunos, é tarefa muito importante, uma vez que os resultados dessa análise podem auxiliar professores no processo de ensino e aprendizagem. Entretanto, devido ao fato de as ferramentas utilizadas para essa caracterização serem complexas e pouco intuitivas, os profissionais da área de ensino acabam por não utilizá-las, inviabilizando a implementação de tais ferramentas em ambientes educacionais. Dentro desse contexto, a dissertação de mestrado aqui apresentada teve como objetivo analisar os dados provenientes de um sistema tutor inteligente, o MathTutor, que disponibiliza exercícios específicos de matemática, para identificar padrões de comportamento dos alunos que interagiram com esse sistema durante um determinado período. Essa análise foi realizada por meio de um processo de Mineração de Dados Educacionais (EDM), utilizando a ferramenta DAMICORE, com o intuito de possibilitar que fossem geradas, de forma rápida e eficaz, informações úteis à caracterização dos alunos. Durante a realização dessa análise, seguiram-se algumas fases do processo de descobrimento de conhecimento em bases de dados, seleção, pré-processamento, mineração dos dados e avaliação e interpretação. Na fase de mineração de dados, foi utilizada a ferramenta DAMICORE, que encontrou padrões que foram estudados na fase de avaliação e interpretação. A partir dessa análise foram encontrados padrões comportamentais dos alunos, por exemplo, alunos do sexo masculino apresentam rendimento superior ou inferior ao de alunas do sexo feminino e quais alunos terão um bom ou mau rendimento nas etapas finais do processo de ensino. Como principal resultado temos que uma das hipóteses criadas, Alunos que obtiveram bom desempenho no pós-teste imediato apresentaram dois dos três seguintes comportamentos: poucas interações na intervenção, baixo tempo interagindo com o sistema na intervenção e poucos misconceptions no pré-teste, teve sua acurácia comprovada dentre os dados utilizados nessa pesquisa. Assim, por meio desta pesquisa concluiu-se que a utilização da DAMICORE em contexto educacional pode auxiliar o professor a inferir o desempenho dos seus alunos oferecendo a ele a oportunidade de realizar as intervenções pedagógicas que auxiliem alunos com possíveis dificuldades e apresente novos desafios para aqueles com facilidade no tema estudado / With the popularization of the use of technological resources in education, a huge amount of data, related to the interactions between students and these resources, is stored. Analyzing this data, due to characterize the students, is an important task, since the results of this analysis can help teachers on teaching and learning process. However, due to the fact that the tools used to this characterization are complex and non-intuitive, the educational professionals do not use it, invalidating the implementation of such tools at educational environments. Within this context, this master\'s dissertation aimed analyzing the prevenient data from an educational web system named MathTutor, which offers specific math exercises to identify behavioral patterns of students who interacted with this system during some period. This analysis was performed by a process known as Educational Data Mining, using the tool named DAMICORE, in order to enable quickly and effectively the construction of helpful information to the characterization of the students. During the course of this analysis, some phases of the process of knowledge discovery in databases were followed: \"selection\", \"preprocessing\", \"data mining\" and \"evaluation and interpretation\". In \"data mining\" phase, the tool DAMICORE was used to find behavioral patterns of students which were studied at the \"evaluation and interpretation\" phase. From this analysis, behavioral patterns of students were found, for example, male students have higher or lower yield against the female students and which students are going to have a good or bad yield on the final steps of the educational process. As the main result we have one of the made assumptions, \"Students who get good performance in the \"immediate posttest\" showed two of the following behaviors: few interactions in the \"intervention\", low time interacting with the system in the \"intervention\" and few misconceptions in \"pretest\"\", has proven its accuracy among the data used in this dissertation. Thus, through this research, it was concluded that the use of DAMICORE at educational context can help teacher to infer the performance of their students offering him the opportunity to perform the pedagogical interventions that help students who faces difficulties and show new challenges for those who have facilities in the subject studied.
15

Educational Data Mining : En kvalitativ studie med inriktning på dataanalys för att hitta mönster i närvarostatistik / Educational Data Mining : A qualitative study focusing on data analysis to find patterns in presence statistics

Borg, Olivia January 2019 (has links)
Studien fokuserar på att hitta olika mönster i närvarostatistik hos elever som inte närvarar i skolan. Informationen som resultatet ger kan därefter användas som ett beslutsunderlag för skolor eller till andra organisationer som är intresserade av EDM inom närvarostatistik. Arbetet genomförde en kvalitativ metodansats med en fallstudie som bestod utav en litteraturstudie samt en implementation. Litteraturstudien användes för att få en förståelse över vanliga tillvägagångssätt inom EDM, som därefter låg till grund för implementationen som använde arbetssättet CRISP-DM. Resultatet blev fem olika mönster som definieras genom dataanalys. Mönstren visar frånvaro ur ett tidsperspektiv samt per ämne och kan ligga till grund för framtida beslutsunderlag. / The study focuses on finding different patterns in attendance statistics for students who are not present at school. The information provided by the results can thereafter be used as a basis for decision-making for schools or for other organizations interested in EDM within attendance statistics. The work carried out a qualitative method approach with a case study that consisted a literature study and an implementation. The literature study was used to gain an understanding of common approaches within EDM, which subsequently formed the basis for the implementation that used the working method CRISP-DM. The project resulted in five different patterns defined by data analysis. The patterns show absence from a time perspective and per subject and can form the basis for future decision-making.
16

Predicting Student Performance in Programming Courses Using Test Unit Snapshot Data / Förutsägelse av Studentprestationer i Programmeringskurser med hjälp av Snapshot-data för Testenheter

Elia, Sanherib January 2023 (has links)
Predicting student performance is an important topic in academia, especially so in programming context, where identification of struggling students allows teachers to offer early and continuous assistance to help them improve their performance. It is thus essential to analyze student programming behavior to detect those at-risk students. This thesis uses data generated from 220 students in a master’s level programming course at a large European university. The students run unit tests in order to test their code when solving assignments, with a snapshot being taken of each test as it is executed. Unit testing is a method of testing software where individual units of source code are tested for correctness. A data set with simple features is derived from a database of snapshots and labeled with students’ grades. Then, the machine learning models support vector machine (SVM), naive Bayes (NB), random forest, and neural networks with one, two and three hidden layers each are trained, evaluated and performance is compared. The results show that SVM and neural networks models are likely the best performing all-rounders, with a possible naive Bayes selection depending on what goal one has. The thesis contributes by training machine learning models on students’ programming behavior. By arming teacher with models such as these, more students that need assistance can get in-time support and thus improve their performance. Future work can improve the models by using or combining other types of student data as features or use a larger data set. / Att förutsäga studenters prestationer är ett viktigt ämne inom akademin, särskilt i programmeringssammanhang, där identifiering av studenter som kämpar med sina studier gör det möjligt för lärare att erbjuda tidig och kontinuerlig hjälp för att hjälpa dem att förbättra sina prestationer. Det är därför viktigt att analysera studenternas programmeringsbeteende för att upptäcka dessa studenter som är vid risk. Denna uppsats använder data från 220 studenter i en programmeringskurs på masternivå vid ett stort europeiskt universitet. Studenterna kör enhetstester för att testa sin kod när de löser uppgifter, och en snapshot tas av varje test när det körs. Enhetstestning är en metod för att testa programvara där enskilda enheter av källkoden testas för korrekthet. En datamängd med enkla features härleds från en databas med snapshots och märks med studenternas betyg. Därefter tränas och utvärderas maskininlärningsmodellerna support vector machine (SVM), naive Bayes (NB), random forest och neurala nätverk med ett, två och tre dolda lager vardera och deras prestanda jämförs. Resultaten visar att SVM och neurala nätverk sannolikt är de bäst presterande allroundmodellerna, med ett möjligt naivt Bayes-val beroende på vilket mål man har. Uppsatsen bidrar genom att träna maskininlärningsmodeller på studenters programmeringsbeteende. Genom att utrusta lärare med modeller som dessa kan fler studenter som behöver hjälp få stöd i tid och därmed förbättra sina prestationer. Framtida arbete kan förbättra modellerna genom att använda eller kombinera andra typer av studentdata som features eller använda en större datamängd.
17

Tracing Knowledge and Engagement in Parallel by Observing Behavior in Intelligent Tutoring Systems

Schultz, Sarah E 27 January 2015 (has links)
Two of the major goals in Educational Data Mining are determining students’ state of knowledge and determining their affective state. It is useful to be able to determine whether a student is engaged with a tutor or task in order to adapt to his/her needs and necessary to have an idea of the students' knowledge state in order to provide material that is appropriately challenging. These two problems are usually examined separately and multiple methods have been proposed to solve each of them. However, little work has been done on examining both of these states in parallel and the combined effect on a student’s performance. The work reported in this thesis explores ways to observe both behavior and performance in order to more fully understand student state.
18

Modeling Student Retention in an Environment with Delayed Testing

Li, Shoujing 24 April 2013 (has links)
Over the last two decades, the field of educational data mining (EDM) has been focusing on predicting the correctness of the next student response to the question (e.g., [2, 6] and the 2010 KDD Cup), in other words, predicting student short-term performance. Student modeling has been widely used for making such inferences. Although performing well on the immediate next problem is an indicator of mastery, it is by far not the only criteria. For example, the Pittsburgh Science of Learning Center's theoretic framework focuses on robust learning (e.g., [7, 10]), which includes the ability to transfer knowledge to new contexts, preparation for future learning of related skills, and retention - the ability of students to remember the knowledge they learned over a long time period. Especially for a cumulative subject such as mathematics, robust learning, particularly retention, is more important than short-term indicators of mastery. The Automatic Reassessment and Relearning System (ARRS) is a platform we developed and deployed on September 1st, 2012, which is mainly used by middle-school math teachers and their students. This system can help students better retain knowledge through automatically assigning tests to students, giving students opportunity to relearn the skill when necessary and generating reports to teachers. After we deployed and tested the system for about seven months, we have collected 287,424 data points from 6,292 students. We have created several models that predict students' retention performance using a variety of features, and discovered which were important for predicting correctness on a delayed test. We found that the strongest predictor of retention was a student's initial speed of mastering the content. The most striking finding was that students who struggled to master the content (took over 8 practice attempts) showed very poor retention, only 55% correct, after just one week. Our results will help us advance our understanding of learning and potentially improve ITS.
19

Ressources et parcours pour l'apprentissage du langage Python : aide à la navigation individualisée dans un hypermédia épistémique à partir de traces / Resources and paths to learn Python language : supporting individualized navigation into an epistemic hypermedia through traces

Miled, Mahdi 26 November 2014 (has links)
Les travaux de recherche de cette thèse concernent principalement l‘aide à la navigation individualisée dans un hypermédia épistémique. Nous disposons d‘un certain nombre de ressources qui peut se formaliser à l‘aide d‘un graphe acyclique orienté (DAG) : le graphe des épistèmes. Après avoir cerné les environnements de ressources et de parcours, les modalités de visualisation et de navigation, de traçage, d‘adaptation et de fouille de données, nous avons présenté une approche consistant à corréler les activités de conception ou d‘édition à celles dédiées à l‘utilisation et la navigation dans les ressources. Cette approche a pour objectif de fournir des mécanismes d‘individualisation de la navigation dans un environnement qui se veut évolutif. Nous avons alors construit des prototypes appropriés pour mettre à l‘épreuve le graphe des épistèmes. L‘un de ces prototypes a été intégré à une plateforme existante. Cet hypermédia épistémique baptisé HiPPY propose des ressources et des parcours portant sur l‘apprentissage du langage Python. Il s‘appuie sur un graphe des épistèmes, une navigation dynamique et un bilan de connaissances personnalisé. Ce prototype a fait l‘objet d‘une expérimentation qui nous a donné la possibilité d‘évaluer les principes introduits et d‘analyser certains usages. / This research work mainly concerns means of assistance in individualized navigation through an epistemic hypermedia. We have a number of resources that can be formalized by a directed acyclic graph (DAG) called the graph of epistemes. After identifying resources and pathways environments, methods of visualization and navigation, tracking, adaptation and data mining, we presented an approach correlating activities of design or editing with those dedicated to resources‘ use and navigation. This provides ways of navigation‘s individualization in an environment which aims to be evolutive. Then, we built prototypes to test the graph of epistemes. One of these prototypes was integrated into an existing platform. This epistemic hypermedia called HiPPY provides resources and pathways on Python language. It is based on a graph of epistemes, a dynamic navigation and a personalized knowledge diagnosis. This prototype, which was experimented, gave us the opportunity to evaluate the introduced principles and analyze certain uses.
20

Using Differential Sequence Mining to Associate Patterns of Interactions in Concept Mapping Activity with Dimensions of Collaborative Process

January 2015 (has links)
abstract: Computer supported collaborative learning (CSCL) has made great inroads in classroom teaching marked by the use of tools and technologies to support and enhance collaborative learning. Computer mediated learning environments produce large amounts of data, capturing student interactions, which can be used to analyze students’ learning behaviors (Martinez-Maldonado et al., 2013a). The analysis of the process of collaboration is an active area of research in CSCL. Contributing towards this area, Meier et al. (2007) defined nine dimensions and gave a rating scheme to assess the quality of collaboration. This thesis aims to extract and examine frequent patterns of students’ interactions that characterize strong and weak groups across the above dimensions. To achieve this, an exploratory data mining technique, differential sequence mining, was employed using data from a collaborative concept mapping activity where collaboration amongst students was facilitated by an interactive tabletop. The results associate frequent patterns of collaborative concept mapping process with some of the dimensions assessing the quality of collaboration. The analysis of associating these patterns with the dimensions of collaboration is theoretically grounded, considering aspects of collaborative learning, concept mapping, communication, group cognition and information processing. The results are preliminary but still demonstrate the potential of associating frequent patterns of interactions with strong and weak groups across specific dimensions of collaboration, which is relevant for students, teachers, and researchers to monitor the process of collaborative learning. The frequent patterns for strong groups reflected conformance to the process of conversation for dimensions related to “communication” aspect of collaboration. In terms of the concept mapping sub-processes the frequent patterns for strong groups reflect the presentation phase of conversation with processes like talking, sharing individual maps while constructing the groups concept map followed by short utterances which represents the acceptance phase. For “joint information processing” aspect of collaboration, the frequent patterns for strong groups were marked by learners’ contributing more upon each other’s work. In terms of the concept mapping sub-processes the frequent patterns were marked by learners adding links to each other’s concepts or working with each other’s concepts, while revising the group concept map. / Dissertation/Thesis / Masters Thesis Computer Science 2015

Page generated in 0.1713 seconds