71 |
Continuous Assessment in Agile Learning using Visualizations and Clustering of Activity Data to Analyze Student BehaviorJanuary 2016 (has links)
abstract: Software engineering education today is a technologically advanced and rapidly evolving discipline. Being a discipline where students not only design but also build new technology, it is important that they receive a hands on learning experience in the form of project based courses. To maximize the learning benefit, students must conduct project-based learning activities in a consistent rhythm, or cadence. Project-based courses that are augmented with a system of frequent, formative feedback helps students constantly evaluate their progress and leads them away from a deadline driven approach to learning.
One aspect of this research is focused on evaluating the use of a tool that tracks student activity as a means of providing frequent, formative feedback. This thesis measures the impact of the tool on student compliance to the learning process. A personalized dashboard with quasi real time visual reports and notifications are provided to undergraduate and graduate software engineering students. The impact of these visual reports on compliance is measured using the log traces of dashboard activity and a survey instrument given multiple times during the course.
A second aspect of this research is the application of learning analytics to understand patterns of student compliance. This research employs unsupervised machine learning algorithms to identify unique patterns of student behavior observed in the context of a project-based course. Analyzing and labeling these unique patterns of behavior can help instructors understand typical student characteristics. Further, understanding these behavioral patterns can assist an instructor in making timely, targeted interventions. In this research, datasets comprising of student’s daily activity and graded scores from an under graduate software engineering course is utilized for the purpose of identifying unique patterns of student behavior. / Dissertation/Thesis / Masters Thesis Engineering 2016
|
72 |
Mining Signed Social Networks Using Unsupervised Learning AlgorithmsJanuary 2017 (has links)
abstract: Due to vast resources brought by social media services, social data mining has
received increasing attention in recent years. The availability of sheer amounts of
user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information
in social networks could provide another rich source in deriving implicit information
for social data mining. However, the vast majority of existing studies overwhelmingly
focus on positive links between users while negative links are also prevailing in real-
world social networks such as distrust relations in Epinions and foe links in Slashdot.
Though recent studies show that negative links have some added value over positive
links, it is dicult to directly employ them because of its distinct characteristics from
positive interactions. Another challenge is that label information is rather limited
in social media as the labeling process requires human attention and may be very
expensive. Hence, alternative criteria are needed to guide the learning process for
many tasks such as feature selection and sentiment analysis.
To address above-mentioned issues, I study two novel problems for signed social
networks mining, (1) unsupervised feature selection in signed social networks; and
(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In
particular, I model positive and negative links simultaneously for user preference
learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and
implicit sentiment signals from signed social networks into a coherent model Signed-
Senti. Empirical experiments on real-world datasets corroborate the effectiveness of
these two frameworks on the tasks of feature selection and sentiment analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2017
|
73 |
Suporte ao diagnóstico da doença de Alzheimer a partir de imagens de ressonância magnética / Diagnostic support for Alzheimer's disease through magnetic resonance imagingPadovese, Bruno Tavares [UNESP] 15 May 2017 (has links)
Submitted by Bruno Tavares Padovese null (bpadovese@gmail.com) on 2017-07-03T15:22:41Z
No. of bitstreams: 1
Dissertacao_Mestrado_Bruno_Tavares_Padovese.pdf: 4559390 bytes, checksum: 9152719c817205d08d3a72b5a5abc949 (MD5) / Approved for entry into archive by Luiz Galeffi (luizgaleffi@gmail.com) on 2017-07-04T17:59:03Z (GMT) No. of bitstreams: 1
padovese_bt_me_sjrp.pdf: 4559390 bytes, checksum: 9152719c817205d08d3a72b5a5abc949 (MD5) / Made available in DSpace on 2017-07-04T17:59:03Z (GMT). No. of bitstreams: 1
padovese_bt_me_sjrp.pdf: 4559390 bytes, checksum: 9152719c817205d08d3a72b5a5abc949 (MD5)
Previous issue date: 2017-05-15 / Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) / Resumo: Os estágios iniciais da doença de Alzheimer são comumente confundidos com o processo natural de envelhecimento. Adicionalmente, a metodologia envolvida no diagnóstico por radiologistas pode ser subjetiva e difícil de documentar. Neste cenário, o desenvolvimento de abordagens acessíveis capazes de auxiliar no diagnóstico precoce da doença de Alzheimer é crucial. Várias abordagens têm sido empregadas com este objetivo, especialmente utilizando imagens de ressonância magnética cerebral. Embora resultados com precisão satisfatória tenham sido obtidos, a maioria das abordagens requer etapas de pré-processamento muito específicas, baseadas na anatomia do cérebro. Neste trabalho, apresentamos uma nova abordagem de recuperação de imagens para auxílio ao diagnóstico da doença de Alzheimer, com base em descritores de propósito geral e uma etapa de pós-processamento não supervisionada. Os exames de ressonância magnética cerebral são processados e recuperados através de descritores de uso geral sem nenhuma etapa de pré-processamento. Dois algoritmos de aprendizado não-supervisionados baseados em ranqueamento foram aplicados para melhorar a eficácia dos resultados iniciais: os algoritmos RL-Sim e ReckNN. Os resultados experimentais demonstram que a abordagem proposta é capaz de atingir resultados de recuperação eficazes, sendo adequada para auxiliar no diagnóstico da doença de Alzheimer. / Abstract: Initial stages of Alzheimer’s disease are easily confused with the normal aging process. Additionally, the methodology involved in the diagnosis by radiologists can be subjective and difficult to document. In this scenario, the development of accessible approaches capable of supporting the early diagnosis of Alzheimer’s disease is crucial. Various approaches have been employed with this objective, specially using brain MRI scans. Although certain satisfactory accuracy results have been achieved, most of the approaches require very specific pre-processing steps based on the brain anatomy. In this work, we present a novel image retrieval approach for supporting the Alzheimer’s disease diagnostic, based on general purpose features and an unsupervised post-processing step. The brain MRI scans are processed and retrieved through general visual features without any pre-processing step. Two rank-based unsupervised distance learning algorithms were used for improving the effectiveness of the initial results: the RL-Sim and ReckNN algorithms. Experimental results demonstrate that the proposed approach can achieve effective retrieval results, being suitable in aiding the diagnosis of Alzheimer’s disease. / CNPq: 154034/2016-9
|
74 |
Truncated Signed Distance Fields Applied To RoboticsCanelhas, Daniel Ricão January 2017 (has links)
This thesis is concerned with topics related to dense mapping of large scale three-dimensional spaces. In particular, the motivating scenario of this work is one in which a mobile robot with limited computational resources explores an unknown environment using a depth-camera. To this end, low-level topics such as sensor noise, map representation, interpolation, bit-rates, compression are investigated, and their impacts on more complex tasks, such as feature detection and description, camera-tracking, and mapping are evaluated thoroughly. A central idea of this thesis is the use of truncated signed distance fields (TSDF) as a map representation and a comprehensive yet accessible treatise on this subject is the first major contribution of this dissertation. The TSDF is a voxel-based representation of 3D space that enables dense mapping with high surface quality and robustness to sensor noise, making it a good candidate for use in grasping, manipulation and collision avoidance scenarios. The second main contribution of this thesis deals with the way in which information can be efficiently encoded in TSDF maps. The redundant way in which voxels represent continuous surfaces and empty space is one of the main impediments to applying TSDF representations to large-scale mapping. This thesis proposes two algorithms for enabling large-scale 3D tracking and mapping: a fast on-the-fly compression method based on unsupervised learning, and a parallel algorithm for lifting a sparse scene-graph representation from the dense 3D map. The third major contribution of this work consists of thorough evaluations of the impacts of low-level choices on higher-level tasks. Examples of these are the relationships between gradient estimation methods and feature detector repeatability, voxel bit-rate, interpolation strategy and compression ratio on camera tracking performance. Each evaluation thus leads to a better understanding of the trade-offs involved, which translate to direct recommendations for future applications, depending on their particular resource constraints.
|
75 |
Object Instance Detection and Dynamics Modeling in a Long-Term Mobile Robot ContextBore, Nils January 2017 (has links)
In the last years, simple service robots such as autonomous vacuum cleaners and lawn mowers have become commercially available and increasingly common. The next generation of service robots should perform more advanced tasks, such as to clean up objects. Robots then need to learn to robustly navigate, and manipulate, cluttered environments, such as an untidy living room. In this thesis, we focus on representations for tasks such as general cleaning and fetching of objects. We discuss requirements for these specific tasks, and argue that solving them would be generally useful, because of their object-centric nature. We rely on two fundamental insights in our approach to understand environments on a fine-grained level. First, many of today's robot map representations are limited to the spatial domain, and ignore that there is a time axis that constrains how much an environment may change during a given period. We argue that it is of critical importance to also consider the temporal domain. By studying the motion of individual objects, we can enable tasks such as general cleaning and object fetching. The second insight comes from that mobile robots are becoming more robust. They can therefore collect large amounts of data from those environments. With more data, unsupervised learning of models becomes feasible, allowing the robot to adapt to changes in the environment, and to scenarios that the designer could not foresee. We view these capabilities as vital for robots to become truly autonomous. The combination of unsupervised learning and dynamics modelling creates an interesting symbiosis: the dynamics vary between different environments and between the objects in one environment, and learning can capture these variations. A major difficulty when modeling environment dynamics is that the whole environment can not be observed at one time, since the robot is moving between different places. We demonstrate how this can be dealt with in a principled manner, by modeling several modes of object movement. We also demonstrate methods for detection and learning of objects and structures in the static parts of the maps. Using the complete system, we can represent and learn many aspects of the full environment. In real-world experiments, we demonstrate that our system can keep track of varied objects in large and highly dynamic environments. / Under de senaste åren har enklare service-robotar, såsom autonoma dammsugare och gräsklippare, börjat säljas, och blivit alltmer vanliga. Nästa generations service-robotar förväntas utföra mer komplexa uppgifter, till exempel att städa upp utspridda föremål i ett vardagsrum. För att uppnå detta måste robotarna kunna navigera i ostrukturerade miljöer, och förstå hur de kan bringas i ordning. I denna avhandling undersöker vi abstrakta representationer som kan förverkliga generalla städrobotar, samt robotar som kan hämta föremål. Vi diskuterar vad dessa specifika tillämpningar kräver i form av representationer, och argumenterar för att en lösning på dessa problem vore mer generellt applicerbar på grund av uppgifternas föremåls-centrerade natur. Vi närmar oss uppgiften genom två viktiga insikter. Till att börja medär många av dagens robot-representationer begränsade till rumsdomänen. De utelämnar alltså att modellera den variation som sker över tiden, och utnyttjar därför inte att rörelsen som kan ske under en given tidsperiod är begränsad. Vi argumenterar för att det är kritiskt att också inkorperara miljöns rörelse i robotens modell. Genom att modellera omgivningen på en föremåls-nivå möjliggörs tillämpningar som städning och hämtning av rörliga objekt. Den andra insikten kommer från att mobila robotar nu börjar bli så robusta att de kan patrullera i en och samma omgivning under flera månader. Dekan därför samla in stora mängder data från enskilda omgivningar. Med dessa stora datamängder börjar det bli möjligt att tillämpa så kallade "unsupervised learning"-metoder för att lära sig modeller av enskilda miljöer utan mänsklig inblandning. Detta tillåter robotarna att anpassa sig till förändringar i omgivningen, samt att lära sig koncept som kan vara svåra att förutse på förhand. Vi ser detta som en grundläggande förmåga hos en helt autonom robot. Kombinationen av unsupervised learning och modellering av omgivningens dynamik är intressant. Eftersom dynamiken varierar mellan olika omgivningar,och mellan olika objekt, kan learning hjälpa oss att fånga dessa variationer,och skapa mer precisa dynamik-modeller. Något som försvårar modelleringen av omgivningens dynamik är att roboten inte kan observera hela omgivningen på samma gång. Detta betyder att saker kan flyttas långa sträckor mellan två observationer. Vi visar hur man kan adressera detta i modellen genom att inlemma flera olika sätt som ett föremål kan flyttas på. Det resulterande systemet är helt probabilistiskt, och kan hålla reda på samtliga föremål i robotens omgivning. Vi demonstrerar även metoder för att upptäcka och lära sig föremål i den statiska delen av omgivningen. Med det kombinerade systemet kan vi således representera och lära oss många aspekter av robotens omgivning. Genom experiment i mänskliga miljöer visar vi att systemet kan hålla reda på olika sorters föremål i stora, och dynamiska, miljöer. / <p>QC 20171213</p>
|
76 |
Automatic induction of verb classes using clusteringSun, Lin January 2013 (has links)
Verb classifications have attracted a great deal of interest in both linguistics and natural language processing (NLP). They have proved useful for important tasks and applications, including e.g. computational lexicography, parsing, word sense disambiguation, semantic role labelling, information extraction, question-answering, and machine translation (Swier and Stevenson, 2004; Dang, 2004; Shi and Mihalcea, 2005; Kipper et al., 2008; Zapirain et al., 2008; Rios et al., 2011). Particularly useful are classes which capture generalizations about a range of linguistic properties (e.g. lexical, (morpho-)syntactic, semantic), such as those proposed by Beth Levin (1993). However, full exploitation of such classes in real-world tasks has been limited because no comprehensive or domain-specific lexical classification is available. This thesis investigates how Levin-style lexical semantic classes could be learned automatically from corpus data. Automatic acquisition is cost-effective when it involves either no or minimal supervision and it can be applied to any domain of interest where adequate corpus data is available. We improve on earlier work on automatic verb clustering. We introduce new features and new clustering methods to improve the accuracy and coverage. We evaluate our methods and features on well-established cross-domain datasets in English, on a specific domain of English (the biomedical) and on another language (French), reporting promising results. Finally, our task-based evaluation demonstrates that the automatically acquired lexical classes enable new approaches to some NLP tasks (e.g. metaphor identification) and help to improve the accuracy of existing ones (e.g. argumentative zoning).
|
77 |
Detecção de novidade com aplicação a fluxos contínuos de dados / Novelty detection with application to data streamsEduardo Jaques Spinosa 20 February 2008 (has links)
Neste trabalho a detecção de novidade é tratada como o problema de identificação de conceitos emergentes em dados que podem ser apresentados em um fluxo contínuo. Considerando a relação intrínseca entre tempo e novidade e os desafios impostos por fluxos de dados, uma nova abordagem é proposta. OLINDDA (OnLIne Novelty and Drift Detection Algorithm) vai além da classficação com uma classe e concentra-se no aprendizado contínuo não-supervisionado de novos conceitos. Tendo aprendido uma descrição inicial de um conceito normal, prossegue à análise de novos dados, tratando-os como um fluxo contínuo em que novos conceitos podem aparecer a qualquer momento. Com o uso de técnicas de agrupamento, OLINDDA pode empregar diversos critérios de validação para avaliar grupos em termos de sua coesão e representatividade. Grupos considerados válidos produzem conceitos que podem sofrer fusão, e cujo conhecimento é continuamente incorporado. A técnica é avaliada experimentalmente com dados artificiais e reais. O módulo de classificação com uma classe é comparado a outras técnicas de detecção de novidade, e a abordagem como um todo é analisada sob vários aspectos por meio da evolução temporal de diversas métricas. Os resultados reforçam a importância da detecção contínua de novos conceitos, assim como as dificuldades e desafios do aprendizado não-supervisionado de novos conceitos em fluxos de dados / In this work novelty detection is treated as the problem of identifying emerging concepts in data that may be presented in a continuous ow. Considering the intrinsic relationship between time and novelty and the challenges imposed by data streams, a novel approach is proposed. OLINDDA, an OnLIne Novelty and Drift Detection Algorithm, goes beyond one-class classification and focuses on the unsupervised continuous learning of novel concepts. Having learned an initial description of a normal concept, it proceeds to the analysis of new data, treating them as a continuous ow where novel concepts may appear at any time. By the use of clustering techniques, OLINDDA may employ several validation criteria to evaluate clusters in terms of their cohesiveness and representativeness. Clusters considered valid produce concepts that may be merged, and whose knowledge is continuously incorporated. The technique is experimentally evaluated with artificial and real data. The one-class classification module is compared to other novelty detection techniques, and the whole approach is analyzed from various aspects through the temporal evolution of several metrics. Results reinforce the importance of continuous detection of novel concepts, as well as the dificulties and challenges of the unsupervised learning of novel concepts in data streams
|
78 |
Técnicas de aprendizado não supervisionado baseadas no algoritmo da caminhada do turista / Unsupervised learning techniques based on the tourist walk algorithmCarlos Humberto Porto Filho 07 November 2017 (has links)
Nas últimas décadas, a quantidade de informações armazenadas no formato digital tem crescido de forma exponencial, levando à necessidade cada vez maior de produção de ferramentas computacionais que auxiliem na geração do conhecimento a partir desses dados. A área de Aprendizado de Máquina fornece diversas técnicas capazes de identificar padrões nesses conjuntos de dados. Dentro dessas técnicas, este trabalho destaca o Aprendizado de Máquina Não Supervisionado onde o objetivo é classificar as entidades em clusters (grupos) mutuamente exclusivos baseados na similaridade entre as instâncias. Os clusters não são pré-definidos e daí o elemento não supervisionado. Organizar esses dados em clusters que façam sentido é uma das maneiras mais fundamentais de entendimento e aprendizado. A análise de clusters é o estudo dos métodos para agrupamento e se divide entre hierárquico e particional. A classificação hierárquica é uma sequência encadeada de partições enquanto que na particional há somente uma partição. O interesse deste trabalho são as técnicas baseadas em uma caminhada determinística parcialmente auto repulsiva conhecida como caminhada do turista. Partindo da hipótese de que é possível utilizar a caminhada do turista como uma técnica de Aprendizado de Máquina Não Supervisionado, foi implementado um algoritmo hierárquico baseado na caminhada do turista proposto por Campiteli et al. (2006). Foi avaliado, através de diferentes conjuntos de imagens médicas, como essa técnica se compara com técnicas hierárquicas tradicionais. Também é proposto um novo algoritmo de Aprendizado de Máquina Não Supervisionado particional baseado na caminhada do turista, chamado de Tourist Walk Partitional Clustering (TWPC). Os resultados mostraram que a técnica hierárquica baseada na caminhada do turista é capaz de identificar clusters em conjuntos de imagens médicas através de uma árvore que não impõe uma estrutura binária, com um número menor de hierarquias e uma invariabilidade à escala dos dados, resultando em uma estrutura mais organizada. Mesmo que a árvore não seja diretamente baseada nas distâncias dos dados, mas em um ranking de vizinhos, ela ainda preserva uma correlação entre suas distâncias cofenéticas e as distâncias reais entre os dados. O método particional proposto TWPC foi capaz de encontrar, de forma eficiente, formas arbitrárias de clusters com variações inter-cluster e intra-cluster. Além disso o algoritmo tem como vantagens: ser determinístico; funcionar com interações locais, sem a necessidade de conhecimento a priori de todos os itens do conjunto; incorporar o conceito de ruído e outlier; e funcionar com um ranking de vizinhos, que pode ser construído através de qualquer medida. / In the last decades, the amount of data stored in digital format has grown exponentially, leading to the increasing need to produce computational tools that help generate knowledge from these data. The Machine Learning field provides several techniques capable of identifying patterns in these data sets. Within these techniques we highlight the Unsupervised Machine Learning where the objective is to classify the entities in mutually exclusive clusters based on the similarity between the instances. Clusters are not predefined and hence the unsupervised element. Organizing this data into clusters that make sense is one of the most fundamental ways of understanding and learning. Cluster analysis is the study of methods for clustering and is divided between hierarchical and partitional. A hierarchical clustering is a sequence of partitions whereas in the partitional clustering there is only one partition. Here we are interested in techniques based on a deterministic partially self-avoiding walk, known as tourist walk. Based on the hypothesis that it is possible to use the tourist walk as an unsupervised machine learning technique, we have implemented a hierarchical algorithm based on the tourist walk proposed by Campiteli et al. (2006). We evaluate this algorithm using different sets of medical images and compare it with traditional hierarchical techniques. We also propose a new algorithm for partitional clustering based on the tourist talk, called Tourist Walk Partitional Clustering (TWPC). The results showed that the hierarchical technique based on the tourist walk is able to identify clusters in sets of medical images through a tree that does not impose a binary structure, with a smaller number of hierarchies and is invariable to scale transformation, resulting in a more organized structure. Even though the tree is not directly based on the distances of the data but on a ranking of neighbors, it still preserves a correlation between its cophenetic distances and the actual distances between the data. The proposed partitional clustering method TWPC was able to find, in an efficient way, arbitrary shapes of clusters with inter-cluster and intra-cluster variations. In addition, the algorithm has the following advantages: it is deterministic; it operates based on local interactions, without the need for a priori knowledge of all the items in the set; it is capable of incorporate the concept of noise and outlier; and work with a ranking of neighbors, which can be built through any measure.
|
79 |
Perceptual learning in speech reveals pathways of processingMunson, Cheyenne Michele 01 December 2011 (has links)
Listeners use perceptual learning to rapidly adapt to manipulated speech input. Examination of this learning process can reveal the pathways used during speech perception. By assessing generalization of perceptually learned categorization boundaries, others have used perceptual learning to help determine whether abstract units are necessary for listeners and models of speech perception. Here we extend this approach to address the inverse issue of specificity. In these experiments we have sought to discover the levels of specificity for which listeners can learn variation in phonetic contrasts. We find that (1) listeners are able to learn multiple voicing boundaries for different pairs of phonemic contrasts relying on the same feature contrast. (2) Listeners generalize voicing boundaries to untrained continua with the same onset as the trained continua, but generalization to continua with different onsets depends on previous experience with other continua sharing this different onset. (3) Listeners can learn different voicing boundaries for continua with the same CV onset, which suggests that boundaries are lexically-specific. (4) Listeners can learn different voicing boundaries for multiple talkers even when they are not given instructions about talkers and their task does not require talker identification. (5) Listeners retain talker-specific boundaries after training on a new boundary for a second talker, but generalize boundaries across talkers when they have no previous experience with a talker. These results were obtained using a new paradigm for unsupervised perceptual learning in speech. They suggest that models of speech perception must be highly flexible in order to accommodate both specificity and generalization of perceptually learned categorization boundaries.
|
80 |
Identifying Crime Hotspot: Evaluating the suitability of Supervised and Unsupervised Machine learningHussein, Abdul Aziz 05 October 2021 (has links)
No description available.
|
Page generated in 0.1245 seconds