281 |
Academic Recommendation System Based on the Similarity Learning of the Citation Network Using Citation ImpactAlshareef, Abdulrhman M. 29 April 2019 (has links)
In today's significant and rapidly increasing amount of scientific publications, exploring recent studies in a given research area and building an effective scientific collaboration has become more challenging than any time before. Scientific production growth has been increasing the difficulties for identifying the most relevant papers to cite or to find an appropriate conference or journal to submit a paper to publish. As a result, authors and publishers rely on different analytical approaches in order to measure the relationship among the citation network. Different parameters have been used such as the impact factor, number of citations, co-citation to assess the impact of the produced research publication. However, using one assessing factor considers only one level of relationship exploration, since it does not reflect the effect of the other factors. In this thesis, we propose an approach to measure the Academic Citation Impact that will help to identify the impact of articles, authors, and venues at their extended nearby citation network. We combine the content similarity with the bibliometric indices to evaluate the citation impact of articles, authors, and venues in their surrounding citation network. Using the article metadata, we calculate the semantic similarity between any two articles in the extended network. Then we use the similarity score and bibliometric indices to evaluate the impact of the articles, authors, and venues among their extended nearby citation network.
Furthermore, we propose an academic recommendation model to identify the latent preferences among the citation network of the given article in order to expose the concealed connection between the academic objects (articles, authors, and venues) at the citation network of the given article. To reveal the degree of trust for collaboration between academic objects (articles, authors, and venues), we use the similarity learning to estimate the collaborative confidence score that represents the anticipation of a prospect relationship between the academic objects among a scientific community. We conducted an offline experiment to measure the accuracy of delivering personalized recommendations, based on the user’s selection preferences; real-world datasets were used. Our evaluation results show a potential improvement to the quality of the recommendation when compared to baseline recommendation algorithms that consider co-citation information.
|
282 |
Uma proposta de estruturação e integração de processamento de cores em sistemas artificiais de visão. / A proposal for structuration and integration of color processing in artifical vision systems.Moreira, Jander 05 July 1999 (has links)
Esta tese descreve uma abordagem para a utilização da informação de cores no sistema de visão artificial com inspiração biológica denominada Cyvis-1. Considerando-se que grande parte da literatura sobre segmentação de imagens se refere a imagens em níveis de cinza, informações cromáticas na segmentação permanecem uma área que ainda deve ser mais bem explorada e para a qual se direcionou o interesse da presente pesquisa. Neste trabalho, o subsistema de cor do Cyvis-1 é definido, mantendo-se o vínculo com os princípios que inspiram o sistema de visão como um todo: hierarquia, modularidade, especialização do processamento, integração em vários níveis, representação efetiva da informação visual e integração com conhecimento de nível alto. O subsistema de cor se insere neste escopo, propondo uma técnica para segmentação de imagens coloridas baseada em mapas auto-organizáveis para a classificação dos pontos da imagem. A segmentação incorpora a determinação do número de classes sem supervisão, tornando o processo mais independente de intervenção humana. Por este processo de segmentação, são produzidos mapas das regiões encontradas e um mapa de bordas, derivado das regiões. Uma segunda proposta do trabalho é um estudo comparativo do desempenho de técnicas de segmentação por bordas. A comparação é feita em relação a um mapa de bordas de referência e o comportamento de várias técnicas é analisado segundo um conjunto de atributos locais baseados em contrastes de intensidade e cor. Derivada desta comparação, propõe-se também uma combinação para a geração de um mapa de bordas a partir da seleção das técnicas segundo seus desempenhos locais. Finalmente, integrando os aspectos anteriores, é proposta urna estruturação do módulo de cor, adicionalmente com a aquisição de imagens, a análise de formas e o reconhecimento de objetos poliédricos. Há, neste contexto, a integração ao módulo de estéreo, que proporciona o cálculo de dados tridimensionais, essenciais para o reconhecimento dos objetos. Para cada parte deste trabalho são propostas formas de avaliação para a validação dos resultados, demonstrando e caracterizando a eficiência e as limitações de cada uma. / This thesis describes an approach to color information processing in the biologically-inspired artificial vision system named Cyvis-1. Considering that most of the current literature in image segmentation deals with gray level images, color information remains an incipient area, which has motivated this research. This work defines the color subsystem within the Cyvis-1 underlying phylosophy, whose main principles include hierarchy, modularity, processing specialization, multilevel integration, effective representation of visual information, and high-level knowledge integration. The color subsystem is then introduced according to this framework, with a proposal of a segmentation technique based on self-organizing maps. The number of regions in the image is achieved through a unsupervised clustering approach, so no human interaction is needed. Such segmentation technique produces region oriented representation of the classes, which are used to derive an edge map. Another main topic in this work is a comparative study of the edge maps produced by several edge-oriented segmentation techniques. A reference edge map is used as standard segmentation, to which other edge maps are compared. Such analysis is carried out by means of local attributes (local gray level and \"color\" contrasts). As a consequence of the comparison, a combination edge map is also proposed, based on the conditional selection of techniques considering the local attributes. Finally, the integration of two above topics is proposed, which is characterized by the design of the color subsystem of Cyvis-1, altogether with the modules for image acquisition, shape analysis and polyhedral object recognition. In such a context, the integration with the stereo subsystem is accomplished, allowing the evaluation of the three-dimensional data needed for object recognition. Assessment and validation of the three proposals were carried out, providing the means for analyzing their efficiency and limitations.
|
283 |
Sistema de visión computacional estereoscópico aplicado a un robot cilíndrico accionado neumáticamenteRamirez Montecinos, Daniela Elisa January 2017 (has links)
In the industrial area, robots are an important part of the technological resources available to perform manipulation tasks in manufacturing, assembly, the transportation of dangerous waste, and a variety of applications. Specialized systems of computer vision have entered the market to solve problems that other technologies have been unable to address. This document analyzes a stereo vision system that is used to provide the center of mass of an object in three dimensions. This kind of application is mounted using two or more cameras that are aligned along the same axis and give the possibility to measure the depth of a point in the space. The stereoscopic system described, measures the position of an object using a combination between the 2D recognition, which implies the calculus of the coordinates of the center of mass and using moments, and the disparity that is found comparing two images: one of the right and one of the left. This converts the system into a 3D reality viewfinder, emulating the human eyes, which are capable of distinguishing depth with good precision.The proposed stereo vision system is integrated into a 5 degree of freedom pneumatic robot, which can be programmed using the GRAFCET method by means of commercial software. The cameras are mounted in the lateral plane of the robot to ensure that all the pieces in the robot's work area can be observed.For the implementation, an algorithm is developed for recognition and position measurement using open sources in C++. This ensures that the system can remain as open as possible once it is integrated with the robot. The validation of the work is accomplished by taking samples of the objects to be manipulated and generating robot's trajectories to see if the object can be manipulated by its end effector or not. The results show that is possible to manipulate pieces in a visually crowded space with acceptable precision. However, the precision reached does not allow the robot to perform tasks that require higher accuracy as the one is needed in manufacturing assembly process of little pieces or in welding applications. / En el área industrial los robots forman parte importante del recurso tecnológico disponible para tareas de manipulación en manufactura, ensamble, manejo de residuos peligrosos y aplicaciones varias. Los sistemas de visión computacional se han ingresado al mercado como soluciones a problemas que otros tipos de sensores y métodos no han podido solucionar. El presente trabajo analiza un sistema de visión estereoscópico aplicado a un robot. Este arreglo permite la medición de coordenadas del centro de un objeto en las tres dimensiones, de modo que, le da al robot la posibilidad de trabajar en el espacio y no solo en un plano. El sistema estereoscópico consiste en el uso de dos o más cámaras alineadas en alguno de sus ejes, mediante las cuales, es posible calcular la profundidad a la que se encuentran los objetos. En el presente, se mide la posición de un objeto haciendo una combinación entre el reconocimiento 2D y la medición de las coordenadas y de su centro calculadas usando momentos. En el sistema estereoscópico, se añade la medición de la última coordenada mediante el cálculo de la disparidad encontrada entre las imágenes de las cámaras inalámbricas izquierda y derecha, que convierte al sistema en un visor 3D de la realidad, emulando los ojos humanos capaces de distinguir profundidades con cierta precisión. El sistema de visión computacional propuesto es integrado a un robot neumático de 5 grados de libertad el cual puede ser programado desde la metodología GRAFCET mediante software de uso comercial. Las cámaras del sistema de visión están montadas en el plano lateral del robot de modo tal, que es posible visualizar las piezas que quedan dentro de su volumen de trabajo. En la implementación, se desarrolla un algoritmo de reconocimiento y medición de posición, haciendo uso de software libre en lenguaje C++. De modo que, en la integración con el robot, el sistema pueda ser lo más abierto posible. La validación del trabajo se logra tomando muestras de los objetos a ser manipulados y generando trayectorias para el robot, a fin de visualizar si la pieza pudo ser captada por su garra neumática o no. Los resultados muestran que es posible lograr la manipulación de piezas en un ambiente visualmente cargado y con una precisión aceptable. Sin embargo, se observa que la precisión no permite que el sistema pueda ser usado en aplicaciones donde se requiere precisión al nivel de los procesos de ensamblado de piezas pequeñas o de soldadura.
|
284 |
Slowness and sparseness for unsupervised learning of spatial and object codes from naturalistic dataFranzius, Mathias 27 June 2008 (has links)
Diese Doktorarbeit führt ein hierarchisches Modell für das unüberwachte Lernen aus quasi-natürlichen Videosequenzen ein. Das Modell basiert auf den Lernprinzipien der Langsamkeit und Spärlichkeit, für die verschiedene Ansätze und Implementierungen vorgestellt werden. Eine Vielzahl von Neuronentypen im Hippocampus von Nagern und Primaten kodiert verschiedene Aspekte der räumlichen Umgebung eines Tieres. Dazu gehören Ortszellen (place cells), Kopfrichtungszellen (head direction cells), Raumansichtszellen (spatial view cells) und Gitterzellen (grid cells). Die Hauptergebnisse dieser Arbeit basieren auf dem Training des hierarchischen Modells mit Videosequenzen aus einer Virtual-Reality-Umgebung. Das Modell reproduziert die wichtigsten räumlichen Codes aus dem Hippocampus. Die Art der erzeugten Repräsentationen hängt hauptsächlich von der Bewegungsstatistik des simulierten Tieres ab. Das vorgestellte Modell wird außerdem auf das Problem der invaranten Objekterkennung angewandt, indem Videosequenzen von simulierten Kugelhaufen oder Fischen als Stimuli genutzt wurden. Die resultierenden Modellrepräsentationen erlauben das unabhängige Auslesen von Objektidentität, Position und Rotationswinkel im Raum. / This thesis introduces a hierarchical model for unsupervised learning from naturalistic video sequences. The model is based on the principles of slowness and sparseness. Different approaches and implementations for these principles are discussed. A variety of neuron classes in the hippocampal formation of rodents and primates codes for different aspects of space surrounding the animal, including place cells, head direction cells, spatial view cells and grid cells. In the main part of this thesis, video sequences from a virtual reality environment are used for training the hierarchical model. The behavior of most known hippocampal neuron types coding for space are reproduced by this model. The type of representations generated by the model is mostly determined by the movement statistics of the simulated animal. The model approach is not limited to spatial coding. An application of the model to invariant object recognition is described, where artificial clusters of spheres or rendered fish are presented to the model. The resulting representations allow a simple extraction of the identity of the object presented as well as of its position and viewing angle.
|
285 |
Robot Motion and Task Learning with Error RecoveryChang, Guoting January 2013 (has links)
The ability to learn is essential for robots to function and perform services within a dynamic human environment. Robot programming by demonstration facilitates learning through a human teacher without the need to develop new code for each task that the robot performs. In order for learning to be generalizable, the robot needs to be able to grasp the underlying structure of the task being learned. This requires appropriate knowledge abstraction and representation. The goal of this thesis is to develop a learning by imitation system that abstracts knowledge of human demonstrations of a task and represents the abstracted knowledge in a hierarchical framework. The learning by imitation system is capable of performing both action and object recognition based on video stream data at the lower level of the hierarchy, while the sequence of actions and object states observed is reconstructed at the higher level of the hierarchy in order to form a coherent representation of the task. Furthermore, error recovery capabilities are included in the learning by imitation system to improve robustness to unexpected situations during task execution. The first part of the thesis focuses on motion learning to allow the robot to both recognize the actions for task representation at the higher level of the hierarchy and to perform the actions to imitate the task. In order to efficiently learn actions, the actions are segmented into meaningful atomic units called motion primitives. These motion primitives are then modeled using dynamic movement primitives (DMPs), a dynamical system model that can robustly generate motion trajectories to arbitrary goal positions while maintaining the overall shape of the demonstrated motion trajectory. The DMPs also contain weight parameters that are reflective of the shape of the motion trajectory. These weight parameters are clustered using affinity propagation (AP), an efficient exemplar clustering algorithm, in order to determine groups of similar motion primitives and thus, performing motion recognition. The approach of DMPs combined with APs was experimentally verified on two separate motion data sets for its ability to recognize and generate motion primitives. The second part of the thesis outlines how the task representation is created and used for imitating observed tasks. This includes object and object state recognition using simple computer vision techniques as well as the automatic construction of a Petri net (PN) model to describe an observed task. Tasks are composed of a sequence of actions that have specific pre-conditions, i.e. object states required before the action can be performed, and post-conditions, i.e. object states that result from the action. The PNs inherently encode pre-conditions and post-conditions of a particular event, i.e. action, and can model tasks as a coherent sequence of actions and object states. In addition, PNs are very flexible in modeling a variety of tasks including tasks that involve both sequential and parallel components. The automatic PN creation process has been tested on both a sequential two block stacking task and a three block stacking task involving both sequential and parallel components. The PN provides a meaningful representation of the observed tasks that can be used by a robot to imitate the tasks. Lastly, error recovery capabilities are added to the learning by imitation system in order to allow the robot to readjust the sequence of actions needed during task execution. The error recovery component is able to deal with two types of errors: unexpected, but known situations and unexpected, unknown situations. In the case of unexpected, but known situations, the learning system is able to search through the PN to identify the known situation and the actions needed to complete the task. This ability is useful not only for error recovery from known situations, but also for human robot collaboration, where the human unexpectedly helps to complete part of the task. In the case of situations that are both unexpected and unknown, the robot will prompt the human demonstrator to teach how to recover from the error to a known state. By observing the error recovery procedure and automatically extending the PN with the error recovery information, the situation encountered becomes part of the known situations and the robot is able to autonomously recover from the error in the future. This error recovery approach was tested successfully on errors encountered during the three block stacking task.
|
286 |
Automated Recognition of 3D CAD Model Objects in Dense Laser Range Point CloudsBosche, Frederic January 2008 (has links)
There is shift in the Architectural / Engineering / Construction and Facility Management (AEC&FM) industry toward performance-driven projects. Assuring
good performance requires efficient and reliable performance control processes.
However, the current state of the AEC&FM industry is that control processes are
inefficient because they generally rely on manually intensive, inefficient, and often
inaccurate data collection techniques.
Critical performance control processes include progress tracking and dimensional
quality control. These particularly rely on the accurate and efficient collection
of the as-built three-dimensional (3D) status of project objects. However, currently available
techniques for as-built 3D data collection are extremely inefficient, and provide
partial and often inaccurate information. These limitations have a negative impact
on the quality of decisions made by project managers and consequently on project
success.
This thesis presents an innovative approach for Automated 3D Data Collection
(A3dDC). This approach takes advantage of Laser Detection and Ranging
(LADAR), 3D Computer-Aided-Design (CAD) modeling and registration technologies.
The performance of this approach is investigated with a first set of experimental
results obtained with real-life data. A second set of experiments then
analyzes the feasibility of implementing, based on the developed approach, automated
project performance control (APPC) applications such as automated project
progress tracking and automated dimensional quality control. Finally, other applications
are identified including planning for scanning and strategic scanning.
|
287 |
Automated Recognition of 3D CAD Model Objects in Dense Laser Range Point CloudsBosche, Frederic January 2008 (has links)
There is shift in the Architectural / Engineering / Construction and Facility Management (AEC&FM) industry toward performance-driven projects. Assuring
good performance requires efficient and reliable performance control processes.
However, the current state of the AEC&FM industry is that control processes are
inefficient because they generally rely on manually intensive, inefficient, and often
inaccurate data collection techniques.
Critical performance control processes include progress tracking and dimensional
quality control. These particularly rely on the accurate and efficient collection
of the as-built three-dimensional (3D) status of project objects. However, currently available
techniques for as-built 3D data collection are extremely inefficient, and provide
partial and often inaccurate information. These limitations have a negative impact
on the quality of decisions made by project managers and consequently on project
success.
This thesis presents an innovative approach for Automated 3D Data Collection
(A3dDC). This approach takes advantage of Laser Detection and Ranging
(LADAR), 3D Computer-Aided-Design (CAD) modeling and registration technologies.
The performance of this approach is investigated with a first set of experimental
results obtained with real-life data. A second set of experiments then
analyzes the feasibility of implementing, based on the developed approach, automated
project performance control (APPC) applications such as automated project
progress tracking and automated dimensional quality control. Finally, other applications
are identified including planning for scanning and strategic scanning.
|
288 |
Interactive text response for assistive robotics in the homeAjulo, Morenike 18 May 2010 (has links)
In a home environment, there are many tasks that a human may need to accomplish. These activities, which range from picking up a telephone to clearing rooms in the house, all have the common trend of fetching. These tasks can only be completed correctly with the consideration of many things including an understanding of what the human wants, recognition of the correct item from the environment, and manipulation and grasping of the object of interest.
The focus of this work is on addressing one aspect of this problem, decomposing an image scene such that a task-specific object of interest can be identified. In this work, communication between human and robot is represented using a feedback formalism. This involves the back-and-forth transfer of textual information between the human and the robot such that the robot receives all information necessary to recognize the task-specific object of interest. We name this new communication mechanism Interactive Text Response (ITR), which we believe will provide a novel contribution to the field of Human Robot Interaction.
The methodology employed involves capturing a view of the scene that contains an object of interest. Then, the robot makes inquiries based on its current understanding of the scene to disambiguate between objects in the scene. In this work, we discuss development of ITR in human-robot interaction, and understanding of variability, ease of recognition, clutter, and workload needed to develop an interactive robot system.
|
289 |
Discriminative object categorization with external semantic knowledgeHwang, Sung Ju 25 September 2013 (has links)
Visual object category recognition is one of the most challenging problems in computer vision. Even assuming that we can obtain a near-perfect instance level representation with the advances in visual input devices and low-level vision techniques, object categorization still remains as a difficult problem because it requires drawing boundaries between instances in a continuous world, where the boundaries are solely defined by human conceptualization. Object categorization is essentially a perceptual process that takes place in a human-defined semantic space. In this semantic space, the categories reside not in isolation, but in relation to others. Some categories are similar, grouped, or co-occur, and some are not. However, despite this semantic nature of object categorization, most of the today's automatic visual category recognition systems rely only on the category labels for training discriminative recognition with statistical machine learning techniques. In many cases, this could result in the recognition model being misled into learning incorrect associations between visual features and the semantic labels, from essentially overfitting to training set biases. This limits the model's prediction power when new test instances are given. Using semantic knowledge has great potential to benefit object category recognition. First, semantic knowledge could guide the training model to learn a correct association between visual features and the categories. Second, semantics provide much richer information beyond the membership information given by the labels, in the form of inter-category and category-attribute distances, relations, and structures. Finally, the semantic knowledge scales well as the relations between categories become larger with an increasing number of categories. My goal in this thesis is to learn discriminative models for categorization that leverage semantic knowledge for object recognition, with a special focus on the semantic relationships among different categories and concepts. To this end, I explore three semantic sources, namely attributes, taxonomies, and analogies, and I show how to incorporate them into the original discriminative model as a form of structural regularization. In particular, for each form of semantic knowledge I present a feature learning approach that defines a semantic embedding to support the object categorization task. The regularization penalizes the models that deviate from the known structures according to the semantic knowledge provided. The first semantic source I explore is attributes, which are human-describable semantic characteristics of an instance. While the existing work treated them as mid-level features which did not introduce new information, I focus on their potential as a means to better guide the learning of object categories, by enforcing the object category classifiers to share features with attribute classifiers, in a multitask feature learning framework. This approach essentially discovers the common low-dimensional features that support predictions in both semantic spaces. Then, I move on to the semantic taxonomy, which is another valuable source of semantic knowledge. The merging and splitting criteria for the categories on a taxonomy are human-defined, and I aim to exploit this implicit semantic knowledge. Specifically, I propose a tree of metrics (ToM) that learns metrics that capture granularity-specific similarities at different nodes of a given semantic taxonomy, and uses a regularizer to isolate granularity-specific disjoint features. This approach captures the intuition that the features used for the discrimination of the parent class should be different from the features used for the children classes. Such learned metrics can be used for hierarchical classification. The use of a single taxonomy can be limited in that its structure is not optimal for hierarchical classification, and there may exist no single optimal semantic taxonomy that perfectly aligns with visual distributions. Thus, I next propose a way to overcome this limitation by leveraging multiple taxonomies as semantic sources to exploit, and combine the acquired complementary information across multiple semantic views and granularities. This allows us, for example, to synthesize semantics from both 'Biological', and 'Appearance'-based taxonomies when learning the visual features. Finally, as a further exploration of more complex semantic relations different from the previous two pairwise similarity-based models, I exploit analogies, which encode the relational similarities between two related pairs of categories. Specifically, I use analogies to regularize a discriminatively learned semantic embedding space for categorization, such that the displacements between the two category embeddings in both category pairs of the analogy are enforced to be the same. Such a constraint allows for a more confusing pair of categories to benefit from a clear separation in the matched pair of categories that share the same relation. All of these methods are evaluated on challenging public datasets, and are shown to effectively improve the recognition accuracy over purely discriminative models, while also guiding the recognition to be more semantic to human perception. Further, the applications of the proposed methods are not limited to visual object categorization in computer vision, but they can be applied to any classification problems where there exists some domain knowledge about the relationships or structures between the classes. Possible applications of my methods outside the visual recognition domain include document classification in natural language processing, and gene-based animal or protein classification in computational biology. / text
|
290 |
Active visual category learningVijayanarasimhan, Sudheendra 02 June 2011 (has links)
Visual recognition research develops algorithms and representations to autonomously recognize visual entities such as objects, actions, and attributes. The traditional protocol involves manually collecting training image examples, annotating them in specific ways, and then learning models to explain the annotated examples. However, this is a rather limited way to transfer human knowledge to visual recognition systems, particularly considering the immense number of visual concepts that are to be learned.
I propose new forms of active learning that facilitate large-scale transfer of human knowledge to visual recognition systems in a cost-effective way. The approach is cost-effective in the sense that the division of labor between the machine learner and the human annotators respects any cues regarding which annotations would be easy (or hard) for either party to provide. The approach is large-scale in that it can deal with a large number of annotation types, multiple human annotators, and huge pools of unlabeled data. In particular, I consider three important aspects of the problem:
(1) cost-sensitive multi-level active learning, where the expected informativeness of any candidate image annotation is weighed against the predicted cost of obtaining it in order to choose the best annotation at every iteration.
(2) budgeted batch active learning, a novel active learning setting that perfectly suits automatic learning from crowd-sourcing services where there are multiple annotators and each annotation task may vary in difficulty.
(3) sub-linear time active learning, where one needs to retrieve those points that are most informative to a classifier in time that is sub-linear in the number of unlabeled examples, i.e., without having to exhaustively scan the entire collection.
Using the proposed solutions for each aspect, I then demonstrate a complete end-to-end active learning system for scalable, autonomous, online learning of object detectors. The approach provides state-of-the-art recognition and detection results, while using minimal total manual effort. Overall, my work enables recognition systems that continuously improve their knowledge of the world by learning to ask the right questions of human supervisors. / text
|
Page generated in 0.094 seconds