Global ETD Search

31	Academic Recommendation System Based on the Similarity Learning of the Citation Network Using Citation Impact Alshareef, Abdulrhman M. 29 April 2019 (has links) In today's significant and rapidly increasing amount of scientific publications, exploring recent studies in a given research area and building an effective scientific collaboration has become more challenging than any time before. Scientific production growth has been increasing the difficulties for identifying the most relevant papers to cite or to find an appropriate conference or journal to submit a paper to publish. As a result, authors and publishers rely on different analytical approaches in order to measure the relationship among the citation network. Different parameters have been used such as the impact factor, number of citations, co-citation to assess the impact of the produced research publication. However, using one assessing factor considers only one level of relationship exploration, since it does not reflect the effect of the other factors. In this thesis, we propose an approach to measure the Academic Citation Impact that will help to identify the impact of articles, authors, and venues at their extended nearby citation network. We combine the content similarity with the bibliometric indices to evaluate the citation impact of articles, authors, and venues in their surrounding citation network. Using the article metadata, we calculate the semantic similarity between any two articles in the extended network. Then we use the similarity score and bibliometric indices to evaluate the impact of the articles, authors, and venues among their extended nearby citation network. Furthermore, we propose an academic recommendation model to identify the latent preferences among the citation network of the given article in order to expose the concealed connection between the academic objects (articles, authors, and venues) at the citation network of the given article. To reveal the degree of trust for collaboration between academic objects (articles, authors, and venues), we use the similarity learning to estimate the collaborative confidence score that represents the anticipation of a prospect relationship between the academic objects among a scientific community. We conducted an offline experiment to measure the accuracy of delivering personalized recommendations, based on the user’s selection preferences; real-world datasets were used. Our evaluation results show a potential improvement to the quality of the recommendation when compared to baseline recommendation algorithms that consider co-citation information. Citation network analysis similarity learning Latent Semantic Analysis citation impact academic recommendation data mining Natural language processing Recommender systems Object recognition Collaboration
32	Especificação, instanciação e experimentação de um arcabouço para criação automática de ligações hipertexto entre informações homogêneas / Specification, instantion and experimentation of a framework intended to support the task of automatic creation of hypertext links between homogeneous repositories Macedo, Alessandra Alaniz 02 July 2004 (has links) Com a evolução da informática, diferentes meios de comunicação passaram a explorar a Web como um meio de divulgação de suas informações. Diferentes fontes de informações, diferentes estilos de escrita e a curiosidade nata do ser humano despertam o interesse de leitores por conhecer mais de um relato sobre um mesmo tema. Para que a leitura de diferentes relatos com conteúdo similar seja possível, leitores precisam procurar, ler e analisar informações fornecidas por diferentes fontes de informação. Essa atividade, além de exigir grande investimento de tempo, sobrecarrega cognitivamente usuários. Faz parte das pesquisas da área de Hipermídia investigar mecanismos que apóiem usuários no processo de identificação de informações em repositórios homogêneos, sejam eles disponibilizados na Web ou não. No contexto desta tese, repositórios com informações de conteúdo homogêneo são aqueles cujas informações tratam do mesmo assunto. Esta tese tem por objetivo investigar a especificação, a instanciação e a experimentação de um arcabouço para apoiar a tarefa de criação automática de ligações hipertexto entre repositórios homogêneos. O arcabouço proposto, denominado CARe (Criação Automática de Relacionamentos), é representado por um conjunto de classes que realizam a coleta de informações a serem relacionadas e que processam essas informações para a geração de índices. Esses índices são relacionados e utilizados na criação automática de ligações hipertexto entre a informação original. A definição do arcabouço se deu após uma fase de análise de domínio na qual foram identificados requisitos e construídos componentes de software. Nessa fase, vários protótipos também foram construídos de modo iterativo / With the evolution of the Internet, distinct communication media have focused on the Web as a channel of information publishing. An immediate consequence is an abundance of sources of information and writing styles in the Web. This effect, combining with the inherent curiosity of human beings, has led Web users to look for more than a single article about a same subject. To gain access to separate on a same subject, readers need to search, read and analyze information provided by different sources of information. Besides consuming a great amount of time, that activity imposes a cognitive overhead to users. Several hypermedia researches have investigated mechanisms for supporting users during the process of identifying information on homogeneous repositories, available or not on the Web. In this thesis, homogeneous repositories are those containing information that describes a same subject. This thesis aims at investigating the specification and the construction of a framework intended to support the task of automatic creation of hypertext links between homogeneous repositories. The framework proposed, called CARe (Automatic Creation of Relationships), is composed of a set of classes, methods and relationships that gather information to be related, and also process that information for generating an index. Those indexes are related and used in the automatic creation of hypertext links among distinct excerpts of original information. The framework was defined based on a phase of domain analysis in which requirements were identified and software components were built. In that same phase several prototypes were developed in an iterative prototyping Análise da semântica latente Computação ubíqua Hipermídia Homogeneous repositories Hyperlinks Hypermedia Indexação Indexing Informação homogênea Information retrieval Latent semantic analysis Recuperação de informação Relacionamentos semânticos Semantic relationships Web Web
33	Organizational Identity and Community Values: Determining Meaning in Post-secondary Education Social Media Guideline and Policy Documents Pasquini, Laura Anne 08 1900 (has links) With the increasing use of social media by students, researchers, administrative staff, and faculty in post-secondary education (PSE), a number of institutions have developed guideline and policy documents to set standards for social media use. Social media platforms and applications have the potential to increase communication channels, support learning, enhance research, and encourage community engagement at PSE institutions. As social media implementation and administration has developed in PSE, there has been minimal assessment of the substance of social media guideline and policy documents. The first objective of this research study was to examine an accessible, online database (corpus) comprised of 24, 243 atomic social media guideline and policy text documents from 250 PSE institutions representing 10 countries to identify central attributes. To determine text meaning from topic extraction, a rotated latent semantic analysis (rLSA) method was applied. The second objective of this investigation was to determine if the distribution of topics analyze in the corpus differ by PSE institution geographic location. To analyze the diverging topics, the researcher utilized an iterative consensus-building algorithm.Through the maximum term frequencies, LSA determined a rotated 36-factor solution that identified common attributes and topics shared among the 24,243 social media guideline and policy atomic documents. This initial finding produced a list of 36 universal topics discussed in social media guidelines and policies across all 250 PSE institutions from 10 countries. Continually, the applied chi-squared tests, that measured expected and observed document term counts, identified distribution differences of content related factors between US and Non-US PSE institutions. This analysis offered a concrete analysis for unstructured text data on the topic of social media guidance. This resulted in a comprehensive list of recommendations for developing social media guidelines and policies, and a database of social media guideline and policy documents for the PSE sector and other related organizations. Additionally, this research stimulated important theoretical development for how organizations socially construct a semantic structure within a community of practice. By assessing the community of practice, comprised of PSE 250 institutions that direct social media use, a corpus of documents provided unstructured data to evaluate the community. The spontaneous participation and reification process of the social media guideline and policy document corpus reaffirmed that a corpus-creating community of practice can instinctively form a knowledge-sharing organization that provides meaning, values, and identity. These findings should stimulate further research contributions, and provides practitioners and scholars with tools to measure, understand, and assess semantic space for other artifacts developed within a community of practice in other industries, organizations, or distributed associations. social media latent semantic analysis post-secondary education guideline communities of practice corpus-creating community Social media. Education, Higher -- Public relations.
34	The predictability problem Ong, James Kwan Yau January 2007 (has links) Wir versuchen herauszufinden, ob das subjektive Maß der Cloze-Vorhersagbarkeit mit der Kombination objektiver Maße (semantische und n-gram-Maße) geschätzt werden kann, die auf den statistischen Eigenschaften von Textkorpora beruhen. Die semantischen Maße werden entweder durch Abfragen von Internet-Suchmaschinen oder durch die Anwendung der Latent Semantic Analysis gebildet, während die n-gram-Wortmaße allein auf den Ergebnissen von Internet-Suchmaschinen basieren. Weiterhin untersuchen wir die Rolle der Cloze-Vorhersagbarkeit in SWIFT, einem Modell der Blickkontrolle, und wägen ab, ob andere Parameter den der Vorhersagbarkeit ersetzen können. Unsere Ergebnisse legen nahe, dass ein computationales Modell, welches Vorhersagbarkeitswerte berechnet, nicht nur Maße beachten muss, die die Relatiertheit eines Wortes zum Kontext darstellen; das Vorhandensein eines Maßes bezüglich der Nicht-Relatiertheit ist von ebenso großer Bedeutung. Obwohl hier jedoch nur Relatiertheits-Maße zur Verfügung stehen, sollte SWIFT ebensogute Ergebnisse liefern, wenn wir Cloze-Vorhersagbarkeit mit unseren Maßen ersetzen. / We try to determine whether it is possible to approximate the subjective Cloze predictability measure with two types of objective measures, semantic and word n-gram measures, based on the statistical properties of text corpora. The semantic measures are constructed either by querying Internet search engines or by applying Latent Semantic Analysis, while the word n-gram measures solely depend on the results of Internet search engines. We also analyse the role of Cloze predictability in the SWIFT eye movement model, and evaluate whether other parameters might be able to take the place of predictability. Our results suggest that a computational model that generates predictability values not only needs to use measures that can determine the relatedness of a word to its context; the presence of measures that assert unrelatedness is just as important. In spite of the fact, however, that we only have similarity measures, we predict that SWIFT should perform just as well when we replace Cloze predictability with our measures. Cloze-Vorhersagbarkeit Blickbewegungen Latente-Semantische-Analyse Wort-n-Gramme-Wahrscheinlichkeit Ähnlichkeit-Masse Cloze predictability eye movements Latent Semantic Analysis word n-gram probability similarity measures Mathematics
35	Tinklalapio navigavimo asociacijų analizės ir prognozavimo modelis / A model for analyzing and predicting the scent of a web site Kučaidze, Artiom 08 September 2009 (has links) Darbe, remiantis informacijos paieškos teorija, bandoma sukurti tinklalapio navigavimo asociacijų analizės ir prognozavimo modelį. Šio modelio tikslas – simuliuoti potencialių tinklalapio vartotojų informacijos paieškos kelius turint apibrėžtą informacinį tikslą. Modelis kuriamas apjungiant LSA, SVD algoritmus ir koreliacijos koeficientų skaičiavimus. LSA algoritmas naudojamas kuriant semantines erdves, o koreliacijos koeficientų skaičiavimai naudojami statistikoje. Kartu jie leidžia tinklalapio navigavimo asociacijų analizės ir prognozavimo modeliui analizuoti žodžių semantinį panašumą. Darbo eigoje išskiriamos pagrindinės problemos, su kuriomis gali susidurti tinklalapio lankytojai sudarant tinklalapio navigavimo asociacijas – tai yra konkurencijos tarp nuorodų problema, klaidinančių nuorodų problema ir nesuprantamų nuorodų problema. Demonstruojama kaip sukurtas modelis atpažįsta ir analizuoja šias problemas. / In this document we develop a model for analyzing and predicting the scent of a web site, which is based on information foraging theory. The goal of this model is to simulate potential web page users and their information foraging paths having specific information needs. Model is being developed combining LSA, SVD algorithms and correlation values calculations. LSA algorithm is used for creating semantic spaces and correlation values are user in statistics. Together they provide possibility to analyze word‘s semantic similarity. Primary problems of web navigation are described in this document. These problems can occur for users while creating the scent of a web site. User can face with concurrency between links problem, wrong sense link problem and unfamiliar link problem. In this document we demonstrate how model recognizes and analyzes these problems. Informacijos paieška Informacijos nuojauta Navigavimo asociacijos Tinklalapio panaudojamumas Tinklalapio analizė Tinklalapio navigacija Panaudojamumo problemos LSA Latent Semantic Analysis SVD Singular Value Decomposition Ilgiausiai pasikartojančios sekos Koreliacija
36	Μελέτη και συγκριτική αξιολόγηση μεθόδων δόμησης περιεχομένου ιστοτόπων : εφαρμογή σε ειδησεογραφικούς ιστοτόπους Στογιάννος, Νικόλαος-Αλέξανδρος 20 April 2011 (has links) Η κατάλληλη οργάνωση του περιεχομένου ενός ιστοτόπου, έτσι ώστε να αυξάνεται η ευρεσιμότητα των πληροφοριών και να διευκολύνεται η επιτυχής ολοκλήρωση των τυπικών εργασιών των χρηστών, αποτελεί έναν από τους πρωταρχικούς στόχους των σχεδιαστών ιστοτόπων. Οι υπάρχουσες τεχνικές του πεδίου Αλληλεπίδρασης-Ανθρώπου Υπολογιστή που συνεισφέρουν στην επίτευξη αυτού του στόχου συχνά αγνοούνται εξαιτίας των απαιτήσεών τους σε χρονικούς και οικονομικούς πόρους. Ειδικότερα για ειδησεογραφικούς ιστοτόπους, τόσο το μέγεθος τους όσο και η καθημερινή προσθήκη και τροποποίηση των παρεχόμενων πληροφοριών, καθιστούν αναγκαία τη χρήση αποδοτικότερων τεχνικών για την οργάνωση του περιεχομένου τους. Στην εργασία αυτή διερευνούμε την αποτελεσματικότητα μίας μεθόδου, επονομαζόμενης AutoCardSorter, που έχει προταθεί στη βιβλιογραφία για την ημιαυτόματη κατηγοριοποίηση ιστοσελίδων, βάσει των σημασιολογικών συσχετίσεων του περιεχομένου τους, στο πλαίσιο οργάνωσης των πληροφοριών ειδησεογραφικών ιστοτόπων. Για το σκοπό αυτό διενεργήθηκαν πέντε συνολικά μελέτες, στις οποίες πραγματοποιήθηκε τόσο ποσοτική όσο και ποιοτική σύγκριση των κατηγοριοποιήσεων που προέκυψαν από συμμετέχοντες σε αντίστοιχες μελέτες ταξινόμησης καρτών ανοικτού και κλειστού τύπου, με τα αποτελέσματα της τεχνικής AutoCardSorter. Από την ανάλυση των αποτελεσμάτων προέκυψε ότι η AutoCardSorter παρήγαγε ομαδοποιήσεις άρθρων που βρίσκονται σε μεγάλη συμφωνία με αυτές των συμμετεχόντων στις μελέτες, αλλά με σημαντικά αποδοτικότερο τρόπο, επιβεβαιώνοντας προηγούμενες παρόμοιες μελέτες σε ιστοτόπους άλλων θεματικών κατηγοριών. Επιπρόσθετα, οι μελέτες έδειξαν ότι μία ελαφρώς τροποποιημένη εκδοχή της AutoCardSorter τοποθετεί νέα άρθρα σε προϋπάρχουσες κατηγορίες με αρκετά μικρότερο ποσοστό συμφωνίας συγκριτικά με τον τρόπο που επέλεξαν οι συμμετέχοντες. Η εργασία ολοκληρώνεται με την παρουσίαση κατευθύνσεων για την βελτίωση της αποτελεσματικότητας της AutoCardSorter, τόσο στο πλαίσιο οργάνωσης του περιεχομένου ειδησεογραφικών ιστοτόπων όσο και γενικότερα. / The proper structure of a website's content, so as to increase the findability of the information provided and to ease the typical user task-making, is one of the primary goals of website designers. The existing methods from the field of HCI that assist designers in this, are often neglected due to their high cost and human resources demanded. Even more so on News Sites, their size and the daily content updating call for improved and more efficient techniques. In this thesis we investigate the efficiency of a novel method, called AutoCardSorter, that has been suggested in bibliography for the semi-automatic content categorisation based on the semantic similarity of each webpage-content. To accomplish this we conducted five comparative studies in which the method was compared, to the primary alternatives of the classic Card Sorting method (open, closed). The analysis of the results showed that AutoCardSorter suggested article categories with high relavance to the ones suggested from a group of human subjects participating in the CardSort studies, although in a much more efficient way. This confirms the results of similar previous studies on websites of other themes (eg. travel, education). Moreover, the studies showed that a modified version of the method places articles under pre-existing categories with significant less relavance to the categorisation suggested by the participants. The thesis is concluded with the proposal of different ways to improve the proposed method's efficiency, both in the content of News Sites and in general. Ταξινόμηση καρτών 025.042 2 Human-computer interaction Semantic similarity measures Latent semantic analysis (LSA) Information architecture Card sorting
37	Especificação, instanciação e experimentação de um arcabouço para criação automática de ligações hipertexto entre informações homogêneas / Specification, instantion and experimentation of a framework intended to support the task of automatic creation of hypertext links between homogeneous repositories Alessandra Alaniz Macedo 02 July 2004 (has links) Com a evolução da informática, diferentes meios de comunicação passaram a explorar a Web como um meio de divulgação de suas informações. Diferentes fontes de informações, diferentes estilos de escrita e a curiosidade nata do ser humano despertam o interesse de leitores por conhecer mais de um relato sobre um mesmo tema. Para que a leitura de diferentes relatos com conteúdo similar seja possível, leitores precisam procurar, ler e analisar informações fornecidas por diferentes fontes de informação. Essa atividade, além de exigir grande investimento de tempo, sobrecarrega cognitivamente usuários. Faz parte das pesquisas da área de Hipermídia investigar mecanismos que apóiem usuários no processo de identificação de informações em repositórios homogêneos, sejam eles disponibilizados na Web ou não. No contexto desta tese, repositórios com informações de conteúdo homogêneo são aqueles cujas informações tratam do mesmo assunto. Esta tese tem por objetivo investigar a especificação, a instanciação e a experimentação de um arcabouço para apoiar a tarefa de criação automática de ligações hipertexto entre repositórios homogêneos. O arcabouço proposto, denominado CARe (Criação Automática de Relacionamentos), é representado por um conjunto de classes que realizam a coleta de informações a serem relacionadas e que processam essas informações para a geração de índices. Esses índices são relacionados e utilizados na criação automática de ligações hipertexto entre a informação original. A definição do arcabouço se deu após uma fase de análise de domínio na qual foram identificados requisitos e construídos componentes de software. Nessa fase, vários protótipos também foram construídos de modo iterativo / With the evolution of the Internet, distinct communication media have focused on the Web as a channel of information publishing. An immediate consequence is an abundance of sources of information and writing styles in the Web. This effect, combining with the inherent curiosity of human beings, has led Web users to look for more than a single article about a same subject. To gain access to separate on a same subject, readers need to search, read and analyze information provided by different sources of information. Besides consuming a great amount of time, that activity imposes a cognitive overhead to users. Several hypermedia researches have investigated mechanisms for supporting users during the process of identifying information on homogeneous repositories, available or not on the Web. In this thesis, homogeneous repositories are those containing information that describes a same subject. This thesis aims at investigating the specification and the construction of a framework intended to support the task of automatic creation of hypertext links between homogeneous repositories. The framework proposed, called CARe (Automatic Creation of Relationships), is composed of a set of classes, methods and relationships that gather information to be related, and also process that information for generating an index. Those indexes are related and used in the automatic creation of hypertext links among distinct excerpts of original information. The framework was defined based on a phase of domain analysis in which requirements were identified and software components were built. In that same phase several prototypes were developed in an iterative prototyping Análise da semântica latente Computação ubíqua Hipermídia Indexação Informação homogênea Recuperação de informação Relacionamentos semânticos Web Homogeneous repositories Hyperlinks Hypermedia Indexing Information retrieval Latent semantic analysis Semantic relationships Web
38	Text mining Twitter social media for Covid-19 : Comparing latent semantic analysis and latent Dirichlet allocation Sheikha, Hassan January 2020 (has links) In this thesis, the Twitter social media is data mined for information about the covid-19 outbreak during the month of March, starting from the 3’rd and ending on the 31’st. 100,000 tweets were collected from Harvard’s opensource data and recreated using Hydrate. This data is analyzed further using different Natural Language Processing (NLP) methodologies, such as termfrequency inverse document frequency (TF-IDF), lemmatizing, tokenizing, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Furthermore, the results of the LSA and LDA algorithms is reduced dimensional data that will be clustered using clustering algorithms HDBSCAN and K-Means for later comparison. Different methodologies are used to determine the optimal parameters for the algorithms. This is all done in the python programing language, as there are libraries for supporting this research, the most important being scikit-learn. The frequent words of each cluster will then be displayed and compared with factual data regarding the outbreak to discover if there are any correlations. The factual data is collected by World Health Organization (WHO) and is then visualized in graphs in ourworldindata.org. Correlations with the results are also looked for in news articles to find any significant moments to see if that affected the top words in the clustered data. The news articles with good timelines used for correlating incidents are that of NBC News and New York Times. The results show no direct correlations with the data reported by WHO, however looking into the timelines reported by news sources some correlation can be seen with the clustered data. Also, the combination of LDA and HDBSCAN yielded the most desireable results in comparison to the other combinations of the dimnension reductions and clustering. This was much due to the use of GridSearchCV on LDA to determine the ideal parameters for the LDA models on each dataset as well as how well HDBSCAN clusters its data in comparison to K-Means. Data mining Text mining artificial intelligence Natural language processing Latent Semantic Analysis Latent Dirichlet Allocation KMeans HDBSCAN Dimension reduction Engineering and Technology Teknik och teknologier
39	Automatická tvorba tezauru z wikipedie / Acquiring Thesauri from Wikipedia Novák, Ján January 2011 (has links) This thesis deals with automatic acquiring thesauri from Wikipedia. It describes Wikipedia as a suitable data set for thesauri acquiring and also methods for computing semantic similarity of terms are described. The thesis also contains a description of concepts and implementation of the system for automatic thesauri acquiring. Finally, the implemented system is evaluated by the standard metrics, such as precision or recall.
40	Získávání skrytých znalostí z online dat souvisejících s vysokými školami Hlaváč, Jakub January 2019 (has links) Social networks are a popular form of communication. They are also used by universities in order to simplify information providing and addressing candidates for study. Foreign study stays are also a popular form of education. Students, however, encounter a number of obstacles. The results of this work can help universities make their social network communication more efficient and better support foreign studies. In this work, the data from Facebook related to Czech universities and the Erasmus program questionnaire data were analyzed in order to find useful knowledge. The main emphasis was on textual content of communication. The statistical and machine learning methods, including mostly feature selection, topic modeling and clustering were used. The results reveal interesting and popular topics discussed on Czech universities social networks. The main problems of students related to their foreign studies were identified too and some of them were compared for countries and universities.

Search results