Global ETD Search

11	A longitudinal study of academic web links : identifying and explaining change Payne, Nigel January 2007 (has links) A problem common to all current web link analyses is that, as the web is continuously evolving, any web-based study may be out of date by the time it is published in academic literature. It is therefore important to know how web link analyses results vary over time, with a low rate of variation lengthening the amount of time corresponding to a tolerable loss in quality. Moreover, given the lack of research on how academic web spaces change over time, from an information science perspective it would interesting to see what patterns and trends could be identified by longitudinal research and the study of university web links seems to provide a convenient means by which to do so. The aim of this research is to identify and track changes in three academic webs (UK, Australia and New Zealand) over time, tracking various aspects of academic webs including site size and overall linking characteristics, and to provide theoretical explanations of the changes found. This should therefore provide some insight into the stability of previous and future webometric analyses. Alternative Document Models (ADMs), created with the purpose of reducing the extent to which anomalies occur in counts of web links at the page level, have been used extensively within webometrics as an alternative to using the web page as the basic unit of analysis. This research carries out a longitudinal study of ADMs in an attempt to ascertain which model gives the most consistent results when applied to the UK, Australia and New Zealand academic web spaces over the last six years. The results show that the domain ADM gives the most consistent results with the directory ADM also giving more reliable results than are evident when using the standard page model. Aggregating at the site (or university) level appears to provide less consistent results than using the page as the standard unit of measure, and this finding holds true over all three academic webs and for each time period examined over the last six years. The question of whether university web sites publish the same kind of information and use the same kind of hyperlinks year on year is important from the perspective of interpreting the results of academic link analyses, because changes in link types over time would also force interpretations of link analyses to change over time. This research uses a link classification exercise to identify temporal changes in the distribution of different types of academic web links, using three academic web spaces in the years 2000 and 2006. Significant increases in ‘research oriented’, ‘social/leisure’ and ‘superficial’ links were identified as well as notable decreases in the ‘technical’ and ‘personal’ links. Some of these changes identified may be explained by general changes in the management of university web sites and some by more wide-spread Internet trends, e.g., dynamic pages, blogs and social networking. The increase in the proportion of research-oriented links is particularly hopeful for future link analysis research. Identifying quantitative trends in the UK, Australian and New Zealand academic webs from 2000 to 2005 revealed that the number of static pages and links in each of the three academic webs appears to have stabilised as far back as 2001. This stabilisation may be partly due to an increase in dynamic pages which are normally excluded from webometric analyses. In response to the problem for webometricians due to the constantly changing nature of the Internet, the results presented here are encouraging evidence that webometrics for academic spaces may have a longer-term validity than would have been previously assumed. The relationship between university inlinks and research activity indicators over time was examined, as well as the reasons for individual universities experiencing significant increases and decreases in inlinks over the last six years. The findings indicate that between 66% and 70% of outlinks remain the same year on year for all three academic web spaces, although this stability conceals large individual differences. Moreover, there is evidence of a level of stability over time for university site inlinks when measured against research. Surprisingly however, inlink counts can vary significantly from year to year for individual universities, for reasons unrelated to research, underlining that webometric results should be interpreted cautiously at the level of individual universities. Therefore, on average since 2001 the university web sites of the UK, Australia and New Zealand have been relatively stable in terms of size and linking patterns, although this hides a constant renewing of old pages and areas of the sites. In addition, the proportion of research-related links seems to be slightly increasing. Whilst the former suggests that webometric results are likely to have a surprisingly long shelf-life, perhaps closer to five years than one year, the latter suggests that webometrics is going to be increasingly useful as a tool to track research online. While there have already been many studies involving academic webs spaces, and much work has been carried out on the web from a longitudinal perspective, this thesis concentrates on filling a critical gap in current webometric research by combining the two and undertaking a longitudinal study of academic webs. In comparison with previous web-related longitudinal studies this thesis makes a number of novel contributions. Some of these stem from extending established webometric results, either by introducing a longitudinal aspect (looking at how various academic web metrics such as research activity indicators, site size or inlinks change over time) or by their application to other countries. Other contributions are made by combining traditional webometric methods (e.g. combining topical link classification exercises with longitudinal study) or by identifying and examining new areas for research (for example, dynamic pages and non-HTML documents). No previous web-based longitudinal studies have focused on academic links and so the main findings that (for UK, Australian and New Zealand academic webs between 2000 and 2006) certain academic link types exhibit changing patterns over time, approximately two-thirds of outlinks remain the same year on year and the number of static pages and links appears to have stabilised are both significant and novel. 025.04
12	Effective web service discovery using a combination of a semantic model and a data mining technique Bose, Aishwarya January 2008 (has links) With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
13	Trilhas de comunicação científica : links de postagens de pesquisadores brasileiros nos blogs de ciência / Trails of scientific communication: links of posts of Brazilian researchers in science blogs Sousa, Rodrigo Silva Caxias de January 2011 (has links) O objetivo deste estudo consiste em interpretar o uso dos links nas postagens dos blogs de pesquisadores brasileiros. A investigação tem início através da análise dos links dos blogs inclusos no Anel de Blogs Científicos. Sua efetivação ocorre a partir da composição dos aglomerados das redes de links oriundas dos blogrolls, dos blogs de ciência circunscritos ao Anel de Blogs Científicos. Em momento posterior, foram categorizados os 640 links inclusos no conteúdo das postagens de acordo com as categorias de funções e motivações obtidas de estudos precedentes, e de categorias que emergiram do fenômeno estudado. Por fim, a compreensão das motivações para a inserção de links na composição das postagens dos blogs e as funções que esses links cumprem foi inferida através da Análise de Conteúdo dos contextos aos quais os links estavam inseridos, considerando os espaços aos quais o leitor é remetido a partir de sua ativação. Resultados indicam que o conceito de diários pessoais, em que as mensagens têm um número reduzido de caracteres e são apresentadas em ordem cronológica inversa, só se confirmam em relação a essa última característica, decorrente dos softwares de composição que assim a condicionam. A pouca incidência de links em relação aos blogs de pesquisadores permite afirmar que há uma baixa conectividade por parte dos blogs compostos por diferentes atores que compõem a amostra relativa ao Anel e às áreas as quais pertencem, reforçada pela baixa existência de links entre as postagens e os comentários. Reforça essa questão a ausência de links trackbaks entre os comentários das postagens selecionadas. A primeira das hipóteses que guia o estudo foi refutada, na medida em que as postagens não indicam rearticulações através de seus links de interlocuções entre pesquisadores, leigos e jornalistas científicos, não permitindo reordenações e maior amplitude dessas interlocuções junto à sociedade. A segunda das hipóteses foi confirmada por se basear no fato de que os links colocam em evidência que os documentos e fontes de informação relacionados a partir desses dispositivos (links) são um híbrido de uso e socialização de informações, estas tanto circunscritas às fontes e documentos científicos quanto a fontes e documentos que não se caracterizem tradicionalmente como parte do ciclo de produção científica. A terceira hipótese que guia este estudo foi refutada, pois os dados analisados indicam que o uso dos links por pesquisadores brasileiros não se baseia em funções e motivações que objetivam agilizar processos de produção e comunicação dos resultados de pesquisa através de blogs. / The aim of this study is to interpret the use of links on the blog postings by Brazilian researchers. The investigation begins by examining the links of blogs included in the Anel de Blogs Científicos (Ring of Science Blogs). Its effectiveness is based on the combination of clusters of link networks coming from the blogrolls restricted to the Anel de Blogs Científicos. Afterwards the 640 links included in the content of the posts were classified according to categories of functions and motivations obtained from previous studies and from categories that emerged from the phenomenon studied. Finally, understanding of motivations for the insertion of links on the composition of the blog posts and the functions that such links fulfill were inferred by Content Analysis of the contexts to which the links were inserted, considering the locations to which the reader is referred from its activation. Results indicate that the concept of personal journals, in which the messages have a limited number of characters and are presented in reverse chronological order, is only confirmed through this latter feature, as a result of the software used for composing the posts. The low incidence of links in relation to blogs of researchers allows us to state that there is low connectivity by blogs composed by different authors who form the sample relative to the Anel and the areas which they belong to, reinforced by the low availability of links between posts and comments. What reinforces this point is the absence of trackbak links among comments of the selected posts. The first hypothesis guiding the study was refuted, in that the posts do not indicate rearticulations through its links of dialogues among researchers, lay people and science journalists, not allowing rearrangements and higher amplitude of these dialogues with society. The second hypothesis was confirmed by relying on the fact that the links give evidence that the documents and related sources of information from those devices (links) are a hybrid of using and sharing of information – this information being limited to the sources and scientific documents as much as to sources and documents that are not traditionally characterized as part of the cycle of scientific production. The third hypothesis that guides this study was refuted because the data analyzed indicate that the use of links by Brazilian researchers is not based on functions and motivations that aim at streamlining production processes and communication of research results through blogs. Blog científico Comunicação científica Webometria Análise de links Blogs Science communication Link analysis Webometrics
14	Trilhas de comunicação científica : links de postagens de pesquisadores brasileiros nos blogs de ciência / Trails of scientific communication: links of posts of Brazilian researchers in science blogs Sousa, Rodrigo Silva Caxias de January 2011 (has links) O objetivo deste estudo consiste em interpretar o uso dos links nas postagens dos blogs de pesquisadores brasileiros. A investigação tem início através da análise dos links dos blogs inclusos no Anel de Blogs Científicos. Sua efetivação ocorre a partir da composição dos aglomerados das redes de links oriundas dos blogrolls, dos blogs de ciência circunscritos ao Anel de Blogs Científicos. Em momento posterior, foram categorizados os 640 links inclusos no conteúdo das postagens de acordo com as categorias de funções e motivações obtidas de estudos precedentes, e de categorias que emergiram do fenômeno estudado. Por fim, a compreensão das motivações para a inserção de links na composição das postagens dos blogs e as funções que esses links cumprem foi inferida através da Análise de Conteúdo dos contextos aos quais os links estavam inseridos, considerando os espaços aos quais o leitor é remetido a partir de sua ativação. Resultados indicam que o conceito de diários pessoais, em que as mensagens têm um número reduzido de caracteres e são apresentadas em ordem cronológica inversa, só se confirmam em relação a essa última característica, decorrente dos softwares de composição que assim a condicionam. A pouca incidência de links em relação aos blogs de pesquisadores permite afirmar que há uma baixa conectividade por parte dos blogs compostos por diferentes atores que compõem a amostra relativa ao Anel e às áreas as quais pertencem, reforçada pela baixa existência de links entre as postagens e os comentários. Reforça essa questão a ausência de links trackbaks entre os comentários das postagens selecionadas. A primeira das hipóteses que guia o estudo foi refutada, na medida em que as postagens não indicam rearticulações através de seus links de interlocuções entre pesquisadores, leigos e jornalistas científicos, não permitindo reordenações e maior amplitude dessas interlocuções junto à sociedade. A segunda das hipóteses foi confirmada por se basear no fato de que os links colocam em evidência que os documentos e fontes de informação relacionados a partir desses dispositivos (links) são um híbrido de uso e socialização de informações, estas tanto circunscritas às fontes e documentos científicos quanto a fontes e documentos que não se caracterizem tradicionalmente como parte do ciclo de produção científica. A terceira hipótese que guia este estudo foi refutada, pois os dados analisados indicam que o uso dos links por pesquisadores brasileiros não se baseia em funções e motivações que objetivam agilizar processos de produção e comunicação dos resultados de pesquisa através de blogs. / The aim of this study is to interpret the use of links on the blog postings by Brazilian researchers. The investigation begins by examining the links of blogs included in the Anel de Blogs Científicos (Ring of Science Blogs). Its effectiveness is based on the combination of clusters of link networks coming from the blogrolls restricted to the Anel de Blogs Científicos. Afterwards the 640 links included in the content of the posts were classified according to categories of functions and motivations obtained from previous studies and from categories that emerged from the phenomenon studied. Finally, understanding of motivations for the insertion of links on the composition of the blog posts and the functions that such links fulfill were inferred by Content Analysis of the contexts to which the links were inserted, considering the locations to which the reader is referred from its activation. Results indicate that the concept of personal journals, in which the messages have a limited number of characters and are presented in reverse chronological order, is only confirmed through this latter feature, as a result of the software used for composing the posts. The low incidence of links in relation to blogs of researchers allows us to state that there is low connectivity by blogs composed by different authors who form the sample relative to the Anel and the areas which they belong to, reinforced by the low availability of links between posts and comments. What reinforces this point is the absence of trackbak links among comments of the selected posts. The first hypothesis guiding the study was refuted, in that the posts do not indicate rearticulations through its links of dialogues among researchers, lay people and science journalists, not allowing rearrangements and higher amplitude of these dialogues with society. The second hypothesis was confirmed by relying on the fact that the links give evidence that the documents and related sources of information from those devices (links) are a hybrid of using and sharing of information – this information being limited to the sources and scientific documents as much as to sources and documents that are not traditionally characterized as part of the cycle of scientific production. The third hypothesis that guides this study was refuted because the data analyzed indicate that the use of links by Brazilian researchers is not based on functions and motivations that aim at streamlining production processes and communication of research results through blogs. Blog científico Comunicação científica Webometria Análise de links Blogs Science communication Link analysis Webometrics
15	Segmentação dos usuários de cartão de crédito por meio da análise de cesto de compras / Segmentation of credit card clients by market basket analysis Pedro Daniel Tavares 17 January 2012 (has links) Esta dissertação de mestrado tem como objetivo, elaborar um modelo de segmentação baseando-se no comportamento comprovado de consumo de clientes, valendo-se das técnicas de Análise de Associação e Análise de Cesto de Compras, aplicadas aos dados das faturas de cartão de crédito dos clientes. A partir do modelo proposto, testou-se a previsibilidade das próximas transações dos clientes por meio de uma amostra de validação. A motivação desta pesquisa provém de três pilares: Contexto Científico, Tecnológico e Mercadológico. No Contexto Científico, apesar de já terem sido publicados artigos que associam a utilização do cartão de crédito a perfis de segmentação de clientes, não se encontram publicados estudos que relacionam dados da própria utilização do cartão como fonte de informação do cliente. A razão mais provável para isso é a dificuldade no levantamento dos dados fundamentais para este tipo de pesquisa. Com o apoio de uma grande instituição financeira, este trabalho está se tornando viável, sob o preceito da análise apenas sobre bases de clientes anônimos e que não transpareça informações estratégicas da instituição. No contexto tecnológico, com a tecnologia de informação em crescente desenvolvimento, as operações feitas com cartão de crédito tem o processamento on-line em tempo real, promovendo a troca de informação entre o estabelecimento comercial e a instituição emissora do cartão de crédito no momento em que a cobrança é lançada e aceita pelo consumidor final. Isso possibilita que ações promocionais sejam realizadas em toda a cadeia de valor de cartões de crédito, gerando mais valor para os clientes e empresas. No contexto mercadológico, o Brasil apresentou altas taxas de crescimento do mercado de cartões de crédito nas últimas décadas, substituindo os outros meios mais antigos de pagamento e de crediário. Especialmente no Brasil, observam-se compras pagas com o uso do cartão de crédito parceladas com e sem juros, o que contribui para a substituição de outras formas de crédito. Como benefício deste trabalho, concluiu-se que a partir do conhecimento do consumo do cliente, pode-se aplicar a análise de cesto de compras para prever as próximas transações dos clientes, a fim de segmentar os clientes para estimulá-los a aderir a uma determinada oferta. / The objective of this research is elaborating a Segmentation Model based on credit card client\'s behavior using Link Analysis and Market Basket Analysis techniques. The proposed model was used to testing the predictability of next client transactions through validation sample. Scientific, technological and marketing scenarios are the three motivational pillars of this research. On scientific context there were published studies that associate credit card use with segmentation profile of customer. However these studies do not establish relationship between data from own clients credit card utilization. One probably reason for this lack analysis into studies is the difficult collect of fundamental data. This research was feasible with the support of a great Brazilian financial group. On technological context is observed a wide information technology development. Credit cards transactions have on-line processing. This scenario allows exchange information between market and credit card institution at the moment of final client transaction approval. This technology permits that actions be realized along credit card value chain based on transactions that have been made. On marketing context, during the latest decades, Brazil has shown large growth rates on credit card beyond older ways of payment. In Brazil, is observed a wide utilization of credit cards in installment purchases contributing for the replacement of other ways of credits. This research conclude that from the knowledge of client consume profile, using the Market Basket Analysis technique, it is possible to get a forecast of purchase transactions with the objective to stimulate the consumer in accept particular offer. Cartão de crédito Segmentação de mercado Credit card Customer relationship management Link analysis Market basket analysis Segmentation
16	Trilhas de comunicação científica : links de postagens de pesquisadores brasileiros nos blogs de ciência / Trails of scientific communication: links of posts of Brazilian researchers in science blogs Sousa, Rodrigo Silva Caxias de January 2011 (has links) O objetivo deste estudo consiste em interpretar o uso dos links nas postagens dos blogs de pesquisadores brasileiros. A investigação tem início através da análise dos links dos blogs inclusos no Anel de Blogs Científicos. Sua efetivação ocorre a partir da composição dos aglomerados das redes de links oriundas dos blogrolls, dos blogs de ciência circunscritos ao Anel de Blogs Científicos. Em momento posterior, foram categorizados os 640 links inclusos no conteúdo das postagens de acordo com as categorias de funções e motivações obtidas de estudos precedentes, e de categorias que emergiram do fenômeno estudado. Por fim, a compreensão das motivações para a inserção de links na composição das postagens dos blogs e as funções que esses links cumprem foi inferida através da Análise de Conteúdo dos contextos aos quais os links estavam inseridos, considerando os espaços aos quais o leitor é remetido a partir de sua ativação. Resultados indicam que o conceito de diários pessoais, em que as mensagens têm um número reduzido de caracteres e são apresentadas em ordem cronológica inversa, só se confirmam em relação a essa última característica, decorrente dos softwares de composição que assim a condicionam. A pouca incidência de links em relação aos blogs de pesquisadores permite afirmar que há uma baixa conectividade por parte dos blogs compostos por diferentes atores que compõem a amostra relativa ao Anel e às áreas as quais pertencem, reforçada pela baixa existência de links entre as postagens e os comentários. Reforça essa questão a ausência de links trackbaks entre os comentários das postagens selecionadas. A primeira das hipóteses que guia o estudo foi refutada, na medida em que as postagens não indicam rearticulações através de seus links de interlocuções entre pesquisadores, leigos e jornalistas científicos, não permitindo reordenações e maior amplitude dessas interlocuções junto à sociedade. A segunda das hipóteses foi confirmada por se basear no fato de que os links colocam em evidência que os documentos e fontes de informação relacionados a partir desses dispositivos (links) são um híbrido de uso e socialização de informações, estas tanto circunscritas às fontes e documentos científicos quanto a fontes e documentos que não se caracterizem tradicionalmente como parte do ciclo de produção científica. A terceira hipótese que guia este estudo foi refutada, pois os dados analisados indicam que o uso dos links por pesquisadores brasileiros não se baseia em funções e motivações que objetivam agilizar processos de produção e comunicação dos resultados de pesquisa através de blogs. / The aim of this study is to interpret the use of links on the blog postings by Brazilian researchers. The investigation begins by examining the links of blogs included in the Anel de Blogs Científicos (Ring of Science Blogs). Its effectiveness is based on the combination of clusters of link networks coming from the blogrolls restricted to the Anel de Blogs Científicos. Afterwards the 640 links included in the content of the posts were classified according to categories of functions and motivations obtained from previous studies and from categories that emerged from the phenomenon studied. Finally, understanding of motivations for the insertion of links on the composition of the blog posts and the functions that such links fulfill were inferred by Content Analysis of the contexts to which the links were inserted, considering the locations to which the reader is referred from its activation. Results indicate that the concept of personal journals, in which the messages have a limited number of characters and are presented in reverse chronological order, is only confirmed through this latter feature, as a result of the software used for composing the posts. The low incidence of links in relation to blogs of researchers allows us to state that there is low connectivity by blogs composed by different authors who form the sample relative to the Anel and the areas which they belong to, reinforced by the low availability of links between posts and comments. What reinforces this point is the absence of trackbak links among comments of the selected posts. The first hypothesis guiding the study was refuted, in that the posts do not indicate rearticulations through its links of dialogues among researchers, lay people and science journalists, not allowing rearrangements and higher amplitude of these dialogues with society. The second hypothesis was confirmed by relying on the fact that the links give evidence that the documents and related sources of information from those devices (links) are a hybrid of using and sharing of information – this information being limited to the sources and scientific documents as much as to sources and documents that are not traditionally characterized as part of the cycle of scientific production. The third hypothesis that guides this study was refuted because the data analyzed indicate that the use of links by Brazilian researchers is not based on functions and motivations that aim at streamlining production processes and communication of research results through blogs. Blog científico Comunicação científica Webometria Análise de links Blogs Science communication Link analysis Webometrics
17	A Comparison of Katz-eig and Link-analysis for Implicit Feedback Recommender Systems / En jämförelse av Katz-eig och Link-analysis för rekommendationssystem med implicit återkoppling Hietala, Jonas January 2015 (has links) Recommendations are becoming more and more important in a world where there is an abundance of possible choices and e-commerce and content providers are featuring recommendations prominently. Recommendations based on explicit feedback, where user is giving feedback for example with ratings, has been a popular research subject. Implicit feedback recommender systems which passively collects information about the users is an area growing in interest. It makes it possible to generate recommendations based purely from a user's interactions history without requiring any explicit input from the users, which is commercially useful for a wide area of businesses. This thesis builds a recommender system based on implicit feedback using the recommendation algorithms katz-eig and link-analysis and analyzes and implements strategies for learning optimized parameters for different datasets. The resulting system forms the foundation for Comordo Technologies' commercial recommender system. / Rekommendationer blir viktigare och viktigare i en värld där det finns ett överflöd av möjliga val och där e-handel och innehållsleverantörer använder rekommendationer flitigt. Rekommendationer baserad på explicit återkoppling, där användare ger återkoppling med till exempel betyg, har varit ett populärt forskningsområde. Rekommendationssystem med implicit återkoppling som passivt samlar in information om användarna är ett område som blir mer och mer intressant. Det gör det möjligt att generera rekommendationer endast baserat på en användares interaktionshistoria utan krav på explicit input från användarna, vilket är kommersiellt användbart för en rad olika versamheter. Den här uppsatsen bygger ett rekommendationssystem med implicit återkoppling med rekommendationsalgoritmerna katz-eig och link-analysis och analyserar och implementerar optimeringsstrategier för inlärning av optimerade parameterar för olika dataset. Systemet lägger grunden för Comordo Technologies kommersiella rekommendationssystem. katz-eig link analysis recommender systems machine learning Computer Sciences Datavetenskap (datalogi)
18	A Study on Social Information Search and Analysis on the Web by Diversity Computation / 多様性計算に基づくウェブ上のソーシャル情報の検索と分析に関する研究 Shoji, Yoshiyuki 23 March 2015 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第19119号 / 情博第565号 / 新制\|\|情\|\|99(附属図書館) / 32070 / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授田中克己, 教授吉川正俊, 教授黒橋禎夫 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Information retrieval Web search Social media Diversity Link analysis Web 2.0 007
19	Collaboration between UK universities : a machine-learning based webometric analysis Kenekayoro, Patrick January 2014 (has links) Collaboration is essential for some types of research, which is why some agencies include collaboration among the requirements for funding research projects. Studying collaborative relationships is important because analyses of collaboration networks can give insights into knowledge based innovation systems, the roles that different organisations play in a research field and the relationships between scientific disciplines. Co-authored publication data is widely used to investigate collaboration between organisations, but this data is not free and thus may not be accessible for some researchers. Hyperlinks have some similarities with citations, so hyperlink data may be used as an indicator to estimate the extent of collaboration between academic institutions and may be able to show types of relationships that are not present in co-authorship data. However, it has been shown that using raw hyperlink counts for webometric research can sometimes produce unreliable results, so researchers have attempted to find alternate counting methods and have tried to identify the reasons why hyperlinks may have been created in academic websites. This thesis uses machine learning techniques, an approach that has not previously been widely used in webometric research, to automatically classify hyperlinks and text in university websites in an attempt to filter out irrelevant hyperlinks when investigating collaboration between academic institutions. Supervised machine learning methods were used to automatically classify the web page types that can be found in Higher Education Institutions’ websites. The results were assessed to see whether ii automatically filtered hyperlink data gave better results than raw hyperlink data in terms of identifying patterns of collaboration between UK universities. Unsupervised learning methods were used to automatically identify groups of university departments that are collaborating or that may benefit from collaborating together, based on their co-appearance in research clusters. Results show that the machine learning methods used in this thesis can automatically identify both the source and target web page categories of hyperlinks in university websites with up to 78% accuracy; which means that it can increase the possibility for more effective hyperlink classification or for identifying the reasons why hyperlinks may have been created in university websites, if those reasons can be inferred from the relationship between the source and target page types. When machine learning techniques were used to filter hyperlinks that may not have been created because of collaboration from the hyperlink data, there was an increased correlation between hyperlink data and other collaboration indicators. This emphasises the possibility for using machine learning methods to make hyperlink data a more reliable data source for webometric research. The reasons for university name mentions in the different web page types found in an academic institution’s website are broadly the same as the reasons for link creation, this means that classification based on inter-page relationships may also be used to improve name mentions data for webometrics research. iii Clustering research groups based on the text in their homepages may be useful for identifying those research groups or departments with similar research interests which may be valuable for policy makers in monitoring research fields; based on the sizes of identified clusters and for identifying future collaborators; based on co-appearances in clusters, if identical research interests is a factor that can influence the choice of a future collaborator. In conclusion, this thesis shows that machine learning techniques can be used to significantly improve the quality of hyperlink data for webometrics research, and can also be used to analyse other web based data to give additional insights that may be beneficial for webometrics studies. 378.1
20	[en] XHITS: EXTENDING THE HITS ALGORITHM FOR DISTILLATION OF BROAD SEARCH TOPIC ON WWW / [pt] XHITS: ESTENDENDO O ALGORITMO HITS PARA EXTRAÇÃO DE TÓPICOS NA WWW FRANCISCO BENJAMIM FILHO 20 September 2005 (has links) [pt] O ambiente baseado em hyperlink possui na sua topologia informações substanciais sobre o seu conteúdo. Baseado nesse tipo de ambiente, Jon Kleingerg desenvolveu um conjunto de algoritmos, popularmente conhecido como HITS (Hyperlink Induced Topic Search), que utiliza a estrutura de hyperlinks na WWW para extrair essas informações. O foco central desses algoritmos é a classificação de tópicos de busca de caráter geral na WWW, através da descoberta de páginas que representam autoridade sobre tais tópicos. Para tanto, os algoritmos formulam a noção de autoridade considerando o relacionamento, decorrente da estrutura de hyperlink, entre o conjunto de páginas que são autoridades relevantes e o conjunto de páginas que apontam para essas, denominadas de hubs. Jon Kleingerg definiu, portanto, uma relação de interdependência entre os conjuntos anteriormente citados: uma boa autoridade será uma página apontada por bons hubs e um bom hub será uma página que aponta para boas autoridades. Neste trabalho, propomos a extensão do modelo formulado por Jon Kleingerg, através da inserção de novos conceitos nas relações de interdependência entre autoridades e hubs. Assim, formulamos um algoritmo estendido, XHITS (Extended Hyperlink Induced Topic Search), que visa melhorar a classificação das autoridades do ambiente. Nessa extensão as autoridades são apontadas por bons hubs, às vezes apontadas por bons portais e também apontam para boas novidades. Os bons hubs são páginas que apontam para boas autoridades e novidades, e são apontados por bons portais. As boas novidades são páginas que são apontadas pelas boas autoridades, pelos bons hubs e pelos bons portais e bons portais são páginas que apontam para as boas autoridades, para bons hubs e para boas novidades. Adicionalmente, mostramos que o algoritmo proposto converge e também os diversos resultados experimentais que indicam a melhoria na precisão dos hiperdocumentos recuperados. / [en] The network structure of a hyperlinked environment can be a rich source of information about the content of this environment. Jon Kleinberg developed a set of algorithms, called HITS (Hyperlink Induced Topic Search), for extracting information from the hyperlink structures of those environments. The aim of these algorithms is the distillation of broad search topics, through the discovery of related authoritative information sources. The notion of authority is based on the hyperlink structure relationship between a set of relevant authoritative pages and the set of hubs. Thus, hubs and authorities exhibit what could be called a mutually reinforcing relationship: a good hub is a page that points to many good authorities; a good authority is a page that is pointed by many good hubs. In this work, we present the XHITS (Extended Hyperlink Induced Topic Search) algorithm, an extension of the HITS algorithm by introducing new concepts on the mutually reinforcing relationship. In XHITS, a good authority is a page that is pointed by many good hubs, some good portals and points to good novels; a good hub is a page that points to many good authorities, some good novels and is pointed by some good portals; and a good novel is a page that is pointed by good authorities, some good hubs and some good portals; a good portal is a page that points to some good authorities, some good hubs and some good novels. In addition, we show that XHITS converges and, through some experiments, the improved quality of the hyper documents retrieved. [pt] AUTORIDADE [en] AUTHORITY [pt] ANALISE DE HYPERLINKS [en] LINK ANALYSIS [pt] HUBS [en] HUBS [pt] HITS [en] HITS [pt] XHITS [en] XHITS

Search results