Global ETD Search

1	Ranking and its applications on web search. / 排序算法及其在網絡搜索中的應用 / Pai xu suan fa ji qi zai wang luo sou suo zhong de ying yong January 2011 (has links) Wang, Wei. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 106-122). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Thesis Contributions --- p.5 / Chapter 1.3 --- Thesis Organization --- p.8 / Chapter 2 --- Background and Literature Review --- p.9 / Chapter 2.1 --- Label Ranking in Machine Learning --- p.11 / Chapter 2.1.1 --- Label Ranking --- p.11 / Chapter 2.1.2 --- Semi-Supervised Learning --- p.12 / Chapter 2.1.3 --- The Development of Label Ranking --- p.14 / Chapter 2.2 --- Question Retrieval in Community Question Answering --- p.16 / Chapter 2.2.1 --- Question Retrieval --- p.16 / Chapter 2.2.2 --- Basic Question Retrieval Models --- p.18 / Chapter 2.2.3 --- The Development of Question Retrieval Models --- p.21 / Chapter 2.3 --- Ranking through CTR by Building Click Models --- p.24 / Chapter 2.3.1 --- Click Model's Importance --- p.24 / Chapter 2.3.2 --- A Simple Example of Click Model --- p.25 / Chapter 2.3.3 --- The Development of Click Models --- p.27 / Chapter 3 --- Semi-Supervised Label Ranking --- p.30 / Chapter 3.1 --- Motivation: The Limitations of Supervised Label Ranking --- p.30 / Chapter 3.2 --- Label Ranking and Semi-Supervised Learning Framework --- p.32 / Chapter 3.2.1 --- Label Ranking and Semi-Supervised Learning Setup --- p.32 / Chapter 3.2.2 --- Information Gain Decision Tree for Label Ranking --- p.37 / Chapter 3.2.3 --- Instance Based Label Ranking --- p.39 / Chapter 3.2.4 --- Mallows Model Decision Tree for Label Ranking --- p.40 / Chapter 3.3 --- Experiments --- p.40 / Chapter 3.3.1 --- Dataset Description --- p.41 / Chapter 3.3.2 --- Experimental Results --- p.42 / Chapter 3.3.3 --- Discussion --- p.42 / Chapter 3.4 --- Summary --- p.44 / Chapter 4 --- An Application of Label Ranking --- p.45 / Chapter 4.1 --- Motivation: The Limitations of Traditional Question Retrieval --- p.45 / Chapter 4.2 --- Intention Detection Using Label Ranking --- p.47 / Chapter 4.2.1 --- Question Intention Detection --- p.48 / Chapter 4.2.2 --- Label Ranking Algorithms --- p.50 / Chapter 4.2.3 --- Some Other Learning Algorithms --- p.53 / Chapter 4.3 --- Improved Question Retrieval Using Label Ranking --- p.54 / Chapter 4.3.1 --- Question Retrieval Models --- p.55 / Chapter 4.3.2 --- Improved Question Retrieval Model --- p.55 / Chapter 4.4 --- Experimental Setup --- p.56 / Chapter 4.4.1 --- Experiment Objective --- p.56 / Chapter 4.4.2 --- Experiment Design --- p.56 / Chapter 4.4.3 --- DataSet Description --- p.57 / Chapter 4.4.4 --- Question Feature --- p.59 / Chapter 4.5 --- Experiment Result and Comments --- p.60 / Chapter 4.5.1 --- Question Classification --- p.60 / Chapter 4.5.2 --- Classification Enhanced Question Retrieval --- p.63 / Chapter 4.6 --- Summary --- p.69 / Chapter 5 --- Ranking by CTR in Click Models --- p.71 / Chapter 5.1 --- Motivation: The Relational Influence's Importance in Click Models --- p.71 / Chapter 5.2 --- Click Models in Sponsored Search --- p.75 / Chapter 5.2.1 --- A Brief Review on Click Models --- p.76 / Chapter 5.3 --- Collaborating Influence Identification from Data Analysis --- p.77 / Chapter 5.3.1 --- Quantity Analysis --- p.77 / Chapter 5.3.2 --- Psychology Interpretation --- p.82 / Chapter 5.3.3 --- Applications Being Influenced --- p.82 / Chapter 5.4 --- Incorporating Collaborating Influence into CCM . --- p.83 / Chapter 5.4.1 --- Dependency Analysis of CCM --- p.83 / Chapter 5.4.2 --- Extended CCM --- p.84 / Chapter 5.4.3 --- Algorithms --- p.85 / Chapter 5.5 --- Incorporating Collaborating Influence into TCM . --- p.87 / Chapter 5.5.1 --- TCM --- p.87 / Chapter 5.5.2 --- Extended TCM --- p.88 / Chapter 5.5.3 --- Algorithms --- p.88 / Chapter 5.6 --- Experiment --- p.90 / Chapter 5.6.1 --- Dataset Description --- p.90 / Chapter 5.6.2 --- Experimental Setup --- p.91 / Chapter 5.6.3 --- Evaluation Metrics --- p.91 / Chapter 5.6.4 --- Baselines --- p.92 / Chapter 5.6.5 --- Performance on RMS --- p.92 / Chapter 5.6.6 --- Performance on Click Perplexity --- p.93 / Chapter 5.6.7 --- Performance on Log-Likelihood --- p.93 / Chapter 5.6.8 --- Significance Discussion --- p.98 / Chapter 5.6.9 --- Sensitivity Analysis --- p.98 / Chapter 5.7 --- Summary --- p.102 / Chapter 6 --- Conclusion and Future Work --- p.103 / Chapter 6.1 --- Conclusion --- p.103 / Chapter 6.2 --- Future Work --- p.105 / Bibliography --- p.106 Internet searching--mathematics Web search engines
2	An analysis of sports coverage on Canadian television station websites Fan, Ying 05 1900 (has links) Following the early days of the Internet and the World Wide Web, news media in Canada have gone on to develop their own news web sites with the intentions of meeting the on-line needs of media audiences, expanding their audience reach, and adding to revenue production and profitability on- and off-line. Web strategies have varied somewhat across the different media, but anecdotal evidence suggests that sports contents have been important for both print and television. This thesis focused on the latter, sports contents on television network websites, and was undertaken to evaluate how Canadian television stations are utilizing the Internet and web technologies to feature sports news and information. Only a few studies specific to sports television web sites have been done, and these have mainly focused on American news stations. The research objective of the thesis was to systematically examine the web presence of sports contents on Canadian television web sites by conducting a content analysis of identifiably unique sites in the Canadian context. A site analysis protocol was developed through an iterative process. An initial instrument was constructed drawing on past research in this area. In particular, prior work by Bates et al. (1996 & 1997), Pines (1999), Bucy, Lang, Potter & Grabe (1999), Sparks (2001) provided systematic measures for examiriirig the Web presence of television stations. Ha & James's definition of interactivity (1998) was also useful as was the work of Cho (1999), Rogers & Thorson (2000) on Internet advertising. The initial instrument was evaluated and modified during a series of trial scans. The final instrument focused on five areas: body of the home page, types of content, presentation mechanisms, interactivity and advertising. A systematic site analysis was conducted from August to October, 2003, and a total of twenty-one sports home pages were analyzed. Three web sites (TSN, Leafs TV and The Score) were found to have a good balance in the five areas evaluated in the study. The results of independent-samples t-tests showed that general television networks had more sports top news and hyperlinks to other news items than sport specialty networks. By comparison, sports specialty networks tended to have more sport-related search engines and greater efficiency of space. CBC's "Sports Forums" that were configured on its sports home page gave the public broadcaster the highest quotient for interactivity in comparison with the twenty private networks and stations in the study. Advertising was present in all of the sites, and the findings point to an increasing interest in the televisual and sport web site media in producing revenue through web-based advertising. Television stations -- Canada
3	Uma abordagem para a identificação automática de problemas de usabilidade em interfaces de sistemas web através de reconhecimento de padrões Santana, Gisele Alves 11 April 2013 (has links) CAPES / Recentemente, alguns sistemas estão sendo transferidos para a plataforma web. Muitos serviços e aplicações, incluindo sistemas de simulação e planejamento de energia e sistemas de automação, são desenvolvidos com interfaces baseadas na Internet. A usabilidade é a principal característica de uma interface e está associada com as funcionalidades de um sitema. Ela descreve o quão bem um produto pode ser utilizado para os fins propostos por seus usuários com eficácia, eficiência e satisfação. Este trabalho apresenta a aplicação de técnicas de Reconhecimento de Padrões na detecção e classificação automática de problemas de usabilidade na interface de um sistema web. O foco inicial do trabalho é centrado na identificação de possíveis problemas de usabilidade em formulários web. Os potenciais problemas de usabilidade do formulário web são definidos a partir das recomendações descritas na literatura. As tarefas realizadas pelo usuário são obtidas através da análise da interação do usuário armazenada em arquivos de log. A classificação de quais tarefas são realizadas conforme o esperado e quais são consideradas potenciais problemas de usabilidade é realizada através de uma Rede Neural Artificial. / Recently, some systems have been transferred to the web-based platform. Many services and applications, including those of power systems simulating and planning and automation systems, are developed with Internet-based interface. Usability is mainly a characteristic of the interface and is associated with the functionalities of the systems. It describes how well a product can be used for its intended purpose by its users with efficiency, effectiveness and satisfaction. This paper presents the application of pattern recognition techniques in automatic detection and classification of usability problems in the interface of a web system. The initial focus of this work is focused on identifying potential usability problems in web forms. The potential usability problems of the web form are defined based on the recommendations described in the literature. The tasks performed by the user are obtained through analysis of user interaction stored in log files. The classification of tasks which are performed as expected and what are considered potential usability problems is performed by an Artificial Neural Network. Sistemas de reconhecimento de padrões Web sites - Ratings and rankings Pattern recognition systems
4	Uma abordagem para a identificação automática de problemas de usabilidade em interfaces de sistemas web através de reconhecimento de padrões Santana, Gisele Alves 11 April 2013 (has links) CAPES / Recentemente, alguns sistemas estão sendo transferidos para a plataforma web. Muitos serviços e aplicações, incluindo sistemas de simulação e planejamento de energia e sistemas de automação, são desenvolvidos com interfaces baseadas na Internet. A usabilidade é a principal característica de uma interface e está associada com as funcionalidades de um sitema. Ela descreve o quão bem um produto pode ser utilizado para os fins propostos por seus usuários com eficácia, eficiência e satisfação. Este trabalho apresenta a aplicação de técnicas de Reconhecimento de Padrões na detecção e classificação automática de problemas de usabilidade na interface de um sistema web. O foco inicial do trabalho é centrado na identificação de possíveis problemas de usabilidade em formulários web. Os potenciais problemas de usabilidade do formulário web são definidos a partir das recomendações descritas na literatura. As tarefas realizadas pelo usuário são obtidas através da análise da interação do usuário armazenada em arquivos de log. A classificação de quais tarefas são realizadas conforme o esperado e quais são consideradas potenciais problemas de usabilidade é realizada através de uma Rede Neural Artificial. / Recently, some systems have been transferred to the web-based platform. Many services and applications, including those of power systems simulating and planning and automation systems, are developed with Internet-based interface. Usability is mainly a characteristic of the interface and is associated with the functionalities of the systems. It describes how well a product can be used for its intended purpose by its users with efficiency, effectiveness and satisfaction. This paper presents the application of pattern recognition techniques in automatic detection and classification of usability problems in the interface of a web system. The initial focus of this work is focused on identifying potential usability problems in web forms. The potential usability problems of the web form are defined based on the recommendations described in the literature. The tasks performed by the user are obtained through analysis of user interaction stored in log files. The classification of tasks which are performed as expected and what are considered potential usability problems is performed by an Artificial Neural Network. Sistemas de reconhecimento de padrões Web sites - Ratings and rankings Pattern recognition systems
5	An analysis of sports coverage on Canadian television station websites Fan, Ying 05 1900 (has links) Following the early days of the Internet and the World Wide Web, news media in Canada have gone on to develop their own news web sites with the intentions of meeting the on-line needs of media audiences, expanding their audience reach, and adding to revenue production and profitability on- and off-line. Web strategies have varied somewhat across the different media, but anecdotal evidence suggests that sports contents have been important for both print and television. This thesis focused on the latter, sports contents on television network websites, and was undertaken to evaluate how Canadian television stations are utilizing the Internet and web technologies to feature sports news and information. Only a few studies specific to sports television web sites have been done, and these have mainly focused on American news stations. The research objective of the thesis was to systematically examine the web presence of sports contents on Canadian television web sites by conducting a content analysis of identifiably unique sites in the Canadian context. A site analysis protocol was developed through an iterative process. An initial instrument was constructed drawing on past research in this area. In particular, prior work by Bates et al. (1996 & 1997), Pines (1999), Bucy, Lang, Potter & Grabe (1999), Sparks (2001) provided systematic measures for examiriirig the Web presence of television stations. Ha & James's definition of interactivity (1998) was also useful as was the work of Cho (1999), Rogers & Thorson (2000) on Internet advertising. The initial instrument was evaluated and modified during a series of trial scans. The final instrument focused on five areas: body of the home page, types of content, presentation mechanisms, interactivity and advertising. A systematic site analysis was conducted from August to October, 2003, and a total of twenty-one sports home pages were analyzed. Three web sites (TSN, Leafs TV and The Score) were found to have a good balance in the five areas evaluated in the study. The results of independent-samples t-tests showed that general television networks had more sports top news and hyperlinks to other news items than sport specialty networks. By comparison, sports specialty networks tended to have more sport-related search engines and greater efficiency of space. CBC's "Sports Forums" that were configured on its sports home page gave the public broadcaster the highest quotient for interactivity in comparison with the twenty private networks and stations in the study. Advertising was present in all of the sites, and the findings point to an increasing interest in the televisual and sport web site media in producing revenue through web-based advertising. / Education, Faculty of / Kinesiology, School of / Graduate Television stations -- Canada
6	How effective are college based websites at providing students with the information necessary to make an informed college choice? Escatiola, Joanne Ambat 01 January 2007 (has links) The purpose of the project was to develop a rubric to assess whether a selected group of college websites, chosen to represent most of what is available to students, meet the requirements necessary for students to make an informed college choice. The project was undertaken as a way to determine if these sites, individually or as a whole, present enough information for students to make a choice that correctly aligns with their college aspirations. Web sites Ratings and rankings Web sites Directories Universities and colleges Admission Web sites Web sites Ratings and rankings. Higher Education Administration Instructional Media Design
7	Classificação de sites a partir das análises estrutural e textual Ribas, Oeslei Taborda 28 August 2013 (has links) Com a ampla utilização da web nos dias atuais e também com o seu crescimento constante, a tarefa de classificação automática de sítios web têm adquirido importância crescente, pois em diversas ocasiões é necessário bloquear o acesso a sítios específicos, como por exemplo no caso do acesso a sítios de conteúdo adulto em escolas elementares e secundárias. Na literatura diferentes trabalhos têm surgido propondo novos métodos de classificação de sítios, com o objetivo de aumentar o índice de páginas corretamente categorizadas. Este trabalho tem por objetivo contribuir com os métodos atuais de classificação através de comparações de quatro aspectos envolvidos no processo de classificação: algoritmos de classificação, dimensionalidade (número de atributos considerados), métricas de avaliação de atributos e seleção de atributos textuais e estruturais presentes nas páginas web. Utiliza-se o modelo vetorial para o tratamento de textos e uma abordagem de aprendizagem de máquina clássica considerando a tarefa de classificação. Diversas métricas são utilizadas para fazer a seleção dos termos mais relevantes, e algoritmos de classificação de diferentes paradigmas são comparados: probabilista (Naıve Bayes), árvores de decisão (C4.5), aprendizado baseado em instâncias (KNN - K vizinhos mais próximos) e Máquinas de Vetores de Suporte (SVM). Os experimentos foram realizados em um conjunto de dados contendo sítios de dois idiomas, Português e Inglês. Os resultados demonstram que é possível obter um classificador com bons índices de acerto utilizando apenas as informações do texto ˆancora dos hyperlinks. Nos experimentos o classificador baseado nessas informações atingiu uma Medida-F de 99.59%. / With the wide use of the web nowadays, also with its constant growth, task of automatic classification of websites has gained increasing importance. In many occasions it is necessary to block access to specific sites, such as in the case of access to adult content sites in elementary and secondary schools. In the literature different studies has appeared proposing new methods for classification of sites, with the goal of increasing the rate of pages correctly categorized. This work aims to contribute to the current methods of classification by comparing four aspects involved in the classification process: classification algorithms, dimensionality (amount of selected attributes), attributes evaluation metrics and selection of textual and structural attributes present in webpages. We use the vector model to treat text and an machine learning classical approach according to the classification task. Several metrics are used to make the selection of the most relevant terms, and classification algorithms from different paradigms are compared: probabilistic (Na¨ıve Bayes), decision tree (C4.5), instance-based learning (KNN - K-Nearest Neighbor) and support vector machine (SVM). The experiments were performed on a dataset containing two languages, English and Portuguese. The results show that it is possible to obtain a classifier with good success indexes using only the information from the anchor text in hyperlinks, in the experiments the classifier based on this information achieved 99.59% F-measure. Processamento de textos (Computação) Aprendizado do computador Redes neurais (Computação) Métodos de simulação Web sites - Ratings and rankings Text processing (Computer science) Machine learning Neural networks (Computer science) HTML (Document marKup language) Simulation methods
8	Classificação de sites a partir das análises estrutural e textual Ribas, Oeslei Taborda 28 August 2013 (has links) Com a ampla utilização da web nos dias atuais e também com o seu crescimento constante, a tarefa de classificação automática de sítios web têm adquirido importância crescente, pois em diversas ocasiões é necessário bloquear o acesso a sítios específicos, como por exemplo no caso do acesso a sítios de conteúdo adulto em escolas elementares e secundárias. Na literatura diferentes trabalhos têm surgido propondo novos métodos de classificação de sítios, com o objetivo de aumentar o índice de páginas corretamente categorizadas. Este trabalho tem por objetivo contribuir com os métodos atuais de classificação através de comparações de quatro aspectos envolvidos no processo de classificação: algoritmos de classificação, dimensionalidade (número de atributos considerados), métricas de avaliação de atributos e seleção de atributos textuais e estruturais presentes nas páginas web. Utiliza-se o modelo vetorial para o tratamento de textos e uma abordagem de aprendizagem de máquina clássica considerando a tarefa de classificação. Diversas métricas são utilizadas para fazer a seleção dos termos mais relevantes, e algoritmos de classificação de diferentes paradigmas são comparados: probabilista (Naıve Bayes), árvores de decisão (C4.5), aprendizado baseado em instâncias (KNN - K vizinhos mais próximos) e Máquinas de Vetores de Suporte (SVM). Os experimentos foram realizados em um conjunto de dados contendo sítios de dois idiomas, Português e Inglês. Os resultados demonstram que é possível obter um classificador com bons índices de acerto utilizando apenas as informações do texto ˆancora dos hyperlinks. Nos experimentos o classificador baseado nessas informações atingiu uma Medida-F de 99.59%. / With the wide use of the web nowadays, also with its constant growth, task of automatic classification of websites has gained increasing importance. In many occasions it is necessary to block access to specific sites, such as in the case of access to adult content sites in elementary and secondary schools. In the literature different studies has appeared proposing new methods for classification of sites, with the goal of increasing the rate of pages correctly categorized. This work aims to contribute to the current methods of classification by comparing four aspects involved in the classification process: classification algorithms, dimensionality (amount of selected attributes), attributes evaluation metrics and selection of textual and structural attributes present in webpages. We use the vector model to treat text and an machine learning classical approach according to the classification task. Several metrics are used to make the selection of the most relevant terms, and classification algorithms from different paradigms are compared: probabilistic (Na¨ıve Bayes), decision tree (C4.5), instance-based learning (KNN - K-Nearest Neighbor) and support vector machine (SVM). The experiments were performed on a dataset containing two languages, English and Portuguese. The results show that it is possible to obtain a classifier with good success indexes using only the information from the anchor text in hyperlinks, in the experiments the classifier based on this information achieved 99.59% F-measure. Processamento de textos (Computação) Aprendizado do computador Redes neurais (Computação) Métodos de simulação Web sites - Ratings and rankings Text processing (Computer science) Machine learning Neural networks (Computer science) HTML (Document marKup language) Simulation methods

Search results