Global ETD Search

1	Induction-Based Approach to Personalized Search Engines Alhalabi, Wadee Saleh 09 May 2008 (has links) In a document retrieval system where data is stored and compared with a specific query and then compared with other documents, we need to find the document that is most similar to the query. The most similar document will have the weight higher than other documents. When more than one document are proposed to the user, these documents have to be sorted according to their weights. Once the result is presented to the user by a recommender system, the user may check any document of interest. If there are two different documents' lists, as two proposed results presented by different recommender systems, then, there is a need to find which list is more efficient. To do so, the measuring tool "Search Engine Ranking Efficiency Evaluation Tool [SEREET]" came to existence. This tool assesses the efficiency of each documents list and assigns a numerical value to the list. The value will be closer to 100% if the ranking list efficiency is high which means more relevance documents exist in the list and documents are sorted according to their relevance to the user. The value will be closer to 0% when the ranking list efficiency is poor and all of the presented documents are uninteresting documents to the user. A model to evaluate ranking efficiency is proposed in the dissertation, then it is proved it mathematically. Many mechanisms of search engine have been proposed in order to assess the relevance of a web page. They have focused on keyword frequency, page usage, link analysis and various combinations of them. These methods have been tested and used to provide the user with the most interesting web pages, according to his or her preferences. The collaborative filtering is a new approach, which was developed in this dissertation to retrieve the most interesting documents to the user according to his or her interests. Building a user profile is a very important issue in finding the user interest and categorizes each user in a suitable category. This is a requirement in collaborative filtering implementation. The inference tools such as time spent in a web page, mouse movement, page scrolling, mouse clicks and other tools were investigated. Then the dissertation shows that the most efficient and sufficient tool is the time a user spent on a web page. To eliminate errors, the system introduces a low threshold and high threshold for each user. Once the time spent on a web page breaks this threshold, an error is reported. SEREET tool is one of the contributions to the scientific society, which measures the efficiency of a search engine ranking list. Considerable work were carried, then the conclusion was that the amount of time spent on a web page is the most important factor in determining a user interest of a web page and also it is a sufficient tool which does not require collaborations from other tools such as mouse movements or a page scrolling. The results show that implicit rating is a satisfactory measure and can replace explicit rating. New filtering technique was introduced to design a fully functional recommender system. The linear vector algorithm which was introduced improves the vector space algorithm (VSA) in time complexity and efficiency. The use of machine learning enhances the retrieved list efficiency. Machine learning algorithm uses positive and negative examples for the training, these examples are mandatory to improve the error rate of the system. The result shows that the amount of these examples increases proportionally with the error rate of the system. Collaborative Filters Search Engine SEREET
2	[en] HYBRID RECOMMENDATION SYSTEM BASED ON COLLABORATIVE FILTERING AND FUZZY NUMBERS / [pt] SISTEMA HÍBRIDO DE RECOMENDAÇÃO DE PRODUTOS COM USO DE FILTROS COLABORATIVOS E NÚMEROS FUZZY MIGUEL ANGELO GASPAR PINTO 17 November 2021 (has links) [pt] O varejo virtual tem sido um importante setor para dinamização da economia, cujo valor das transações em 2010 ficou em torno de R$10,6 bilhões. As lojas nesse segmento não possuem restrição de clientes ou de estoque, porém possuem consumidores pouco pacientes com várias outras lojas a sua disposição, sendo necessário que o item de seu interesse seja encontrado visível rapidamente. Buscando resolver este problema, foram desenvolvidos algoritmos de recomendação capazes de gerar listagens de produtos que fossem direcionados ao usuário. Os algoritmos de filtragem colaborativa são amplamente usados no varejo virtual, porém eles apresentam problemas devido a escala e esparsidade do banco de dados. Algoritmos baseados em conteúdo podem apresentar menor sensibilidade ao tamanho da base de dados, porém sua efetividade depende da existência de dados de usuários que comumente não estão presentes. Nesta tese, propõe-se um algoritmo híbrido que utiliza tanto a filtragem colaborativa quanto um algoritmo baseado em conteúdo para permitir boas recomendações em bases de dados esparsas e de grande porte. O algoritmo baseado em conteúdo faz uso de números fuzzy e técnicas de marketing para guiar sua recomendação apenas com base nos itens comprados pelo usuário, sem necessidade de quaisquer outros dados pessoais do usuário. O algoritmo proposto foi testado em bases de dados sintética e real, sendo comparado com um filtro colaborativo padrão para avaliar seu desempenho.Os resultados obtidos demonstram que o algoritmo híbrido proposto apresentou um desempenho superior ao do filtro colaborativo padrão em ambas as base de dados, apresentando invariância à esparsidade da base de dados. / [en] The virtual retail has been an important sector at Brazilian economy, being a USD 6.23 billion market in 2010, having 30 percent expansion on that period. The companies in such segment don t have client or product restrictions due to physical limitations. On the other hand, the consumers of this kind of retail have several options to buy and little patience to keep searching on the same website. The companies need to define which item will be shown to the consumer before he leaves for the next competitor. Several recommendation algorithms were developed to generate products list directed to the consumer. Nowadays the algorithms for collaborative filtering are well spread in virtual retail, but they have problems caused exactly by the huge quantity of data that exist on virtual retail. Content based algorithms are less sensitive to the size of the database, but their effectiveness depends on the existence of user data, which usually are not available. This thesis proposes a hybrid algorithm which uses both collaborative filtering and a content based algorithm to allow recommendations in huge sparse databases. The content base algorithm uses fuzzy numbers and marketing techniques to guide the recommendation using only the items brought by the user, without the need for further personal data from the consumer. The proposed algorithm was tested in both artificial and real databases, compared with a benchmark collaborative filter. The collected results show that the proposed hybrid algorithm provides superior performance than the benchmark collaborative filter in both databases, generating good results and presenting sparsity invariance. The proposed algorithm also solves problems of initialization, neighborhood transitivity and in cases when new users or items are inserted on database. [pt] MARKETING [pt] ALGORITMOS BASEADOS EM CONTEUDO [pt] FILTROS COLABORATIVOS [pt] RECOMENDACAO [pt] NUMEROS FUZZY [en] MARKETING [en] CONTENT BASED ALGORITHMS [en] COLLABORATIVE FILTERS [en] RECOMMENDATIONS [en] FUZZY NUMBERS

Search results

Induction-Based Approach to Personalized Search Engines

[en] HYBRID RECOMMENDATION SYSTEM BASED ON COLLABORATIVE FILTERING AND FUZZY NUMBERS / [pt] SISTEMA HÍBRIDO DE RECOMENDAÇÃO DE PRODUTOS COM USO DE FILTROS COLABORATIVOS E NÚMEROS FUZZY