Spelling suggestions: "subject:"digital 1ibrary"" "subject:"digital fibrary""
31 |
A High-quality Digital Library Supporting Computing Education: The Ensemble ApproachChen, Yinlin 28 August 2017 (has links)
Educational Digital Libraries (DLs) are complex information systems which are designed to support individuals' information needs and information seeking behavior. To have a broad impact on the communities in education and to serve for a long period, DLs need to structure and organize the resources in a way that facilitates the dissemination and the reuse of resources. Such a digital library should meet defined quality dimensions in the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework - including completeness, consistency, efficiency, extensibility, and reliability - to ensure that a good quality DL is built.
In this research, we addressed both external and internal quality aspects of DLs. For internal qualities, we focused on completeness and consistency of the collection, catalog, and repository. We developed an application pipeline to acquire user-generated computing-related resources from YouTube and SlideShare for an educational DL. We applied machine learning techniques to transfer what we learned from the ACM Digital Library dataset. We built classifiers to catalog resources according to the ACM Computing Classification System from the two new domains that were evaluated using Amazon Mechanical Turk. For external qualities, we focused on efficiency, scalability, and reliability in DL services. We proposed cloud-based designs and applications to ensure and improve these qualities in DL services using cloud computing. The experimental results show that our proposed methods are promising for enhancing and enriching an educational digital library.
This work received support from ACM, as well as the National Science Foundation under Grant Numbers DUE-0836940, DUE-0937863, and DUE-0840719, and IMLS LG-71-16-0037-16. / Ph. D. / Educational Digital Libraries (DLs) are designed to serve users finding educational materials. To have a broad impact on the communities in education for a long period, DLs need to structure and organize the resources in a way that facilitates their dissemination and reuse. Such a digital library should be built on a well-defined framework to ensure that the services it provides are of good quality.
In this research, we focused on the quality aspects of DLs. We developed an application pipeline to acquire resources contributed by the users from YouTube and SlideShare for an educational DL. We applied machine learning techniques to build classifiers in order to catalog DL collections using a uniform classification system: the ACM Computing Classification System. We also used Amazon Mechanical Turk to evaluate the classifier’s prediction result and used the outcome to improve classifier performance. To ensure efficiency, scalability, and reliability in DL services, we proposed cloud-based designs and applications to enhance DL services. The experimental results show that our proposed methods are promising for enhancing and enriching an educational digital library.
|
32 |
Feedback de relevância orientado a termos: um novo método para ordenação de resultados de motores de busca. / Term-oriented relevance feedback: a novel ranking method for search engines.Hattori, Fernando 23 May 2016 (has links)
O modelo de recuperação de informação mais amplamente utilizado no contexto de acervos digitais é o Vector Space Model. Algoritmos implementados para este modelo que aproveitam informações sobre relevância obtidas dos usuários (chamados feedbacks) na tentativa de melhorar os resultados da busca. Porém, estes algoritmos de feedback de relevância não possuem uma estratégia global e permanente, as informações obtidas desses feedbacks são descartadas para cada nova sessão de usuário (são perenes) ou não modificam os documentos como um todo (são alterações locais). Este trabalho apresenta um método de feedbacks de relevância denominado orientado a termos, permitindo que as modificações realizadas por influência dos feedbacks dos usuários sejam globais e permanentes. Foram realizados experimentos utilizando o dataset ClueWeb09 que dão evidências de que este método melhora a qualidade dos resultados da busca em relação ao modelo tradicional Vector Space Model. / The Vector Space Model is the most widely used information retrieval model within digital libraries\' systems. Algorithms developed to be used with this model use relevance information obtained from users (called feedbacks) to improve the search results. However, the relevance feedback algorithms developed are not global nor permanent, the feedbacks are discarded in users new sessions and do not affect every document. This paper presents a method that uses of relevance feedback named terms oriented. In this method, users\' feedbacks lead to modifications in the terms\' vectors representations. These modifications are global and permanent, influencing further searches. An experiment was conducted using the ClueWeb09 dataset, giving evidence that this method improves the quality of search results when compared with Vector Space Model.
|
33 |
Arquitetura da informação para biblioteca digital personalizável /Camargo, Liriane Soares de Araújo de. January 2004 (has links)
Orientador: Silvana Aparecida Borsetti Gregorio Vidotti / Banca: Plácida Leopoldina Ventura Amorin da Costa Santos / Banca: Edberto Ferneda / Resumo: A recuperação e disseminação de informações no ambiente Web são dificuldades que existem atualmente, pois podem estar de forma desestruturada e desorganizada segundo um padrão aceito na área de organização, armazenamento e recuperação da informação. Alguns recursos que podem minimizar essas dificuldades são tanto bibliotecas digitais, que possuem acesso simultâneo e remoto às informações de forma eficiente, quanto serviço de personalização, que permite ao usuário uma interação personalizada baseada no seu perfil. O problema de prover esses recursos se encontra na onerosidade e dificuldade do processo de desenvolvimento desse tipo de biblioteca devido à grande quantidade de processos e elementos envolvidos em sua construção. Nesse contexto é proposta uma arquitetura da informação para bibliotecas digitais personalizáveis, que visa tratar dos seguintes problemas: escassez de literatura especializada sobre arquitetura da informação para bibliotecas digitais; falta de elementos tecnológicos e informacionais que possibilitem um acesso rápido e preciso à informação requerida; e pouca utilização de serviços de personalização de conteúdo e de interface para diversos tipos de usuários. Essa arquitetura é constituída de processos e elementos oriundos da área de Ciência da Informação e Ciência da Computação que são compartilhados pela maioria das bibliotecas digitais. Além disso, essa arquitetura contém elementos genéricos, que permitem flexibilidade para serem adaptados e modificados de acordo com as características de cada biblioteca digital. O objetivo da arquitetura proposta é auxiliar o desenvolvedor/projetista na construção de WebSites, principalmente em bibliotecas digitais personalizáveis de forma a satisfazer as necessidades dos usuários. Foi realizada uma análise dos processos e elementos que são mais utilizados e comuns a esse...(Resumo completo, clicar acesso eletrônico abaixo) / Abstract: The recovery and dissemination of unstructured information of the Web environment are difficulties existing nowadays. Some resources which can minimize these difficulties are as digital libraries, which can have simultaneous and remote access to information in an efficient way, as personalization services, which provide a user personalized interaction based on its profile. The problem in providing such resources is the difficulty in developing digital libraries because of the number of elements and processes involved in its building. In this context, is proposed a information architecture for customizable digital libraries, which aims deals with the following problems: lack of specialized literature on information architectures for digital libraries; lack of technological elements and elements of information which allow a fast and accurate access to required information; and the lack of personalization services of content and interface for different kinds of users. This architecture is composed by processes and elements, from Information Science and Computer Science fields, that are share by most of digital libraries. Besides, this architecture contains generic elements, which allow flexibility to be adapted and modified according to the characteristics of each digital library. The aim of the proposed architecture is to help the developer in building Web sites and customizable digital libraries as well. To supplement the project, it was performed an evaluation of the processes and elements more used and common in this type of information unit, in this case: digital libraries. The results of this evaluation showed that the majority of the elements are used by digital libraries and are relevant to its build. This study is a descriptive research with theoretical and methodological approach of the Information Science field, performed by means of a study of bibliographical data... (Complete abstract, access undermentioned eletronic address) / Mestre
|
34 |
Feedback de relevância orientado a termos: um novo método para ordenação de resultados de motores de busca. / Term-oriented relevance feedback: a novel ranking method for search engines.Fernando Hattori 23 May 2016 (has links)
O modelo de recuperação de informação mais amplamente utilizado no contexto de acervos digitais é o Vector Space Model. Algoritmos implementados para este modelo que aproveitam informações sobre relevância obtidas dos usuários (chamados feedbacks) na tentativa de melhorar os resultados da busca. Porém, estes algoritmos de feedback de relevância não possuem uma estratégia global e permanente, as informações obtidas desses feedbacks são descartadas para cada nova sessão de usuário (são perenes) ou não modificam os documentos como um todo (são alterações locais). Este trabalho apresenta um método de feedbacks de relevância denominado orientado a termos, permitindo que as modificações realizadas por influência dos feedbacks dos usuários sejam globais e permanentes. Foram realizados experimentos utilizando o dataset ClueWeb09 que dão evidências de que este método melhora a qualidade dos resultados da busca em relação ao modelo tradicional Vector Space Model. / The Vector Space Model is the most widely used information retrieval model within digital libraries\' systems. Algorithms developed to be used with this model use relevance information obtained from users (called feedbacks) to improve the search results. However, the relevance feedback algorithms developed are not global nor permanent, the feedbacks are discarded in users new sessions and do not affect every document. This paper presents a method that uses of relevance feedback named terms oriented. In this method, users\' feedbacks lead to modifications in the terms\' vectors representations. These modifications are global and permanent, influencing further searches. An experiment was conducted using the ClueWeb09 dataset, giving evidence that this method improves the quality of search results when compared with Vector Space Model.
|
35 |
Geração automática de metadados: uma contribuição para a Web semântica. / Automatic metadata generation: a contribution to the semantic Web.Eveline Cruz Hora Gomes Ferreira 05 April 2006 (has links)
Esta Tese oferece uma contribuição na área de Web Semântica, no âmbito da representação e indexação de documentos, definindo um Modelo de geração automática de metadados baseado em contexto, a partir de documentos textuais na língua portuguesa, em formato não estruturado (txt). Um conjunto teórico amplo de assuntos ligados à criação de ambientes digitais semântico também é apresentado. Conforme recomendado em SemanticWeb.org, os documentos textuais aqui estudados foram automaticamente convertidos em páginas Web anotadas semanticamente, utilizando o Dublin Core como padrão para definição dos elementos de metadados, e o padrão RDF/XML para representação dos documentos e descrição dos elementos de metadados. Dentre os quinze elementos de metadados Dublin Core, nove foram gerados automaticamente pelo Modelo, e seis foram gerados de forma semi-automática. Os metadados Description e Subject foram os que necessitaram de algoritmos mais complexos, sendo obtidos através de técnicas estatísticas, de mineração de textos e de processamento de linguagem natural. A finalidade principal da avaliação do Modelo foi verificar o comportamento dos documentos convertidos para o formato RDF/XML, quando estes foram submetidos a um processo de recuperação de informação. Os elementos de metadados Description e Subject foram exaustivamente avaliados, uma vez que estes são os principais responsáveis por apreender a semântica de documentos textuais. A diversidade de contextos, a complexidade dos problemas relativos à língua portuguesa, e os novos conceitos introduzidos pelos padrões e tecnologias da Web Semântica, foram alguns dos fortes desafios enfrentados na construção do Modelo aqui proposto. Apesar de se ter utilizado técnicas não muito novas para a exploração dos conteúdos dos documentos, não se pode ignorar que os elementos inovadores introduzidos pela Web Semântica ofereceram avanços que possibilitaram a obtenção de resultados importantes nesta Tese. Como demonstrado aqui, a junção dessas técnicas com os padrões e tecnologias recomendados pela Web Semântica pode minimizar um dos maiores problemas da Web atual, e uma das fortes razões para a implementação da Web Semântica: a tendência dos mecanismos de busca de inundarem os usuários com resultados irrelevantes, por não levarem em consideração o contexto específico desejado pelo usuário. Dessa forma, é importante que se dê continuidade aos estudos e pesquisas em todas as áreas relacionadas à implementação da Web Semântica, dando abertura para que sistemas de informação mais funcionais sejam projetados / This Thesis offers a contribution to the Semantic Web area, in the scope of the representation and indexing of documents, defining an Automatic metadata generation model based on context, starting from textual documents not structured in the Portuguese language. A wide theoretical set of subjects related to the creation of semantic digital environments is also presented. As recommended in SemanticWeb.org, the textual documents studied here were automatically converted to Web pages written in semantic format, using Dublin Core as standard for definition of metadata elements, and the standard RDF/XML for representation of documents and description of the metadata elements. Among the fifteen Dublin Core metadata elements, nine were automatically generated by the Model, and six were generated in a semiautomatic manner. The metadata Description and Subject were the ones that required more complex algorithms, being obtained through statistical techniques, text mining techniques and natural language processing. The main purpose of the evaluation of the Model was to verify the behavior of the documents converted to the format RDF/XML, when these were submitted to an information retrieval process. The metadata elements Description and Subject were exhaustively evaluated, since these are the main ones responsible for learning the semantics of textual documents. The diversity of contexts, the complexity of the problems related to the Portuguese language, and the new concepts introduced by the standards and technologies of the Semantic Web, were some of the great challenges faced in the construction of the Model here proposed. In spite of having used techniques which are not very new for the exploration and exploitation of the contents of the documents, we cannot ignore that the innovative elements introduced by the Web Semantic have offered improvements that made possible the obtention of important results in this Thesis. As demonstrated here, the joining of those techniques with the standards and technologies recommended by the Semantic Web can minimize one of the largest problems of the current Web, and one of the strong reasons for the implementation of the Semantic Web: the tendency of the search mechanisms to flood the users with irrelevant results, because they do not take into account the specific context desired by the user. Therefore, it is important that the studies and research be continued in all of the areas related to the Semantic Web?s implementation, opening the door for more functional systems of information to be designed.
|
36 |
Indigenous language usage in a digital library: He hautoa kia ora tonu ai.Keegan, Te Taka Adrian Gregory January 2007 (has links)
The research described in this thesis examines indigenous language usage in a digital library environment that has been accessed via the Internet. By examining discretionary use of the Māori Niupepa and Hawaiian Nūpepa digital libraries this research investigates how indigenous languages were used in these electronic environments in 2005. The results provide encouragement and optimism to people who are striving to retain, revitalise and develop the use of indigenous languages in information technologies. The Transaction Log Analysis (TLA) methods used in this research serve as an example of how web logs can be used to provide significant information about language usage in a bilingual online information system. Combining the TLA with user feedback has provided insights into how and why clients use indigenous languages in their information retrieval activities. These insights in turn, show good practice that is relevant not only to those working with indigenous languages, indigenous peoples or multilingual environments, but to all information technology designers who strive for universal usability. This thesis begins by describing the importance of using indigenous languages in electronic environments and suggests that digital libraries can provide an environment to support and encourage the use of such languages. TLA is explained in the context of this study and is then used to analyse aspects of te reo Māori usage in the Niupepa digital library environment in 2005. TLA also indicates that te reo Māori was used by international clients and this usage differed to te reo Māori usage by national (Aotearoa) clients. Findings further reveal that the default language setting of the Niupepa digital library had a considerable impact on te reo Māori usage. When the default language was set to te reo Māori not only were there more requests in te reo Māori but there was also a higher usage of te reo Māori in the information retrieval activities. TLA of the Hawaiian Nūpepa digital library indicated that the Hawaiian language was also used in a digital library. These results confirm that indigenous languages were used in digital library environments. Feedback from clients suggests reasons why indigenous languages were used in this environment. These reasons include the indigenous language content of the digital library, the indigenous language default language setting of the digital library and a stated desire by the clients to use the indigenous language. The key findings raise some interface design issues and support the claim that digital libraries can provide an environment to support the use of indigenous languages.
|
37 |
Clustering Articles in a Literature Digital Library Based on Content and UsageTing, Kang-Di 10 August 2004 (has links)
Literature digital library is one of the most important resources to preserve civilized asset. To provide more effective and efficient information search, many systems are equipped with a browsing interface that aims to ease the article searching task. A browsing interface is associated with a subject directory, which guides the users to identify articles that need their information need. A subject directory contains a set (or a hierarchy) of subject categories, each containing a number of similar articles. How to group articles in a literature digital library is the theme of this thesis.
Previous work used either document classification or document clustering approaches to dispatching articles into a set of article clusters based on their content. We observed that articles that meet a single user¡¦s information need may not necessarily fall in a single cluster. In this thesis, we propose to make use of both Web log and article content is clustering articles. We proposed two hybrid approaches, namely document categorization based method and document clustering based method. These alternatives were compared to other content-based methods. It has been found that the document categorization based method effectively reduces the number of required click-through at the expense of slight increase of entropy that measures the content heterogeneity of each generated cluster.
|
38 |
Employing Social Networks for Recommendation in a Literature Digital LibraryLiao, Yi-fan 04 August 2006 (has links)
Interpersonal relationship and recommendation are the important relation and popular mechanism. Living in the information-overloading age, the original information searching mechanisms, which require the specification of keywords, are ineffective and impractical. Moreover, a variety of recommendation techniques have been proposed and many of them have been implemented in real systems, especially in online stores. Among different recommendation techniques proposed in the literature, the content-based and collaborative filtering approaches have been broadly adopted by membership stores that maintain users¡¦ long term interest. For short-term interest, by far the content-based approach is the most popular one for recommendation. However, most of the proposed recommendation approaches do not take the social information as an important factor. In this study, we proposed several social network-based recommendation approaches that take into account the similarities of items with respect to their social closeness for meeting users¡¦ short term interests. Our experiment evaluation results show that social network-based approaches perform better than the content-based counterpart, if the user¡¦s short term interest profile contains articles of similar content. Nonetheless, content-based approach becomes better when articles in the profile are diversified in their content. Besides, contrast to content-based approach, social network-based approach is less likely to recommend articles which readers do not value. Finally, the desired articles recommended by content-based approach are very different from those by social network-based approach. This suggests the development of some hybrid recommendation method that utilizes both content and social network when making recommendations.
|
39 |
Combining Social Networks and Content for Recommendation in a Literature Digital LibraryHuang, Yu-chin 24 July 2008 (has links)
Living in an information-overloading age, the original information searching mechanisms are ineffective and impractical. As the e-commerce is more and more popular, using information technology to discover the latent demand of customers becomes an important issue. Hence, a variety of recommendation techniques have been proposed and many of them have been implemented in real systems, mostly in online stores. Among the techniques, the content-based and collaborative filtering approaches are the ones broadly adopted and proved to be successful. Recently, social network-based recommendation approach has been proposed that takes into account the similarities of items with respect to their social closeness. The social network-based approach performs better than content-based approach in some scenarios and it can also avoid recommending articles that have high content similarity to a user¡¦s favorite articles but low quality. Therefore, we propose three hybrid approaches, Switching, Proportional, and Fusion
that combine content-based and social network-based approaches in order to achieve a better performance. Our experimental result shows that even though the proposed approaches have pros and cons under different scenarios, in general they achieve better performance than individual
approaches. Besides, we generate some synthetic articles that have close content similarities to articles in our collection to evaluate the fidelity of each approach. The experimental results show that approaches incorporating social network information have lower chance to recommend these faked articles.
|
40 |
Möglichkeiten des neuen WWW-Standards XMLKreulich, Klaus 28 October 1998 (has links)
Overview about the use of XML; New possibilities for Digital Libraries; Introduction to basic concepts of XML and SGML
|
Page generated in 0.0412 seconds