• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 133
  • 8
  • 5
  • 2
  • Tagged with
  • 153
  • 153
  • 54
  • 43
  • 42
  • 31
  • 29
  • 27
  • 20
  • 19
  • 18
  • 16
  • 15
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

New techniques for efficiently discovering frequent patterns

Jin, Ruoming, January 2005 (has links)
Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Document formatted into pages; contains xvii, 170 p.; also includes graphics. Includes bibliographical references (p. 160-170). Available online via OhioLINK's ETD Center
92

Information Retrieval Strategies of Millennial Undergraduate Students in Web and Library Database Searches

Porter, Brandi 01 January 2009 (has links)
Millennial students make up a large portion of undergraduate students attending colleges and universities, and they have a variety of online resources available to them to complete academically related information searches, primarily Web based and library-based online information retrieval systems. The content, ease of use, and required search techniques are different between the two information retrieval systems. Students often prefer searching the Web, but in doing so often miss higher quality materials that may be available only through their library. Furthermore, each system uses different information retrieval algorithms for producing results, so proficiency in one search system may not transfer to another. Web based information retrieval systems are unable to search and retrieve many resources available in libraries and other proprietary information retrieval systems, often referred to as the Invisible Web. These are resources that are not available to the general public and are password protected (from anyone not considered to be an affiliated user of that particular organization). These resources are often licensed to libraries by third party vendors or publishers and include fee-based access to content. Therefore, many millennial students may not be accessing many scholarly resources available to them if they were to use Web based information retrieval systems. Investigation of how millennial students approach searches for the same topic in both systems was conducted. The goal was to build upon theory of why students search using various techniques, why they often choose the Web for their searches, and what can be done to improve library online information retrieval systems. Mixed qualitative methods of data gathering were used to elicit this information. The investigation showed that millennial undergraduate students lacked detailed search strategies, and often used the same search techniques regardless of system or subject. Students displayed greater familiarity and ease of use with Web based IR systems than online library IR systems. Results illustrated suggestions for search design enhancements to library online information retrieval systems such as better natural language searching and easier linking to full text articles. Design enhancements based on millennial search strategies should encourage students to use library-based information retrieval systems more often.
93

The Cluster Hypothesis: A Visual/Statistical Analysis

Sullivan, Terry 05 1900 (has links)
By allowing judgments based on a small number of exemplar documents to be applied to a larger number of unexamined documents, clustered presentation of search results represents an intuitively attractive possibility for reducing the cognitive resource demands on human users of information retrieval systems. However, clustered presentation of search results is sensible only to the extent that naturally occurring similarity relationships among documents correspond to topically coherent clusters. The Cluster Hypothesis posits just such a systematic relationship between document similarity and topical relevance. To date, experimental validation of the Cluster Hypothesis has proved problematic, with collection-specific results both supporting and failing to support this fundamental theoretical postulate. The present study consists of two computational information visualization experiments, representing a two-tiered test of the Cluster Hypothesis under adverse conditions. Both experiments rely on multidimensionally scaled representations of interdocument similarity matrices. Experiment 1 is a term-reduction condition, in which descriptive titles are extracted from Associated Press news stories drawn from the TREC information retrieval test collection. The clustering behavior of these titles is compared to the behavior of the corresponding full text via statistical analysis of the visual characteristics of a two-dimensional similarity map. Experiment 2 is a dimensionality reduction condition, in which inter-item similarity coefficients for full text documents are scaled into a single dimension and then rendered as a two-dimensional visualization; the clustering behavior of relevant documents within these unidimensionally scaled representations is examined via visual and statistical methods. Taken as a whole, results of both experiments lend strong though not unqualified support to the Cluster Hypothesis. In Experiment 1, semantically meaningful 6.6-word document surrogates systematically conform to the predictions of the Cluster Hypothesis. In Experiment 2, the majority of the unidimensionally scaled datasets exhibit a marked nonuniformity of distribution of relevant documents, further supporting the Cluster Hypothesis. Results of the two experiments are profoundly question-specific. Post hoc analyses suggest that it may be possible to predict the success of clustered searching based on the lexical characteristics of users' natural-language expression of their information need.
94

Learning hash codes for multimedia retrieval

Chen, Junjie 28 August 2019 (has links)
The explosive growth of multimedia data in online media repositories and social networks has led to the high demand of fast and accurate services for large-scale multimedia retrieval. Hashing, due to its effectiveness in coding high-dimensional data into a low-dimensional binary space, has been considered to be effective for the retrieval application. Despite the progress that has been made recently, how to learn the optimal hashing models which can make the best trade-off between the retrieval efficiency and accuracy remains to be open research issues. This thesis research aims to develop hashing models which are effective for image and video retrieval. An unsupervised hashing model called APHash is first proposed to learn hash codes for images by exploiting the distribution of data. To reduce the underlying computational complexity, a methodology that makes use of an asymmetric similarity matrix is explored and found effective. In addition, the deep learning approach to learn hash codes for images is also studied. In particular, a novel deep model called DeepQuan which tries to incorporate product quantization methods into an unsupervised deep model for the learning. Other than adopting only the quadratic loss as the optimization objective like most of the related deep models, DeepQuan optimizes the data representations and their quantization codebooks to explores the clustering structure of the underlying data manifold where the introduction of a weighted triplet loss into the learning objective is found to be effective. Furthermore, the case with some labeled data available for the learning is also considered. To alleviate the high training cost (which is especially crucial given a large-scale database), another hashing model named Similarity Preserving Deep Asymmetric Quantization (SPDAQ) is proposed for both image and video retrieval where the compact binary codes and quantization codebooks for all the items in the database can be explicitly learned in an efficient manner. All the aforementioned hashing methods proposed have been rigorously evaluated using benchmark datasets and found to outperform the related state-of-the-art methods.
95

Complex Proteoform Identification Using Top-Down Mass Spectrometry

Kou, Qiang 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Proteoforms are distinct protein molecule forms created by variations in genes, gene expression, and other biological processes. Many proteoforms contain multiple primary structural alterations, including amino acid substitutions, terminal truncations, and posttranslational modifications. These primary structural alterations play a crucial role in determining protein functions: proteoforms from the same protein with different alterations may exhibit different functional behaviors. Because top-down mass spectrometry directly analyzes intact proteoforms and provides complete sequence information of proteoforms, it has become the method of choice for the identification of complex proteoforms. Although instruments and experimental protocols for top-down mass spectrometry have been advancing rapidly in the past several years, many computational problems in this area remain unsolved, and the development of software tools for analyzing such data is still at its very early stage. In this dissertation, we propose several novel algorithms for challenging computational problems in proteoform identification by top-down mass spectrometry. First, we present two approximate spectrum-based protein sequence filtering algorithms that quickly find a small number of candidate proteins from a large proteome database for a query mass spectrum. Second, we describe mass graph-based alignment algorithms that efficiently identify proteoforms with variable post-translational modifications and/or terminal truncations. Third, we propose a Markov chain Monte Carlo method for estimating the statistical signi ficance of identified proteoform spectrum matches. They are the first efficient algorithms that take into account three types of alterations: variable post-translational modifications, unexpected alterations, and terminal truncations in proteoform identification. As a result, they are more sensitive and powerful than other existing methods that consider only one or two of the three types of alterations. All the proposed algorithms have been incorporated into TopMG, a complete software pipeline for complex proteoform identification. Experimental results showed that TopMG significantly increases the number of identifications than other existing methods in proteome-level top-down mass spectrometry studies. TopMG will facilitate the applications of top-down mass spectrometry in many areas, such as the identification and quantification of clinically relevant proteoforms and the discovery of new proteoform biomarkers. / 2019-06-21
96

Strategies for students to seek information on the web: an action research

Cheng, Chung-chee., 鄭頌慈. January 2003 (has links)
published_or_final_version / abstract / toc / Education / Master / Master of Science in Information Technology in Education
97

Socio-aware random walk search and replication in peer-to-peer networks

Xie, Jing, 謝靜 January 2009 (has links)
published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy
98

Efficient Incremental View Maintenance for Data Warehousing

Chen, Songting 20 December 2005 (has links)
"Data warehousing and on-line analytical processing (OLAP) are essential elements for decision support applications. Since most OLAP queries are complex and are often executed over huge volumes of data, the solution in practice is to employ materialized views to improve query performance. One important issue for utilizing materialized views is to maintain the view consistency upon source changes. However, most prior work focused on simple SQL views with distributive aggregate functions, such as SUM and COUNT. This dissertation proposes to consider broader types of views than previous work. First, we study views with complex aggregate functions such as variance and regression. Such statistical functions are of great importance in practice. We propose a workarea function model and design a generic framework to tackle incremental view maintenance and answering queries using views for such functions. We have implemented this approach in a prototype system of IBM DB2. An extensive performance study shows significant performance gains by our techniques. Second, we consider materialized views with PIVOT and UNPIVOT operators. Such operators are widely used for OLAP applications and for querying sparse datasets. We demonstrate that the efficient maintenance of views with PIVOT and UNPIVOT operators requires more generalized operators, called GPIVOT and GUNPIVOT. We formally define and prove the query rewriting rules and propagation rules for such operators. We also design a novel view maintenance framework for applying these rules to obtain an efficient maintenance plan. Extensive performance evaluations reveal the effectiveness of our techniques. Third, materialized views are often integrated from multiple data sources. Due to source autonomicity and dynamicity, concurrency may occur during view maintenance. We propose a generic concurrency control framework to solve such maintenance anomalies. This solution extends previous work in that it solves the anomalies under both source data and schema changes and thus achieves full source autonomicity. We have implemented this technique in a data warehouse prototype developed at WPI. The extensive performance study shows that our techniques put little extra overhead on existing concurrent data update processing techniques while allowing for this new functionality."
99

Adaptive Scheduling Algorithm Selection in a Streaming Query System

Pielech, Bradford Charles 13 January 2004 (has links)
Many modern applications process queries over unbounded streams of data. These applications include tracking financial data from international markets, intrusion detection in networks, monitoring remote sensors, and monitoring patients vital signs. These data streams arrive in real time, are unbounded in length and have unpredictable arrival patterns due to external uncontrollable factors such as network congestion or weather in the case of remote sensors. This thesis presents a novel technique for adapting the execution of stream queries that, to my knowledge, is not present in any other continuous query system to date. This thesis hypothesizes that utilizing a single scheduling algorithm to execute a continuous query, as is employed in other state-of-the-art continuous query systems, is not sufficient because existing scheduling algorithms all have inherent flaws or tradeoffs. Thus, one scheduling algorithm cannot optimally meet an arbitrary set of Quality of Service (QoS) requirements. Therefore, to meet unique features of specific monitoring applications, an adaptive strategy selector guidable by QoS requirements was developed. The adaptive strategy selector monitors the effects of its behavior on its environment through a feedback mechanism, with the aim of exploiting previously beneficial behavior and exploring alternative behavior. The feedback mechanism is guided by qualitatively comparing how well each algorithm has met the QoS requirements. Then the next scheduling algorithm is chosen by spinning a roulette wheel where each candidate is chosen with a probability equal to its performance score. The adaptive algorithm is general, being able to employ any candidate scheduling algorithm and to react to any combination of quality of service preferences. As part of this thesis, the Raindrop system was developed as exploratory test bed in which to conduct an experimental study. In that experimental study, the adaptive algorithm was shown to be effective in outperforming single scheduling algorithms for many QoS combinations and data arrival patterns.
100

Shape description and retrieval for 3D model search engine.

January 2014 (has links)
隨著互聯網上3D模型的大量增加,產生了開發3D模型搜索引擎的需求。本論文提出了一個基於草圖和3D模型的3D模型搜索引擎。 / 對於使用3D模型作檢索條件的搜索系統,我們提出了兩種新的3D模型描述子,分別叫做Sphere Image 描述子和Bag-of-View-Words (BoVW)描述子。Sphere Image描述子是由一系列投影圖的特徵組成。我們將每一個視角看到的圖形都當作是一個"像素",把視角的位置看作像素點的位置,把所看到的圖形的特徵值看作是像素值。我們同時也提出了一種基於概率圖的3D模型匹配算法,並開發了一個3D模型檢索系統來檢測我們的算法。BoVW描述子通過3D模型投影圖出現的次數來描述3D模型。我們用一種自適應的聚類算法,對3D模型的所有投影圖進行分類,然後用一個多層次的柱狀圖來描述一個3D模型。我們同時提出一種新的金字塔匹配算法來比較3D模型。我們使用SHREC和普林斯頓的3D模型庫來檢驗我們的系統,實驗結果證明我們的系統在檢索效率和精度上都優與現今的3D模型檢索系統。 / 對於使用草圖作檢索條件的3D模型搜索系統,我們提出Bigger ExposureOpportunity Views (BEOV) 描述子來表示3D模型,同時提出Shape-Ring描述子來表示草圖。BEOV描述子是由一些特徵圖組成,這些圖的特點是更容易被人們看到。Shape-Ring描述子保留了圖形的輪廓和內部特徵。我們使用SHREC2012草圖數據庫來檢驗我們的系統,實驗結果證明我們的系統在精度和計算複雜度上都優與現今的3D模型檢索系統。 / The large number of 3D models on the Internet encourages us to develop 3D model search engines. In this dissertation, we present a 3D model retrieval system using both the 3D model query and the sketch query. / For 3D model query based retrieval system, we propose two new 3D model descriptors, named the Sphere Image and the Bag-of-View-Words (BoVW) descriptor. The Sphere Image is defined as a collection of view features. A viewpoint of a 3D model is regarded as a "pixel": (1) The position of the viewpoint is denoted as the coordinate of the "pixel". (2) The feature descriptor of the projected view is denoted as the value of the "pixel". We also propose a probabilistic graphical model for 3D model matching, and develop a 3D model retrieval system to test our approach. The BoVW descriptor describes a 3D model by measuring the occurrences of its projected views. An adaptive clustering method is applied to reduce the redundancy of the projected views of each 3D model. A 3D model is represented by a multi-resolution histogram, which is combined by several BoVW descriptors at different levels. The codebook is obtained by unsupervised learning. We also propose a new pyramid matching method for 3D model comparison. We have conducted experiments based on the SHape REtrieval Contest (SHREC) 2012 Generic 3D model benchmark and the Princeton Shape Benchmark (PSB). Experimental results indicate that our system outperforms some state-of-the-art 3D model retrieval systems with respect to the retrieval precision and the computational cost. / For sketch query based retrieval system, we propose a Bigger Exposure Opportunity Views (BEOV) descriptor and a Shape-Ring descriptor, for representing the 3D model candidates and the sketch query, respectively. The BEOV descriptor represents a 3D model by several characteristic views, which have more chances to be exposed to people. The Shape-Ring descriptor preserves the features of the contour and the inside detail of the sketch query and the BEOV. Experiments have been conducted based on the SHape REtrieval Contest (SHREC) 2012 and SHREC 2013 sketch track data sets. Our approach outperforms the existing 3D model retrieval methods in terms of the retrieval precision and the computational cost. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Ding, Ke. / Thesis (Ph.D.) Chinese University of Hong Kong, 2014. / Includes bibliographical references (leaves 107-120). / Abstracts also in Chinese.

Page generated in 0.0807 seconds