Global ETD Search

171	Online searchers in Australia : backgrounds, experience, attitudes, behaviours, styles and satisfaction Byrne, Alex, n/a January 1988 (has links) Online searchers in Australia were studied through six sets of variables: backgrounds, experience, attitudes, behaviours, styles and satisfaction. A mailed questionnaire attracted a response rate of 84.5 per cent. Respondents were drawn equally from academic and special libraries. Those in special libraries tended to be more satisfied with their searches, and favoured adaptability but not preplanning. Those whose organisations levied charges appeared to search less often and to have less faith in controlled vocabularies. A minority with computational backgrounds tended to have more searching experience. Many respondents searched infrequently and had conducted low total numbers of searches. Those searching more often were less cost conscious, and more in favour of trial-and-error and reviewing retrieved titles. Searchers who had conducted more searches favoured trial-and-error , browsing and reviewing retrieved titles. Controlled vocabularies, adaptability (related to a disinclination to review retrieved titles), trial-and-error and browsing were favoured . Fidel's conceptualist style tended to be adopted by those favouring trial-and-error. Her operationalist style was considered routine and positively related to perceived user satisfaction with searches. Some concern about cost was related to a tendency to plan alternative strategies. online searchers Australia libraries search styles
172	Effective web crawlers Ali, Halil, hali@cs.rmit.edu.au January 2008 (has links) Web crawlers are the component of a search engine that must traverse the Web, gathering documents in a local repository for indexing by a search engine so that they can be ranked by their relevance to user queries. Whenever data is replicated in an autonomously updated environment, there are issues with maintaining up-to-date copies of documents. When documents are retrieved by a crawler and have subsequently been altered on the Web, the effect is an inconsistency in user search results. While the impact depends on the type and volume of change, many existing algorithms do not take the degree of change into consideration, instead using simple measures that consider any change as significant. Furthermore, many crawler evaluation metrics do not consider index freshness or the amount of impact that crawling algorithms have on user results. Most of the existing work makes assumptions about the change rate of documents on the Web, or relies on the availability of a long history of change. Our work investigates approaches to improving index consistency: detecting meaningful change, measuring the impact of a crawl on collection freshness from a user perspective, developing a framework for evaluating crawler performance, determining the effectiveness of stateless crawl ordering schemes, and proposing and evaluating the effectiveness of a dynamic crawl approach. Our work is concerned specifically with cases where there is little or no past change statistics with which predictions can be made. Our work analyses different measures of change and introduces a novel approach to measuring the impact of recrawl schemes on search engine users. Our schemes detect important changes that affect user results. Other well-known and widely used schemes have to retrieve around twice the data to achieve the same effectiveness as our schemes. Furthermore, while many studies have assumed that the Web changes according to a model, our experimental results are based on real web documents. We analyse various stateless crawl ordering schemes that have no past change statistics with which to predict which documents will change, none of which, to our knowledge, has been tested to determine effectiveness in crawling changed documents. We empirically show that the effectiveness of these schemes depends on the topology and dynamics of the domain crawled and that no one static crawl ordering scheme can effectively maintain freshness, motivating our work on dynamic approaches. We present our novel approach to maintaining freshness, which uses the anchor text linking documents to determine the likelihood of a document changing, based on statistics gathered during the current crawl. We show that this scheme is highly effective when combined with existing stateless schemes. When we combine our scheme with PageRank, our approach allows the crawler to improve both freshness and quality of a collection. Our scheme improves freshness regardless of which stateless scheme it is used in conjunction with, since it uses both positive and negative reinforcement to determine which document to retrieve. Finally, we present the design and implementation of Lara, our own distributed crawler, which we used to develop our testbed. Web Crawler Search Engines Index Consistency Efficiency
173	Advanced Intranet Search Engine Narayan, Nitesh January 2009 (has links) <p>Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the beneﬁts of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web.</p> semantic lexical search engine natural language processing
174	The Combinatorics of Heuristic Search Termination for Object Recognition in Cluttered Environments Grimson, W. Eric L. 01 May 1989 (has links) Many recognition systems use constrained search to locate objects in cluttered environments. Earlier analysis showed that the expected search is quadratic in the number of model and data features, if all the data comes from one object, but is exponential when spurious data is included. To overcome this, many methods terminate search once an interpretation that is "good enough" is found. We formally examine the combinatorics of this, showing that correct termination procedures dramatically reduce search. We provide conditions on the object model and the scene clutter such that the expected search is quartic. These results are shown to agree with empirical data for cluttered object recognition. computer vision object recognition search combinatorics
175	On the Verification of Hypothesized Matches in Model-Based Recognition Grimson, W. Eric L., Huttenlocher, Daniel P. 01 May 1989 (has links) In model-based recognition, ad hoc techniques are used to decide if a match of data to model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. We rigorously derive conditions under which to accept a match, relating the probability of a random match to the fraction of model features accounted for, as a function of the number of model features, number of image features and the sensor noise. We analyze some existing recognition systems and show that our method yields results comparable with experimental data. object recognition search model-based vision
176	Type-omega DPLs Arkoudas, Konstantine 16 October 2001 (has links) Type-omega DPLs (Denotational Proof Languages) are languages for proof presentation and search that offer strong soundness guarantees. LCF-type systems such as HOL offer similar guarantees, but their soundness relies heavily on static type systems. By contrast, DPLs ensure soundness dynamically, through their evaluation semantics; no type system is necessary. This is possible owing to a novel two-tier syntax that separates deductions from computations, and to the abstraction of assumption bases, which is factored into the semantics of the language and allows for sound evaluation. Every type-omega DPL properly contains a type-alpha DPL, which can be used to present proofs in a lucid and detailed form, exclusively in terms of primitive inference rules. Derived inference rules are expressed as user-defined methods, which are "proof recipes" that take arguments and dynamically perform appropriate deductions. Methods arise naturally via parametric abstraction over type-alpha proofs. In that light, the evaluation of a method call can be viewed as a computation that carries out a type-alpha deduction. The type-alpha proof "unwound" by such a method call is called the "certificate" of the call. Certificates can be checked by exceptionally simple type-alpha interpreters, and thus they are useful whenever we wish to minimize our trusted base. Methods are statically closed over lexical environments, but dynamically scoped over assumption bases. They can take other methods as arguments, they can iterate, and they can branch conditionally. These capabilities, in tandem with the bifurcated syntax of type-omega DPLs and their dynamic assumption-base semantics, allow the user to define methods in a style that is disciplined enough to ensure soundness yet fluid enough to permit succinct and perspicuous expression of arbitrarily sophisticated derived inference rules. We demonstrate every major feature of type-omega DPLs by defining and studying NDL-omega, a higher-order, lexically scoped, call-by-value type-omega DPL for classical zero-order natural deduction---a simple choice that allows us to focus on type-omega syntax and semantics rather than on the subtleties of the underlying logic. We start by illustrating how type-alpha DPLs naturally lead to type-omega DPLs by way of abstraction; present the formal syntax and semantics of NDL-omega; prove several results about it, including soundness; give numerous examples of methods; point out connections to the lambda-phi calculus, a very general framework for type-omega DPLs; introduce a notion of computational and deductive cost; define several instrumented interpreters for computing such costs and for generating certificates; explore the use of type-omega DPLs as general programming languages; show that DPLs do not have to be type-less by formulating a static Hindley-Milner polymorphic type system for NDL-omega; discuss some idiosyncrasies of type-omega DPLs such as the potential divergence of proof checking; and compare type-omega DPLs to other approaches to proof presentation and discovery. Finally, a complete implementation of NDL-omega in SML-NJ is given for users who want to run the examples and experiment with the language. AI deduction computation proof search soundness logic
177	Constraint-directed search : a case study of job-shop scheduling / Fox, Mark, January 1983 (has links) Thesis (Ph. D.)--Carnegie-Mellon University, 1983. / Bibliography: p. 147-153.
178	Bootstrap Learning of Heuristic Functions Jabbari Arfaee, Shahab 11 1900 (has links) We investigate the use of machine learning to create effective heuristics for single-agent search. Our method aims to generate a sequence of heuristics from a given weak heuristic h{0} and a set of unlabeled training instances using a bootstrapping procedure. The training instances that can be solved using h{0} provide training examples for a learning algorithm that produces a heuristic h{1} that is expected to be stronger than h{0}. If h{0} is so weak that it cannot solve any of the given instances we use random walks backward from the goal state to create a sequence of successively more difficult training instances starting with ones that are guaranteed to be solvable by h{0}. The bootstrap process is then repeated using h{i} instead of h{i-1} until a sufficiently strong heuristic is produced. We test this method on the 15- and 24-sliding tile puzzles, the 17- , 24- , and 35-pancake puzzles, Rubik's Cube, and the 15- and 20-blocks world. In every case our method produces heuristics that allow IDA* to solve randomly generated problem instances quickly with solutions very close to optimal. The total time for the bootstrap process to create strong heuristics for large problems is several days. To make the process efficient when only a single test instance needs to be solved, we look for a balance in the time spent on learning better heuristics and the time needed to solve the test instance using the current set of learned heuristics. %We use two threads in parallel, We alternate between the execution of two threads, namely the learning thread (to learn better heuristics) and the solving thread (to solve the test instance). The solving thread is split up into sub-threads. The first solving sub-thread aims at solving the instance using the initial heuristic. When a new heuristic is learned in the learning thread, an additional solving sub-thread is started which uses the new heuristic to try to solve the instance. The total time by which we evaluate this process is the sum of the times used by both threads up to the point when the instance is solved in one sub-thread. The experimental results of this method on large search spaces demonstrate that the single instance of large problems are solved substantially faster than the total time needed for the bootstrap process while the solutions obtained are still very close to optimal. Heuristic Search Learning Heuristics Bootstrapping Planning
179	Enterprise Users and Web Search Behavior Lewis, April Ann 01 May 2010 (has links) This thesis describes analysis of user web query behavior associated with Oak Ridge National Laboratory’s (ORNL) Enterprise Search System (Hereafter, ORNL Intranet). The ORNL Intranet provides users a means to search all kinds of data stores for relevant business and research information using a single query. The Global Intranet Trends for 2010 Report suggests the biggest current obstacle for corporate intranets is “findability and Siloed content”. Intranets differ from internets in the way they create, control, and share content which can make it often difficult and sometimes impossible for users to find information. Stenmark (2006) first noted studies of corporate internal search behavior is lacking and so appealed for more published research on the subject. This study employs mature scientific internet web query transaction log analysis (TLA) to examine how corporate intranet users at ORNL search for information. The focus of the study is to better understand general search behaviors and to identify unique trends associated with query composition and vocabulary. The results are compared to published Intranet studies. A literature review suggests only a handful of intranet based web search studies exist and each focus largely on a single aspect of intranet search. This implies that the ORNL study is the first to comprehensively analyze a corporate intranet user web query corpus, providing results to the public. This study analyzes 65,000 user queries submitted to the ORNL intranet from September 17, 2007 through December 31, 2007. A granular relational data model first introduced by Wang, Berry, and Yang (2003) for Web query analysis was adopted and modified for data mining and analysis of the ORNL query corpus. The ORNL query corpus is characterized using Zipf Distributions, descriptive word statistics, and Mutual Information. User search vocabulary is analyzed using frequency distribution and probability statistics. The results showed that ORNL users searched for unique types of information. ORNL users are uncertain of how to best formulate queries and don’t use search interface tools to narrow search scope. Special domain language comprised 38% of the queries. The average results returned per query for ORNL were too high and no hits occurred 16.34%. web search search log analysis intranet search search behavior data mining Computational Linguistics Databases and Information Systems Other Computer Sciences Other Linguistics
180	Using Rabin-Karp fingerprints and LevelDB for faster searches Deighton, Richard A. 01 December 2012 (has links) This thesis represents the results of a study into using fingerprints generated according to the Rabin-Karp Algorithm, and a database LevelDB to achieve Text Search times below GREP, which is a standard command-line UNIX text search tool. Text Search is a set of algorithms that find a string of characters called a Search Pattern in a much larger string of characters in a document we call a text file. The Rabin-Karp Algorithm iterates through a text file converting character strings into fingerprints at each location. A fingerprint numerically represents a window length string of characters to the left of its location. The algorithm compares the calculated fingerprint to the Search Pattern’s fingerprint. When fingerprints are not equal, we can guarantee the corresponding strings will not match. Whereas when fingerprints are, the strings probably match. A verification process confirms matches by checking respective characters. Our application emerges after making the following major changes to the Rabin-Karp Algorithm. First, we employ a two-step technique rather than one. During step 1, the preprocessing step, we calculate and store fingerprints in a LevelDB database called an Index Database. This is our first major change unique to us. Step 2, the matching step, is our second unique change. We use the Index Database to look-up the Search Pattern’s fingerprint and gather its set of locations. Finally, we allow the pattern to be any length relative to the window length. We even created an equation to check if the difference in length is too long for the fingerprint’s number system base. We facilitated our performance experiments by first building our application and testing it against GREP for a wide range of different parameters. Our conclusions and recommendations determine that although we currently only outperform GREP in about half the cases, we identify some promising opportunities to modify some parts of our application so that we can outperform GREP in all instances. / UOIT Rabin-Karp fingerprints Text search LevelDB

Search results