331 |
CSI in the Web 2.0 Age: Data Collection, Selection, and Investigation for Knowledge DiscoveryFu, Tianjun January 2011 (has links)
The growing popularity of various Web 2.0 media has created massive amounts of user-generated content such as online reviews, blog articles, shared videos, forums threads, and wiki pages. Such content provides insights into web users' preferences and opinions, online communities, knowledge generation, etc., and presents opportunities for many knowledge discovery problems. However, several challenges need to be addressed: data collection procedure has to deal with unique characteristics and structures of various Web 2.0 media; advanced data selection methods are required to identify data relevant to specific knowledge discovery problems; interactions between Web 2.0 users which are often embedded in user-generated content also need effective methods to identify, model, and analyze. In this dissertation, I intend to address the above challenges and aim at three types of knowledge discovery tasks: (data) collection, selection, and investigation. Organized in this "CSI" framework, five studies which explore and propose solutions to these tasks for particular Web 2.0 media are presented. In Chapter 2, I study focused and hidden Web crawlers and propose a novel crawling system for Dark Web forums by addressing several unique issues to hidden web data collection. In Chapter 3 I explore the usage of both topical and sentiment information in web crawling. This information is also used to label nodes in web graphs that are employed by a graph-based tunneling mechanism to improve collection recall. Chapter 4 further extends the work in Chapter 3 by exploring the possibilities for other graph comparison techniques to be used in tunneling for focused crawlers. A subtree-based tunneling method which can scale up to large graphs is proposed and evaluated. Chapter 5 examines the usefulness of user-generated content in online video classification. Three types of text features are extracted from the collected user-generated content and utilized by several feature-based classification techniques to demonstrate the effectiveness of the proposed text-based video classification framework. Chapter 6 presents an algorithm to identify forum user interactions and shows how they can be used for knowledge discovery. The algorithm utilizes a bevy of system and linguistic features and adopts several similarity-based methods to account for interactional idiosyncrasies.
|
332 |
Effectively Visualizing Library DataPhetteplace, Eric 20 December 2012 (has links)
As libraries collect more and more data, it is worth taking some time to analyze the data we collect and effectively present it. This article details how to use visualization to investigate trends and make compelling arguments with data.
|
333 |
A DECENTRALIZED ADAPTIVE CONTROL SCHEME FOR ROBOTIC MANIPULATORS.Koenig, Mark A. January 1985 (has links)
No description available.
|
334 |
Knowledge discovery from distributed aggregate data in data warehouses and statistical databasesPaÌirceÌir, RoÌnaÌn January 2002 (has links)
No description available.
|
335 |
Mis-reporting of food intake by UK adultsO'Reilly, Leona January 2001 (has links)
No description available.
|
336 |
Performance and complexity of lattice codes for the Gaussian channelSheppard, J. A. January 1996 (has links)
No description available.
|
337 |
An analysis of cost efficiency in English acute hospitalsJacobs, Rowena January 2002 (has links)
No description available.
|
338 |
The specification and implementation of an Extended Relational Model and its application within an Integrated Project Support EnvironmentEarl, A. N. January 1988 (has links)
No description available.
|
339 |
Acquisition and storage of data from manual and automated ultrasonic systemsTowert, D. S. W. M. January 1988 (has links)
No description available.
|
340 |
PONI : an intelligent alarm system for respiratory and circulatory management in the operating roomsMatsiras, Paul V. January 1989 (has links)
No description available.
|
Page generated in 0.0358 seconds