Global ETD Search

1	Automatic identification and removal of low quality online information Webb, Steve. January 2008 (has links) Thesis (Ph.D)--Computing, Georgia Institute of Technology, 2009. / Committee Chair: Pu, Calton; Committee Member: Ahamad, Mustaque; Committee Member: Feamster, Nick; Committee Member: Liu, Ling; Committee Member: Wu, Shyhtsun Felix. Part of the SMARTech Electronic Thesis and Dissertation Collection.
2	Automatic identification and removal of low quality online information Webb, Steve 17 November 2008 (has links) The advent of the Internet has generated a proliferation of online information-rich environments, which provide information consumers with an unprecedented amount of freely available information. However, the openness of these environments has also made them vulnerable to a new class of attacks called Denial of Information (DoI) attacks. Attackers launch these attacks by deliberately inserting low quality information into information-rich environments to promote that information or to deny access to high quality information. These attacks directly threaten the usefulness and dependability of online information-rich environments, and as a result, an important research question is how to automatically identify and remove this low quality information from these environments. The first contribution of this thesis research is a set of techniques for automatically recognizing and countering various forms of DoI attacks in email systems. We develop a new DoI attack based on camouflaged messages, and we show that spam producers and information consumers are entrenched in a spam arms race. To break free of this arms race, we propose two solutions. One solution involves refining the statistical learning process by associating disproportionate weights to spam and legitimate features, and the other solution leverages the existence of non-textual email features (e.g., URLs) to make the classification process more resilient against attacks. The second contribution of this thesis is a framework for collecting, analyzing, and classifying examples of DoI attacks in the World Wide Web. We propose a fully automatic Web spam collection technique and use it to create the Webb Spam Corpus -- a first-of-its-kind, large-scale, and publicly available Web spam data set. Then, we perform the first large-scale characterization of Web spam using content and HTTP session analysis. Next, we present a lightweight, predictive approach to Web spam classification that relies exclusively on HTTP session information. The final contribution of this thesis research is a collection of techniques that detect and help prevent DoI attacks within social environments. First, we provide detailed descriptions for each of these attacks. Then, we propose a novel technique for capturing examples of social spam, and we use our collected data to perform the first characterization of social spammers and their behaviors. Denial of information Email spam Web spam Social spam Applied machine learning Information security Spam filtering (Electronic mail) World Wide Web Online social networks Computer networks Security measures

Search results

Automatic identification and removal of low quality online information

Automatic identification and removal of low quality online information