Global ETD Search

1	Battling the Internet water army: detection of hidden paid posters. Chen, Cheng 04 July 2012 (has links) Online social media, such as news websites and community question answering (CQA) portals, have made useful information accessible to more people. However, many of online comment areas and communities are flooded with fraudulent information. These messages come from a special group of online users, called online paid posters, or termed "Internet water army" in China, represents a new type of online job opportunities. Online paid posters get paid for posting comments or articles on different online communities and websites for hidden purpose, e.g., to influence the opinion of other people towards certain social events or business markets. Though an interesting strategy in business marketing, paid posters may create a significant negative effect on the online communities, since the information from paid posters is usually not trustworthy. We thoroughly investigate the behavioral pattern of online paid posters based on a real-world trace data from the social comments of a business conflict. We design and validate a new detection mechanism, including both non-semantic analysis and semantic analysis, to identify potential online paid posters. Using supervised and unsupervised approaches, our test results with real-world datasets show a very promising performance. / Graduate online paid posters machine learning spam detection behavioral pattern
2	Efficient Spam Detection across Online Social Networks Xu, Hailu January 2016 (has links) No description available. Computer Science
3	Methods for Analyzing the Evolution of Email Spam Nachenahalli Bhuthegowda, Bharath Kumar 11 January 2019 (has links) Email spam has steadily grown and has become a major problem for users, email service providers, and many other organizations. Many adversarial methods have been proposed to combat spam and various studies have been made on the evolution of email spam, by finding evolution patterns and trends based on historical spam data and by incorporating spam filters. In this thesis, we try to understand the evolution of email spam and how we can build better classifiers that will remain effective against adaptive adversaries like spammers. We compare various methods for analyzing the evolution of spam emails by incorporating spam filters along with a spam dataset. We explore the trends based on the weights of the features learned by the classifiers and the accuracies of the classifiers trained and tested in different settings. We also evaluate the effectiveness of the classifier trained in adversarial settings on synthetic data. Adversarial classification Email spam Evolution of email spam Machine learning Spam detection
4	Transfer Learning for BioImaging and Bilingual Applications January 2015 (has links) abstract: Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution, but we may have plenty of labeled data from one or multiple related sources with different distributions. Due to its capability of migrating knowledge from related domains, transfer learning has shown to be effective for cross-domain learning problems. In this dissertation, I carry out research along this direction with a particular focus on designing efficient and effective algorithms for BioImaging and Bilingual applications. Specifically, I propose deep transfer learning algorithms which combine transfer learning and deep learning to improve image annotation performance. Firstly, I propose to generate the deep features for the Drosophila embryo images via pretrained deep models and build linear classifiers on top of the deep features. Secondly, I propose to fine-tune the pretrained model with a small amount of labeled images. The time complexity and performance of deep transfer learning methodologies are investigated. Promising results have demonstrated the knowledge transfer ability of proposed deep transfer algorithms. Moreover, I propose a novel Robust Principal Component Analysis (RPCA) approach to process the noisy images in advance. In addition, I also present a two-stage re-weighting framework for general domain adaptation problems. The distribution of source domain is mapped towards the target domain in the first stage, and an adaptive learning model is proposed in the second stage to incorporate label information from the target domain if it is available. Then the proposed model is applied to tackle cross lingual spam detection problem at LinkedIn’s website. Our experimental results on real data demonstrate the efficiency and effectiveness of the proposed algorithms. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2015 Computer science Bilingual Spam Detection BioImage Annotation Deep Learning Transfer Learning
5	Detekce nevyžádaných zpráv v mobilní komunikaci a na sociálních sítích / Detection of SPAM Messages in Mobile Communication and Social Networks Jaroš, Ján January 2014 (has links) This thesis deals with spam in mobile and social networks. It focuses on spam in SMS messages and web service Twitter. Theoretical part provides brief overview of those two media, informations about what spam is, how to defend against it and where does it comes from. There is also a list of methods for spam detection, many of them have their roots in filtration of email communication. The rest of thesis is about design, implementation of application for spam detection in SMS and Twitter messages and evaluation of its performance.
6	EVALUATION OF VISUAL ANALYTICS WITH APPLICATION TO SOCIAL SPAMBOT LABELING Mosab Abdulaziz Khayat (8992520) 23 June 2020 (has links) Visual analytics (VA) solutions emerged in the past decade and tackled many problems in a variety of domains. The power of combining the abilities of human and machine creates fertile ground for new solutions to grow. However, the rise of these hybrid solutions complicates the process of evaluation. Unlike automated solutions, VA solutions behavior depends on the user who operates them. This creates a dimension of variability in measured performance. The existence of a human, on the other hand, allows researchers to borrow evaluation methods from domains, such as sociology. The challenge in these methods, however, lies in gathering and analyzing qualitative data to build valid evidence of usefulness.<div>This thesis tackles the challenge of evaluating the usefulness of VA solutions. We survey existing evaluation methods that have been used to assess VA solutions. We then analyze these methods in terms of validity and generalizability of their findings, as well as the feasibility of using them. Subsequently, we propose an evaluation framework which suggests evaluating VA solutions based on judgment analysis theory. The analysis provided by our framework is capable of quantitatively assessing the performance of a solution while providing a reason for the captured performance.<br></div><div>We have conducted multiple case studies in social spambot labeling domain to apply our theoretical discussion. We have developed a VA solution that tackles social spambot labeling problem, then use this solution to apply existing evaluation methods and showcase some of their limitations. Furthermore, we have used our solution to show the benefit yielded by our proposed evaluation framework.</div> Computer Engineering Visual analytics visualization methods Evaluation evaluation instrument Spam detection
7	Towards Building a Versatile Tool for Social Media Spam Detection Abdel Halim, Jalal 15 June 2023 (has links) No description available. Computer Science spam detection social media bert sequential model live detection selenium classifiers
8	Realizace spamového filtru na bázi umělého imunitního systému / Spam Filter Implementation on the Basis of Artificial Immune Systems Neuwirth, David January 2009 (has links) Unsolicited e-mails generally present a major problem within the e-mail communication nowadays. There exist several methods that can detect spam and distinguish it from the requested messages. The theoretical part of the masters thesis introduces the ways of detecting unsolicited messages by using artificial immune systems. It presents and subsequently analyses several methods of the artificial immune systems that can assist in the fight against spam. The practical part of the masters thesis deals with the implementation of a spam filter on the basis of the artificial immune systems. The project ends with comparison of effectiveness of the newly designed spam filter and the one which uses common methods for spam detection.
9	Spam Analysis and Detection for User Generated Content in Online Social Networks Tan, Enhua 23 July 2013 (has links) No description available. Computer Engineering Computer Science user generated content online social networks user behavior stretched exponential distribution spam filtering spam detection spam classification decision tree social graph user-link graph Sybil attack community detection BARS UNIK
10	Fake Mass-Produced Advertisements Detection on Global Online Adult Service Websites / Detektering av Falska Massproducerade Annonser på Globala Webbplatser som Erbjuder Eskorttjänster Pokropek, Ernest January 2023 (has links) A significant amount of sex trafficking victims are being advertised on online adult services, which are currently being flooded with spam. Investigators rely on online adult services to track cases of sex trafficking; however, the ever-increasing volume of spam poses a mounting challenge, making their task progressively more difficult. This thesis presents a machine learning-based approach for detecting fake mass-produced advertisements on global online adult service websites. The objective is to aid investigators in tracking sex trafficking by developing a robust spam classifier that minimizes false positives on genuine ads while effectively identifying mass-produced spam. This objective is of utmost importance as it allows for filtering out spam effectively while ensuring that genuine ads are not mistakenly labeled as spam, ensuring their inclusion in crucial investigations. The research involved cleaning advertisement text, generating text embeddings using sentence-BERT, clustering them with DBSCAN, and feature engineering for classification using a random forest classifier. A dataset of two million advertisements was utilized for training and evaluation. The study successfully achieved the crucial goal of minimizing false positives, ensuring that genuine ads are not misclassified as spam. By employing innovative techniques and carefully engineered features, the classifier demonstrates a high level of recall in distinguishing mass-produced spam from authentic ads. Furthermore, the investigation identified key markers of mass-produced spam, such as geographical spread and frequent use of profane language. This research fills a significant research gap, as no previous attempts had been made to classify spam on these websites. The findings not only contribute to the field of machine learning but also provide a comprehensive overview of fraudulent advertisement features, making sex trafficking investigations more efficient. Equipping investigators with a reliable tool to navigate the vast amount of data associated with global online adult service websites, this work plays a crucial role in combating sex trafficking and ensuring the integrity of the investigative process. / En betydande mängd offer för sexhandel annonseras ut på webbplatser som erbjuder eskorttjänster på nätet, som för närvarande översvämmas av skräppost. Poliser använder sig av webbplatser som erbjuder eskorttjänster för att spåra fall av sexhandel, men den ständigt ökande mängden skräppost utgör en allt större utmaning och gör deras uppgift allt svårare. Denna avhandling presenterar en maskininlärningsbaserad metod för att upptäcka falska massproducerade annonser på globala webbplatser som erbjuder eskorttjänster. Målet är att hjälpa poliser att spåra sexhandel genom att utveckla en robust spamklassificerare som minimerar risken att kategorisera äkta annonser som spam, samtidigt som den effektivt identifierar massproducerad spam. Detta mål är av yttersta vikt eftersom det möjliggör effektiv filtrering av skräppost samtidigt som det säkerställer att äkta annonser inte felaktigt märks som skräppost, vilket säkerställer att de inkluderas i viktiga utredningar. Arbetet omfattade tvättning av annonstexterna, generering av ordvektorer med hjälp av sentence-BERT, klustring av vektorerna med hjälp av DBSCAN och definition av särdrag för den klassificering som sedan utfördes med hjälp av en random forest-klassificerare. Ett dataset med två miljoner annonser användes för träning och utvärdering. Studien lyckades uppnå det viktiga målet att minimera falska positiva resultat, vilket säkerställer att äkta annonser inte felklassificeras som skräppost. Tack vare innovativa tekniker och noggrant utformade särdrag uppvisar klassificeraren hög täckning (recall) när det gäller att skilja massproducerad skräppost från autentiska annonser. Dessutom identifierade undersökningen viktiga kännetecken för massproducerad skräppost, såsom geografisk spridning och frekvent användning av grova ord. Denna forskning fyller en betydande forskningslucka, eftersom inga tidigare försök hade gjorts för att klassificera skräppost på dessa webbplatser. Resultaten bidrar inte bara till området maskininlärning utan ger också insikter om bedrägliga annonser, vilket gör utredningar av sexhandel mer effektiva. Genom att förse utredare med ett tillförlitligt verktyg för att navigera i den enorma mängd data som är kopplad till globala webbplatser som erbjuder eskorttjänster spelar detta arbete en avgörande roll i kampen mot sexhandel. Machine learning Spam detection Mass-produced spam Global adult online services Maskininlärning Detektering av Spam Massproducerad Spam Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Computer and Information Sciences Data- och informationsvetenskap

Search results