论文标题
基于内容的垃圾邮件检测中的视觉欺骗
Visual Spoofing in content based spam detection
论文作者
论文摘要
尽管似乎解决了垃圾邮件分类的问题,但当前垃圾邮件过滤器中仍然存在漏洞,可以轻松利用。我们提出了一个这样的漏洞,其中一个可以用不同字母的相应字符替换某些字符。这些字符在视觉上相似,但具有不同的Unicode编码。使用这种方法,垃圾邮件发送者可以创建绕过现有垃圾邮件过滤器的消息。此外,我们表明这种方法可用于避免窃检测,以及在使用自然语言处理进行自动分析文本文档的其他应用中。
Although the problem of spam classification seems to be solved, there are still vulnerabilities in the current spam filters that could be easily exploited. We present one such vulnerability, in which one could replace some characters with corresponding characters from a different alphabet. These characters are visually similar, yet have a different Unicode encoding. With this approach spammers can create messages that bypass existing spam filters. Moreover, we show that this approach can be used to avoid plagiarism detection, and in other applications that use natural language processing for automatic analysis of text documents.