Files
Abstract
Telephone spam and scam has become an increasingly prevalent problem in many countries all over the world. To effectively detect telephone spam, we present a novel detection system which uses a combination of unsupervised and supervised machine learning methods to mine new, previously unknown spam numbers from large datasets of call detail records (CDRs) with a small seed of confirmed known spam phone numbers. Our experimental results show that the system is able to greatly expand on the initial seed of known spam numbers by up to about 250\% while aiming for zero false positives. In the research on telephony spam, we noticed that most of previously proposed mitigation methods can be evaded by robocalls that leverage caller ID spoofing technique. To fight against such mass robocalls prevailing in modern life, we propose a novel prototype of virtual assistant (VA) application for smartphones to automatically vet incoming calls. The VA system can pick up an incoming call and screen it with deep learning modules without user interruption to determine if the call is unwanted. Furthermore, we perform a comprehensive investigation over a specific type of telephony scam that tricks victims into calling scammers, namely Technical Support Scam(TSS). Since modern TSS websites intentionally create text content which is highly similar to that of benign technical support pages, current content-based models are not sufficient to detect TSS websites. We first report major components of TSS ecosystem based on an investigation over TSS underground market. With the obtained understanding, we build a multi-source pipeline to collect large amount of ground truth TSS websites from both search result and ads with mainstream search engine. After analyzing characteristics of TSS websites in multiple dimensions such as phone number prominence on web page layout, phone number change in history and backlink importance in TSS website rank promotion, we propose a novel topic-agnostic model to detect TSS in search results and make suggestions to fight against TSS in ads. The experimental results show that our proposed defense is effective.