Understanding and Defending Against Telephone Scams with Large-Scale Data Analytics and Machine Learning Systems

Liu, Jienan

Liu, Jienan

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Telephone spam and scam has become an increasingly prevalent problem in many countries all over the world. To effectively detect telephone spam, we present a novel detection system which uses a combination of unsupervised and supervised machine learning methods to mine new, previously unknown spam numbers from large datasets of call detail records (CDRs) with a small seed of confirmed known spam phone numbers. Our experimental results show that the system is able to greatly expand on the initial seed of known spam numbers by up to about 250\% while aiming for zero false positives. In the research on telephony spam, we noticed that most of previously proposed mitigation methods can be evaded by robocalls that leverage caller ID spoofing technique. To fight against such mass robocalls prevailing in modern life, we propose a novel prototype of virtual assistant (VA) application for smartphones to automatically vet incoming calls. The VA system can pick up an incoming call and screen it with deep learning modules without user interruption to determine if the call is unwanted. Furthermore, we perform a comprehensive investigation over a specific type of telephony scam that tricks victims into calling scammers, namely Technical Support Scam(TSS). Since modern TSS websites intentionally create text content which is highly similar to that of benign technical support pages, current content-based models are not sufficient to detect TSS websites. We first report major components of TSS ecosystem based on an investigation over TSS underground market. With the obtained understanding, we build a multi-source pipeline to collect large amount of ground truth TSS websites from both search result and ads with mainstream search engine. After analyzing characteristics of TSS websites in multiple dimensions such as phone number prominence on web page layout, phone number change in history and backlink importance in TSS website rank promotion, we propose a novel topic-agnostic model to detect TSS in search results and make suggestions to fight against TSS in ads. The experimental results show that our proposed defense is effective.

Details

Record ID

4365

Record Created

2024-12-05

Title

Understanding and Defending Against Telephone Scams with Large-Scale Data Analytics and Machine Learning Systems

Author

Liu, Jienan

Contributor

Perdisci, Roberto Advisor
Lee, Kyu Hyung Committee Member
Guan, Le Committee Member

College or School

Computer Sciences

Subjects

Computer science

Content Type

Dissertation

Pagination

125

File Format

pdf

Language

English

Degree Type

Doctor of Philosophy (PHD)

Name of Granting Institution

University of Georgia

Year Degree Granted

2021-12

Keywords

Machine Learning; Robocall; Tech Support Scam; Telephone Blacklist; Telephony Spam

Record Appears in

Electronic Theses and Dissertations > Doctoral Dissertation
All Resources
Doctoral

System Control Number

9949420827602959

PDF

Statistics

Download Full History