Files
Abstract
Scam phone calls have become a serious security issue in telephony networks. Asurvey in 2015 indicates that 27 millions of U.S. phone users lost approximately$7.4 billion to scam calls [1]. Unfortunately, users are not able to detect whether anunknown incoming call is from a fraud campaign or a legitimate user and currently,there is no singleeffective way to handle this dilemma.We have proposed a novel three-layer automatic scam labeling system which, for thefirst time, uses online user complaints dataset in order to identify spam and scamcampaigns. The system utilizes topic modeling methods to draw the most frequentfraud campaigns from the text, and breaks down the extracted information into high-quality and low-quality topics using our proposed word intrusion with the repeatedsampling method. Finally, the system maps back the high-quality topics to the relatedphone numbers and group the numbers based on campaign intention.