How SMS stakeholders can reduce spam and phishing messages on behalf of mobile operators

First, let’s address some conflation — it’s leading to some confusion amongst operators and vendors.

Paul Walsh

Published in

METACERT

8 min readJul 16, 2021

(Conflation is the merging of two or more sets of information, texts, ideas, opinions, etc., into one, often in error).

SPAM vs Phishing

Spam = annoying
Phishing = dangerous

Spam can be defined as unwanted messages sent by people and entities trying to promote or sell unwanted products and services.

Phishing can be defined as the technique used to impersonate a person or entity on the Internet. That’s it. Nothing more. Phishing should never be defined as the theft of personal information — that’s just one of the desired outcomes. Phishing refers only to the technique — not the outcome.

It’s not wrong to combine “Phishing” and “Spam” inside security solutions. But it’s wrong to assume both are classified in the same way. They are not. “Phishing” requires a very unique set of skills, experience, technology, techniques and insights. Classifying Pornography related URLs also requires unique attributes.

How to reduce “SPAM”

I’m not an expert in anti-spam, so this is not an exhaustive list. I’m an expert in URL Classification. I’m sure my own team would add many more insights below if I asked. We built software and services that listen to the entire Twitter firehose, detecting signals attributed to phishing-related scams and then automatically classifying the URLs used inside a tweet. We did the same for misinformation and fake news. With enough data points, and well-defined AI, and machine learning rules, picking up signals is pretty easy. What’s mathematically impossible however, is making a determination about a URL with no other data points. It’s IMPOSSIBLE.

I’m sure SMS Firewall vendors will have more insights too.

AI and machine learning are great for identifying digital fingerprints associated with software and human behavioral patterns. This approach is well suited to SMS traffic. Some signals to measure might include — sender ID, how the message was sent, message content, URL structure, and domain name creation date.
Sender ID verification is useful when traffic comes from a specific source.
Implement a rules-based engine to detect suspicious behavior associated with sender IDs, message content and URLs. Regex is good for some of the URL work — but only when it’s combined with other data points. Regex can’t be used for making a determination on the fly without failing, or without introducing so many false positives that everyone would get upset.
Detect messages that contain URLs that are classified as “dangerous” — this will require the integration of at least one cybersecurity threat intelligence feed. The more feeds you integrate the better. Any non-security company that thinks it can build a meaningful threat intelligence system and feed internally, is unaware of what it doesn’t know. Almost every cybersecurity product or service you pay for, relies on third-party threat feeds. Very few security vendors own their own threat intelligence system. See below.
Detect messages that contain domains created x-n days ago. It’s extremely unlikely that any brand or bank would have time to build an entire website within 7 days of registering a new domain name. You shouldn’t assume all traffic with a newly registered domain name is dangerous though. A decision tree is needed to make a better determination about the likely behavior of the sender. The number of recipients is a signal that can be combined to help make an informed determination. Note — every URL used in the FluBot SMS messages belonged to domains that were at least a year old. While detecting newly registered domains is good for blocking “spam”, it’s not reliable enough to be a stand-alone solution for phishing.

How to kill SMS Phishing

To be blunt, if any company in the world could effectively and reliable stop phishing-led attacks by detecting and blocking dangerous URLs, 2020 wouldn’t be the worst year in history for phishing. And Proofpoint would promote a solution on this page.

Zero Trust is the only strategy that can kill SMS-led phishing attacks because it’s the only way to stop any kind of cyberattack that involves a deceptive URL. This article has more information to explain why, in more detail.

SMS Phishing is not sophisticated

Phishing was first discovered on the AOL network in 1995. Let’s see how sophisticated it has become.

1990's

Phishing = deceptive URL that links to a counterfeit webpage or download.

Desired outcome = ego boost, and a little financial gain.

Cost = high, because domains were expensive and building fake webpages had to be coded by hand.

2021

Phishing = deceptive URL that links to a counterfeit webpage or download.

Desired outcome = my mother couldn’t get a blood test in Ireland. The HSE was hit with a ransomware attack because their security vendors failed to protect employees from a targeted email phishing attack. See what I did there? I turned the tables on the security industry instead of blaming victims — who are not the security experts.

If you were sold an alarm that made it impossible for a criminal to walk through your front door, who should we question when we see criminals walking through your front door? What if everyone was sold the same front door protection? At what point would we try a different approach?

The only thing that has become more sophisticated, is the desired outcome of the phishing attack. FluBot malware is sophisticated — but the SMS messages that contain a deceptive URL is NOT sophisticated.

The SMS-led phishing messages that we see today are less sophisticated than the phishing techniques adopted in the 90’s because there are lots of tools and services that make it fast, cheap and easy for threat actors to setup and launch campaigns.

How you can test like a hacker

Go to phishtank.com and grab a few Phishing URLs for free. Send one of the URLs to any MSISDN on any network. If your messages reaches the handset, you know it’s not protected. Don’t test URLs that are already classified as dangerous by the network — they’re not longer in use by smart criminals. If operators have the protection that some people think, why are they still warning subscribers not to open links?

I guarantee you with 99.9% confidence, that every message with an unknown deceptive URL will get through to the handset. Why does the word “unknown” not make all of this obvious? I don’t understand. If you can block something, it’s not unknown. 🤪

I hear often that operators can block over 98% of all phishing attacks. This is not true. What is true is the fact they can block over 98% of all spam — some of it will include some phishing URLs — but only the ones that were already classified as dangerous — and criminals don’t care because they no longer use those URLs. It’s like collecting 98% of all used single-use water bottles, and saying you can stop people from using single-use water bottles. All they have to do is more from the shop.

Blocking 98% of home intrusions through windows is a meaningless statistic if 99% of all intrusions are via the front door.

How the cybersecurity industry does things

There are thousands of security vendors in the world, but only a small handful own a URL threat intelligence system — that’s a fancy name for a database of URLs with classification information. Security vendors that offer products and services, typically license one or more feed from threat intelligence providers. The vast majority of security vendors who build solutions for email, browsers, and other channels, don’t own their the URL data — they license it. So whenever there is a report of a suspicious URL, it’s the threat intelligence researchers that do all the work — not necessarily the security vendor supplying the solution.

Those of us responsible for URL Classification prefer a separate category for almost everything. “Phishing” and “Spam” are separate categories that could be put under one parent category. Some might even combine them. Vendors who license feeds from third-parties might get an API response that says “Spam” — but on the backend for the supplier, it’s exceptionally likely they’re combining multiple categories that required very different skills, techniques and tools to identify and classify those URLs.

Some specialize only in Phishing URLs. PhishTank is so specific it specializes in phishing URLs used in fraud and identity theft only — they don’t classify phishing URLs used for counterfeit webpages that encourage you to download malware — this excludes every URL used for FluBot. 99% of the security industry doesn’t even know this about PhishThank. So imagine a solution that relies only on PhishTank — assuming you’re safe from all known phishing URLs for FluBot — this is happening today.

Some parents consider “Lingerie” related websites and “Pornography” websites as “Pornography”. When it comes to protecting their children from what they deem to be “inappropriate”, they’re not wrong — they can call it whatever they like. They get to decide what should be accessible or not. But from MetaCert’s perspective, these are two very different categories from a “classification” perspective. We would provide an API service that allows a security vendor to decide how to present options to customers — we would never allow a partner to block lingerie websites under the banner of “pornography” as it would make us look bad. But we would allow them to block lingerie if it was positioned properly.

When we had an exclusive contract with ICM Registry to automatically classify every .XXX domain as “Pornography”, we were asked by them to change the classification for non-adult websites that used a .XXX domain. We politely declined on the basis that the entire world assumes every .XXX domain is registered with adult content in mind. And the amount of work involved to determine which sites were adult oriented and which weren’t, at any given time, was and still is, cost prohibitive — probably impossible to do it reliably.

While our classification technology remains a constant, the people, skills, tools, and techniques for classifying “Phishing” URLs is exceptionally unique — it’s very different to classifying any other category on the Internet — including “SPAM”.

Summary

Proofpoint Cloudmark is not the answer because it relies on being able to detect and block dangerous URLs it doesn’t know about. Read this article to find out why this approach is not reliable or effective.

Mavenir SpamShield is not the answer because Mavenir is not even close to being a cybersecurity company, and SpamShield has been tested by us over and over again and it doesn’t stop phishing URLs that are not classified already.

If you think otherwise, show me which networks they protect and I’ll show you how they don’t protect them. It’s time we called out vendors who overstate their ability to keep everyone safe from harm, instead of the organizations that they fail to protect. It’s time we show empathy towards mobile operators that are trying to tackle this problem. It’s time we help operators select a solution that does what they need, not a solution that meets ill-informed requirements.

If the cybersecurity industry can recognize the need for experts in URL Classification, isn’t it time the mobile industry does the same. SMS Firewall vendors are not cybersecurity companies, let alone experts in threat intelligence and URL Classification. I haven’t seen anything related to this subject from an SMS vendor that doesn’t make me cringe. Whenever I find a vendor with a compelling solution, I’ll update this article.