Fraud Squad Field Notes is a multipart series that will cover a variety of topics related to user fraud and waste within mobile app marketing. With insights directly from the mCent fraud team, we will discuss industry trends, methodologies, and share observations from our audience data.
Thinking outside the box: Using Benford’s Law to detect fraudulent traffic
As members of the Fraud Squad team here at Jana, we work tirelessly to detect and prevent fraud on our platform to make sure that we can not only scale our own business, but that we can continue to deliver the highest quality audience to our clients.
But (as anyone with experience in Ad Tech can attest to) detecting fraudulent traffic can be extremely tricky: by its very nature, fraudulent traffic is designed to evade detection and sophisticated abusers are quick to circumvent even the most advanced countermeasures.
In order to stay ahead of the curve, we have to think creatively about fraud in order to detect this bad behavior amidst a virtual sea of ambiguous data.
What is Benford’s Law?
In a nutshell, Benford’s Law states that, for certain randomly sampled groups of data, the distribution of the leading digit follows will follow a specific pattern (i.e. digits starting with “1” will occur 30.1% of the time, those starting with “2” will occur 17.6% of the time, etc.). If you’re curious, I’d urge you listen to the excellent RadioLab podcast on the subject.
The law is proven mathematically and applies to an odd array of data, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, etc.
Here’s why we got curious: One application of the Benford’s Law has been detecting fraud (i.e. forensic accounting). That is, in cases where someone is investigating financial fraud, accounting numbers are run through Benford’s, and if it varies from the “natural distribution,” it is possibly because the numbers were cooked. It’s even been admitted as evidence in court.
Curious, we began experimenting with Benford’s in a forensic exercise of our own, using our own data to see if Benford’s could detect fraud.
Using Benford’s with real data
Below is some sample data usage from our app, mCent (that is, the amount of data a member has used while logged into our platform). The Group A is the Benford distribution, the Group B is a segment of normal traffic, and Group C represents a segment of traffic we know to be fraudulent :
Pretty neat, eh? It’s one thing to be read stories about Benford’s Law, it’s quite another to see it working in practice.
This is just one of the ways in which we’re creating new methods to use data to combat fraud and waste in the mobile app ecosystem, and we’re always thinking of next thing to investigate! Interested in joining the fun? That’s great news, because we’re hiring!