Is Behavioral Profiling Moral?

information-overload-5It’s been a long time since humans were able to organically consume the information that surrounded them. We have even long passed the point where we were able to consume the majority of relevant information to us. To add to the problem we now live in an age that requires faster interactions on our part which in turn requires us to process more information in a shorter amount of time.

This has led us to create new ways of accessing information, from book indexes to library index cards to search engines and Spotify weekly recommendations. But we are fighting a never ending battle and today even the most advanced search engines won’t keep up to our expectations without tailoring their results to us1. This is done through many different methods, one of them is Behavioral Profiling.

What is Behavioral Profiling?

Behavioral Profiling is the process of finding a representation of you based on your implicit actions, i.e. your searches, things you buy online, ads you click on and many other small interactions you have online every day. All this information is stored and processed to create a profile of what you’re interested in, what you want to see and what you don’t want to see. This is a great technology, it can be used to help you navigate an ocean of information that not even the entire human race can collectively process2. It can help you find the product you want, tailor children’s learning material to fit their abilities or even find the music you’d like to hear. Or in Jana’s case, find apps that you ♥!

But you might already be able to think of ways this can be problematic. Since its widespread use in the industry, recommendation, targeting and personalization have been in the center of online privacy debates and rightfully so. If a company like google (and many others) controls the information you consume on a daily basis they can have a big influence on how you think, how you spend, how you vote, and in general how you interact socially. And this does happen, read here or here4. These are not necessarily intentional, many times these are services that try to please you too aggressively and end up converging on a single aspect of your interests, unintentional hiding information from you3.

What Can we do?

We need data filtering, it’s a necessity of our age, but as data scientists, engineers and product managers there are things we need to consider when creating products and services that use this technology. Most importantly we need to think through the implications of behavioral profiles or any action that filters information for our users. I’ve summarized a list of question I ask myself when working on these tools that follows.

Is this filtering ethical?

Simply the most important question. Is what we are trying to achieve deprives our users from useful information that my might want to access? It’s really not on us to decide what is and is not useful to our users. At Jana we do weekly (if not daily) surveys and user interviews to better understand what our users seek by using our platform.

Are the filtering/profiling choices transparent?

We need to make sure our choices are transparent to our users and that they understand them and have the option to opt-out.

Are we accounting for our own errors?

I believe no such system is perfect, not only because our implementation is not (both theory and code) but also because there is uncertainty in the nature of the subject matter, i.e. human interests are not static and they evolve when exposed to the filtered content. We shouldn’t assume that our filtering is perfect. This can be solved by allowing room for results that the are not necessarily ranked highest based on our ranking algorithm. This is also a great source of constant feedback for the algorithm to improve itself.

Are we measuring the long term effects?

We all talk about data driven products. But what I see being most neglected is long term control for machine learning algorithms. Many times the we saturate users with information overtime or users change interests. It’s important to iteratively evaluate the assumptions of the model.

Did I Mention we’re Hiring?

Here at Jana we care about people, not just numbers. If you like to work on complicated problems that help millions of our users access internet easier we’d like to talk.

1. When I search for “Barcelona” I mean the restaurant near where I live not the city in Spain.
2. We generate more information than we can consume, so in a way some of the information generated is never, at least directly, consumed by anyone.
3. I wish Spotify would stop showing me deep-house music, Stubhub! would stop blocking me because their algorithm misclassified me as fraud and Facebook would stop showing me hoodie ads (how many hoodies can I buy?)
4. Or watch this ted talk instead.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s