python

Managing Redis complexity in Python – Part 1: Organizing Keys

A major challenge when engineering any large software system is managing complexity. The power of Redis comes with the responsibility of managing the complexity of your data layer dilligently. Failing to do so means time wasted re-writing code, chasing down bugs, and trying to explain how you lost valuable data.

After building a few mid-scale systems leveraging Redis, I’ve learned, the hard way, much about managing complexity. To help save you the trouble of making the same mistakes I did, I’ll detail what I’ve learned, so you can focus on building solid applications.

In this first post, I will be discussing strategies for formatting and organizing Redis keys. I plan to touch on other topics in managing Redis complexity in future posts.

Keep keys consistent & simple

Redis does not enforce key separation past its (default) 16 numbered databases, so naming your objects well is particularly important. When choosing keys, keep simplicity in mind. A good place to start is to follow a few simple naming conventions to keep your names consistent.

One common convention is to use a colon : to separate unique IDs in a key. For example:

  • user:0001
  • user:0001:friends
  • campaign_variant:{campaign_id}:{variant_id}:config

Another convention is to use dots . to group keys into namespace/contexts. For example:

  • sessions.data
  • sessions.data.index_by_created
  • sessions.data.group_by_variant

Combining these two conventions:

  • events.count_by_var.hourly:{campaign_id}:{variant_id}
  • events.count_by_var.daily:{campaign_id}:{variant_id}

While sticking to these conventions (or any others you choose, as long as you are consistent in their use) try to avoid keys that are particularly long, complex or nested. Remember, as the The Zen of Python states: “Simple is better than complex” and “Flat is better than nested”.

Enforce a centralized key configuration

It can be extremely frustruating having your database keys scattered across many files. It makes your data structures hard to understand and keep consistent, while making changes tedious and error-prone.

In my experience, making a central registry defining the keys used within a database works very well. It enforces well-structured keys while centralizing the database schema concisely.

Below is the abstract implementation of a Redis key “registry”. Don’t worry about understanding everything it’s doing on your first pass; its functionality will become clearer later in this post.

class RedisKeyRegistryBase(object):
    key_config = {}

    def __init__(self, key_prefix=None):
        self.key_prefix = key_prefix or ''
        self._key_multi = {}
        self._key_single = {}

        # Compile the rules into efficient runtime lookups
        formatter = string.Formatter()  # For determining the format keys
        for name, key_config in self.key_config.iteritems():
            if key_config['type'] == 'multiple':
                parsed_template = formatter.parse(key_config['key'])
                self._key_multi[name] = (
                    key_prefix + key_config['key'],
                    [x[1] for x in parsed_template if x[1]]  # Argument names
                )
            elif key_config['type'] == 'single':
                self._key_single[name] = (
                    key_prefix + key_config['key']
                )
            else:
                assert False, "Invalid key type"

    def get_key(self, name, *args, **kwargs):
        """ Look up a key for a given name
        """
        if kwargs:
            return self._key_multi[name][0].format(**kwargs)
        elif args:
            key_template, key_args = self._key_multi[name]
            return key_template.format(
                **{k: args[i] for i, k in enumerate(key_args)}
            )
        else:
            return self._key_single[name]

On initalization, this class compiles an internal lookup table for quickly resolving a key “name” into a Redis key. The configuration used to compile this lookup table is defined by a subclass.

In the following subclass, key_config is configured with a dictionary of rules:

class RedisKeyRegistry(RedisKeyRegistryBase):
    key_config = {
        'sessions.data': {
            'type': 'single',
            'key': 'sessions.data',
            'data_type': 'hmap',  # optional metadata
        },
        'sessions.group_variant_by_created': {
            'type': 'multiple',
            'key': 'sessions.grp_var_by_created:{campaign}:{variant}',
        },
        'user.friends': {
            'type': 'multiple',
            'key': 'user:{user_id:06d}:friends',  # Zero-pad for sorting
            'data_type': 'set',
        },
    }


redis_keys = RedisKeyRegistry('proj1/')  # Initialize a key registry

Properly managing code complexity enables you to write complex code as long as it is well-contained. In this case, I use the Formatter class from the string library to auto-extract the variable key components during the lookup table compilation, just like the string format method. This helps make defining keys more pleasant while leaving less room for errors to sneak into your code.

Given the definitions above, we can now easily look up keys:

>>> redis_keys.get_key('sessions.data')
'proj1/sessions.data'

>>> redis_keys.get_key('sessions.group_variant_by_created',
                       campaign='camp_1', variant='var_a')
'proj1/sessions.grp_var_by_created:camp_1:var_a'

>>> redis_keys.get_key('sessions.group_variant_by_created', 'camp_2', 'var_b')
'proj1/sessions.grp_var_by_created:camp_2:var_b'

>>> redis_keys.get_key('user.friends', 42)
'proj1/user:000042:friends'

Key “names” cleanly decouple us from the Redis keys that they signify. This allows you to use meaningful names within your code, seperate from the more efficiency-centric keys used in the database. They also give you somewhere to attach meta-data to your database objects; for example to create a data explorer interface for your Redis databases.

A base prefix (proj1/ here) allows you to run multiple applications/projects in the same database without conflict. This is particularly helpful when testing your data structures.

Conclusion

Picking good names is not easy, but it doesn’t have to be complicated. Hopefully some of this advice will help you to focus on picking good names for your keys instead of juggling complicated key dependencies and chasing bugs.

For my next post I plan on discussing the creation of solid data structures on top of Redis primitives. If you have any feedback or insight into what has worked (or not worked) for you, I’d love to hear it.

 

 

Discussion

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s