In 2015, we hired our first data engineers here at Jana. Please note that we call them data engineers, not data scientists, because we prioritize the practical applications of models into user-facing or customer-facing code.
As with most companies in this situation of bringing on a new role, we needed to figure out how they fit into our organizational structure. We had a couple of choices, broadly-speaking: centralized, e.g. a “Data Engineering Team” with its own backlog, prioritization, and related processes, or “embedded,” where a data engineer joins each existing team as needed.
We chose the embedded model, and have been very happy with the choice. We now have five data engineers, each associated with a different team or set of teams, as most applicable. For example, one of them is primarily working on our user “quality” model, another with a team that’s emphasizing “social graph” sort of features, another on growth and retention, and so on.
This association has allowed each data engineer to thoroughly understand their team’s work, how their work aligns to business objectives, how the team evaluates work for future prioritization as well as past performance, and of course to get to know their teammates well. It’s kept our data engineers consistently delivering user-facing / customer-facing value, which also makes them happier.
Most importantly, this embedding makes sure data engineers’ work is not marginalized as some hypothetical model or analysis. Instead, it gets realized as soon as possible to benefit our stakeholders.
One of the data engineers, the original one coincidentally, is also very infrastructure-savvy. He spends significant portions of his time on the analytics pipeline, data validation, standardization, and so on. This benefits not just every other data engineer, but every engineer and product manager as well. It’s been crucial to our success, and the subject of a future blog post.
The data engineers also meet periodically to chat about best practices, share knowledge beyond the day-to-day work, discuss potential project approaches, and give each other feedback. This is largely informal, done at their own behest, at the frequency and format that works for them.
Beyond the aforementioned articles, what have you found works well with data engineering fitting into a product development organization?