“MLOps? Yeah, yeah we’re doing that now. Meghan is our MLOps girl, she’s running that stuff.” As data scientists and MLOps engineers, we would all like to have a fully-fledged MLOps team that is perfectly integrated with the teams surrounding it. So what does a good MLOps team look then? As a kid, I would spend hours and hours with my friends doodling the perfect 11-man team for football. Romario here, Bergkamp there. And I have to admit to you, I still do it today for NBA basketball. My friends and I have heated discussions in our WhatsApp group about who deserves what position in “the best team of all time” or what positional system they should play. In this article, I will try to do the same for MLOps. There won’t be a selection of the greatest engineers and product managers ever. However, we will discuss the positions and system needed. Let me know in the comments what you think! Heated discussions are welcome :)
But first: why you should want a good MLOps team
Unfortunately, in many organizations MLOps still is dependent on only a handful of people, sometimes even just 1 or 2! In larger organizations where ML is done at scale, the teams grow in size. There might even be multiple teams dealing with machine learning operations because having the machine learning solutions up and running is crucial to the business’s core operations. They are, as we say, mission-critical. In the world of data, where we have come from building ETL pipelines with data engineering to the current world of ML and AI, this requires a bit of a mind shift. When I started in the field a failed data pipeline would mean “the dashboard didn’t refresh”. We could usually debug some code in a fairly stress-free setting. And if we didn’t manage to fix it by the end of the day, there was always tomorrow.
In the current age of machine learning and ML, where pipelines often are directly tied to business operations, the stakes have been upped. A failed ML pipeline might mean the website might go down, e-mails can’t be sent, products can’t be delivered, and large workforces at distribution centers are sitting idle. It might lead to waste, loss of profit, and stress. And it the worst cases: safety, security, and health risks.
Data flows are hardly ever perfect, ML solutions are often complex and can be prone to bugs. And overall in our industry we have a lot to learn in software engineering, testing code, and building robust IT solutions. So shit is going to hit the fan, sooner or later. Shit is pretty much guaranteed to hit the fan… the question is, how bad will it be, how often will it happen, and how will you be able to fix it at an acceptable lead time? The answer is: having a good MLOps team in place. Do not let it all rest on the shoulders of our girl Meghan (she might need a holiday too at some point, can your business keep on running without her even?).
So what does the perfect MLOps team look like?
Strategic/executive level
Every successful ML project starts with buy-in from the strategic and/or executive level. If you don't have it, don't bother trying (fight me in the comments ;), because you will not be successful in the long run. So having a sponsor and maybe even a direct report at the highest level is key to the success of your project(s). This is because data science and MLOps are often cross-departmental, require significant investments, and tend to fundamentally change (parts) of "How we do things around here". These transformational objectives require support from the big bosses. So treat them as an (extended) part of your team with regular updates and short feedback loops. Getting the green light for an ML/MLOps project and then delivering output months later won't cut it.
Evangelist
So how did you get to the strategic/executive level? That’s right, your team had a great evangelist. The evangelists are often forgotten, so that is why I am putting them high on this list. Data scientists and engineers have wonderful ideas, but it is the evangelists that get them traction and make sure resources are allocated to the teams. So they are important at the start, but also in the middle and after a project. When things get hard during the execution phase, the evangelist makes sure life support is not cut and prevents the project from flatlining. After the project is delivered, they are the enthusiasts that fan on adoption! And together with technical experts from the data side, they can be a great one-two punch constantly educating the end users on something that might seem like a black box at first. This is not a specific role you would hire for, your typical evangelist could hide under any type of job title, but most likely they have some of that Steve Ballmer type of energy.
Outside-to-inside person
This role is the linking pin between the technical team and the business stakeholders. The person executing it should have a primary focus on the business! Keeping business stakeholders and sponsors informed and happy will ensure sustainable appreciation and impact of ML and MLOps initiatives. Successful people in this role will make great slide decks, have a stakeholder matrix by their bedside, are skilled meeting strategists, and will be having many coffees and meetings around the organization to keep all the noses pointed in the same direction. A key priority for this role should be having clear project milestones (strategic roadmap) supported by the right people in the organization (stakeholder management). The outside-to-inside is ideally a person from a business background with high likeability. The risk with this role is that a lot of tech people get promoted into it and eventually get frustrated with the slowness and stickiness of organizations “They just don’t get it”. Beware of disgruntled developer syndrome, patience is a key skill here! This role could be executed by either one person or a bigger team. In different organizations the role will have different names, any of these or even others:
product manager
business consultant
program manager
analytics translator
business process owner/business process specialist
Inside-to-outside person
Where the outside-to-inside person is the linking pin from the business side, the inside-to-outside person does the same coming from the side of the technical team. Together they can be a dynamic duo that ensures the often-mentioned "gap between business and tech" is closed. Many organizations make the mistake that they combine these two roles into one role. Something that is too complex and has an internal conflict of interest by design. The inside-to-outside person should be the champion and shepherd of the development team, pushing back on business wishes and pressures, while the outside-to-inside person is the business' best friend. Key priorities should be prioritizing and refining work for the development team on the micro-level (backlog) and managing and updating developer epics on the macro-level (technical roadmap). This role often pops up under the name of product owner or project manager but is sometimes also a part of a wider ML manager role. Lately, there seems to be a shift towards a more hands-on product ownership, where the inside-to-outside person is still also coding and reviewing. This is a great and necessary change! But only possible if there is good support from an outside-to-inside person. The inside-to-outside person often gets one of the following names:
product owner
project manager
data science manager
requirements engineer
Tech lead
In an ideal world, the tech lead is the guru with the respect of all the devs, the experienced senior or principal engineer that has seen it all and knows it all. These gurus are a rare breed though, and there is a shortage of good seniors in the market. The lead should be experienced, yes, but also important here I think is that the tech lead is a strong communicator and culture setter. The tech lead should know the capabilities, hopes, and dreams of their team well. Together with the people working on the horizontal levels (for example, chapter leads and learning & development managers) the tech lead would be wise to create a culture of continuous learning. Dev culture can be meritocratic, and this can be fine, as long as everyone builds each other up. So "Pfft, this guy didn't even know about X in the CLI" is a red flag, and alternatively "Owww, but you could also just fix this from the CLI, let me show you if you want!" is the green flag you want. I believe this starts at the top! If your tech leads adopt this attitude, while also always being open to learning new things from others, your team will flourish. It starts at the top. Key responsibilities: solution design, quality assurance, culture setting, standard operating procedures, team learning & development, problem-solving. Ideally, the tech lead also has business acumen, but the position can be executed without it if the tech lead has strong support on the wings by the other roles.
The engineers
In an MLOps teams, you might work with different types of engineers. We will discuss the different flavors of engineers, their skillsets, and their responsibilities in a future article. So stay tuned!
MLOps engineers
ML engineers
Infrastructure / Cloud / DevOps engineers
BI / ETL / Analytics engineers
Architects
Middle Management (is dead)
I don't believe in middle management for data science and MLOps. The outside-to-inside, the inside-to-outside, and the tech lead are the leaders here. Data science solutions can have such a profound impact on business operations and strategic approaches that I believe teams should work close to or under the strategic and executive level. Of course, this does not always scale well in larger organizations and an extra hierarchy of direct reporting might be needed to keep the governance taxonomy workable. However, I believe when doing tech in a non-tech company you will want to keep the expertise close the strategic and executive levels to ensure more streamlined decision-making. Getting clout with the big bosses will not only lead to more efficient processes, but it will also empower your employees to be more autonomous and responsible, leading to higher job satisfaction and greater motivation to excel. And from a boardroom perspective: CEO's wanting to "transform their company with AI" better start talking to their experts. There are a lot of meaningless slide decks out there. Be ready to challenge your C-suite to invest time, blood, sweat, and tears into the "AI revolution", because it's coming.
Beyond the labels
These roles will have different names in different organizations. It is important to be able to see through the fancy job titles and have a grasp of what different roles there are in our industry. The exact titles do not matter that much, though I would always advocate for transparency and understandability!
A common pitfall is that many organizations try to add multiple roles together. I have seen people in the role of Data science manager / Lead developer / Product owner, and yes, I too have been guilty of this. From my experience, it does not work well in the long run. It’s better to split up some of the responsibilities if the scale allows it. Of course in small organizations, not all roles will be present and 1 product owner, 2 data scientists, and a valuable business case will be all you need / can afford!
Also, a very important part of team building is the role-talent fits of your team. And getting the team is just the first step. You need a delivery model, setup processes, ensure team performance and happiness, and then some. We will discuss these in future articles. Happy team building!