The Myth That Machine Learning Engineers Are Not Good General Software Engineers

Nov 27, 2020

I once attended an engineering leadership meeting in my company where the topic was how best to allocate support from our most experienced technical contributors, staff and principal engineers. As a manager for a machine learning team my reaction was mixed. Few of the available staff or principal engineers had any meaningful experience with statistics or techniques to develop predictive models in a production environment, so my feeling was this: If they can help with capacity for devops and infrastructure projects, then sure, sign me up to rotate a staff engineer to my team. Otherwise, no thanks.

One engineering leader balked at this. This person was so predetermined to argue that any type of machine learning team, in principle, needed remedial software architecture guidance from staff engineers that they blurted out, "your team is a bunch of fly-by-night juniors!” Ouch. It’s discouraging to have the hard working members of your team insulted because they work within a particular domain. On my team there were veteran machine learning engineers who had delivered large-scale distributed systems in huge tech companies. Not only had they been around the block with production-quality general engineering, they were among the best general backend engineers in the company.

I was able to forgive this unfair sleight to my team because I knew this is a pervasive bias in software engineering. Many technology leaders assume that people skilled in machine learning somehow lack more general system design, backend architecture or best practices skill sets. Meanwhile, if you've ever worked closely with machine learning engineers you know the reality is the opposite. They tend to be excellent at a wide range of general engineering skills because they must interoperate between so many aspects of software engineering, system administration, databases and operations in the course of their daily jobs.

In this essay my goal is to unpack some of the causes of this culture bias towards machine learning engineers. I will also describe tactical solutions that machine learning managers can use to overcome it. I will first discuss sociological causes. These are by far the most common true reasons for unfair bias and bad attitudes towards machine learning engineers. Afterwards I will discuss other reasons of honest misunderstandings and technical confusion among otherwise good-faith colleagues.

One aside on nomenclature: I use the term machine learning engineer interchangeably with data scientist and I take these job titles to refer to professionals who are highly trained in the use of statistical modeling to solve vague business problems. The term data scientist sometimes invites unproductive debates about definitions. I exclude from the notion of data scientist those staff members who don’t focus on software systems to train, deploy, serve and monitor statistical model solutions in production. For example, an employee who only works in a Jupyter notebook environment or a business intelligence software tool would not be either a machine learning engineer or a data scientist. The term for such roles would be business analyst or product intelligence analyst or similar (and these folks are often tremendously valuable to their employers). These analyst roles are not addressed by this essay.

• • •

The first and most obvious reasons for bias against machine learning engineers are sociological. People tend to put others into stereotyped buckets and machine learning engineers, usually with long academic tenures and dispositions towards research, are mentally lumped in with a stereotype of an ivory tower professor trying to stay above the fray of daily engineering concerns. The more removed that a given leader is from the day to day realities of machine learning teams the more prone they will be to adopt this simplistic stereotyping.

If you are facing this type of situation there are two solid tactics to pursue. One is to develop a "reality check" meeting that occurs regularly (perhaps once per month). Invite senior leaders to attend and hear a briefing on the specifics of machine learning engineer contributions.

One key point is that it doesn't necessarily matter if your preferred audience attends. Merely by establishing the existence of the meeting, you define the terms and opportunity for people to raise concerns that machine learning engineers lack generalist skills. If someone gives this criticism and they are not regularly attending your reality check meetings, you can gently point out they are not in the loop on the evidence or context. This prevents most forms of unfair biased criticism of machine learning from automatically being treated as valid in other engineering leadership meetings. In cases where the biased parties are acting in good faith but are just uninformed, this gives you an excellent chance to correct wrong perspectives and build collaboration. You can also better receive sincere feedback about real cases of skill gaps or company-specific context that your machine learning team needs to address.

The second tactic is to host an occasional "state of the union" presentation about machine learning in your organization. This kind of presentation often has the risk of over-focusing on eye-catching demos or top line measures of customer impact to foster excitement, bolster staffing requests or motivate product management. But you can also use it to highlight programs of investment in machine learning testing, architectural component design, refactoring and related topics that connect more with audiences that care about software development life cycles and engineering quality metrics. By building in a section of the presentation that focuses on metrics like code coverage, failure robustness, organizational compliance and modularity, you will create a culture where machine learning is widely understood to be a partner in all the same engineering discipline considerations that affect other teams.

The other type of sociological reason for this bias is much more insidious. Consider first that machine learning is very universal - much more universal than nearly any other discipline of software engineering. Machine learning can provide production solutions in almost any kind of product or system ranging from search engines to e-commerce platforms to image processing to computer security to learned database indexes to aesthetic design to predictive devops tools and more. You will encounter people who view machine learning as a threat to a specialty they have built in a certain domain. Maybe they are a product designer and an algorithm capable of producing superior aesthetic design results threatens to usurp their authority based on curated judgment. Or they may be a backend engineer in charge of a rule-based system for applications like fraud detection or routing customers in a help center. They may view a machine learning model as a threat to their architectural control or ownership of the system's design.

If these threatened anxieties are taken in bad faith, there's little you can do. The other party has decided to put their personal interests in maintaining authority ahead of helping the customer or the stakeholder. They will criticize machine learning technology unfairly, claiming it is just experimental gadgetry unsuited for production use cases. The outcomes in this situation will entirely hinge on politics. If the anxious, bad-faith critic is given political power, machine learning will be deprioritized and left to fend for itself. If the machine learning team has political power, the critic can be exposed as a bad faith actor and they will lose influence.

If you find yourself in this situation, you truly can reduce your options to just two: prepare your resume and steadily look for a new role so you can leave your current company if the bad faith critic has unchallenged political power, or talk with senior leaders about the underlying fears or anxieties motivating unfair criticism if you believe they would respond rationally. It may seem dramatic to consider leaving a role in this situation, but it’s important to recognize when you’re dealing with bad faith counterparts who deny fairness towards machine learning and may work to undermine you instead of partnering with you to solve problems. Young managers often make the mistake of believing that with diligence and hard work, they will change minds and steer the poor culture around machine learning to a better place. You run a serious risk of burnout if you spend significant effort trying to achieve this when you’re dealing with bad faith coworkers. The people reporting to you will be under served as you must dedicate increasingly greater amounts of your time to combat the undermining policital issues. And in the end, if bad faith coworkers have political power, it’s not likely that your effort will have much payoff. In the interim, good people from the machine learning teams are likely to depart and it will be hard to attract high-quality replacements. This negative effect on machine learning staffing will in turn be politicized by the bad faith counterparts. In the end, you can spend a lot of good years of your time trying to boil the ocean with regard to companies that have poor machine learning culture. You will be doing yourself a massive favor by pursuing a new role sooner rather than later and it can help send a message to others that they should leave for healthier environments or otherwise create a more urgent catalyst for culture changes.

If the anxiety or fear that machine learning will threaten another team's authority is coming from a good-faith point of view the story is much more positive. The first thing to do is to set up one-on-one meetings with the insecure party. Walk them through the way the machine learning solution architecture will work and emphasize that as a domain expert in their team's system, they are a critical partner. Bring system design and integration questions that show them you are serious and the machine learning team wants to understand their expert perspective on topics like reliability, operating cost and complexity. Reinforce that there is a common business or organizational goal and all teams involved share the credit. Even if a core component involves novel efforts from a machine learning team, it takes a village to successfully deliver production engineering solutions. This is the primary spirit of modern team structures related to devops, empowerment, or other notions of cross-cutting, end-to-end partnerships: all the teams involved have the same goals, share credit for successes and share responsibility for failures.

One other tactic in the good-faith anxiety case is to bolster your team's documentation. Consider creating end to end architecture docs and dedicated test plan docs. Offer resources through which others can see your architecture decision process. Use RFCs or ADRs to expose the way general software lifecycle considerations and architecture principles are integrated in the machine learning team's workflows. If you have regular tech talks or engineering all-hands meetings, be sure your team is utilizing them to share and don't forget to resurface those docs or slide decks whenever inappropriate stereotypes about machine learning engineers’ general system skills pop up.

A final consideration for sociological reasons motivating unfounded bias towards machine learning engineers is insecurity when leaders know they cannot provide resources required to solve problems. Remember that colleague who blurted that they thought machine learning engineers were "fly-by-night juniors" in a leadership meeting? This particular insecurity turned out to be the issue. The usual setting is a company that is immature in terms of giving machine learning or data science insufficient dedicated support or data resources. It can also manifest in a lack of proper staffing or poor compensation for the machine learning team. This is followed by an outward projection of insecurity from leaders who know they are not in a strong position to meet the needs of their organization and may face high turnover risk.

Take for example the case of devops support. It's widely understood that machine learning systems have highly unique infrastructure needs. In a company where there is no devops expertise dedicated to machine learning, the machine learning team itself will likely become overloaded handling devops to an unsustainable degree.

Some leaders facing this problem know they are under serving machine learning as a use case but they either cannot or will not dedicate staffing to solve the problem. The next place to turn is compensation. If they cannot or will not greatly increase compensation to retain machine learning engineers who are being given poor career development, they may become insecure and try to deflect blame or manufacture a false idea that the issue stems from machine learning engineers lacking generalist skills.

Compensation is typically only a short term solution in this case as well. This feeds the insecurity because compensation as a reward for enduring career dissatisfaction still highlights an urgent need to fix the broken career development problems. Rather than invest time to come up with a long-term solution, insecure leaders will misrepresent machine learning engineers as lacking generalist skills as a way to argue it’s OK to allow a machine learning team to decline. By putting the breaks on backfilling staff some leaders may hope that after enough neglect, the problem will solve itself in the form of mass exodus of unhappy machine learning engineers.

In the long run, as their fear of being held accountable for a poor engineering culture and high turnover grows, this type of insecure leader will even start to question the fundamental value proposition of machine learning itself. It shows up in ways that are entirely dissonant with obvious value additive machine learning solutions, widely publicized investment in similar capabilities among your competitors and demand from product management for increased scope of machine learning capabilities.

This is a very specious and dangerous political issue. You must openly call it out in leadership meetings or you will risk a big rift developing between political allies of the specious critic and sincere stakeholders in machine learning technology. It's important that if there is a specific way a machine learning team is being under served or under resourced, such as poor devops support, career development or compensation, it must be reiterated. Any attempt by a specious critic to subvert that situation and reframe it as a lack of general engineering skill on behalf of the machine learning team has to be vigorously resisted and pedantically countered at every opportunity.

• • •

Aside from sociological issues, one other main source of perpetuating the myth is honest technical misunderstanding or technology biases.

For example, senior leaders (directors or executives), HR and recruiting staff may simply lack technical context because they are far removed from daily engineering work. They need operationally quick and easy mental models of different technology domains. They might reduce a security engineer to "cryptography" in their mind or reduce a financial analyst to "spreadsheets" even though the reality is that those employees perform much more general software engineering tasks in their jobs. This same over simplification applies in spades to machine learning. At best someone may reduce a machine learning engineer to something like "notebooks", "research" or "models" but even these topics are fuzzy for most senior leaders, HR and recruiting.

To solve this it can be helpful to make a document or slide deck that walks through the skills a typical person on your team needs and provide a good layperson visual aide for each one. You can point out that most machine learning engineers have experience with many distinct backend engineering concerns, such as

GPU system administration and batching or queuing for GPU throughput efficiency, as well as multi-GPU distributed model training
Query performance tuning and indexing in many database systems
Operationalizing models in microservice frameworks and designing REST APIs
Factoring data cleaning and transformation software
Creating optimized algorithms in many paradigms (object oriented, functional, compiled vs. interpreted, static vs. dynamic typing, and various language ecosystems with modern machine learning tooling)
Extending version control best practices and collaborative development to also handle experiment tracking and reproducibile model training
Test engineering concerns that other teams don't regularly handle, such as complex statistical accuracy checking, training vs. serving skew or stability of on-line learning

Once you socialize reference materials that educate colleagues about this highly diverse set of generalist skill areas that embody daily work for machine learning engineers, you can use it to redirect unfounded criticisms or over simplifications and even develop a culture where other leaders, HR or recruiting can help send this message on your behalf.

When the parties involved are not senior leaders, HR or recruiting, but instead come from peer engineering teams, the solution is largely the same but you may want to use examples or references more suited to that audience. You may also need to be on the lookout for programming language biases that arise in this case. The unfair bias towards machine learning engineers might actually just be a symptom of unfair bias against a certain language like Python or Scala or a bias against a web framework like Flask. The criticism is sometimes based on parochial feelings of superiority about an alternative technology and sometimes based on misunderstanding of the specific capabilities of machine learning systems within these different languages.

Below are four example articles that discuss generalist technical topics in the context of machine learning which may help start a knowledge sharing conversation that removes culture barries stemming from language or tooling differences.

“User Experience Design for APIs” - the popular deep learning library Keras is used as an example for end-to-end usability design.
“Machine Learning: The High-Interest Credit Card of Technical Debt” - this article sits on the night table for late night reading for practically every machine learning engineer. Focusing on good software hygiene that prevents complex sources of tech debt from the very beginning of a project is a critical, fundamental part of machine learning.
“Effective Testing for Machine Learning Systems” - this is a good introduction to some of the more exotic testing requirements that machine learning systems have which other backend systems don’t have.
“Deploying Deep Learning at Trulia” - a reference architecture for serious backend considerations about structuring user APIs and prediction APIs with constraints on GPU vs. CPU task resources.

• • •

As a manager of machine learning teams you should take pride in spending work hours addressing the basic level of respect shown to your engineers. It will boost morale, invite loyalty and commitment and expose misunderstandings so they can be alleviated to unblock progress. It requires perceptiveness on your part to diagnose whether you are facing sociopolitical issues or regular technical misunderstanding. Is unfair criticism coming from a place of good faith and only needs reassurance? Or is it coming from a place of bad faith to disguise attempts to retain political authority? Is the critic insecure because they know they are under serving machine learning teams and they want to subvert that fact? Once you decompose the problem along these lines, you can identify relevant tactics we have discussed and create a plan to move machine learning engineering perception back to a fair and healthy position.

Managing Machine Learning

The Myth That Machine Learning Engineers Are Not Good General Software Engineers