Should a Data Science team be multilingual?
I recently watched a good talk titled Building Multilingual Data Science Teams by Michael Thomas (a conference talk from posit::conf(2025)). The key message of the talk is that good Data Science teams should be multilingual, i.e., be able to work with more than one programming language. The talk makes some good arguments for why Data Science teams should be multilingual, but I would have liked more about the downsides or trade-offs.
The three main arguments are that Data Science teams should be multilingual in order to have 1) a larger potential talent pool, 2) more tools in the toolbox, and 3) a dual offering. I believe these arguments are stronger if you are a consultant than if you work within an organisation where you collaborate with non-Data Scientists. Accordingly, the talk seems more like a pitch for why (some) consultants should be able to work with R and Python rather than why Data Scientists in general should learn and use both R and Python.
When I worked as a consultant, I remember the advantages of working with both R and Python. It is great not to have to limit your potential client base by not being able to work in R or Python, but most people, luckily, do not work as consultants. On the contrary, most Data Scientists work in cross-functional teams or at least in close collaboration with other teams where people use Python. If you are even remotely close to putting things into production, you will be working with software developers, AI engineers and the like, and all of them are more likely to work in Python than R.
While the arguments in the talk are for R and Python, they can also be interpreted as arguments for Python in and of itself. Earlier this year I wrote a post about why I would opt for Python over R, and I believe that most good Data Science teams today - all else equal - are better off specialising in Python rather than both Python and R. It is not easy switching between R and Python all the time, and while the APIs of different packages have made the switch easier (e.g., plotnine for Python and ggplot2 for R), I see this as more of an argument for switching to Python rather than relying on both Python and R.
That is, the easier it is to use both R and Python, the less of a comparative advantage R will have over Python. It might be easier to do certain things in R, but that is only a reason to try to make things easier in Python rather than relying on both Python and R. As large language models make it easier to turn natural language into code (and vice versa), it becomes less important to "speak" both R and Python. My impression is also that most LLMs today are much better at writing Python code than R code.
If your team is working in both R and Python, there will easily be a lot of tech debt. If you maintain your own R and Python packages, should both be updated whenever a new feature is introduced? How should we deal with updates? And should all team members be expected to work in both R and Python? And if not, does that introduce limits to who can do code reviews, and when? Being able to work with both R and Python in a Data Science team might look great on paper, but in practice it introduces specific costs to the maintenance of code and collaboration within and between teams.
This is not to say that a Data Science team cannot be multilingual. I work in R and Python every day, and there are a lot of things that are easier for me to do in R than in Python (and vice versa). However, in the hypothetical scenario that I should set up a Data Science team from scratch, I do not see any strong arguments for why such a team should be able to work with both Python and R instead of going all-in on Python.
Finally, even if a Data Science team should be multilingual, there is nobody saying that it should be in Python and R. You can make just as good arguments for why it would be better to focus on, say, Python and Rust, or Python and Julia. Again, it is fine if a Data Science team can use both R and Python, but I am not convinced - in 2025 - that the benefits outweigh the costs.