Learning new programming languages is an investment in human capital. Figuring out the return on investment can thus be very informative. There are very specific requirements for each industry and specific job, and finding a generalizable answer to the question proves quite difficult. One approach is to analyze the required software skills in job postings, which reflect current demand and may therefore indicate general return on investment. We downloaded all data science related job posts for Switzerland on Indeed to obtain a rough idea of the popularity of each software on the Swiss labor market.
We proceeded by scraping all job listings in Switzerland for data science related keywords (Data Scientist, Data Analyst, Big Data, Machine Learning) in June 2017. Even though this only covers the current postings, job ads are usually kept up for several weeks, which allows us to assume that we’re getting an accurate reflection of current market demand. The search covers 848 job postings, but only half the postings specifically state a software or a programming language. Why do so many postings come without specific requirements? Often employers only state general requirements (e.g. knowledge in machine learning or data analytics) and some of the job postings overlap with data science (and thus mention it) but do not explicitly filter for fully fledged data scientists.
We searched the 848 job postings for the 25 most popular data science softwares. 440 job postings mention at least one of the softwares, with many listing several. The figure below shows the number of job posts mentioning a specific software. With more than 210 job posts, SQL is the most popular software by a significant margin. Python and Java follow with around 130 mentions.
Is the above distribution unique to Switzerland or does it reflect worldwide trends? Robert Muenchen performed the same analysis for the US market. Places one to three turn out identical: SQL (18,000 jobs), Python (13,000 jobs) and Java (13,000 jobs) dominate the market. Some differences exist further down: E.g., Hadoop is more popular in the US (4th) than on the Swiss market (6th). But the two graphs are very similar overall, confirming that software trends are global and demands shaped by the technological frontier.
If you are new to data science or thinking about moving into the field, the analysis gives you a decent idea about which programming skills are likely to be particularly valuable in the near future. SQL (still) being in high demand could be a signal that many companies don’t just expect skills in data analysis but a smooth interaction with databases too. Python seems on the rise. Robert Muenchen shows that Python’s popularity has been greatly increasing over the past three years, with its growth outpacing the other big open source player R. Traditional software such as SAS is stagnating. Overall, a combination of strong analytical skills in Python and R with solid knowledge of SQL looks like a great foundation for a career in the growing field of data science.
Interested in analyzing jobs on Indeed? You can access the jobs through their API. The jobbR package on R is helpful; similar tools exist for Python.
Performed a fascinating analysis you’d like to publish and share? Found a cool dataset that should be featured on our page? Contribute to our blog!