Opportunities
“There is a continuously growing number of use-cases and companies who have proven the transformative value of data,” comments Jan Teichmann, Head of Data Science at Zoopla. “Data science and machine learning models have reduced the risk and costs of B2B and created entirely new products and revenue streams for B2C. The hype has increasingly proven itself and recent progress in AI is reaching a breakthrough point from R&D to real world applications. Which industry could afford resisting the ongoing data transformation in such context and stay relevant long-term?”
Given all this, data science has become thoroughly embedded in the DevOps culture; and it's become easier than ever to bring powerful analytical techniques to bear. Jan notes, “There is an emerging standard of best practise, platforms and toolkits which significantly reduced the barrier of entry and price point of a data science team. This has made data science more accessible for companies and practitioners alike.”
Yet, this has had a knock-on impact on the role of data scientists, with rapid changes taking place in the technology landscape and in how the field itself is viewed. “It is important to remember that data science is (a) still an emerging business function and (b) constantly evolving from ongoing innovation,” says Jan. “While the data science unicorns are as rare as always, the data science discipline has seen a great increase in differentiation and maturity. There are now cross-functional teams working on algorithms all the way to full-stack data products with a focus on research, commercial applications, experimentation, interpretability, algorithmic fairness and data ethics. These days there are advanced analysts, data scientists, ML/AI engineers and DataOps professionals – and the CDO [Chief Data Officer] is no longer a rarity either.”
And this of course sits within a growing commercial marketplace. “With the increasing number of vendors for data science platforms and PaaS offerings by any major cloud provider, the technological challenges in big data and data science projects can be overcome by anyone, not just high-tech companies,” adds Jan. “Data science has finally switched from hype into a more pragmatic, value-focused delivery mode.”
Key skills
So, what are the key areas of expertise that data science specialists need today? With the field covering everything from analytics to algorithms, databases and big data processing, there's quite a range – with Python, Matlab, R, SAS, SQL, noSQL, Hadoop and Spark coming in useful for data scientists; C++, Java, Perl, Python and Ruby for data engineers; and Java, Julia and Scala for machine learning engineers.
“For the majority of commercially applied teams, data scientists can stand on the shoulders of a quality open source community and toolkits and frameworks for their day-to-day work,” says Jan. “The previously required academic/scientific understanding has given way to a need of mastering real-world data infrastructure (which is usually comprised of data silos of poor quality), commercial awareness, an ability to communicate insights with wider business stakeholders and robust product thinking around proof of value.”
Despite all this, though, many specialists struggle to find jobs. Jan explains, “In the last 10 years, 85% of big data and data science projects have failed to deliver business impact and many teams have been discontinued as a consequence. While the field and industry has learned a lot from that previous hype of inflated expectations, we are still in the early days of a cautious reinvestment phase. The new data science teams grow much slower than before in a new lockstep with proving their value via delivered business impact.”
Furthermore, there's the challenge of leadership. “There is no shortage of data scientists but, at the same time, many businesses struggle to find the qualified leaders and managers who can blaze a more successful and sustainable trail this time around,” says Jan. “This does hold back investment into data science at the moment and can make it temporarily more difficult for people with data science skills to find a job.”
It's against this background that businesses are gradually building their data science teams, upskilling existing staff while also taking on new hires. It's also important for firms to ensure that data engineers are to hand, to maintain infrastructure and oversee data collection, as well as ensuring that data is managed effectively.
“A good data science team is a happy data science product team, fully empowered to deliver full-stack data products for the needs of internal stakeholders as well as external customers and motivated by their successful delivery of business impact,” says Jan. “There is an important link between a scalable delivery pipeline and a happy data scientist. Data scientists are motivated by developing new models to solve relevant business problems rather than the day-to-day operational responsibilities of models in production. This means that data infrastructure, data science platforming, automations and DataOps are crucial problems not just for the delivery of business outcome but also for the retention of the team long term.”
Conclusion
Based on the immense value that big data can deliver for businesses, building a strong data science capability is going to be a key priority in the years ahead. As the field matures, it's becoming an operational requirement.
Because of this, specialists who are able to demonstrate multi-disciplinary skillsets (for example, crossing into cybersecurity), are likely to be well-placed to get ahead – and data experts who can prove themselves as competent managers will be in particularly high demand. The greater challenge, though, will be identifying new use-cases where data analysis can provide value for the organisation, as well as the data points that can unlock exponential growth.
Jan is a successful leader in the data transformation efforts of companies and has a track record of bringing data science into commercial production usage at scale. He previously co-founded Cambridge Energy Data Lab where they celebrated a successful exit with Enechange.jp, an utility comparison platform, which is now the market leader in Japan.
He uses his skills now to lead the data science team at Zoopla. At Zoopla the data science team is driving great innovations from vast amounts of property market data, behavioural data, geo data, property images and text data sets.
Jan publishes a wide range of articles about his work and challenges as a data scientist, manager and thought leader on medium ( https://medium.com/@jan.teichm...) or connect with him via LinkedIn
https://www.linkedin.com/in/janteichmann