There’s not much doubt that the potential for data science is massive. What many teams are finding is that the hurdles to realizing this potential are also massive. In this post, I look at some of the key building blocks that teams need to start with in order to realize maximum success from their data science initiatives.
“Let’s build the right data science team.” It’s very easy to say, not nearly so easy to pull off. We all understand that having data science expertise is now critical to the success of artificial intelligence (AI) initiatives, and ultimately to the business’ long-term prospects.
The problem is that that’s true for your organization, for your competitors, and pretty much every other company in every other industry. One survey found that 67% of organizations are expanding their data science teams, and, between 2015 and 2019, AI-related hiring growth grew 74%. While the demand is pressing and widespread, the supply of top talent remains scarce. The same survey reported a shortfall of 250,000 data science experts in 2020. Following are some strategies for overcoming these obstacles.
Particularly as organizations move workloads to external clouds and automation continues to grow more widespread, many organizations will be able to redeploy some of their engineering experts and assign them to top-level, strategic data science initiatives. When done right, this can be a true win/win, where organizations build up staffing to drive key initiatives forward, and staff can build up experience in an area that will undoubtedly represent a hot job market in the long term.
These internal staff members have the advantage of understanding your business, and they have skills in IT operations, data management, and analytics that can serve as a strong foundation. Look for engineers on staff that have some relevant experience and, perhaps more importantly, a drive to learn about this area. Offer these team members opportunities for education, and give them assignments to start working with the data science staff on hand.
While it may be difficult to hire top data science expertise, it can be done. Finding and hiring a candidate that checks every box on your list of criteria may not be realistic. Start by identifying the must haves in order to cast a wider net.
Make sure you’re harnessing the networks of existing data science staff to learn about the teams doing innovative work. Get engaged in forums, user groups, and technology communities to build connections with organizations and people. Finally, keep in mind that an optimal team will be one that has a diverse set of backgrounds and areas of expertise. By pooling different team members with unique strengths, backgrounds, and skills, teams can ultimately be established that complement one another, and enable strong results.
In plotting an effective data science initiative, it is important to start by assessing the data available. In this effort, it can be helpful to assess data across these categories:
By assessing the data available, and understanding its potential and downsides, teams can start to ensure they’re using the optimal mix of information to power their initiatives.
As outlined above, the dearth of data science talent is a problem today, and it isn’t going to suddenly disappear any time soon. While the process will take time, ultimately, the long-term solution is to democratize data science. It is only by empowering teams from across the organization to harness AI that businesses will be prepared to navigate the challenges of the future.
For some time now, analysts have been writing about the concept of the citizen data scientist. At a high level, this approach refers to capabilities and practices that allow users to extract insights from data without needing to be as skilled and technically sophisticated as expert data scientists. By establishing platforms that make it easy for non-data scientists, and even non-technologists, to access data, ask questions, and experiment with models, businesses will be able to open up an entirely unprecedented level of AI-powered innovation. To learn more, see our blog post on cultivating the citizen data scientist.
Depending on an organization’s geological reach or its industry, there may be a number of external privacy mandates in place, including regulations like the EU’s General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), the Health Insurance Portability and Accountability Act (HIPAA), and the Payment Card Industry Data Security Standard (PCI DSS), to name but a few. Complying with these types of regulations isn’t anything new. However, what is new is the need to navigate compliance while moving forward with data science initiatives.
With data science, teams gain a new level of power in terms of how data can be mined, and with this increasing power comes increasing responsibility. To start, teams must establish policies and guardrails that ensure data science initiatives don’t encroach on, or completely violate these rules.
Perhaps even more fundamentally, data science teams need to ensure they’re aligning with corporate standards for ethics and operating with transparency. Beyond legal risks and the potential for fines for non-compliance, failing to operate in an ethical manner can expose a business to backlash, endangering the most valuable of corporate assets: the customer and their loyalty. If not mitigated, these risks can therefore outweigh any of the potential upsides of a data science initiative.
In recent years, a wide range of workloads and services have been moved to the cloud, and data science isn’t an exception. For many organizations, the agility and scalability of these services have made cloud deployments the go-to alternative for analytics and big data initiatives from early on.
Cloud infrastructure is only the beginning, however. Now, cloud-based services like Amazon SageMaker and Microsoft Azure ML bring together a range of technologies that represent a complete data science platform, significantly streamlining the effort required to move from set up to AI-fueled intelligence.
Moving forward, advancements in automation will also offer profound advantages. Historically, data science initiatives have meant a significant amount of labor, particularly in terms data aggregation and clean up. Automation will fuel significant improvements in these areas. In addition, they’ll also provide teams with an ability to build and test models with a minimum of manual effort.
Today, many business leaders are looking to leverage data science, but they’re not where they want to be yet. By starting with an understanding of the core building blocks that are required, teams can set the stage for the realization of the breakthroughs AI and machine learning can provide. To learn more, be sure to view our blog post, “Putting Machine Learning Algorithms to Work.”