Hello and welcome to Open Citizen Data Science!
In many industries Data Science is still a relatively new field and as such it's likely that hiring a consultant to try new ways to harness the power of available data is seen as a relatively low-risk path.
While this is often the case, it's still easy to run into unscrupulous consulting firms advertising miracles in business performance improvements, often at what seems a very competitive price.
This can easily lead to disappointing outcomes and make you feel that you've been defrauded
I
went through that phase, which involved lots of time and money lost,
making me realize that if you want things done properly, you have to do
them internally and not rely exclusively on external consultants, although they can still be an extremely valuable asset once you find the right firm to partner with.
Data science is a description of several tools and methods to get statistically relevant results on problems regarding data.
Just like other disciplines, the potential benefits depends on how it’s used.
Let’s make something clear though: Data Science and AI are not magic.
To make a Data Science project work, you need a defined set of things:
- A
well-defined problem: Do you want to see potential churners? Do you
want to see which customers call for selling something? Are you trying
to see which parts are going to fail soon? Pick one problem at the time
- Abundant and clean data that is produced in a consistent way and is varied enough to cover most potential factors affecting the prolem
- Lots of support from people that actually work on the problem: Data scientists without domain expertise are bound to waste time of correlations that bring little or no value
- An
organization that is actually able to follow the insights that are
taken from the data. Often enough there will be results that goes
against “common wisdom”, resulting in resistance by the very same people
that are supposed to benefit from the new information! Good insights need to be actionable!
- Do not expect miracles.
Many business cases are about relatively rare occurrences (like monthly
sales per customer in a volume market) and even reaching a 10x
improvement in predictive capability can mean you still are below 50%
accuracy in the final results. This is normal if for example what you’re
looking for is something that happens 1-2% of the time.
If the conditions above have been followed and results have been beyond disappointing, look into possible red flags:
- “Senior” Data Scientists that look younger than 30. The field is not that young anymore but many consulting companies still try to sell fresh graduates as experienced resources.
The
individual consultants are usually bright, but they lack experience
with real world business data, especially with typical dirty datasets
and have am algorithm rather that data focused approach.
- Focus on a “black box” solution. Anyone who just wants the data and will send you back the results is unlikely to give you good value for your money.
- Focus on customized solutions completely written in code. Customized solutions are often the mark of a start-up fresh from university or someone that has a vested interest in keeping their product as obscure as possible in order to ensure repeat business when something goes wrong. Having customized R or Python scripts for the machine learning part is acceptable but if they refuse any kind of integration with internal IT infrastructure for data extraction and assembly you're dealing with either little real world experience, outdated methodologies or a business model focused on "profitability through obscurity" like on the previous point.
- Focus on technical jargon. Deep
learning is often not an optimal solution for most business cases using
structured data, yet many consulting companies will try to push it,
especially as it saves them time on feature engineering and having to explain which variables affected the results. Always demand transparency, it's your money and your investments depends on being sure the insights are based on solid assumptions!
- External data needs skilled evaluation before being bought. While in some cases additional data can make a huge difference, it shouldn’t be purchased blindly.
If
it’s personal data (customer profiles for examples) one has to make
sure it has been treated accordingly with GDPR regulation if it involves
European customers or the risk is to get some hefty fines.
Some
players will also try to sell data that is actually available for free.
One of the big 4 consulting companies recently tried to sell to the
company I work for census data that is freely available on the public domain!
- Make sure whoever tries to sell you a Data Science project has actual experience on the problem you’re trying to solve! Ask for references and follow them up, domain expertise is extremely important, if they don't have experience on the industry you're working on be ready to have an internal analyst to guide the consultants through the data understanding process and to make sure the results obtained are actually useful.
Other than that, take your time in choosing the right partner, sample as many offers as you can to improve your understanding of the required effort and if in doubt ask for a free or very low cost Proof of Concept project before committing your budget.
Stay tuned for our next article!
No comments:
Post a Comment