2

I work in the sales department of electronics component manufacturing company and we do data science projects using traditional algorithm like Random forests (success likelihood of design project), Time series (demamd forecasting), clustering (for customer segmentation) etc. I am the only data scientist in the sales department.

My department sets individual KPIs such as below

a) Customer segmentation - No of customers we recovered and revenue gaimed from those customers based on the insights that AI project provided (ex - one of the segment is lost customer segment). Our sales follow up with lost customers to bring them back.Every year they want the AI project outcomes to exceed the previous year achievement (like Humans)

b) Demand Forecasting - Our project predicts the revenue of each part (which when added together results in total sales revenue for the company). Our sales takes the results and check with customers.

c) Project success prediction - How many projects actually succeed, when AI model says that there is a high likelihood to succeed (ex: 90%). We don't use yes or no binary classification instead we use probability measure.Our sales takes the results and check with customers.

So, now the concern is my department assesses the project only based on sales tangible outcomes like revenue, customers gained etc. They don't care about the effort I invest to build multiple models, code quality, documentation, insights that assists sales (with decision making), efficieny improvement, model performance, explainability etc. Meaning, they don't treat me as a IT guy instead they treat me and AI model as digital sales (instead of traditional sales). While I understand that every AI investment requires some sort of tangible returns and am okay with that but treating it as successful and eligible to called as "met expectations" only when it achieves tangible gains seems a bit incorrect to me.

Experts here, do you have any advice on how do you deal with such scenarios and decide on KPI for data science projects? Or is this the global trend for data science projects? What would be the appropriate KPIs for my project to propose from data scientist perspective.

The Great
  • 2,725
  • 3
  • 23
  • 49

1 Answers1

3

the effort I invest to build multiple models,

This is just part of finding a better final model that will give better measurable outcomes (revenue, customers gained, etc.). So is covered by that.

(If you spend 10 man-days to try alternative models, but your best model is always the one you try on day 1 or day 2 then your prioritizing is good, and you are wasting 80% of your effort. On the other hand if the model you found on day 10 gives N% more sales than the one you found on day 1, e.g. based on back-testing, then you can compare the value of N% to 9 days of your salary, and hopefully it is higher.)

code quality documentation

These can be classed as prevention of technical debt.

If this is a large company, with a large IT department, you could ask them what they do to get technical debt measured and considered.

Note that avoidance of technical debt can be often be considered premature optimization (aka, a hard sell) if lots of projects are being tried, and perhaps only 10% of them are expected to survive the trial. Better to see which the 10% are, then spend the time improving the code and documentation of the survivors.

insights that assists sales (with decision making)

This might be the tricky one on your list, assuming it is something you are directly involved with, rather than an automated output of the model.

I would try and keep a list of those insights. So you can show them as outputs at the end of the year.

efficiency improvement

Of the model building time? Is waiting for the model(s) to build costing money? If not, this is premature optimization. Same argument for inference. If it runs as an overnight batch process, your team doesn't care if it takes 5 seconds, 5 minutes or 5 hours. But if it cannot run between the time of getting end-of-day data and the sales meeting the next day, then optimizing it becomes very important.

model performance

You mean the quality of the model's suggestions? This (like building multiple models) is already covered by the measurable targets.

If you meant speed of inference, see previous.

explainability

Is this required by regulation, or company policy? If so it is non-negotiable, and is just part of the rest of model building.

But I feel this is covered the same as "insights that assists sales" above if it is something you personally do in meetings. And if it is part of the model's output, then it is already covered by a/b/c in your question.

Darren Cook
  • 1,324
  • 8
  • 16