Saturday, July 2, 2022

On goals and metrics


My kids developed a strong interest in the Roblox games. We had to establish a time limit on the use of our home laptop. Once the limit was set and put into daily use, the kids started to invent, day by day, ways of utilizing the limit in a manner that was most efficient for them. Let me give you a sample of what they came up with during the first week:

  • start the timer only when the laptop is up and running
  • start the timer only when the game is selected and loaded
  • pause the timer when I want to use the bathroom
  • pause the timer whenever there's a problem with the laptop (it hangs, no connection, etc.)
  • pause the timer when I'm lost in the game's world and I need to find my way out (!)

I immediately knew these ideas could be their next:

  • start the timer only once I selected the world and configured my character
  • pause the timer while I'm in the process of exiting game A and choosing game B
  • my unused time limit moves on to the next day
  • ...and so on

What was originally the purpose of introducing the time limit? My goal was to limit the time the kids spend in front of the screen. The metric I started with was the elapsed time of using the laptop.

Pausing the timer when a character is lost in a maze defeats the original purpose. The other problem is that letting them add more rules would obviously make the whole time limit system more complicated and difficult to use - and this is true even when the actors in the system (my kids) have no intention of cheating. The reasonable setup was then to only observe one single rule: once the laptop is up and running, your timer starts.

I believe teams and organizations sometimes do something similar to what my kids tried to do. The difference is, within companies and organizations it can even turn into intentional attempts to cheat the system. But even without the intention to cheat, the attempts to measure the net value and the process of adding new measurement rules can be harmful. Let's review a couple of examples.

Cycle time

A team started measuring cycle time as the time elapsed from the moment a Story appears in the TODO column of their board to the moment it lands in the Done column. Soon, they figured out that the scrum board tool they use was smart enough to subtract weekends from cycle time, so they switched to that. Later on, by their own means, they started subtracting national holidays. But what about long weekends? There is always a couple of them in a year and in our team, if a national holiday is on Thursday or Tuesday everyone gets a day off on Friday or Monday and goes on vacation. Counting these into the cycle time wouldn't be fair, would it?

On the other hand, team's throughput or the average cycle time of completing a Story is what it is. The time it takes to roll out a new feature to production is what it is. They are real values. The cycle time calculated as above, using all the sophisticated rules, isn't real.

Measured financial success of a solution

I worked on a number of software solutions where, prior to the start of development, the Marketing team or a Product Manager was able to provide financial predictions. Based on the customer funnel and/or talking to a number of customers a realistic prediction could be provided in a form like in the sample below:

  • feature roll-out: Q4 2022
  • Q1 2023 earnings: $X
  • Q2 2023 earnings: $Y
  • Q3 2023 earnings: $Z

Based on this prediction, we can say that the Company level goal is to achieve earnings of X + Y + Z over the first three quarters of 2023. The realized earnings are the metric to be tracked. But once the solution gets into the development cycle, it is easy to let this original goal get out of sight. The development team or its management are likely to come up with their own goals or success metrics that are closer to team's daily operation. The team understands these goals and metrics better and, in good faith, assumes that good team operation inevitably relates to the Company success. Unfortunately, it is not true and the pathologies of goal substitution and autonomous goals have been researched deeply (in general, not in Software domain). The most ubiquitous pathology in Software is that on a team / org level the means to achieving Company goals are treated and established as team goals. Let's look at some examples of what is not a good goal or success metric here.

  1. The solution is delivered on time (in Q4) - quite often, if it is delivered a little bit later (e.g. Q1 2023) the total earnings for the first three quarters of the year may still be achieved. Managing expectations of the very first Q1 customers can be enough of remedy for being somewhat late. Being on time is a good condition (one of many) of maximizing the chances of a success here, but is not a good success metric.
  2. The solution is delivered within budget - for the sale success, it does not really matter if we are somewhat below or somewhat above budget (within reason).
  3. The solution is delivered 20% below the budget - if 20% of development costs really plays a significant role here, then financial rationale of developing the solution is questionable.
  4. The software solution has no more than 5 minor bugs at the time of the rollout (and zero bugs of severity higher than minor) - the real situation at the beginning of Q1 may be different and it may still not invalidate the 2023 earnings goal. The expectations of the very first customers can be managed through a transparent dialogue about open bugs and the bugs can be resolved and closed during Q1. The first quarter may also not be the highest peak of the predicted adoption, etc.
  5. The software solution has at least 80% unit test coverage - this is an example of an autonomous goal shift: the goal and the 80% metric is not related to the earnings goal at all.
  6. At least 80% of the planned content of each Sprint is Done at the end of the Sprint - this is asking for the predictability. It is not asking for short cycle times or good prioritization or for building quality in. It is again an example of an autonomous goal shift.

This is just an example. There are features and solutions for which this type of earnings prediction cannot be made. Not always are we able to manage customers expectations easily - e.g. in a large scale B2C market, we most likely cannot. And sometimes the cost of delay is much higher and more important than in this example. But the example illustrates the phenomenons of:

  • Replacing the Company level goal with team / org level conditions and metrics, which arguably relate to the Company level goal and often need to be met to achieve it, but are not the Company level goal (example: deliver on time).
  • Autonomously shifting goals and metrics to team level operational metrics that hardly or in no way relate to the Company level goal (example: always deliver at least 80% of Sprint content).

The Company level goal and hence the success criterion of the solution in this example is nothing else but realizing the Q1, Q2 and Q3 earnings in 2023. "Oh, but you know, this is a lagging indicator" - well, tough luck. It is a lagging indicator, but it is real. I'd rather use a lagging indicator than substitute it with leading, but deceptive metrics such as "Deliver at least 80% of Story Points in each Sprint". Now, after going through the examples, let's try to compile a more general guidance.

Good Company level goals and metrics (examples)

Earnings realized over a given period of time    this one was explained above

Accreditation of a standardizing institution       to get an official certification or otherwise be able to claim that the product implements a particular standard (related to an industry, security, health & safety, environmental impact, etc.) The metric is often binary - the product must pass some type verification - but some standard are divided into mandatory and optional parts, so there may be some flexibility there, depending on the overall Company goal.

Adoption of a solution in a new region        this is similar to Earnings realized... but here we speak clearly about opening sales of product or services to a geographical region or a country. The means to achieve it often include localization and adherence to the country's or the region's law. The metric is likely to be the income related to a new group of customers envisioned to but the product or service.

Increased use (by so many %) of a function of the solution which results in cost avoidance of $X dollars per a given period of time        this is about avoiding a cost, rather than about making money. Any function that can be realized digitally instead of traditional processing (e.g. by a call center agent, paper forms, etc.) will likely fall into this category. The savings can usually be nicely calculated using average wages of a call center agent and similar factors and they become the metric.

Achieving the same of better feature set that competitive products A, B and C already have by developing features X, Y and Z        this is likely a goal tracked with a binary metric - either we have the same features as our competitors or not. It makes sense as a Company level goal, if bids depend on having the same features as the competition.

Achieving parity with an existing solution that approaches EOL        again, a goal tracked with a binary metric, but there can be more flexibility, depending on which functions of an existing system will still be used and which will be ditched together with the system, for example because of a change in a business process that we are able to implement when the old system is gone.

I believe the the teams (or organizations - like a group of teams, etc.) should be given the Company goals as their own goals to work towards. I'm not talking about Sprint goals here (that's a topic I'd like to discuss separately). I rather mean quarterly goals, goals for the next 6 months, next 3 Product Increments... While treating the Company goals as their own goals, the teams can still use metrics at their own level, but unlike Company goals, these should not come from the top of the management structure. Some examples of team / org level metrics are listed below.

Team or org level metrics (examples)

Cycle time               to understand the average time it takes to process something through the development cycle. Can be on User Story level and all levels up (Features, Epics).

Work in progress        to understand the amount of in-process work (also for each status of a work item) and experiment with WIP limits to shorten the cycle time.

Story points / velocity    to plan the next Sprint based on previous results (the number of Stories can be used here as well, often with better outcome).

Bottom line

We started with a toy-example of limiting the laptop use, where limiting was the goal and elapsed time was a metric. The ways my kids tried to bend the metric also happen in software industry with other metrics, like cycle time. There's been extensive research done already on goals in the field of Organizational Theory: organizational ability to achieve goals and pathological phenomenons around goal definition. The goals are often substituted by lower-level, helper goals, or by autonomous goals invented at a team or organization level. Both phenomenons are harmful and don't bind teams with Company level goals. While some typical team level metrics, like velocity, can be helpful, they should never become team goals themselves.

Pictures used:

[1] Author's own picture of a voltmeter.

See also