That's not "real" scrum: Measuring and managing productivity for development teams

People always seem confused by the title of this: "What does scrum have to do with measuring productivity?" they ask. And I smile contentedly, because that's the whole point.

Scrum is supposed to be a system for managing product work, iterating and delivering value to the customer. What usually winds up happening is scrum gets used for the management of software development work as a whole, from decisions about promotion to hiring and firing to everything else. That's not what scrum is designed to do, and it shows.

Now, I love to talk about process improvement, completely agnostic of whatever process framework you're using. I would much rather have a discussion about the work you're doing and what blockers you're hitting rather than discussing abstract concepts.

However, if you keep running into the same issues and blockers over and over again, it's usually worth examining your workflows to find out if you're actually applying the theory behind your framework to the actual work you're doing. The concept of Agile specifically is not about the processes involved, but you need to know and understand the rules before you should feel comfortable breaking them.

Processes

I want to start with a quick overview of a few key terms to make sure everyone's on the same page.

Waterfall 
In waterfall development, every item of work is scheduled out in advance. This is fantastic for management, because they can look at the schedule to see exactly what should be worked on, and have a concrete date by which everything will done.
This is horrible for everyone, including management, because the schedule is predicated upon developers being unerring prophets who are able to forecast not only the exact work that needs to be done to develop a release, but also the exact amount of time said work will take.
The ultimate delimiter of the work to be done is the schedule - usually there’s a specific release date (hopefully but not always far out enough to even theoretically get all the work done); whatever can get done by that date tends to be what’s released.
Waterfall also suffers greatly because it’s completely inflexible. Requirements are gathered months ahead of time; any changes require completely reworking the schedule, so changes are frowned upon. Thus, when the product is released, it’s usually missing features that would have been extremely beneficial to have.
Agile
Agile can be viewed as a direct response to waterfall-style development; rather than a rigid schedule, the agile approach embraces iteration and quick releases. The three primary “laws” of agile are:
- Law of the customer - The customer is the number one priority. Rather than focusing on hitting arbitrary milestones or internal benchmarks, agile teams should be focused on delivering products to customers that bring them additional value. A single line of code changed can be more worthwhile than an entirely new product if that line brings extra value to the customer.
- Law of small teams - Developers are grouped into small teams that are given autonomy in how they implement the features they’re working on. When work is assigned to a team, it’s not done so prescriptively. In the best agile teams, the assignment is, “Here’s the problem we have, go solve it.”
- Law of the network - There are differing interpretations on how to implement this, but essentially I view of the network as “the whole organization has to buy in to what the agile teams are doing.” The entire organization doesn’t need to have the same structure as the agile teams, but neither can it be structured in a manner antithetical to the processes or outcomes.
  The easiest counterexample is the entire dev department is using scrum, but the CTO still feels (by virtue of their title) the ability to step in and make changes or contribute code or modify stories on a whim. Just because the CTO is the manager doesn’t mean they have full control over every decision. Basically, law of the network means “respecting the agile method, even if you’re not directly involved.”
It’s worth noting that agile is a philosophy, not a framework in and of itself. Both kanban and scrum are implementations of the agile philosophy.
Kanban
This is usually the most confusing, because both scrum and kanban can use kanban boards (the table of stories, usually denoted by physical or virtual “post-its” that represent the team’s work). Kanban board splits up work into different “stages” (e.g., to-do, doing, done), and provides a visual way to track progression of stories.
The primary difference between scrum and kanban as a product development methodology is that kanban does not have specific “sprints” of work - the delimiter of work is how many items are in a given status at a given time. For example, if team limits “doing” to four cards and there are already four cards in there, no more can be added until one is moved along to the next stage (usually this means developers will pair or mob on a story to get it through).
Scrum
Scrum, by contrast, delimits its work by sprints. Sprints are the collection of work the team feels is necessary to complete to deliver value. They can be variable in their length (though in practice, they tend to be a specified time length, which causes its own issues).
Scrum requires each team to have at least two people - a product owner and a scrum master. Usually there are also developers, QA and devops people on the team as well, but at a minimum you need the PO and SM.
The product owner has the vision for what the product should be - they should be in constant contact with customers, potential customers and former customers to figure out how value can be added. The scrum master’s job is to be the sandpaper for the developers - not (as the name implies) their manager or boss, but the facilitator for ceremonies and provide coaching/guidance on stories and blockers.

Other reasons processes fail

I will note that a lot of the reasons I will list below may also apply to other product management methodologies; however, I’m specifically limiting the scope to how they impact scrum teams.

Lack of product vision

I don’t want to lay the blame entirely on product owners for this issue - very often the problem is with how the role is designed and hired for. Product owners should be the final arbiters for product decisions. They should absolutely consult design, UX and customer service experts for their opinions, but the decisions ultimately lies with them.

Unfortunately, the breadth of skills required to be a good product owner are not in abundant supply, and product owners are, bafflingly, often considered afterthoughts at many organizations.

More than specific skills, though, product owners need to have a vision for what the product could be, as well as the flexibility to adapt that vision when new information comes in. Usually, this requires domain knowledge (that can be acquired, but needs to be done so systematically and quickly upon hiring), steadfastness of conviction and the ability to analyze data properly to understand what customers want.

Far too often product owners essentially turn into feature prioitizers, regurgitating nearly everything customers say they want and assigning a ranking to it. This often comes at the expense of both the product’s conceptual integrity as well as relationships with developers, who are supposed to be given problems to solve, not features to develop. This is the classic feature factory trap.

Mistake the rules for the reason

Far too often, people will adopt the ceremonies or trappings of scrum without actually accepting an agile mindset. This is where my favorite tagline, “that’s just waterfall with sprints” comes from.

If you’ve ever started a project by first projecting and planning how long it’s going to take you to deliver a given set of features, congratulations, you’re using waterfall.

To use scrum, you need to adopt the iterative mindset to how you view your product. If you’re developing Facebook, you don’t say, “we’re going to build a system that allows you to have an activity feed that shows posts from your friends, groups and advertisters, and have an instant messaging product, and ..”

Instead, you’d say, “we’re going to develop a platform that helps people connect to one another.” Then you’d figure out the greatest value you can add in one sprint (e.g., users can create profiles and upload their picture.). You know once you have profiles you’ll probably want the ability to post on others’ profiles, so that’s in the backlog.

That’s it. That’s the planning you do. Because once those releases get into customers’ hands, you’ll then have better ideas for how to deliver the next increment of value.

Simply because an organization has “sprints” and a “backlog” and do “retros” doesn’t mean its’ using scrum, it means it’s using the language of scrum.

Lack of discipline/iteration

Tacking on to the last point, not setting up your team for success in an agile environment can doom the product overall. Companies tend to like hiring more junior developers, because they’re cheaper, but not realizing that a junior developer is not just a senior developer working at 80% speed. Junior developers need to have mentoring and code reviews, and those things take time. If the schedule is not set up to allow for that necessary training and code quality checks to happen, the product will suffer overall.

Similarly, development teams are often kept at a starting remove from everyday users and their opinions/feedback. While I by no means advocate a direct open firehose of feedback, some organizations don’t ever let their devs see actual users using the product, which creates a horrible lack of feedback loop from a UX and product design perspective.

Properly investing in the team and the processes is essential to any organization, but especially one that uses scrum.

Lack of organizational shift

The last ancillary reason I want to talk about in terms of scrum failure is aligning the organization with the teams that are using scrum (we’re back to the law of network, here). Scrum does not just rely on the dev team buying in, it also requires the larger organization to at least respect the principles of scrum for the team.

The most common example of this I see is when the entire dev department is using scrum, but the CTO still feels (by virtue of their title) the ability to step in and make changes or contribute code or modify stories on a whim. Just because the CTO is the manager doesn’t mean they have full control over every decision. Removing the autonomy for the team messes with the fundamental principles of scrum, and usually indicates there will be other issues as well (and I guarantee that CTO will also be mad when the scrum is now unbalanced or work doesn’t get done, even though they’re the direct cause).

No. 1 reason scrum fails: It’s used for other purposes

By far, the biggest reason I see scrum failing to deliver is when the ceremonies or ideas or data generated by scrum gets used for something other than delivering value to the end users.

It’s completely understandable! Management broadly wants predictability, the ability to schedule a release months out so that marketing and sales can create content and be ready to go.

But that’s not how scrum works. Organizations are used to being able to dictate schedules for large releases of software all at once (via waterfall), and making dev deliver on those schedules. If you’re scheduling a featureset six months out, it’s almost guaranteed you’re not delivering in an agile manner.

Instead of marketing-driven development, why not flip the script and have development-driven marketing? There is absolutely no law of marketing that says you have to push a new feature the second it’s generally available. If the marketing team keeps up with what’s being planned a sprint in an advance, that means they’d typically have at least a full month of leadtime to prepare materials for release.

Rather than being schedulable, what dev teams should shoot for is reliability and dependability. If the dev team commits to solving an issue in a given sprint, it’d better be done within that sprint (within reason). If it’s not, it’s on the dev team to improve its process so the situation doesn’t happen again.

But why does scrum get pulled off track? Most often, it’s because data points in scrum get used to mean something else.

Estimates

The two hardest problems in computer science are estimates, naming things, and zero-based indexes. Estimates are notoriously difficult to get right, especially when developing new features. Estimates get inordinately more complex when we talk about story pointing.

Story points are a value assigned to a given story. They are supposed to be relative to other stories in the sprint - e.g., a 2 is bigger than a 1, or a medium is bigger than a small, whatever. Regardless of the scale you’re using, it is supposed to be a measure of complexity for the story for prioritization purposes only.

Unfortunately, what usually winds up happening is teams adopt some sort of translation scale (either direct or indirect), something like 1 = finish in an afternoon, 2 = finish in a day, 3 = multiple days, 5 = a week, etc. But then management wants to make sure everyone is pulling their fair share, so people are gently told that 10 is the expectation for the number of points they should complete in a two-week sprint, and now we are completely off the rails.

Story points are not time estimates. Full stop.

It’s not a contract, you’re not a traffic cop trying to make your quota. Story points are estimates of the complexity of a story for you to use in prioritization. That’s it.

I actually dislike measuring sprint velocity sprint-to-sprint, because I don’t think it’s helpful in most cases. It actually distorts the meaning of a sprint. Remember, sprints are supposed to be variable in length; if your increment of value is small, have a small sprint. But because sprint review and retro have to happen every second Friday, sprints have to be two weeks. Because the sprint is two weeks, now we have two separate focii, and the scrum methodology drifts further and further away.

Campbell’s law is one of my favorite axioms. Paraphrased, it states:

The more emphasis placed on a metric, the more those being measured will be incentivized to game it.

In the case above, if developers are told they should be getting 10 points per sprint, suddenly their focus is no longer on the customer. It’s now on the number of story points they have completed. They may be disincentivized to pick up larger stories, fearing they might get bogged down. They’re almost certainly going to overestimate the complexity of stories, because now underestimates mean they’re going to be penalized in terms of hitting their targets.

This is where what I call the Concilio Corollary (itself a play on the uncertainty principle) comes into play:

You change the outcome of development by measuring it.

It’s ultimately a question of incentives and focus. If you start needing to worry about metrics other than “delivering value to the user,” then your focus drifts from same. This especially comes into play when organizations worry about individual velocity.

I don’t believe in the practice of “putting stories down” or “pick up another story when slightly blocked.” If a developer is blocked, it’s on the scrum master and the rest of the team to help them get unblocked. But I absolutely understand the desire to do so if everybody’s expected to maintain a certain momentum, and other people letting their tasks lie to help you is detrimental to their productivity stats. How could we expect teamwork to flourish in such an environment?

So how do we measure productivity?

Short answer: don’t.

Long answer: Don’t measure “productivity” as if it’s a value that can be computed from a single number. Productivity on its own is useless.

I used to work at a college of medicine, and after a big website refresh they were all excited reporting how many pageviews the new site was getting. And it makes sense, because when we think of web analytics, we think page views and monthly visitors and time on site, all that good stuff.

Except … what’s the value of pageviews to a college? They’re not selling ads, where more views works out to more money. In fact, the entire point of the website was to get prospective students to apply. So rather than track “how many people looked at this site,” what they should have been doing was looking at “how many come to this site and then hit the ‘apply now’ button,” and comparing that to the previous incarnation.

First, you need to figure out what the metrics are being used for. There are any number of different reasons you might want to measure “productivity” on a development team. Some potential reasons include performance reviews, deciding who to lay off, justifying costs, figuring out where/whether to invest more, or fixing issues on the development team.

But each of those reasons has a completely different dataset you should be using to make that decision. If you’re talking about performance reviews, knowing the individual velocity of a developer is useless. If it’s a junior, taking on a 5-point story might be a huge accomplishment. If you’re looking at a principal or a senior, you might actually expected a lower velocity, because they’re spending more time pairing with other developers to mentor them or help them get unblocked.

Second, find the data that answers the question. When I worked at a newspaper, we used to have screens all over the place that showed how many pageviews specific articles were getting. Except, we didn’t sell ads based on total pageviews. We got paid a LOT of money to sell ads to people in our geographical area, and a pittance for everything else. A million pageviews usually meant we had gone viral, but most of those hits were essentially worthless to us. To properly track and incentivize for best return, we should have been tracking local pageviews as our primary metric.

Similarly, if you’re trying to justify costs for your development team, just throwing the sprint velocity out there as the number to look at might work at the beginning, but that now becomes the standard you’re measured against. And once you start having to maintain features or fix bugs, those numbers are going to go down (it’s almost always easier to complete a high-point new-feature story than a high-point maintenance story, simply because you don’t have to understand or worry about as much context).

There are a number of newer metrics that have been proposed as standards that dev teams should be using. I don’t have an inherent problem with most of these metrics, but I do want to caution not to just adopt them wholesale as a replacement for sprint velocity. Instead, carefully consider what you’re trying to use the data for, then select those metrics that provide that data. Those metrics are SPACE and DORA. Please note that these are not all individual metrics; some of them (such as “number of handoffs”) are team-based.

SPACE

• Satisfaction and well-being

◦ This involves developer satisfaction surveys, analyzing retention numbers, things of that nature. Try to quantify how your developers feel about their processes.

• Performance

◦ This might include some form of story points shipped, but would also include things like number and quality of code reviews.

• Activity

◦ Story points completed, frequency of deployments, code reviews completed, or amount of time spent coding vs. architecting, etc.

• Communication/collaboration

◦ Time spent pairing, writing documentation, slack responses, on-call/office hours

• Efficiency/flow

◦ Time to get code reviewed, number of handoffs, time between acceptance and deployment

DORA

DORA, or DevOps Research and Assessment, are mostly team-based metrics. They include:

• Frequency of deployments

• Time between acceptance and deployment

• How frequently deployments fail

• How long it takes to recover/restore from failed

Focus on impact

But all of these metrics should be secondary, as the primary purpose of a scrum team is to deliver value. Thus, the primary metrics should measure direct impact of work: How much value did we deliver to customers?

This can be difficult to ascertain! It requires a lot of setup and analysis around observability, but these are things that a properly focused scrum team should already be doing. When the dev team is handed a story for a new feature, one factor of that story should be success criterion: e.g., at least 10% of active users use this feature in the first 10 days. That measurement should be what matters most. And failing to meet that mark doesn’t mean the individual developer failed, it means some underlying assumption (whether it’s discoverability or user need) is flawed, and should be corrected for the next set of iterations.

It comes down to outcome-driven-development vs. feature-driven-development. In scrum, you should have autonomous teams working to build solutions that provide value to the customer. That also includes accountability for the decisions that were made, and a quick feedback loop coupled with iteration to ensure that quality is being delivered continuously.

TL;DR

In summation, these are the important bits:

• Buy in up and down the corporate stack - structure needs to at least enable the scrum team, not work against it

• Don’t estimate more than you need to, and relatively at that

• Know what you’re measuring and why

Now, I know individual developers are probably not in a position to take action at the level “stop using metrics for the wrong reasons.” That’s why I have a set of individual takeaways you can use.

• Great mindset for performance review

◦ I am a terrible self-promoter, but keeping in mind the value I was creating made it easy for me come promotion time to say, “this is definitively what I did and how I added value to the team.” It made it much easier for me than trying to remember what specific stories I had worked on or which specific ideas were mine.

• Push toward alignment

◦ Try to push your leaders into finding metrics that answer the questions they’re actually asking. You may not be able to get them to abandon sprint velocity right off the bat, but the more people see useful, actionable metrics the less they focus on useless ones.

• Try to champion customer value

◦ It’s what scrum is for, so using customer value as your North Star usually helps cut through confusion and disagreement.

• Get better at knowing what you know / don't know

◦ This is literally the point of sprint retros, but sharing understanding of how the system works will help your whole team to improve the process and produce better software.

Since this post is long enough on its own, I also have a separate post from when I gave this talk in Michigan of questions people asked and my answers.