This is a summary of “How to Measure” in DevEx: What Actually Drives Productivity

Use both perceptual and systemic measurements

Perceptual metrics are self-reported and describe how developers feel (see why it’s important above in DevEx is both about tools and perceptions). They measure human attitudes and opinion.

Workflows measure systems and processes (e.g. the time it takes to build build targets).

You need both to understand the DevEx. For example:

  • You might have quick code reviews that feel disruptive because they’re pushed invasively to developers
  • Developers may feel satisfied with build processes but a measurement of build time may suggest that feedback loops could be quicker

Table showing DevEx (Developer Experience) metrics organized in three categories: Perceptions (attitudes and opinions), Workflows (system behaviors), and KPIs (key metrics). Metrics are grouped by Feedback Loops, Cognitive Load, and Flow State.

Use surveys to measure perceptual data

Surveys are hard to design but allow you to collect data quickly and, when designed correctly, give accurate baselines.

When using surveys it’s important to:

  • Break down results by team and persona because DevEx is highly contextual and varies dramatically by team and role
    • E.g. if you focus on aggregate results you may overlook important problems for a subset of devs, like mobile devs
  • Compare results against benchmarks
    • Other teams/peers in the company
    • Sentiment scores in other companies
  • Mix in transactional surveys (what is this?)
  • Try to consult with survey development experts

Example metrics

This is a generic proposal of metrics most companies could collect to measure DevEx based on the above and Google’s model to measure development productivity (speed, ease, quality).

Speed

MetricMeasurementType
Build timeTime in seconds for a target to complete (P50 and P90) for both local and remote executionWorkflows
Post-commit CITime taken in minutes for each commit to get through the CI pipeline (P50/P90)Workflows
Revisions that fail to landNumber of approved revisions that fail to landWorkflows

Ease

MetricMeasurementType
OnboardingBusiness hours to 1st and 10th PR for all new hires (P50/P90)Workflows
DevSat scoreSatisfaction of working as a developerPerceptual
Ease of code commitSelf-reported ease of committing codePerceptual
Ease of code reviewSelf-reported ease of reviewing codePerceptual
Perceived productivitySelf-reported productivityPeceptual
Tool discoverability% of engineers using a tool in the last monthWorkflows
Runbook qualitySelf-reported perception of the quality of a runbookPerceptual
Information discovereabilitySelf-reported perception of the ease to find information a dev needsPerceptual

Quality

MetricMeasurementType
Test determinism% likelihood that a test suite does not flakeWorkflows
Change Failure rateNumber of bugs per minor and patch releaseWorkflows
Change Failure rateNumber of post-merge failures per deploymentWorkflows
Change failure rateNumber of P0s per deploymentWorkflows
Time to resolve an incidentP50 and P90 of the number of business days taken to close an incidentWorkflows
Size of PRsP50 and P90 of PR sizeWorkflows
Freshness of dependencies% of dependencies that are in its latest versionWorkflows
Freshness of dependenciesP50 and P90 of the last time that a dependency was upgradedWorkflows
Freshness of documentation pages% of unarchived documentation pages that have been edited in the last 6 monthsWorkflows
Freshness of runbooks% of runbooks that have been edited in 6 monthsWorkflows
Testing% of diffs that land with at least some testsWorkflows
Code healthPerceived code healthPerceptual