What Metrics Actually Matter in the AI Era

Engineering organizations have always struggled with metrics.

Leaders want visibility.
Managers want ways to measure productivity.
Teams want to understand whether they are improving.

Over time, many ways of measuring engineering work emerged.

Story points.
Velocity charts.
Burndown graphs.
Ticket completion rates.

These metrics attempt to quantify progress.

But many of them end up measuring activity rather than outcomes.

Teams appear busy.
Tickets are closed.
Dashboards look healthy.

Yet an important question often remains unanswered.

Is the system actually improving?

The Problem with Measuring Activity

Many traditional engineering metrics focus on effort.

How many tasks were completed?
How many story points were delivered?
How many tickets were closed?

These numbers create the appearance of productivity.

But they rarely reflect the real health of the system.

A team can close many tickets while gradually introducing instability.

They can deliver large amounts of work while making the system harder to understand and maintain.

In fast-moving engineering environments, activity metrics become even less meaningful.

Especially in an AI-driven development environment.

When generating code becomes easier, measuring activity becomes almost irrelevant.

The real question is no longer:

How much work did we do?

The real question becomes:

Did the system improve?

A New Risk in the AI Era

AI dramatically accelerates how software can be created.

Large portions of implementation can now be generated quickly.

Developers can build features faster.
Iteration cycles shrink.
New components appear more frequently.

This acceleration improves productivity, but it also introduces a new risk.

Systems can evolve faster than teams fully understand them.

Architectural boundaries may slowly erode.
Dependencies may multiply.
Codebases may grow faster than the team’s ability to reason about them.

This is not a failure of AI.

It is a natural consequence of accelerated development.

When the cost of writing code decreases, the risk of structural complexity increases.

Why Maintainability Becomes Critical

For this reason, engineering teams must pay attention to something that traditional metrics often overlook.

System maintainability.

Maintainability reflects how easily a system can evolve over time.

Can engineers understand the structure of the system?
Are architectural boundaries clear?
Can new features be added without unexpected side effects?

A healthy system allows engineers to move confidently.

Changes remain predictable.
Debugging remains manageable.
The system continues to evolve smoothly.

But when maintainability deteriorates, development eventually slows down.

Features become harder to implement.
Changes introduce unintended consequences.
Debugging becomes increasingly complex.

In the AI era, protecting maintainability becomes even more important.

Because systems can grow faster than ever before.

Measuring Maintainability

Maintainability is harder to measure than deployment speed or failure rates.

Metrics such as deployment frequency or recovery time can be observed directly.

Maintainability is different.

It reflects how engineers experience the system while working with it.

Although there is no single perfect number, engineering teams can still measure maintainability through a set of practical signals.

One useful signal is change scope.

In a well-structured system, most changes remain localized.

A small feature should not require modifications across many services or modules.

If engineers repeatedly need to modify multiple components to implement simple changes, it often indicates weakening architectural boundaries.

Teams can track the average number of components affected per change to detect this pattern.

Another signal is debugging time.

Healthy systems allow engineers to diagnose problems quickly.

If production incidents take increasingly longer to understand and resolve, it may indicate that the system has become too complex.

Tracking the time required to identify the root cause of issues provides insight into system clarity.

A third signal is test stability.

Well-structured systems tend to produce stable automated tests.

If unrelated code changes frequently break existing tests, it often suggests tight coupling between system components.

Frequent test instability may signal declining maintainability.

Finally, engineering teams can include a team maintainability rating.

Periodically, engineers evaluate the maintainability of the system on a simple scale.

They consider questions such as:

How easy is it to understand the system architecture?
How confident do engineers feel when making changes?
How predictable are system behaviors when modified?

Although this rating is subjective, it captures an important dimension of system health that automated metrics cannot fully represent.

Together, these signals provide a practical way to observe maintainability over time.

The DORA Metrics

While maintainability reflects the long-term health of the system, engineering organizations also need metrics that measure delivery performance.

The DevOps research community introduced a widely adopted set of operational metrics known as the DORA metrics.

These metrics focus on how software moves from development into production.

Four metrics became widely used.

Deployment Frequency
Lead Time for Changes
Change Failure Rate
Mean Time to Recovery

Deployment Frequency measures how often the team releases software.

Lead Time for Changes measures how long it takes for a change to move from commit to production.

Change Failure Rate measures how often deployments introduce problems.

Mean Time to Recovery measures how quickly teams restore stability when failures occur.

Together, these metrics capture the balance between speed and stability.

Fast teams deliver changes quickly.

Strong teams deliver changes safely.

The best engineering organizations achieve both.

The Five Signals That Matter

For most engineering organizations, a small set of metrics provides sufficient visibility into engineering performance.

Deployment Frequency
Lead Time for Changes
Change Failure Rate
Mean Time to Recovery
System Maintainability

The first four metrics reveal how effectively software moves into production.

The fifth reflects whether the system remains sustainable as it evolves.

Together, these signals provide a balanced view of engineering health.

They show how quickly the team moves.

They reveal how safely the system operates.

And they highlight whether the system remains maintainable over time.

Together, they answer the most important question.

Is the engineering system improving?

Closing Part 5 — The New Software Development Lifecycle

For decades, the Software Development Lifecycle evolved around a central constraint.

Writing software was slow.

Processes, roles, and structures were designed to manage that limitation.

Testing stages existed to catch mistakes.
Release cycles existed to control risk.
Organizational structures evolved to coordinate development work.

AI shifts that constraint.

Implementation becomes dramatically faster.

But when the speed of building software changes, the entire lifecycle must evolve with it.

Quality moves earlier into development through automation.

Code review shifts from syntax inspection to architectural thinking.

CI/CD pipelines must become fully automated to support continuous delivery.

Production observability becomes a critical validation mechanism.

And engineering metrics must focus on system outcomes rather than activity.

The Software Development Lifecycle does not disappear.

But its center of gravity changes.

Less effort is spent on writing code.

More effort is spent on understanding systems, validating behavior, and protecting long-term system integrity.

Because when code is no longer the bottleneck, something else becomes visible.

The quality of engineering thinking.