Thursday, November 5, 2015

Applying and interpreting staleness metric (Advanced dependency management - part 3)

If you have been following this series from the beginning, you are getting the appreciation why this seemingly simple problem hasn't been solved yet. In Part 1, I described the background and motivation, and in Part 2, I discussed the details of the operations that need to be performed during integration and introduced a robust staleness metric. As a reminder, we're working with this graph of module dependencies, with 200 modules, and 12 levels in depth. Each vertex represents a module, and each edge represents a dependency.

First, we compute Dependency Graph Staleness Metric, DGSM, introduced previously. We find out, that the value is 19,733. That number by itself gives us some idea as to the amount of integration that needs to be performed, since we can interpret each point in the metric as incrementing a single dependency between two modules and running the build to verify that the downstream module is compatible with the new version. Note that there will always be certain staleness in the graph, since even in a most aggressively integrated system, there will always be changes that are in the process of being propagated. But what constitutes a healthy amount of staleness?

Obtaining day-by-day historical data would be time consuming, but we can get some perspective on how far out of this ideal state our system is by looking at the extremes of the metric. One way would be to calculate the metric for each module being one version behind. A graph with this property has DGSM value of 5,015. So we can also say on average the modules in the current graph are 4 versions behind. This tells us a bit how the structure of this graph impacts the metric, but as we will soon find out, majority of the modules in the graph have less than 4 versions.

A better way to look at the staleness would be to estimate the total amount of integration performed on the graph since its inception. We can do this by assuming all dependencies were pointing at the oldest versions. The DGSM metric in that case would be 73,246. This is a good approximation of the total cost of integration that the system has incurred. This graph of jars was converted to a new build system 2 years ago, at which point all dependencies were set to the latest and the value of the metric would have been 0. From that, we can compute monthly staleness attrition, which is about 3,000. One way to interpret the current staleness, at almost 20,000 is to say that the system is between 6 and 7 months behind.

These findings are summarized in the table below.

Current DGSM 19,733
Each module one version behind 5,015
Monthly attrition 3,000
Historic total (2 yrs.) 73,246

Despite the tedious cost of manual integration, this graph has been somewhat kept up to date over time. 75% of the integration work has been performed by developers, leaving the graph 25% stale (current 19,733 divided by historic total 73,246). Since individual developers never had the insight into staleness we are getting right now, we can conclude that this relatively high rate of integration was the minimum necessary to deliver code into production. Continuous Integration is organically occurring in this graph.

No comments:

Post a Comment