Improve your software product delivery process performance using metrics (I)

This is the first of a two articles series about the topic. I recommend to read this one first. Click here to open the second one.

Introduction

As many other graduates in physics, I have passion for Feynman. His explanations of complex concepts made them seem reachable for students like me up to a point where you develop a taste for simplicity. Producing software at scale is complex, but if you have some basic and often simple concepts clear and you keep passionate about simplicity, you not just be able to better understand the management challenges ahead of you but also to communicate them more effectively, as well as the potential solutions.

As manager and currently consultant, I always feel that a significant part of my job is to put focus on simple things that are often forgotten or ignored. I keep in mind Feynman words and try to stick to them.

If you can’t explain something in simple terms, you don’t understand it.

Richard Feynman

These series of articles are an attempt to explain in simple words how to approach software product delivery performance process metrics in production environments at scale. I will use automotive as main example because it is the one I am currently working on.

Disclaimer

The ideas and practices you will read in these two articles are not mine. I just summarize other people ideas, structure them and add some bits here and there based on my own experience. You will find key references on this topic at the end of this article.

Simplicity, when done wrong, leads to inaccuracy and sometimes to a disconnection with reality, which makes the concepts about to be explained, inapplicable. If you see any of that in this article, please point me to it and I will work it further. I use some of these explanations on my daily job so, in the same line, if you help me to improve them, I would really appreciate it.

Why the software product delivery process?

When you analyze any software portfolio life cycle developed at scale in industries like automotive you can identify several phases (check the image).

From all of these phases, the delivery process is the one where you can apply more engineering by making it systematic. Continuous Delivery is becoming increasingly popular as a set of principles and practices to do so. It is also important to add to CD a release approach that meet your needs. If continuous deployment does, then it should definitely be the selected one.

I will concentrate on the delivery phase of the product life cycle.

Approach to follow

As Continuous Delivery gets more and more traction, the understanding of any software product delivery process as an engineering process increases, especially at scale. This necessarily leads a the need to strength data driven decision making processes within organizations.

Telemetry is applied to areas like operations or customer support, for instance. Data is collected, aggregated, plotted and then it is used as input to support analysis and decision making processes. This way of using data is very similar to what science was about in its early days, based on observation. Yes, sophisticated, but observation after all.

That is a useful approach specially for simple systems but it is limited when the goal is to support decisions meant to affect significant parts, or even the entire, delivery process, specially when it is complex.

At scale, we know that optimizing locally does not necessarily leads to global improvements. At the same time, in order to optimize a system globally, we know that we might not require to optimize each one of the stages of the delivery process.

I have invested energy a few times to persuade executives and managers that a “bottom-up” approach (from local to global), frequently based on telemetry, needs to be complemented with a “top-down” (global to local) approach, a more systematic approach, a more “scientific” one.

Sometimes I refer to this approach as applying “system thinking”. This is due to the rejection that I have witness among managers to anything that resembles “science”. I believe it is because they perceive it as too theoretical. Have you experienced something similar?

Summary of the overall approach

In any case, as many reputed consultants in product development often points out, what you will read about here is nothing but an application of the scientific method to solve problems. When it comes to data and how it should support decision making processes (continuous improvement), I have noticed that it works well to refer to metrics vs telemetry.

In summary, when I try to explain the need to invest in product delivery performance metrics to support decision making processes, using a continuous improvement approach, I simplify the message as described in the picture, which I often refer as following a top-down vs a bottom-up approach. Not the best label, but easy to remember and communicate, specially to executives.

Five not-so-simple steps

I rarely have more than one shot to justify the rationale behind the approach I will describe to apply a data driven improvement kata using software product delivery performance metrics as guidance so I have structured my approach in 5 “simple steps”:

  1. Model your delivery process.
  2. Create a mathematical construct.
  3. Measure and analyze.
  4. Apply a data driven improvement kata to optimize the delivery process.
  5. Add complexity to approach reality. Go back to point 1.

1. Model your system

Like scientists do, we should start our approach with a simple model, the simplest possible abstraction of out delivery process that can provide us context and enough valuable information to start with. If we cannot understand the behavior and performance of a simplified abstraction of our delivery process, we will not understand the real process which is significantly more complex.

What is the simplest model we can use to describe our delivery process? I start with this one…

It depends on how advance the automaker is in the adoption of a software mindset as well as the organizational structure, the development teams might have and end-to-end ownership of the software shipped in the vehicles or they release it (hand-over) to those in charge of the deployment in production (production in the factory including end-of-line testing). There are other possible actors involved. To simplify, we will assume that the delivery process ends with a release process to factory.

The delivery process begins once the software is developed, that is, when the developer considers she is finished and produces a merge request to a repository which is part of the product inventory (repository that contains the software that ends in the product or that is necessary to product the product).

With the above in mind, we can model our delivery process as…

Now that we have described our delivery process in the simplest way possible, we will go over the rest of the steps. The goal is to use this model to describe the process and work with it. Once we understand it and we are able to improve it, we can add complexity.

2. Create a mathematical construct

We will use math to explain our the behavior of our system and the performance of the involved process. Our mathematical construct will be formed by three metrics, which are:

Throughput and stability characterize the performance of our model (delivery process) in a simple way, they are easy to understand and, as we will see later on, in the coming article, they work well for simple and complex models.

There are plenty of engineering disciplines where stability is a popular metric. Software product professionals familiar with agile know these concepts from manufacturing for instance. Those of you familiar with physics or engineering areas where physical systems manage some liquids flow or discrete elements, like networking or water supply systems, are already familiar with the concept of throughput.

The goal of this article is not to go deep into these metrics. I will only provide the minimum level of detail to understand them. At the end of the article I provide several references if you are interested in further details and justifications. I strongly recommend you to go over them.

Wikipedia mentions that Cost of delay combines an understanding of value with how that value leaks away over time. It is “a way of communicating the impact of time on the outcomes we hope to achieve.”[1]

If you only quantify one thing, quantify the cost of delay.” – Don Reinertsen

This series will not concentrate on this metric. CoD does not characterize our delivery process although it is related to the other two metrics. CoD refers to the entire product life cycle.

Why did I mentioned it then?

I always recommend to add a business related metrics to whatever software product production process performance metrics you use for, at least, the following reasons:

  • Relates product level decisions with business impact as well as business decisions with impact at product level.
  • It relates business decisions and their impact with engineering activities and improvements.
  • Software developers take better decisions when they understand their impact in the business.
  • It helps executives to understand the profound benefit that decentralization have when it comes to many type of decisions.
  • It works as antidote against old school managers.

There are other popular business metrics that could also be used, by the way.

Dave Farley mentioned in one of his videos, published a few months back, the following: “Stability measures the quality of the output. It refers to building the thing right.” and ” Throughput measures the efficiency with which we produce the output. It refer to building the right thing.”

Stability and Throughput as metrics are becoming increasingly popular, specially after the publication of the book Accelerate, which summarizes the findings of the State of DevOps Report up to 2019. You can find this reference in the Reads section of this site. I mentioned in a previous post why I believe they are the right ones for the job.

The measures that describe these two metrics will be:

  • Change Failure Rate provides an idea of the changes (input) that did not lead to any deployment or release (output) so require remediation actions.
  • Failure recovery time provide an idea of the time required to detect and remediate a failed change to a new deployment or release (output) is provided.
  • Lead time provides an idea of the time that a change (merge or pull request – input) takes to produce a deployment or release (output).
  • Frequency provides an idea of how often changes are released. Time interval is the inverse of frequency. We will use it for practical reasons instead of frequency.

One interesting point about measures are the units. These measures have simple units. This is one of the reasons for using Time Interval, for instance, instead of frequency.

The units are also easily adaptable to delivery processes related with different industries or types of products. For instance, lead times are very different for a mobile app compared to a operating system for an automotive infotainment platform.

Looking for simplicity, we will use averages and standard deviations as the way to characterize the data sets from our measurements, taken over a period of time. Obviously this will not be the case for the Failure Rate.

My recommendation is that you start with these two metrics and, step by step, introduce further ones if required. Remember that less but meaningful is more, and that, at scale, the hard part is to drive changes using the metrics as a part of a continuous improvement culture. In other words, people over metrics, tools, technology…

3. Measure and analyze

3.1 Quantitative analysis

In our simplified model, the different measures should not be hard to identify. When dealing with the input, we will need to dig into our SCM to find out when the code is submitted (MR). When dealing with the output, we will identify in the binary blobs storage tool when the right binary is tagged as release. Those two events will be all we need to define the measures that will define our metrics.

Be careful with time zones and error propagation during conversions. If you have access to data scientists or somebody with a strong compe­tence in math, ask for advice. Please consider that our model will increa­se in complexity and other people might be in charge of extracting the data for you. At some point in time you will use tools to extract, process and plot the data in real time that will need to be configured properly so, despite being simple to identify, you should describe how to extract the data and convert it to the right units in detail.

Validate your results and their usefulness with simple tooling before automating the process using more complex tools. Do not be ashamed of using something like spreadsheets at first or something even simpler.

Plot the data using a timescale in the X-axis that is appropriate for your environment. It is hard to guess which one will be the most useful range so I suggest you to try different options. For simple applications that range might be weeks while in the case of an automotive platform it might be months or even quarters. Different visualizations might lead you to slightly different conclusions so explore your options.

Once you are able to get the right data in a reliable and consistent way, so you can determine your Throughput and Stability for any point in time, you will be ready to go for further quantitative analysis.

Instead of paying too much attention at the beginning to absolute values of the different measures, look for trends in the curves and what happened around:

  • key events for your product or organization.
  • dates when key actions were implemented in your delivery process.

Try to learn about the behavior of your delivery process and its performance before getting too deep into trying to improve it. Having conversations with the teams instead of taking early conclusions is what I recommend.

But in order to communicate the current state of your delivery process as well as the target conditions across the entire organiza­tion, you will need to turn your quantita­tive analysis into a qualitative one.

3.2 Qualitative analysis

The first step to move from a quantitative to a qualitative analysis is to define scales. For each measure we will define a scale. We will assign a label to each range of values from that scale. To keep the scale simple, we will define each scale based on a single threshold.

Which value should be used as threshold? Choose one that makes sense for your organization. Check the values and choose one that will make the rest of the qualitative analysis meaningful.

As you can see, the chosen labels are High and Low for our simplest scale, based on a single threshold (value). Once you have defined the thresholds and the scales for each measure, it is time to define scenarios, based on these scales you just have created.

Following the scenarios published by Steve Smith in his book “Measuring Continuous Delivery” I created the representations below to summarize them. I strongly recommend you read this book. Many of the ideas you will see here and in the next article were either taken from or inspired by it.

Based on the quantitative analysis and the scales that you should do, it must be possible to identify which scenario correspond to the delivery process of your software product life cycle. Such scenarios allows to build a common unders­tanding about the delivery process status across the entire organization, where to improve it as well as the target scenario.

Take the above scenarios (on the first column) as examples. Some might not be relevant to you. At the same time, try to describe each one of them in simple words so the entire workforce understand them.

You have now all the tools to describe the performance of your delivery process, both quantitatively and qualitatively, as well a simple way to describe your current and target condition (where you want to head to). This coming last step is about how to get there from one to the other one.

4. Apply a data driven improvement kata to optimize the delivery process.

How do we move (improve) from our current scenario to the target one? Data driven improvement kata is the answer.

Once you introduce metrics in an organization, there is a high risks to fire up a race among different groups at different levels to make their own interpretations of the data to improve specific parts of the process locally. This behavior frequently lead to incompatible measures taken in different parts of the system instead of improving the metrics as expected. Defining a coordinated improvement kata including a shared understanding of the current scenario (condition) is a must.

In the same way that it was not my intention to go over details about metrics, it is far from my intention to go over details about an improvement kata. Please check the reference section to learn more about it. A few things though are important to mention.

The steps of the improvement kata are, essentially:

  • Understand the direction or challenge.
  • Grasp the current condition.
  • Establish the next target condition.
  • PDCA cycle to experiment towards the target condition.

I suggest to include in the first step of the kata the business and product success picture for the current cycle. Usually the business cycle is a year. The product could be a month, two, a quarter… depending on the product and organization. The experimentation cycle should be restricted to a single sprint. It it takes longer, slice the experiment so it fits in a single sprint.

A data driven improvement kata requires to describe the current scenario, the target one, the hypothesis of the experiments and their results based on the metrics from the mathematical construct (better) or proxy ones. For our example we will assume a simple organization that works in short sprints. Our improvement kata then should be struc­tured using such cadence.

I like to represent any improvement kata through boards, as a summary. Here is a simple one you can start with. These boards should be visible and understood by the entire workforce. Treat them as a living document.

Limit the number of experiments and actions to 3 or 4 per cycle, depending on the teams you work with, specially at the beginning. Be aware that two different experiments can have positive effects applied individually but not necessarily when consolidated together.

We will be able to learn a lot about our delivery process with this simple model. We will also improve the delivery process in several iterations, but at some point a richer model will be required. It will be then time to get one step closer to reality.

6. Add complexity to approach reality. Go back to point 1.

This point will be the substance of the second article of the series.

Summary

We have justified the relevance to start simple when evaluating the performance of our delivery process. We have created the simplest possible model to start our analysis from. We described such model as well as a mathematical construct to characterize it. Some considerations were provided about how to perform the measurements and plot the results as part of a quantitative analysis.

We learned how to move from a quantitative to a qualitative analysis and why. Once the qualitative analysis done, we defined a data driven improve­ment kata to improve the performance of our delivery process iteratively. Such kata is summarized in a simple board.

In essence, this is a process any organization can follow in order to improve de performance of the delivery process effectively. If you are not able to say out loud what has been your Throughput and Stability the past quarter, last month, yesterday, today… your delivery process is not under control. In such case, it is hard to imagine that you will be able to improve it in a meaningful way.

Go to the second article of the series.

References

There are countless references to consider, but most (if not all) ideas from this and the coming article are taken or summarized in the following references. Some of them are included in the Reads section of this site. The main references are:

If you want to go far, together is faster (II).

This is the second of a series of two articles describing the idea of correlating software product delivery process performance metrics with community health and collaboration metrics as a way to engage execution managers so their teams participate in Open Source projects and Inner Source programs. The first article is called If you want to go far, together is faster (I). Please read it before this post if you haven’t already. You can also watch the talk I gave at InnerSource Commons Fall 2020 that summarizes these series.

In the previous post I provided some background and described my perception about what causes resistance from managers to involve their development teams in Open Source projects and Inner Source programs. I enumerated five not-so-simple steps to reduce such resistance. This article explain such steps in some detail.

Let me start enumerating again the proposed steps:

1.- In addition to collaboration and community health metrics, I recommend to track product delivery process performance ones in Open Source projects and Inner Source programs.
2.- Correlate both groups of metrics.
3.- Focus on decisions and actions that creates a positive correlation between those two groups of metrics.
4.- Create a reporting strategy to developers and managers based on such positive correlation.
5.- Use such strategy to turn around your story: it is about creating positive business impact at scale through open collaboration.

The solution explained

1.- Collaboration and community health metrics as well as product delivery process performance metrics.

Most Open Source projects and Inner Source programs focus their initial efforts related with metrics in measuring collaboration as well as community healthiness. There is an Open Source project hosted by the Linux Foundation focused on the definition of many types of metrics. Collaboration and community health metrics are among the more mature ones. The project is called CHAOSS. You can find plenty of examples of these metrics applied in a variety of Open Source projects there too.

Inner Source programs are taking the experience developed by Open Source projects in this field and apply them internally so many of them are using such metrics (collaboration and community health) as the base to evaluate how successful they are. In our attempt to expand our study of these collaboration environments to areas directly related with productivity, efficiency etc., additional metrics should be considered.

Before getting into the core ones, I have to say that many projects pay attention to code review related metrics as well as defect management to evaluate productivity or performance. These metrics go in the right direction but they are only partial and, in order to demonstrate a clear relation between collaboration and productivity or performance for instance, they do not work very well in many cases. I will put a few examples why.

Code review is a standard practice among Open Source projects, but at scale is perceived by many as an inefficient activity compared to others, when knowledge transfer and mentorship are not a core goal. Pair or mob programing as well as code review restricted to team scale are practices perceived by many execution managers as more efficient in corporate environments.

When it comes to defect management, companies have been tracking these variables for a long time and it will be very hard for Open Source and Inner Source evangelists to convince execution managers that what you are doing in the open or in the Inner Source program is so much better ans specially cheaper than it is worth participating. For many of these managers, cost control goes first and code sustainability comes later, not the other way around.

Unsurprisingly, I recommend to focus on the delivery process of the software product production as a first step towards reducing the resistance from execution managers to embrace collaboration at scale. I pick the delivery process because it is deterministic, so it is simpler to apply process engineering (so metrics) than to any other stage of the product life cycle that involves development. From all the potential metrics, throughput and stability are the essential ones.

Throughput and Stability

It is not the point of this article to go deep into these metrics. I suggest you to refer to delivery or product managers at your organization that embrace Continuous Delivery principles and practices to get information about these core metrics. You can also read Steve Smith’s book Measuring Continuous Delivery, which defines the metrics in detail, characterize them and provide guidance on how to implement them and use them. You can find more details about this and other interesting books at the Reads section of this site, by the way.

There are several reasons for me to recommend these two metrics. Some of them are:

  • Both metrics characterize the performance of a system that processes a flow of elements. The software product delivery can be conceive as such a system where the information flows in the form of code commits, packages, images… .
  • Both metrics (sometimes in different forms/expressions) are widely used in other knowledge areas, in some cases for quite some time now, like networking, lean manufacturing, fluid dynamics… There is little magic behind them.
  • To me the most relevant characteristic is that, once your delivery system is modeled, both metrics can be applied at system level (globally) and at a specific stage (locally). This is extremely powerful when trying to improve the overall performance of the delivery process through local actions at specific points. You can track the effect of local improvements in the entire process.
  • Both metrics have simple units and are simple to measure. The complexity is operational when different tools are used across the delivery process. The usage of these metrics reduce the complexity to a technical problem.
  • Throughput and Stability are positively correlated when applying Continuous Delivery principles and practices. In addition, they can be used to track how good you are doing when moving from a discontinuous to a continuous delivery system. Several of the practices promoted by Continuous Delivery are already very popular among Open Source projects. In some cases, some would claim that they were invented there, way before Continuous Delivery was a thing in corporate environments. I love the chicken-egg debates… but not now.

Let’s assume from now on that I have convinced you that Throughput and Stability are the two metrics to focus on, in addition with the already in use collaboration and community health metrics your Open Source or Inner Source project is already using.

If you are not convinced, by the way, even after reading S. Smith book, you might want to check the most common references to Continuous Delivery. Dave Farley, one of the fathers of the Continuous Delivery movement, has a new series of videos you should watch. One of them deals with these two metrics.

2.- Correlate both groups of metrics

Let’s assume for a moment that you have implemented such delivery process metrics in several of the projects in your Inner Source initiative or across your delivery pipelines in your Open Source project. The following step is to introduce an Improvement Kata process to define and evaluate the outcome of specific actions over prestablished high level SMART goals. Such goals should aim for a correlation between both types of metrics (community health / collaboration and delivery process ones).

Let me put one example. It is widely understood in Open Source projects that being welcoming is a sign of good health. It is common to measure how many newcomers the project attract overtime and their initial journey within the community, looking for their consolidation as contributors. A similar thinking is followed in Inner Source projects.

The truth is that not always more capacity translate into higher throughput or an increase of process stability, on the contrary, it is a widely accepted among execution managers that the opposite is more likely in some cases. Unless the work structure, so the teams and the tooling, are oriented to embrace flexible capacity, high rates of capacity variability leads to inefficiencies. This is an example of an expected negative correlation.

In this particular case then, the goal is to extend the actions related with increasing our number of new contributors to our delivery process, ensuring that our system is sensitive to an increase of capacity at the expected rate and we can track it accordingly.

What do we have to do to mitigate the risks of increasing the Integration failure rate due to having an increase of throughput at commit stage? Can we increase our build capacity accordingly? Can our testing infrastructure digest the increase of builds derived from increasing our development capacity, assuming we keep the number of commits per triggered build?

In summary, work on the correlation of both groups of metrics, so link actions that would affect both, community health and collaboration metrics together with delivery metrics.

3.- Focus on decisions and actions that creates a positive correlation between both groups of metrics.

There will be executed actions designed to increase our number of contributors that might lead to a reduction of throughput or stability, others that might have a positive effect in one of them but not the other (spoiler alert, at some both will decrease) and some others that will increase both of them (positive correlation).

If you work in an environment where Continuous Delivery is the norm, those behind the execution will understand which actions have a positive correlation between throughput and stability. Your job will only be associated to link those actions with the ones you are familiar with in the community health and collaboration space. If not, you work will be harder, but still worth it.

For our particular case, you might find for instance, that a simple measure to digest the increasing number of commits (bug fixes) can be to scale up the build capacity if you have remaining budget. You might find though that you have problems doing so when reviewing acceptance criteria because you lack automation, or that your current testing-on-hardware capacity is almost fixed due to limitations in the system that manage your test benches and additional effort to improve the situation is required.

Establishing experiments that consider not just the collaboration side but also the software delivery one as well as translating into production those experiments that demonstrate a positive correlation of the target metrics, increasing all of them, might bring you to surprising results, sometimes far from common knowledge among those focused on collaboration aspects only, but closer to those focused in execution.

4.- Create a reporting strategy to developers and managers based on such positive correlation.

A board member of an organization I was managing, once told me what I follow ever since. It was something like…

Managers talk to managers through reports. Speak up clearly through them.

As manager I used to put a lot of thinking in the reporting strategy. I have some blog posts written about this point. Beside things like the language used or the OKRs and KPIs you base your reporting upon, understanding the motivations and background of the target audience of those reports is as important.

I suggest you pay attention to how those you want to convince about participating in Open Source or Inner Source projects report to their managers as well as how others report to them. Are those report time based? KPIs based, are they presented and discussed in 1:1s or in a team meeting? etc. Usually every senior manager dealing with execution have a consolidated way of reporting and being reported. Adapt to it instead of keeping the format we are more used to in open environments. I love reporting through a team or department blog but it might not be the best format for this case.

After creating and evaluating many reports about community health and collaboration activities, I suggest to change how they are conceived. Instead of focusing on collaboration growth and community health first and then in the consequences of such improvements for the organization (benefits), focus first on how product or project performance have improved while collaboration and community health has improved. In other words, change how cause-effect are presented.

The idea is to convince execution managers that by anticipating in Open Source projects or Inner Source programs, their teams can learn how to be more efficient and productive in short cycles while achieving long term goals they can present to executives. Help those managers also to present both type of achievements to their executives using your own reports.

For engineers, move the spotlight away from the growth of interactions among developers and put it in the increase of stability derived from making those interactions meaningful, for instance. Or try to correlate diversity metrics with defects management results, or with reductions in change failure rates or detected security vulnerabilities, etc. Move partially your reporting focus away from teams satisfaction (a common strategy within Open Source projects) and put it in team performance and productivity. They are obviously intimately related but tech leads and other key roles within your company might be more sensitive to the latter.

In summary, you achieve the proposed goal if execution managers can take the reports you present to them and insert them in theirs without re-interpreting the language, the figures, the datasets, the conclusions…

5.- Turn your story around.

If you manage to find positive correlations between the proposed metrics and report about those correlations in a way that is sensitive for execution managers, you will have established a very powerful platform to create an unbeatable story around your Inner Source program or your participation at Open Source projects. Investment growth will receive less resistance and it will be easier to infect execution units with practices and tools promoted through the collaboration program.

Prescriptors and evangelists will feel more supported in their viral infection and those responsible for these programs will gain an invaluable ally in their battle against legal, procurement, IP or risks departments, among others. Collaboration will not just be good for the developers or the company but also clearly for the product portfolio or the services. And not just in the long run but also in a shorter term. That is a significant difference.

Your story will be about increasing business impact through collaboration instead of about collaborating to achieve bigger business impact. Open collaboration environments increase productivity and have a tangible positive impact in the organization’s product/service, so it has a clear positive business impact.

Conclusion

In order to attract execution managers to promote the participation of their departments and teams in Open Source projects and Inner Source programs, I recommend to define a different communication strategy, one that rely in reports based on results provided by actions that show a positive correlation between community health and collaboration metrics with delivery process performance metrics, especially throughput and stability. This idea can be summarized in the following steps, explained in these two articles:

  • Collaboration within a commercial organization matters more to management if it has a measurable positive business impact.
  • To take decisions and evaluate their impact within your Inner Source program or the FLOSS community, combine collaboration and community health metrics with delivery metrics, fundamentally throughput and stability.
  • Prioritize those decisions/actions that produce a tangible positive correlation between these two groups of metrics.
  • Report, specially to managers, based on such positive correlation.
  • Adapt your Inner Source or Open Source story: increase business impact through collaboration.

In a nutshell, it all comes down to prove that, at scale…

if you want to go far, together is faster.

Check the first one of this article series if you haven’t. You can also watch the recording of the talk provided at ISC Fall 2020 where I summarized what is explained in these two articles.

I would like to thank the ISC Fall 2020 content committee and organizers for giving me the opportunity to participate in such interesting and well organized event.

If you want to go far, together is faster (I).

This is the first of a series of two articles describing the reasoning and the steps behind the idea of correlating software product delivery process performance metrics with community health and collaboration metrics as a way to engage execution managers so their teams participate in Open Source projects and Inner Source programs. What you will read in these articles is an extension of a talk I gave at the event Inner Source Commons Fall 2020.

Background

There is a very popular African proverb within the Free Software movement that says…

If you want to go fast, go alone. If you want to go far, go together.

Many of us used it for years to promote collaboration among commercial organizations over doing it internally in a fast way at the risk of reinventing the wheel, not following standards, reducing quality, etc.

The proverb describes an implicit OR relation between the traditional Open Source mindset, focused on longer term results obtained through extensive collaboration, and the traditional corporate mindset, where time-to-market is almost an obsession.

Early in my career I got exposed as manager (I do not code) to agile and a little later to Continuous Delivery. This second set of principles and practices had a big impact on me because of the tight and positive correlation that proposes between speed and quality. Until then, I had assumed that such correlation was negative, when one increases the other decreases or vice-versa.

During a long time, I also assumed unconsciously as truth the negative correlation between collaboration and speed. It was not until I started working in projects at scale when I became aware of such unconscious assumption and start question it first and challenging it later.

In my early years in Open Source I found myself many times discussing with executives and managers about the benefits of this “collaboration framework” and why they should adopt it. Like probably many of you, I found myself being more successful among executives and politicians than middle layer managers.

– “No wonder they are executives” I thought more than once back then.

But time prove me wrong once again.

Problem statement: can we go faster by going together?

It was not until later on in my career, when I could relate to good Open Source evangelists but specially good sales professionals. I learned a bit about how different groups within the same organization are incentivized differently and you need to understand those incentives to tune your message in a a way that they can relate to it.

Most of my arguments and those from my colleagues back then were focused on cost reductions and collaboration, on preventing silos, on shorten innovation cycles, on sustainability, prevention of vendor lock-in, etc. Those arguments resonate very well among those responsible for strategic decisions or those managers directly related with innovation. But they did not work well with execution managers, specially senior ones.

When I have been a manager myself in the software industry, frequently my incentives had little to do with those arguments. In some cases, either my manager’s incentives had little to do with such arguments despite being an Open Organization. Open Source was part of the company culture but management objectives had little to do with collaboration. Variables like productivity, efficiency, time-to-market, customer satisfaction, defects management, release cycles, yearly costs, etc., were the core incentives that drove my actions and those around me.

If that was the case for those organizations I was involved in back then, imagine traditional corporations. Later on I got engage with such companies which confirmed this intuition.

I found myself more than once arguing with my managers about priorities and incentives, goals and KPIs because, as Open Source guy, I was for some time unable to clearly articulate the positive correlation between collaboration and efficiency, productivity, cost reduction etc. In some cases, this inability was a factor in generating a collision that end up with my bones out of the organization.

That positive correlation between collaboration and productivity was counter-intuitive for many middle managers I know ten years ago. Still is for some, even in the Open Source space. Haven’t you heard from managers that if you want to meet deadlines do not work upstream because you go move slower? I’ve heard so many times that, as mentioned before, during years I believed it was true. It might at small scale, but at big scale, it is not necessarily true.

It was not until two or three years ago that I started paying attention to Inner Source. I realized that many have inherited this belief. And since they live in corporate environments, the challenge that represents convincing execution related managers is bigger than in Open Source.

Inner Source programs are usually supported by executives and R&D departments but receive resistance from middle management, especially those closer to execution units. Collaborating with other departments might be good in the long term but it is perceived as less productive than developing in isolation. Somehow, in order to participate in Inner Source programs, they see themselves choosing between shorter-term and longer-term goals, between their incentives and those of the executives. It has little to do with their ability to “get it“.

So either their incentives are changed and they demonstrate that the organization can still be profitable, or you need to adapt to those incentives. What I believe is that adapting to those incentives means, in a nutshell, to provide a solid answer to the question, can we go faster by going together?

The proposed solution: if you want to go far, together is faster.

If we could find a positive correlation between efficiency/productivity and collaboration, we could change the proverb above by something like…

And hey, technically speaking, it would still be an African proverb, since I am from the Canary Islands, right?.

The idea behind the above sentence is to establish an AND relation between speed and collaboration meeting both, traditional corporate and Open Source (and Inner Source) goals.

Proving such positive correlation could be help to reduce the resistance offered by middle management to practice collaboration at scale either within Inner Source programs or Open Source projects. They would perceive such participation as a path to meet those longer term goals without contradicting many of the incentives they work with and they promote among their managees.

So the following question is, how we can do that? how can we provide evidences of such positive correlation in a language that is familiar to those managers?

The solution summarized: ISC Fall 2020

I tried to briefly explain to people running Inner Source programs, during the ISC Fall 2020, a potential path to establish such relation in five not-so-simple steps. The core slide of my presentation enumerated them as:

1.- In addition to collaboration and community health metrics, I recommend to track product delivery process performance ones.
2.- Correlate both groups of metrics.
3.- Focus on decisions and actions that creates a positive correlation between them.
4.- Create a reporting strategy to developers and managers based on such positive correlation.
5.- Use such strategy to turn around your Inner Source/Open Source story: it is about creating positive business impact at scale through open collaboration.

A detailed explanation of these five points can be found in the second article of this series:

If you want to go far, together is faster (II).

You can also watch the recording of my talk at ISC Fall 2020.

Agile On The Beach: my first time

During 2017 and 2018 I got several interesting references at eagileonthebeachvents like Scale Summit and pipelineconf about an interesting event in the south-west of England about lean, agile and continuous delivery that got my attention. I decided that I could not miss it in 2019.

2019 edition of Agile On The Beach was the 9th one. The event has gain a good reputation among agilists for being a good mix of great content and relaxed atmosphere in a beautiful environment. This is not surprising for somebody coming from the Open Source space but is not a so common combination within the lean/agile/CD world.

I bought the tickets and reserved the accommodation on time (months before the event). As you know, I started in MBition in June. My employer was kind enough to make it easy for me to attend. So on July 10th I headed to Falmouth, Cornwall, UK to participate in Agile On The Beach during the following two days, coming back to Málaga on Saturday July 13th.

I liked the event. I felt like home. Like if I was at Akademy, for instance. There were some outstanding talks, a great atmosphere, talented and experienced attendees, reputed speakers, good organization and nice social activities.  Yes, the trip to get there was long but I had no delays in any of the several planes and trains I had to take (who said British trains are awful?) which allowed me to invest a few hours working on each way.

The videos has not been published yet. I will add as comments those I recommend you to check.

Update: please find the videos here.

Next year the event celebrates the 10th edition. They promised to do something special. If you are in the UK or are willing to invest several hours traveling to a good Lean/Agile/CD event, Agile On The Beach is a great one. But stay tuned, the number of tickets is limited and they are gone several months before the event starts.

Bitacora: impact mapping

Introduction

This is the fourth of a series of articles about Bitacora. Please read the previous ones to get full context:
  1. The diary (aka bitácora): towards alignment in distributed environment.
  2. Bitacora: environment definition.
  3. Bitacora: personas

Why Impact Mapping?

The steps taken so far are standard for me since long time ago. At this point though, depending on the nature of the product, I did not have a fixed procedure to follow until the creation of user stories. At some point I ended up creating use cases and relating them through mind maps.

A few months ago I read Impact Mapping, from Gojko Adzic, a reputed Agile consultant and writer. He proposes a method to go from business model to Epic/stories description that I found interesting. It fills my gap with a systematic approach using a tool I love, mind maps.

It also helps deprecating a common challenge with use cases. When you get dirty with them, it is easy to write many that are not key for the main goal of the tool. Once you have many of them, it might become tough to relate and prioritize them.

Gojko propose to get rid of use cases and go directly from impact mapping to describing the epics/user stories. I have not been able to do so. It might be that I am simply too hooked to use cases and it is only a matter of time before I follow exactly the new process. It can be also that impact mapping is rigid compared to use cases so you can only get rid of them in a second/third iteration of the impact mapping process…

In any case, for Bitacora I started with an impact mapping. Once it was mature enough, I started from scratch doing the use cases. My impact mapping ended up containing 75% percent of the use cases…. and already structured. I assume that, in a collaborative design process, this percentage will be greater. I assume also I can get better with practice. So I am confident to adopt Impact Mapping exclusively. I might loose some use cases but I gain structure. Relevant but at this point ignored use cases will probably pop up later in the design process or during development. Mistakes in the prioritization due to structure deficiencies has a bigger (negative) impact along the product life cycle.

What is an Impact Mapping? 

Mr. Adzic describes in his book Impact Mapping as “a mind-map grown during a discussion facilitated by answering the following four questions: Why?, Who?, How?, What?”

  1. Why: center of the map, represent the goal of our application. It should include the desired achievement based on a key metric.
  2. Who: first level of the mind map, it refers to actors who can influence the outcome. These are the personas. At this point of the process, it is not mandatory to describe them in detail
  3. How: second level of the mind map, it refers to impact we are trying to create on the actors, that is, how the product change their behaviors.
  4. What: third level of the mind map, it refers to the scope, that is, what can we do to achieve the desired impact. Depending on the dimension and complexity of your product/service, this third level will be close or far away from a user story. You can either extend the What section into sub-levels (product approach) or define epics (scrum of scrums). 

You can get further information and diagrams in the website impactmapping.org I am still unsure about the direct relation between the What section and what engineers understand as a derivable. I prefer to reserve derivable as concept to a later stage (user stories) so they are analyses in detail by the product team, together with acceptance criteria. I find this arguable though.

Bitacora Impact mapping

Once you:

you should be able to understand the coming Bitacora Impact Map.

If not, I have probably done something wrong, which wouldn’t surprise me 🙂 or you are not familiar with the problem to solve. Some of the ideas like “tag page” will be described in the user stories, so don’t panic, we will get there.

Since these graphs has gone through several review processes, it might happen that the partial pics are not 100% up to date. The general impact map should be though. 

Center and Level 1: Why? and Who?

 

The center lacks the quantitative goal. Since this is an ideal exercise, I will ignore it for now. We will go down to metrics and measuring impact at the very end.

At the end of the previous article, about personas, I wrote that for a introductory analysis, we could reduce them to 5. Since I went further, I considered groups and described personas for each group since we will use them at some point.

 

Level 2: How?

This is the second level that defines the impact for the product team members.
 


 This is the second level that defines the impact for the other personas.

 As you can see, the impact is described in the following areas:

  • Participation: the tool should promote an increase of interactions based on what each persona did.
  • Team/project follow up: bitacora should make following up a team or project activity easier.
  • Analysis: analize what the team is doing or has done should be easier with Bitacora. Self analysis is as important.
  • Report: Bitacora is designed to improve reporting in both, vertical and horizontal direction (matrix).
  • For Team Member, Bitacora should work as the central point for everyday (short term)  information.

Level 3: What?

For a easier visualization I have divided this level in three different images. The first one correspond to Product Team:

This is the third level for the Scrum Master persona:


 And this is the third image, corresponding to the other three groups:

Complete Impact Map

Here you have the complete map.

You might be wondering what are the icons for.  

I mentioned earlier that one of the advantages of using Impact mappings is the structure. As a consequence, you can start to assign priorities to the different scopes which is very useful before writing down the User stories.  

Note: 

The result of this process should in general be features, but in certain circumstances, at this point it could also be processes. I used the “help” icon to reflect them. his concept is arguable. I am at this point unsure if it will survive the story description.

Series of articles

  1. The diary (aka bitácora): towards alignment in distributed environment.
  2. Bitacora: environment definition.
  3. Bitacora: personas
  4. Bitacora: Impact mapping

The following article will describe the use cases I worked with so you can compare both approaches.