BuildStream metrics: exploration

Metrics and telemetry are fundamental in any engineering activity to evaluate, learn and improve. They are also needed to consolidate a culture in which opinion and experience are continuously challenged, in which experimentation and evidence becomes the norm and not the exception, in which transparency rules so co-workers are empowered, in which data analysis leads to conversations so evaluations are shared.

Open Source projects has been traditionally reluctant to promote telemetry, based on privacy concerns. Some factor that comes to my mind are helping to change this perception:

  • As FLOSS projects grow and mature the need for information grows.
  • It is easier now to process big amounts of data while keeping high levels of anonymity.
  • The proliferation of company driven and consortium driven FLOSS projects, specially those related with SaaS/cloud technologies and products, showing how useful telemetry is. In general, corporations are less concerned about personal data privacy than many Open Source projects though.
  • The DevOps movement is spreading like a pandemic and telemetry is an essential action for practitioners.

So the last few years data analytics is becoming more popular among Open Source projects.

Finding the right metrics is frequently tough. Most of the times projects, teams or departments get drowned in data and graphs before they realize what actually matters, what does it have real business value. When you find the right metrics, somehow it means that the right questions are being asked which I find the hardest part. To identify those questions, I recommend organizations or projects to invest in exploring and learning before moving into automating the data collection, processing, plotting and irradiate the results to be analysed.

So when BuildStream is getting into its third year of life, I thought it could be interesting to invest some effort in digging into some numbers, trying to find a couple of good questions that provide value to the project and the stakeholders involved.buildstream-beaver

The outcome of this exploratory effort was published and spread across the BuildStream / BuildGrid community. The steps taken to publish the report has been:

  • Select a question to drive this exploratory effort, in my case: are we growing?
  • Select data sources: in my case, information from the ticketing system and the git repositories.
  • Collect the data: in this case, the data sets from the BuildStream ticketing system were exported from gitlab.com and the data sets from git obtained through a script developed by Valentin David.
  • Clean the data set (data integrity, duplications, etc.) in this case the data was imported into GSheets and worked there.
  • Data processing: the data was processed and metrics were defined using GSheet since the calculations in this phase were simple enough and the amount of data and processing power did not represent a challenge for the tool.
  • Plot the data: since the graphs were also simple enough, GSheet was also used for this purpose.
  • Initial analysis: the goal here was to identify trends, singularities, exceptions, etc and point them to the BuildStream community looking for debate and answers.
  • Report: provided in .pdf and .odt, it has been publishing in the BuildStream Group in gitlab.com and sent to the community mailing list. The report include several recommendations.

The data set could lead us to a deeper analysis but:

  • It would have also take me more time.
  • I wanted to involve the contributors and stakeholders early in the analysis phase.
  • Some metrics which collection, processing and plotting can be automated has been identified already so to me it is better to consolidate them to bring value to the project on regular basis than to keep exploring.

I understand that my approach is arguable but it has worked for me in the past.  The debate of just half way cooked analysis increases the buy-in in the same way that developers love to put their hands in half-broken tools. Feel free to suggest a better approach that I can try in the coming future. I would appreciate it.

Link to the report on gitlab.com. Download it.

What’s next?

I am looking forward to have a fruitful debate about the report within the BuildStream community and beyond. From there, my recommendation is to look for an external provider (it is all about providing value as fast as possible) that, working with Open Source tools, can consolidate what we’ve learnt from this process and can help us to find more and better questions… and hopefully answers.

What is BuildStream

I have been putting effort on BuildStream since May 2018. Check the project out.

Stable: not moving vs. not breaking

There are two terms that brings a heavy controversy in the Open Source world: support and stable. Both of them have their roots in the “old days” of Open Source, where its commercial impact was low and very few companies made business with it.

You probably have read a lot about maintenance vs support. This controversy is older. I first heard of it in the context of Linux based distributions. Commercial distribution had to put effort in differentiating among the two because in Open SOurce they were used indistictly but not in business. But this post is about the adjectivet stable

Stable as adjective has several meanings in English:

  • According to the Cambridge dictionary stable is, among others: firmly fixed or not likely to move or change. I am used to this meaning because of my backgroun in Physics.
  • One of the definitions provided by the Oxford dictionary is slightly different: not likely to change or fail; firmly established.

Can you see the confusion between…?

  1. It is hard to move… or move slow.
  2. It does not fail… it is hard to break.

I am not an English native speaker. Maybe because of its latin root (I am Spanish) or maybe eacuase I studied Physics, I do not provide to the adjective stable the second meaning. It has always called my attention how many people provide both meanings to the word, specially when referring to software releases. Looking at the dictionaries, I can tell why in English there is such link between both ideas.

I read today on Twitter another example of this confusion/controversy. The comment has its roots on a different topic but it has embedded this confusion, I think.

Open Source projects used to have as the main testing strategy the collaboration with a group of beta testers and power users. This strategy helped to create this relation between “not moving” and “not failing” that has become so popular. But this relation is not direct if you have a different release/deploy strategy. The way to increase stability (as hard to break) is by detecting and fixing bugs and that can be done moving fast (constantly updating), or moving slow (backporting). I will not get into which one is better. Both are possible and popular.

I think the word stable should be avoided. Many system and application developers or integrators refer to it as “move slow” as “it will not change much“. But what many users hear is “it will not break” which is why “it moves slow“.

Which adjective can we use to label a release that “moves slow“, so it is not mistaken with “it is not expected to break“?

Is this controversy present in other languages as strong as I perceive it in English? Is it just me who see this controversy?

Software events in Málaga

In May, there has been two interesting conferences in Málaga that  I have attended to:

J On The Beach

Since I moved back to Málaga, I have been trying to attend to this conference and for one reason or another, I couldn’t, until this 2018 edition. J On The Beach is the most international software event that takes place in the south of Spain on regular basis and this year it was no exception. It is a very well organised event, with good speakers, up to date topics and participants from many different countries. The conference is in English.

My expectations were high and the event met them. I had the chance to listen for the very first time to M. Hashimoto, that provided the audience an overview of Terraform. It is always a pleasure to listen to those who create the code you use. In that regard, he is becoming very popular for tools like Vagrant or Terraform itself. I enjoyed the talk very much as I did with the talks from  H. Karau and J. Amstrong. By the way, M. Hashimoto will be speaking at OSSJ 2018 in a few days so I will have the chance to see him again, but this time in Tokyo.

J On The Beach is a 3 days long event where the first one is focused in workshops and training sessions while the other two are mostly talks. After the closure a party is organised with good music and plenty of drinks.

I recommend to my readers to pay attention to next year edition. Add this event to your list. Tickets are sold fast so subscribe and pay attention to the newsletter to find out when can you get one.

OpenSouthCode

OpenSouthCode is a general purpose FOSS event, very popular among students and local hackers. Last year I talked about the FOSS automotive platforms developed by AGL and GENIVI. This year I provided an overview of CIP, the work that we are currently doing and near future plans. Check the slides for more information.

OpenSouthCode in a two days event. The first one, on a Friday, is all about workshops, meetup and training activities while the second one, on a Saturday, is reserved for talks in Spanish.

If you live in the South of Spain or happen to be around when this event takes place, I totally recommend it.

Coming Events to Málaga

There are two additional software events in Málaga to pay attention to:

Gamepolis

There are more and more international software companies coming to Málaga and several of them are gaming companies,. There are also a few Spanish ones. If you are into games development or you are a avid gamer, this will be a great event for you.

PyConES 2018

Codethink sponsored PyConES 2017 and several of my colleagues attended together with me. One of them, Pedro Álvarez was part of the organization. It took place in Cáceres, Extremadura. This edition will take place in Málaga. I plan to attend again. This is a 400 people event packed of Python developers.

Further notes

I would like to see more international FOSS events coming to Málaga. It is a great place to organise a conference: it has a big airport with connections to all major hubs in Europe, direct flights to many European cities, fast train to both, Madrid and Barcelona, many accommodations from a wide range of prices and quality, a big congress palace and hotels to host events, good weather most of the year, the beach, Granada, Ronda, Antequera or Sevilla are close enough… and an increasing number of software companies opening offices in the area. It is nice to not having to travel to attend to a good conference once in a while.

Moving from a traditional product/release focused delivery model to a rolling model

The past few weeks the GDP delivery team together with some key contributors, has been working on a not very visible but still important change. The GDP project has put the basis to turn GDP release based delivery model to a “rolling” one. My colleagues will provide in a coming post the technical details behind this change. I want to provide a higher level view of what is happening and why.

Some background

GDP was born as a “demo” project. The main goal was to provide a platform to show the software components for automotive that the different GENIVI Expert Groups were developing. This was done through a delivery model focused on publishing a stable and easy to consume version of the project every few months, a major release.

Strictly speaking, GDP is a derivative. It is based on poky and uses Yocto tools to “create” the Linux based platform, adding the different components developed by the GENIVI Alliance together with upstream software. For the defined purpose, the release centric model works fine, especially if you concentrate your effort is very specific areas of the software stack with a small number of dependencies on the other areas, and a limited number of contributions and environments where the system should work.

During this 2016, the GDP has grown significantly. We have more software, more contributors, more components and more target boards to take care of. Although the above model has not been not challenged yet, it was just a matter of time.

As I explained in two previous posts [1][2], the GDP is moving from a being a Demo to a Development Platform. Changing the mission means changing the goals and the target group, which implies the need to adjust the deliverable to meet the new expectations.

So, right after the 14th AMM, the Delivery Team decided to change the delivery model to better meet the new mission, providing developers the newest possible software with the an increasing quality threshold. At the same time, in order to increase the number of contributors, the GDP needs to provide a new solid platform every once in a while. That should be done trough a solid release.

What is a rolling delivery model?

The key idea behind a modern delivery release model is to ensure that the transition from one stable release to the next one takes an affordable effort. I will put an example to picture the idea.

Problem statement

Imagine an organization that publishes one release per year. Let’s assume that a particular release included 100 patches developed by employees and, during the lifetime of the release (1 year too), another 100 patches were added to the product as bug fixes and updates. At the end of the release lifetime, the product includes 200 patches that define the value the product provides to customers and users.

Either for technical or business reasons, a year later it is time to upgrade. Our organization has to create a new Linux based system with newer upstream code and they have to integrate the patches from the previous release plus the updates and bug fixes developed for the coming release.

After a simplification process done by engineers, the number of patches needed to be integrated in this newer base system is reduced to 150. The organization also wants to add to this new release another 100 patches that represent the new features they have been developing during the last year for this new version.

The delivery team now has to integrate 250 patches in the new base system, 150 of them coming from the previous release. One might think that the effort required to do this is 2.5 times the effort invested in the previous release. Maybe you think that the effort is not so high since some of the patches have been developed thinking about the new base system. There are many other considerations like this one that might affect the initial estimation. This example is obviously a simplification.

However any experienced release manager will tell you that moving patches integrated in an older system base onto a newer one (forward-porting) requires additional effort, beyond a linear relation with the number of patches. Forward-porting is the “road of hell“. Iterate this example a few times and you will understand why there are so many organizations out there that have as many people focusing on delivery as they have in development. They migrated to Linux base system keeping the traditional delivery model they had while working with closed source software.

Release based delivery model

Possible solutions

One of the paths to improve the situation is upstreamming those changes that affect generic components. Some companies also upstream their new features early in their development process, generally looking for wider testing, or after they have been released to customers, to increase adoption and reduce future maintenance effort. This is definetly a must do.

From the delivery perspective, the most popular way to tackle the problem though is reducing the release cycle, so the number of patches to forward-port in each release is smaller. The development time and the maintenance cycles are also smaller. The same applies to the complexity of the forward-porting activities. “Jumping” from one release to the next one is easier to do. Add automation of repetitive tasks to this recipe and you feel you have a win…. for some time.

The journey through the “road to hell” becomes more comfortable, but our organization is still getting burned, even in the case that our customers and ourselves can digest releasing frequently. We all know how expensive and stressful a release might become.

The most suitable option to achieve sustainability while scaling up the amount of software an organization can manage without releasing more often than your market can digest is to change your delivery model.

Rolling delivery models are a serious attempt to solve this problem, putting integration as the central element instead of the software itself.
This model is not new. Gentoo has been doing it forever, but it was Arch Linux who implemented it in a way that immediately attracted the attention of thousands of developers. Still it was a model with no hope beyond hardcore Linux developers. openSUSE brought this model to a new level by implementing a process which output was stable enough for a much wider audience, and compatible with the release of a more stable and a commercial releases. Nowadays there are other interesting examples out there that commercial organizations can learn from.

What is a rolling model?

It is still hard to define but essentially it is a process in which ideally you have one continuous integration pipeline as the one an only entry point for the software you plan to ship. Releases then become snapshots of all or part of the software already integrated after going through a specific stabilization, deployment and release process.
So ideally, if you release a portfolio, you integrate only once, reducing significantly the costs of having different engineers working on different versions of the same software and forward-porting, among other benefits.

Rolling delivery model

So a rolling delivery model is a lot more than a continuous integration chain, although that is the key point.

Please have in mind that this is an oversimplification. This description doesn’t go into detail on other key aspects like maintenance cycles, how upstreamming affects the process, strategies towards updating the released products, etc.

A transformation process that takes an organization from a release centric model to a rolling one is about doing less and doing it faster, so less people can handle more software with less pain, allowing more people to concentrate in creating value, developing new and better software instead of just shipping it.

Back to GENIVI

Moving from a release centric to a rolling model is hard work. Frequently it is easier to start all over again. Since the GDP is still a relatively small project, we can afford going through the transformation process step by step.
The first stage has been creating that single integration chain and treating GDP-ivi9, our latest release, and those that follow it, as a deliverable of what we call today Master. Ideally, no single patch will be added directly to the release branches. They should come from Master. That way, we reduce (ideally to zero) the effort of forward-porting of patches while putting in the hands of our contributors the latest software on a regular basis.
To do so, we are in the process of adapting our simple processes and CI system to the new model, GDP repository structure, the wiki contents, the task management structures, several key policies, our communication around the project…

The GDP will face a very interesting challenge since this model needs to be proven successful for a derivative. If we are able to move fast enough, it will come the time in which we will need to decide if GDP keeps being a derivative or it becomes upstream, that is, either GDP limits the delivery speed based on the Poky release cycle, or we work upstream with the Yocto project to increase our delivery speed.

That is a good problem to have, isn’t it?

If (almost) everything goes right, after adding a few needed services in GENIVI’s infrastructure and ensuring the updated software is in compliance with selected verification criteria, the same number of people will be able to manage and deliver more software. And once the new processes become more stable, automation will not just increase efficiency, it will boost the project by allowing GENIVI to achieve goals that only big organizations with large delivery teams can do. This is the kind of transformation that takes time to consolidate, but has a huge impact.

Based on my experience, I believe that if GENIVI is able to sustain this effort and keep a clear direction the next couple of years, the benefits of moving towards a rolling model will be noticeable even outside the industry.

This blog post was originally published in the GENIVI blog site on 2016/07/04. I have adapted the formatting to adapt it to Blogger. The content should be the same.

Embrace Open Source culture: the 5 common transformations.

Article originally published at Linkedin on May 15th 2016.
This is a story of what I have lived or witnessed a few times so far. A story of an organization that used to consume, develop and ship proprietary software for many years. At some point in time, management took the decision of using Open Source. Like in most cases, the decision was forced by its customers, providers, competitors… and by numbers.

A painful but unavoidable transformation was required.

 

1.- Open Source consumer

Engineers had to learn a new system, adapt or re-write those features that used to made the organization unique, together with many other painful actions. It was expensive at the beginning but, due to the cost reduction in licenses and the change that Linux represented in the relation with providers, in a few years it was clearly worth it. And if it wasn’t, it didn’t matter since it was what the market demanded. There was no way back

Every software organization has gone through their unique journey, but the final sentence of the story has been the same for all of them: they became Open Source consumers.

 

2.- Open Source producer

This organization gained control over its production and, by consuming Open Source, it could focus many resources in differentiation, without changing the structure, development and delivery processes. At some point, it was shipping products that involved a significant percentage of generic software taken “from internet”.

It became an Open Source producer.

You can recognise such organizations because they frequently create a specific group, usually linked to R&D, in change of bringing all the innovation that is happening “in the Open Source community” into the organization.

Little by little this organisation realised that giving fast and satisfactory answers to their customer demands became more and more expensive. They got stuck in what rapidly became an old kernel or tool chain version…. Bringing innovation from “the community” required back-porting, solving complex integration issues, incompatibilities with what your provider brings, what your customer wants.

So they have to upgrade.

This organization will be able now to take advantage of all the common features and compatibility that the new kernel, the new tool chain… brings. But, guess what, forward porting all the differentiation features this organization has developed, all the bug fixes, is so much work and so complicated that the challenge put the organization at risk.

 

 

3.- Open Source contributors

The organization feels now the downsides of becoming a blind Open Source consumer and producer. Execs feels like when a bubble explodes and they are inside of it. They has less control than they thought, which turns out to be expensive, and what is worse, they lack the expertise within the organization to gain it….

But, after struggling for some time, this organization survived, which means that it has learnt some lessons:

  • Upstream those features that are not differentiation factors any more.
  • Increase the investment in that reduced groups of rock-stars that are up to date of what’s going on “in the different communities”.
  • Invest in those Open Source projects that develop the key software you consume.
  • Even better, start your own Open Source project to promote your technologies and be perceived as a leader…
  • Reduce the upgrade cycle, so the “upgrade pain” is lower. As a side effect, the organization has the opportunity to increase the cash flow when doing two smaller upgrades instead of a big one. At the very end, the real profit comes with the first update, not with the new version, right?
This organization ended up upstreaming features when they could, normally very late, because “they do not have time”, frequently assigning that task to young inexperienced developers or, even “better”, subcontracting it, which is “cheaper”.

You can recognise that this organization has gone through the described process when attending to conferences given by any of its executives. They cannot stop talking about how much they contribute to this and that community, about how awesome the community is, how important it is to be open and share…. they are referring all the time to communities/upstream as “us” and “them”. They think they got it, they really do.

Most of them believe they are in the crest of the wave after going through this third transformation process. They are innovative, they has been able to reduce their time to market, they are gaining reputation within a variety of communities… They are not just Open Source consumers and producers any more. They are also contributors. Some of them even heavy and “successful” contributors.

But if you look closer, they have not adopted “the Open Source way”.
This organization keeps their traditional processes intact. It is managed in the same way. Decision processes are taken like when it was a proprietary company, it has not improved transparency significantly, it does not share code and practises among departments…. there is a totally different reality in front and behind its firewall, between production and R&D, between management, engineering, customer support, etc..

This organization face friction because of this reality. It still cannot move fast enough. Upgrading is still too expensive since now they have to do it more often than years ago, upstreaming goes so slow, when it happens. They cannot control the communities they are investing on…

So going through a forth transformation becomes unavoidable. Some refer to this transformation as “upstream first”.

 

4.- Upstream first or becoming a good Open Source citizen

This fourth transformation basically means that upstreaming is part of your development process, not an aside task. It also means that communities are part of your delivery strategy, not an after market topic, that R&D is a two way road where you do not just consume innovation created by others but you share yours, not just “promote it”. You really need to get involved.
This organization will learn that by becoming more open, their engineers learn more and faster, so the organization itself. It is at this stage where the organization really understand where the real value is in the software they produce compared to what is commodity…or that is what its executives and managers most likely believe, once again. 🙂

But open source (no capital letters any more) is not about being open, but about being transparent, which means that is not just about seeing what is behind the glass, but also understand it.

I believe the fictional organization I am talking about will have to take one more step, the fifth one. It will be about “becoming upstream”.

 

5.- Becoming upstream or being an open source organization

This is about understanding that, if you consume, produce and contribute Open Source, the smart thing to do is becoming an open source company. I think it is naive to pretend taking full advantage of  Open Source while keeping your traditional corporate culture, which collides with the one of those who produces most of the software you consume and ship, who are your “upstream”. You are building your business on top of them. Since you cannot control them, become “them”.

The smart thing to do is to surf the wave, not fight against it, generating friction. Any manager knows that friction is expensive, reduces focus and drives away talent.  It is bad for the business.

The required culture change to succeed in this fifth transformation involves thinking less about us (company and customers) and them (community), and more about us (ecosystem). It can’t be any more about upstream and downstream but about technology and service. It has to be less about upgrading and more about updating, less about “manage” and more about “lead” at every level of the organization, not just referred to execs and managers.

It is a transformation in which engineers are empowered, where management is more focused on collecting information for execs instead of producing it, and after decisions are taken, their key focus is alignment. A transformation in which execs get closer to where the real value is, to people, because they are the “masters”. A transformation in which engineers not just follow, they get exposed, they take responsibility and assume the consequences… getting paid for it.

An environment in which accessing to key information does not depend on your position within the organization chart, which means that power does not depend so much  on what others ignore, but decisions are taken based on shared knowledge. A culture in which transparency is the norm not the exception.

In summary, a transformation that leads to a stage in which the organization steers its ecosystem instead of driving it. So it leads it in a sustainable way.

 

A quimera?

I understand it might sound like a quimera, but:
  1. No more than it would have sounded 15 years ago any of the stories that so many CEOs or Open Source Program managers from leading corporations are telling nowadays in popular FOSS events about “their transformation”.
  2. I do not think the debate is if this fifth transformation will be needed, but about when and how to go through it.
  3. My +10 years of experience in Open Source and +17 as manager tells me that, waiting to face any of the first four transformation processes until you have no choice is an unnecessary risk. I suspect the same will apply to the fifth one.
So my message is,
  1. Consume, produce and contribute to open source being a good citizen.
  2. Embrace Open Source culture… better sooner than later.

Pic link.