[ros-users] [Discourse.ros.org] [Buildfarm] Status of Xenial migration

Steven! Ragnarök ros.discourse at gmail.com
Mon Dec 4 23:52:35 UTC 2017

@gavanderhoorn I'm very glad you asked! Making a status post has been on my list of things to do and has been consistently pre-empted so I appreciate that you prompted one.

In one sentence: It works, there are some issues inherited from the Trusty buildfarm as well as some new ones and more changes are coming.

## Significant open issues

- [Reprepro performance has decreased significantly](https://github.com/ros-infrastructure/buildfarm_deployment/issues/165)
This is causing a significant bottleneck for low level packages in the very large rosdistros, kinetic and indigo. Unless a farm is rebuilding the entire rosdistro this may not be too much of a problem. We're managing it on the canonical buildfarm until we have time to pick up the investigation by throttling the number of concurrent jobs when low-level packages are rebuilding.

- [Systemd not restarting jenkins java process](https://github.com/ros-infrastructure/buildfarm_deployment/issues/162)
The early days of the new buildfarm gave the Jenkins java process too much of the total system memory and during high load periods spikes would trigger the oom-killer. Worse still, the buildfarm would not come back on its own. I'm pretty sure the issue here is that the init script created by the jenkins puppet module is not properly configured to be managed via xenial's systemd/init compatibility and I plan to resolve this by writing an explicit systemd service for Jenkins rather than using the one built into the puppet module.
- [Mercurial development jobs getting triggered too often](https://github.com/ros-infrastructure/ros_buildfarm/issues/488)
This is a seeming regression that has yet to be investigated and primarily effects buildfarm instances with elastically provisioned workers.

## Stability and maturity

I would like to work with the community to settle on a branching/versioning model that will satisfy our need to keep build.ros.org operating smoothly and with live changes conducted properly through configuration management and the community need to have configuration management that doesn't change drastically week over week. I'm very open to suggestions from the community here. I'd be fine adopting [semantic versioning](https://semver.org/) outright, adopting basic versioning, or maintaining, with the help of a community team, `stable` and `latest` branches of the buildfarm deployment repositories.

The buildfarm_deployment repository has seen a lot activity recently, primarily because improving and maintaining it is one of my core responsibilities. In order to facilitate the move to xenial I paid down quite a bit of technical debt in the form of duplication. I also made some refinements which had implications beyond my understanding at the time and which required further changes down the line. With outstanding issues on the Trusty buildfarm becoming increasingly pronounced, I also dropped some features from the initial "release" of the xenial branch in order to perform the migration.

The largest of the postponed features is currently in progress as [ros-infrastructure/buildfarm_deployment#167](https://github.com/ros-infrastructure/buildfarm_deployment/pull/167) and will enable deploying a ROS buildfarm on a single host, rather than the three needed to run the complete buildfarm today.

The deployment scripts had overlapping configuration values for the different roles and in order to realize a single-host buildfarm the configuration "API"/structure will need to change as well. So the current configuration values will require later changes to keep up with master when that pull request merges. I'm happy to open a discussion on discourse, or in a GitHub issue, to go into further detail on the branching model discussion. Where do folks prefer?

[quote="gavanderhoorn, post:1, topic:3330"]
Theres also still a xenial branch. Is that a remnant, or is that also going to be merged at some point?

The xenial branches are remnants and will be removed at a future date. The buildfarm_deployment xenial branch was merged by https://github.com/ros-infrastructure/buildfarm_deployment/pull/158 and has not been deleted yet to accommodate hosts that autoreconfigure based on the xenial branch. I think my ROS 2 farm is the primary culprit here and per [advisory comments](https://github.com/ros-infrastructure/buildfarm_deployment/pull/158#issuecomment-338303264) I was waiting to delete branches to give folks testing them time to move off and onto `master` which is the branch that currently sees all new development.

[Visit Topic](https://discourse.ros.org/t/status-of-xenial-migration/3330/2) or reply to this email to respond.

More information about the ros-users mailing list