Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Bazel build system #3129

Closed
jwnimmer-tri opened this issue Aug 11, 2016 · 38 comments
Closed

Add support for Bazel build system #3129

jwnimmer-tri opened this issue Aug 11, 2016 · 38 comments

Comments

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Aug 11, 2016

The Bazel build system is a relatively new tool that provides correct, reproducible, fast builds. It is the open-source version of google's own build tool.

I believe that Bazel would solve many of TRI's challenges in using and supporting Drake, and thus we should experiment with adding Bazel support into Drake. For starters, it should be enough to merely add Bazel support for the newer C++ core of Drake (common, math, systems_framework).

This would start as workstation-only test, then graduate to a post-merge build only, and not in the officially supported set. If and when we demonstrate that it is reliable and useful for developers, we can consider making Bazel support official.

It would be nice to start on this soon, in order to help prepare and test the upcoming Bazel 0.5 official support for Windows in a few months, so that we can help drive it to meet our needs.

@jwnimmer-tri jwnimmer-tri added unused type: question team: software core unused team: automotive This team is no longer active within this repository. labels Aug 11, 2016
@jwnimmer-tri jwnimmer-tri self-assigned this Aug 11, 2016
@david-german-tri
Copy link
Contributor

I'm in.

@jamiesnape
Copy link
Contributor

jamiesnape commented Aug 12, 2016

  1. CI system would need to be completely refactored.
  2. Multiple build systems are a bad idea.
  3. Adding features to CMake that you may need in future is easy, for Bazel?

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

@mwoehlke-kitware
Copy link
Contributor

Some critical features that I'm not seeing offhand in Bazel:

  • Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)
  • Ability to install things.
  • Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

To fully take advantage of it, we'd probably also have to port all of our dependencies.

@jwnimmer-tri
Copy link
Collaborator Author

Multiple build systems are a bad idea.

This is a compelling argument against attempting a port. Supporting two systems in parallel during the transition will be somewhat painful. We can perhaps localize the pain to only the bazel-porters (i.e., have a separate CI that is non-authoritative, and nobody but the porters care about).

Having a good plan of action here would be an important part of this pursuit (and I don't have one yet).

CI system would need to be completely refactored.

I'm not too worried about this. It will be effort and cost, but not risk.

Adding features to CMake that you may need in future is easy, for Bazel?

As with SCons, Bazel lets you escape into (limited) Python if you really want to, and is open-source should you need to modify the actual core.

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

I agree that's a big part of the problem, and that nuking the current build system code with a rewrite is a big win. Still, if we stick with CMake, the lack of reproducibility and caching for builds and tests is a big hole CMake, that Bazel and SCons both fill.

[Lacks the] Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)

Bazel often tends towards "build your libraries from source" (as we do with drake externals), which actually helps reproducibility, but you can also get required libraries from the system without much trouble.

And for "locate" (searching), I actually think that's a misfeature. There should be one way to build Drake-the-entireity on a given platform, which means that the dependency will be in one well-known, hard-coded place. Rooting around in system paths, the user's homedir, and finding something that "smells like" the right version of a library or tool only leads to confusing bug reports and wasted time. Or if the developer really wants a different path to some library, they can update their Bazel file to explicitly reference it.

[Lacks the] Ability to install things.

Hmm? Are you saying we'd have to roll your own "zip up these files into a release" logic, but in the CMake case that's already built-in? That's fair, but doesn't seem like a huge difference, compared to the effort of ongoing build system upkeep and developer downtime.

[Lacks] Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

This is in progress (http://bazel.io/roadmap.html), scheduled for a couple months out. Probably slips a bit more, but still relatively near-term compared to the scope of this ticket.

To fully take advantage of it, we'd probably also have to port all of our dependencies.

Yes, this is worth a bit of effort-assessment. Dependencies that are just a pile of C++ code are easy to port (I've done it). Dependencies that have bespoke -DTHIS_AND_THAT autoconf-like logic are harder, but there are ways to cope.

@jwnimmer-tri
Copy link
Collaborator Author

FYI, active development prototype is ongoing here https://github.com/jwnimmer-tri/drake/commits/bazelspike. It is able to build the automotive slice of the code.

The current "off the top of my head" plan is something like "allow BUILD files to be PR'd in-tree without CI yet" as we bring this up. This lets other bazel workspaces use Drake-as-a-library, without Drake-as-a-project needing to adopt bazel for its demos, tests, etc. Probably the next step after that is to get some kind of CI for Drake-as-a-library using bazel, but still rely on CMake for tests, demos, matlab, etc. I plan to iterate with David's review on a real plan offline, and then post here for comments.

@jwnimmer-tri
Copy link
Collaborator Author

Next-up proposal... for C++ libraries that bazel knows about, teach CMake to obtain the list of sources from the BUILD file, instead of repeating the list in two places.

WIP at https://github.com/jwnimmer-tri/drake/tree/bazel-reuse2

@jamiesnape
Copy link
Contributor

Not sure that is going to work. I am trying to work out if you would get extra cmake re-configures when file lists have not changed or not enough re-configures when they do change.

This is all kind of reversed when the purpose of CMake is to generate the build files for other systems. There is no bazel support now, but there could be.

@jwnimmer-tri
Copy link
Collaborator Author

When we discussed Bazel vs CMake a few months ago during the visit, the Kitware consensus in the room at that time was that Bazel is higher-level than CMake, and the one-feeds-the-other direction should go as I've done here.

In any case, what is the best solution in CMake for a list of library sources to be dynamically computed? I had done 6a1a042 originally -- which asks a python program to emit a list of sources -- but the approach in the reuse2 branch seemed like it would scale to generating even more of the listfile content directly from the BUILD.

@jwnimmer-tri
Copy link
Collaborator Author

The other option I've considered is to fully omit the listfiles for core Drake (common, math, systems, solvers, etc.) and just have CMake delegate to bazel to build the core. That is probably better long term, but I was hoping to get there incrementally.

@jamiesnape
Copy link
Contributor

That would certainly be cleanest.

@jamiesnape
Copy link
Contributor

Then CMake would provide an external interface for CMake consumers, build CMake externals, and other legacy code. We have integrated with external tools like Ant and Gradle for Java before, so this may not be that different.

@jamiesnape
Copy link
Contributor

jamiesnape commented Oct 21, 2016

(Might be an idea to create a ticket for the CMake side and assign it to Kitware since we have done similar many times before.)

@jwnimmer-tri
Copy link
Collaborator Author

Yeah. I have to give more thought and discussion to the optimal path forward. For now perhaps I'll just keep the two spellings of the list-of-sources duplicated.

@jamiesnape
Copy link
Contributor

Sure. My general opinion is that whatever the merits of Bazel or CMake, having two build systems duplicating each other or working in unusual ways to accommodate each other is going to cause a lot of extra maintenance at best, confusion or bugs at the worst. I also think there is some scope for upstreaming some useful Bazel support to CMake which might make things easier.

@jamiesnape
Copy link
Contributor

jamiesnape commented Jun 27, 2017

CMake Removal TODO

drake-superbuild:

externals:

drake:

@jamiesnape
Copy link
Contributor

jamiesnape commented Jun 28, 2017

Three proposals regarding CI for PRs:

  1. Switch LSan builds from CMake to Bazel.
  2. Switch off standalone cpplint builds (or replace with a Bazel one, but lint runs on all Bazel builds currently, anyway).
  3. Switch on at least one "everything" Bazel build.

@mwoehlke-kitware
Copy link
Contributor

I was thinking that a Bazel lint-only build might be useful if we could also switch off the lint tests on the other Bazel CI builds. This would make it easier to tell lint failures from "real" failures.

...Just a thought; feel free to hate it 😄.

@jamiesnape
Copy link
Contributor

At the moment switching off lint gets ugly because --test_tag_filters do not accumulate intelligently.

@jwnimmer-tri
Copy link
Collaborator Author

To me, its important that CI builds match user builds, so we shouldn't add or remove tests in CI. The build variants should merely cover the matrix of supported platforms and documented command-line flags (such as --compiler=.)

@jamiesnape
Copy link
Contributor

Any opinion on the three proposals, and do you have a preference for the particular "everything" build? We can add others once CMake is turned down.

@jwnimmer-tri
Copy link
Collaborator Author

Switch LSan builds from CMake to Bazel.

Sure.

Switch off standalone cpplint builds

Sure.

Switch on at least one "everything" Bazel build.

Anything Xenial & Everything would be fine by me.

@jwnimmer-tri
Copy link
Collaborator Author

The checklists upthread probably need a refresh at some point.

@jamiesnape
Copy link
Contributor

We could probably split off the remaining Jenkins TODOs into separate issues and close this, I think. Other than those, I just see hermetic Gurobi remaining, and that does not see worth keeping the issue open for now (or necessarily even bothering about).

@jwnimmer-tri
Copy link
Collaborator Author

I agree we don't need an issue tracking Gurobi stuff. If its a problem in some context, we can fix it if / when needed.

@jwnimmer-tri jwnimmer-tri assigned jamiesnape and unassigned stonier Apr 23, 2018
@jwnimmer-tri jwnimmer-tri removed the unused team: automotive This team is no longer active within this repository. label Jun 9, 2018
@RussTedrake
Copy link
Contributor

@jamiesnape, @jwnimmer-tri -- i think it's safe to close this? ;-)

@jamiesnape jamiesnape removed their assignment Jun 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants