-
Notifications
You must be signed in to change notification settings - Fork 289
Use sccache for builds #1724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In talking with kats I realized that we might be better off just trying to get sccache to work with task-cluster directly instead of getting it working with travis first. |
Here's an example of using secrets with taskcluster: and upload the secret using https://tools.taskcluster.net/secrets |
Also: and https://dxr.mozilla.org/mozilla-central/source/taskcluster/scripts/builder/build-linux.sh#55 |
@jrmuizel @metajack What is the likelihood of getting someone at Mozilla who works on this stuff to look into this officially (running CI on task cluster)? I certainly don't know enough about that stuff to look into it. We haven't been able to merge anything for ~3 days with the latest travis issues, which is following another fortnight of similar issues... |
@staktrace is looking at this a bit right now and we can probably get @luser to help before he goes on PTO next week. |
At the moment I only have basic linux64 taskcluster integration working. There are two steps involved:
After doing this, each PR update or push will trigger the taskcluster job. You can see a sample one (from my taskcluster-ci commit) at https://tools.taskcluster.net/groups/GNsTmjKaQyeF5v623NM6eQ - it runs the debug and release commands on whatever the current stable rust version is. @jrmuizel said that for now the "nightly" rust commands we can just leave on travis. If we want to pin to a specific rust version, we can update the docker image to have that rust version preinstalled, and remove the rustup commands from the .taskcluster.yml script. Next steps are figuring out how to hook up sccache and getting OS X jobs running. I think OS X is probably more important at this point, since with that we can start using taskcluster "in production", and then work on getting it faster with sccache. I'd like to merge my taskcluster-ci branch as soon as possible but I'm not sure what effect that will have on bors/homu and the regular workflow. We should probably set up a "maintenance window" or something where we can do the merge, and ensure things are working or roll back if they aren't. Or if there is a test repo somewhere with bors/homu we can use that to try this out. |
sccache works fine, so I'm not sure what in paritcular is causing problems, or if your request is actually related to sccache builds or just builds being messed up in general. It's pretty easy for us to move a particular repo over to our buildbot instances if that is what is needed. |
I talked to taskcluster folks and I have steps on setting up OS X worker machines for taskcluster, so that we can run our own CI farm. It's fairly straightforward and I have it running using my laptop as a test. I think we can rustle up some OS X machines in the Toronto office or hosted remotely somewhere and use them as dedicated CI machines for webrender. As a bonus, since the OS X setup doesn't use docker, it doesn't reset the machine state after each job is run. This means even if we just use a local sccache we should get a good speedup. |
The easiest thing to do is probably just get a dedicated mac mini or two from macstadium and expense it. |
I set up the worker on the mac mini that jrmuizel rented from macstadium. It seems to be working ok. Next step to move this along is to try it on servo/webrender instead of just my clone of the repo. Whoever owns servo/webrender needs to install the github-taskcluster integration tool from https://github.com/apps/taskcluster. Then we need to get :jonasfj to add the necessary scopes to this repo, so that it can spawn the "kats-webrender-ci-osx" type worker via the "localprovisioner" TC provisioner. And then after that I can make a PR from my branch with the .taskcluster.yml file and see how bors/homu deal with it. |
@metajack @larsbergstrom Is ^ something you can help with (enabling the TC tool on the WR repo)? |
Should be done. |
Thanks. I got :jonasfj to add the scopes as well so we should be good to try the PR. I'll submit that shortly. While I was waiting I installed sccache on the OS X worker but I ran into a rustc internal compiler error when building WR with it. I'll investigate that more but for now let's do this without sccache. |
We've seen that error before elsewhere:
I think this is because cargo creates a make-style jobserver now, and it will pass it down to rustc (for use when you use codegen-units=N). There's some weird interaction here with how the jobserver fd gets passed down and I don't quite understand it. |
Yeah, I just commented in rust-lang/rust#42867 which appears to be tracking this problem. |
Quick update: I made PR #1746 to get the .taskcluster.yml file merged into the webrender repo. By default this will run the CI jobs via taskcluster for PRs by "collaborators" and for pushes. (We need to set I looked at the bors and homu code/docs to figure out exactly what it is they do and what integration we need there. It seems like when we run CI with travis it notifies the result to the bots via webhooks. AFAICT taskcluster-github doesn't have webhook capability yet so we can either request that and wait for it, or just make the CI command itself call out to a webhook and report the success/failure. The other important thing is that homu right now runs tests on the merge commit from the PR and latest master. So that means we need some way of triggering the taskcluster run from homu, the same way it triggers travis/appveyor runs. I haven't looked into if this is possible yet, it might be a feature that we need to request of the taskcluster-github integration tool. Have an API to do this will also allow us to make things like retry requests work. Right now retrying has to be done manually via the taskcluster task page, and even then it won't update the final status of the build (I filed bug 1402136 for this). And finally, one more thing that would be nice is if taskcluster canceled obsolete jobs if e.g. somebody pushes new commits into a PR. It doesn't do this yet and it's an optimization but one that would be good to have. I filed bug 1402884 for this. |
Add a .taskcluster.yml file to run CI using taskcluster This is a test PR to see if (a) taskcluster correctly picks up the PR and schedules the CI jobs and (b) to see how bors/homu deal with this extra CI job. This is related to #1724 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1746) <!-- Reviewable:end -->
Wow, nice work. The TC builds are so fast compared to how long the normal builds take! |
This is done now, in #1789.
I realized that there's no special magic needed to make this work. It looks like the merge head is pushed as the I think really the next thing we want to do here is set up a webhook equivalent for taskcluster, so that it can notify the bots on success/failure. And then have the bots accept travis || taskcluster as success conditions for landing the merge. |
#1871 adds "routes" to the .taskcluster.yml file which will allow us to listen for task-completion notifications. We would need code running somewhere that listens for the four tasks for a particular PR to complete successfully and uses that as a success condition for landing the PR. This can either be added to homu directly or run as a separate service that simulates a travis webhook, or something. With the mozillapulse python library doing most of the work it shouldn't be too hard to glue things together. |
We are running CI on TaskCluster now. Do we still need this open @staktrace @jrmuizel ? |
I think we can close it. Until rust-lang/rust#42867 is solved we probably won't get sccache to work on the OS X builder anyway and it might not be worth the effort unless we start building up a backlog again. |
mozilla/sccache#179 has some instructions.
When I last looked at these I believe I couldn't figure out how exactly to communicate the encrypted AWS key to sccache. The details are fuzzy though.
The text was updated successfully, but these errors were encountered: