From 3f5dd282764f55525637a4ad3fa5f504faf316e2 Mon Sep 17 00:00:00 2001 From: Adam Bergmark Date: Mon, 19 Dec 2016 19:37:42 +0100 Subject: [PATCH] More detailed descriptions of curator workflows. --- CURATORS.md | 175 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 160 insertions(+), 15 deletions(-) diff --git a/CURATORS.md b/CURATORS.md index 7b5f8997..0af3d9f3 100644 --- a/CURATORS.md +++ b/CURATORS.md @@ -4,7 +4,7 @@ Originally this was handled largely by Michael Snoyman, but now we are a team of 4 people handling requests weekly in rotation. Curation activities are mostly automated, and do not take up a significant amount of time. -## Workflow +## Workflow overview This section sketches out at a high level how the entire Stackage build/curation process works: @@ -23,10 +23,30 @@ process works: The typical story on pull requests is: If Travis accepts it and the author only added packages under his/her own name, merge it. If the -build later fails (see below), then block the package until it's -fixed. +build later fails (see "Adding Debian packages for required system tools or libraries"), +then block the package until it's fixed. + +If benchmarks, haddocks, or test suites fails at this point we +typically also block the package until these issues are fixed. This in +order to add packages with a clean slate. Optionally we can check if packdeps says the package is up to date. +Visit http://packdeps.haskellers.com/feed?needle= + +Builds may fail because of unrelated bounds changes. If this happens, +first add any version bounds to get master into a passing state (see +"Fixing bounds issues"), then re-run the travis build. + +A common issue is that authors submit newly uploaded packages, it can +take up to an hour before this has synced across the stack +infrastructure. You can usually compare the versions of the package in +https://github.com/commercialhaskell/all-cabal-metadata/tree/master/packages/ +to what's on hackage to see if this is the case. Wait an hour and +re-run the pull request. + +Tests also commonly fail due to missing test files, and sometimes due +to doctest limitations. You can point the maintainer to +https://github.com/bergmark/blog/blob/master/2016/package-faq.md ## Fixing bounds issues @@ -36,20 +56,121 @@ issue on the Stackage repo about the problem, and modifying the build-constraints.yaml file to work around it in one of the ways below. Be sure to refer to the issue for workarounds added to that file. -* __Temporary upper bounds__ Most common technique, just prevent a new version of a library from being included immediately -* __Skipping tests and benchmarks__ If the upper bound is only in a test suite or benchmark, you can add the relevant package to skipped-tests or skipped-benchmarks. For example, if conduit had an upper bound on criterion for a benchmark, you could added conduit as a skipped benchmark. -* __Excluding packages__ In an extreme case of a non-responsive maintainer, you can remove the package entirely from Stackage. We try to avoid that whenever possible +### Temporary upper bounds + +Most common technique, just prevent a new version of a library from +being included immediately. This also applies to when only benchmarks +and tests are affected. + +* Copy the stackage-curator output and create a new issue, see e.g +https://github.com/fpco/stackage/issues/2108 + +* Add a new entry under the "stackage upper bounds" section of `build-constraints.yaml`. For the above example it would be + +```yaml + "Stackage upper bounds": + # https://github.com/fpco/stackage/issues/2108 + - pipes < 4.3.0 +``` + +* Commit (message e.g. "Upper bound for #2108") +* Optionally: Verify with `stackage-curator check` locally +* Push +* Verify that everything works on the build server (you can restart the build or wait for it to to run again) + +Sometimes releases for different packages are tightly coupled. Then it +can make sense to combine them into one issue, as in +https://github.com/fpco/stackage/issues/2143. + +If a dependency that is not explicitly in stackage is causing test or +benchmark failures you can skip or expect them to fail (see "Skipping +tests and benchmarks" and "Expecting test/benchmark/haddock +failures"). Bonus points for reporting this upstream to that packages' +maintainer. + +### Lifting upper bounds + +You can try this when you notice that a package has been updated. You +can also periodically try to lift bounds (I think it's good to do this +at the start of your week /@bergmark) + +If stackage-curator is happy commit the change ("Remove upper bounds and close #X"). + +### Amending upper bounds + +With the `pipes` example above there was later a new release of +`pipes-safe` that required the **newer** version of `pipes`. You can +add that package to the same upper bounds section, +(e.g. https://github.com/fpco/stackage/commit/6429b1eb14db3f2a0779813ef2927085fa4ad673) +as we want to lift them simultaneously. + +### Skipping tests and benchmarks + +Sometimes tests and benchmark dependencies are forgotten or not cared +for. To disable compilation for them add them to `skipped-tests` or +`skipped-benchmarks`. If a package is added to these sections they +won't be compiled, and their dependencies won't be taken into account. + +There are sub sections under these headers that is used to group types +of failures together, and also to document what type of failures +exist. + +### Expecting test/benchmark/haddock failures + +The difference from the `skipped` sections is that items listed here +are compiled and their dependencies are taken into account. These +sections also have sub sections with groups and descriptions. + +One big category of test suites in this section are those requiring +running services. We don't want to run those, but we do want to check +dependencies and compile them. + +If there are no version bounds that would fix the issue or if you +can't figure it out, file it +(e.g. https://github.com/fpco/stackage/issues/2133) to ask the +maintainer for help. + +### Waiting for new releases + +Sometimes there is a failure reported on a (now possibly closed) issue +on an external tracker. If an issue gets resolved but there is no +hackage release yet we'd like to get notified when it's uploaded. + +Add the package with its current version to the +`tell-me-when-its-released` section. This will cause the build to stop +when the new version is out. + +### Excluding packages + +In an extreme case of a non-responsive maintainer, you can remove the +package entirely from Stackage. We try to avoid that whenever +possible. + +This typically happens when we move to a new major GHC release or when +there are only a few packages waiting for updates on an upper bounds +issue. + +Comment out the offending packages from the "packages" section and add +a comment saying why it was disabled: + +``` + # - swagger # BLOCKED aeson 1.0 +``` + ## Updating the content of the Docker image used for building ### Adding Debian packages for required system tools or libraries Additional (non-Haskell) system libraries or tools should be added to `stackage/debian-bootstrap.sh`. -Committing the changes to a branch should trigger a DockerHub. Normally only the nightly branch needs to be updated +Committing the changes to a branch should trigger a DockerHub. Normally only the `nightly` branch needs to be updated since new packages are not added to the current lts release. Use [Ubuntu Package content search](http://packages.ubuntu.com/) to determine which package provides particular dev files (it defaults to xenial which is the version used to build Nightly). -Note we generally don't install/run services needed for testsuites in the docker images - packages with tests requiring some system service can be add to expected-test-failures. +Note that we generally don't install/run services needed for testsuites in the docker images - packages with tests requiring some system service can be added to `expected-test-failures`. +It's good to inform the maintainer of any disabled tests (commenting in the PR is sufficient). + +If a new package fails to build because of missing system libraries we often ask the maintainer to help figure out what to install. ### Upgrading GHC version The Dockerfile contains information on which GHC versions should be used. You @@ -106,7 +227,7 @@ we're just not there yet. /opt/stackage-build/stackage/automated/build.sh lts-3.0 ``` -Recommended: run these from inside a `screen` session. If you get version bound +Recommended: run these from inside a `tmux` session. If you get version bound problems on nightly or LTS major, you need to fix build-constraints.yaml (see info above). For an LTS minor bump, you'll typically want to use the `CONSTRAINTS` environment variable, e.g.: @@ -126,13 +247,13 @@ If a build fails for bounds reasons, see all of the advice above. If the code itself doesn't build, or tests fail, open up an issue and then either put in a version bound to avoid that version or something else. It's difficult to give universal advice on how to solve things, since each situation is unique. Let's -develop this advice over time. For now: if you're not sure, ask Michael for -guidance. +develop this advice over time. For now: if you're not sure, ask for guidance. __`NOPLAN=1`__ If you wish to rerun a build without recalculating a build plan, you can set the environment variable `NOPLAN=1`. This is useful for such cases as an intermittent test failure, out of memory -condition, or manually tweaking the plan file. +condition, or manually tweaking the plan file. This is the default for +LTS builds. ### Timing @@ -146,9 +267,33 @@ LTS minor bumps typically are run on Sundays. ### Website sync debugging (and other out of disk space errors) * You can detect the problem by running `df`. If you see that `/` is out of space, we have a problem -* There are many temp files inside `/home/ubuntu/stackage-server-cron` that can be cleared out occasionally -* You can then manually run `/home/ubuntu/stackage-server-cron.sh`, or wait for the cron job to do it +* (outdated) There are many temp files inside `/home/ubuntu/stackage-server-cron` that can be cleared out occasionally +* (outdated) You can then manually run `/home/ubuntu/stackage-server-cron.sh`, or wait for the cron job to do it ### Wiping the cache -Sometimes the cache can get corrupted which might manifest as `can't load .so/.DLL`. You can wipe the nightly cache and rebuild everything by doing `rm -rf /opt/stackage-build/stackage/automated/nightly`. +Sometimes the cache can get corrupted which might manifest as `can't load .so/.DLL`. +You can wipe the nightly cache and rebuild everything by doing +`rm -rf /var/stackage/stackage/automated/nightly`. +Replace nightly with `lts7` to wipe the LTS 7 cache. + +## Local curator setup + +We don't run the full stackage build locally as that might take too +much time. Some steps on the other hand are much faster to do +yourself. + +It's useful to be able to modify constraints locally before pushing to +the repository. To do this first install stackage-curator: +`git clone git@github.com:fpco/stackage-curator.git && cd stackage-curator && stack install` +or get the linux binary: https://s3.amazonaws.com/stackage-travis/stackage-curator/stackage-curator.bz2 +(it's a good idea to upgrade stackage-curator at least at the start of your week as curator). +Then clone the stackage repo `git clone git@github.com:fpco/stackage.git`. +Inside it run `stack update && stackage-curator check` to get new packages and do dependency resolution. + +This can be used to make sure all version bounds are in place +(including for test suites and benchmarks), to check whether bounds +can be lifted, and to get `tell-me-when-its-released` notifications. + +Notably this doesn't build anything, so you won't see any compilation +errors for builds/tests/benchmarks.