more

2016-05-14 16:31:32 +02:00 · 2016-05-14 16:31:32 +02:00 · 3250dc4fac
commit 3250dc4fac
parent 26a6171265
1 changed files with 107 additions and 4 deletions
--- a/doc/posts/Announcement.anansi
+++ b/doc/posts/Announcement.anansi
@ -10,13 +10,62 @@ an API description.
 This is much closer to the traditional use of `QuickCheck`. The most obvious
 use-case is checking that properties hold of an *entire* server rather than of
-individual endpoints.
+individual endpoints. (But there are other uses that you can skip to if they
 sound more interesting.)
 ## `serverSatisfies`
-There are a variety of best practices in writing web APIs that aren't always
+A useful guideline when writing and maintaing software is that, if there isn't
-obvious. As a running example, let's use a simple service that allows adding,
+a test for a behaviour or property, sooner or later that property will be broken.
-removing, and querying biological species. Our SQL schema is:
+Another important perspective is that tests are a form of documentation - the
 present developer telling future ones "this matters, and should be this way".
 The advantage of using tests for this form of documentation is that there's
 simply too much information to convey, some of it only relevant to very specific
 use cases, and rather than overload developers with an inexhaustible quantity of
 details that would be hard to keep track of or remember, tests are a mechanism
 of reminding developers of *only the relevant information, at the right time*.
 <<EXAMPLE>>.
 We might hope that we could use tests to communicate the wide array of best
 practices that have developed around APIs. About to return a top-level integer
 in JSON? A test should say that's bad practice. About to not catch exceptions
 and give a more meaningful HTTP status code? Another test there to stop you.
 Traditionally, in web services these things get done at the level of *individual*
 endpoints. But this means that if a developer who hasn't had extensive experience with web
 programming best practices writes a *new* endpoint which *does* return a top-level
 integer literal, there's no test there to stop her. Code review might help, but
 code review is much more error prone than tests, and really only meant for those
 things that are too subtle to automate. (Indeed, if code review were such a reliable
 defense mechanism against bugs and bad code, why have tests and linters at all?)
 The problem, then, with thinking about tests as only existing at the level of individual
 endpoints is that there are no tests *for* tests - tests that check that new
 behaviour and tests conforms to higher-level, more general best practices.
 `servant-quickcheck` aims to solve that. It allows describing properties that
 *all* endpoints myst satisfy. If a new endpoint comes along, it too will be
 tested for that property, without any further work.
 Why isn't this idea already popular? Well, most web frameworks don't have a
 reified description of APIs. When you don't know what the endpoints of an
 application are, and what request body they expect, trying to generate arbitrary
 requests is almost entirely going to result in 404s (not found) and 400s (bad
 request). Maybe one in a thousand requests will actually test a handler. Not
 very useful.
 `servant` applications, on the other hand, have a machine-readable API description
 already available. And they already associate "correct" requests with particular
 types. It's a small step, therefore, to generate 'arbitrary' values for these
 requests, and all of them will go through to your handlers. (Note: all of the
 uses of `servant-quickcheck` work with applications *not* written with servant-server -
 and indeed not *in Haskell - but the API must be described with the servant
 DSL.)
 Let's see how this works in practice.  As a running example, let's use a simple
 service that allows adding, removing, and querying biological species. Our SQL
 schema is:
 :d schema.sql
@ -150,6 +199,60 @@ instance Arbitrary Species where
  arbitrary = Species <$> arbitrary <*> arbitrary
 :
 But this fails in quite a few ways.
 ### Why best practices are good
 As a side note: you might have wondered "why bother with API best practices?".
 It is, it would be said, a lot of extra (as in not only getting the feature done)
 work to do, for dubious benefit. And indeed, the relevance of discoverability, for
 example, unclear, since not that many tools use it.
 But `servant-quickcheck` both makes it *easier* to conform to best practices,
 and exemplifies their advantage. If we pick 201 (Success, the 'resource' was
 created), rather than the more generic 200 (Success), `servant-quickcheck` knows
 this means there should be some representation of the rec°as a response
 ## `serversEqual`
 There's another very appealing application of the ability to generate "sensible"
 arbitrary requests. It's testing that two applications are equal. Generate arbitrary
 requests, send them to both servers (in the same order), and check that the responses
 are equivalent.  (This was, in fact, one of the first applications of
 `servant-client`, albeit in a much more manual way, when we rewrote a microservice
 originally in Python in Haskell.) Generally with rewrites, even if there's some
 behaviour that isn't optimal, if a lot of things already depend on that service,
 it makes sense to first mimick *exactly* the original behaviour, and only then
 aim for improvements.
 `servant-quickcheck` provides a single function, `serversEqual`, that attempts
 to verify the equivalence of servers. Since some aspects of responses might not
 be relevant (for example, whether the the `Server` header is the same, or whether
 two JSON responses have the same formatting), it allows you to provide a custom
 equivalence function. Other than that, you need only provide an API type and two
 URLs for testing, and the rest `serversEqual` handles.
 ## Future directions: benchmarking
 What else could benefit from tooling that can automatically generate sensible
 (*vis-a-vis* a particular application's expectations) requests?
 One area is extensive automatic benchmarking. Currently we use tools such as
 `ab`, `wrk`, `httperf` in a very manual way - we pick a particular request that
 we are interested in, and write a request that gets made thousands of times.
 But now we can have a multiplicity of requests to benchmark with! This allows
 *finding* slow endpoints, as well as (I would imagine, though I haven't actually
 tried this yet) synchronization issues that make threads wait for too long (such
 as waiting on an MVar that's not really needed), bad asymptotics with respect
 to some other type of request.
 (On this last point, imagine not having an index in a database for "people",
 and having a tool that discovers that the latency on a search by first name
 grows linearly with the number of POST requests to a *different* endpoint! We'd
 need to do some to do this well, possibly involving some machine learning, but
 it's an interesting and probably useful idea.)
 **Note**: This post is an anansi literate file that generates multiple source
 files. They are: