stackage-server/app/stackage-server-cron.hs
Alexey Kuleshevich f5e147ab97
Integration with Pantry and usage of new stackage-snapshots:
* Moved all extensions into modules that are using them, rather than globally,
  since they mess up ghci session and introduce conflicts among
  packages. Removed those from `.ghci` file as well
* Redesigned the schema to use Pantry and moved it into it's own module
* Switched all of the db and cron related stuff to RIO. Yesod part is
  still on classy-prelude
* Got pantry to update stackage-server database from hackage
* Got import of stackage-snapshots implemented
* Moved some logic from all-cabal-tool
* Switched everything to `PackageNameP`, `VersionP`, etc. from a la Text.
* Fixed haddock, so it now does proper redirects and pipes the docs
  correctly. Also implemented piping of json files from S3 bucket,
  so index-doc.json is also served by stackage-server thus making
  Ctrl+S feature work properly on haddock. Fix for commercialhaskell/stackage#4301
* Import of modules is done through cabal file parsing, which slows
  down the initial import process drastically, but incremental update
  is not a problem.
* Just as with modules, dependencies are also imported from cabal file.
* In general improved type safety by introducing a few data types:
  eg. `ModuleNameP`, `HackageCabalInfo`, and many more.
* Implemented pulling of deprecation map from hackages and storing it in db
* Implementation of forward/backward dependencies within a snapshot only.
* Drastically improved performance of cron import job, by checking which
  snapshots are not up to date
* Implemented pulling haddock list from S3 bucket. Modules that have
  documentation are marked from the availability of actual haddock. This
  process happens concurrently with snapshots loading.
* Rearranged modules a bit:
  * github related functions went into it's own module
  * cron related functions where moved from Database to Cron module
  * Split up some functions to reduce individual complexity
* Parallelized package loading in cron job
* Implemented parsed cabal file caching.
* All queries where reqritten with esqueleto
* Syntactic improvements:
  * Added stylish-haskell config
  * Formatted all imports and extensions with stylish-haskell.
  * Fixed inconsistent indentation across all modules
* Many improvements to the package page as well as few others.
* Reimplemented hoogledb creation.
* Dropped dependency on tar in favor of tar-conduit
* Added cli for stackage-server-cron
* Add cabal sha and size to the package page
* Fixed links in hoogle searches. Improved type safety for a hoogle handler
* satckage-server-cron is customizable with cli arguments

Final adjustments for the new stackage server release:

* Upgrade to lts-13.16.
* Stackage server related code has been merged to pantry. Made the code
  compatible with the newer version pantry
* Added cli '--snapshots-repo'
* Add readme to package page
* Adjust snapshots expected format:
  * Added `publish-time`
  * Removed name `field`
  * `compiler` field is now in the `resolver` field with fallback to
    the root
2019-04-30 17:10:33 +03:00

86 lines
3.0 KiB
Haskell

{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE NoImplicitPrelude #-}
import Options.Applicative
import RIO
import RIO.List as L
import RIO.Text as T
import Stackage.Database.Cron
import Stackage.Database.Github
readText :: ReadM T.Text
readText = T.pack <$> str
readLogLevel :: ReadM LogLevel
readLogLevel =
maybeReader $ \case
"debug" -> Just LevelDebug
"info" -> Just LevelInfo
"warn" -> Just LevelWarn
"error" -> Just LevelError
_ -> Nothing
readGithubRepo :: ReadM GithubRepo
readGithubRepo =
maybeReader $ \str' ->
case L.span (/= '/') str' of
(grAccount, '/':grName)
| not (L.null grName) -> Just GithubRepo {..}
_ -> Nothing
optsParser :: Parser StackageCronOptions
optsParser =
StackageCronOptions <$>
switch
(long "force-update" <> short 'f' <>
help
"Initiate a force update, where all snapshots will be updated regardless if \
\their yaml files from stackage-snapshots repo have been updated or not.") <*>
option
readText
(long "download-bucket" <> value haddockBucketName <> metavar "DOWNLOAD_BUCKET" <>
help
("S3 Bucket name where things like haddock and current hoogle files should \
\be downloaded from. Default is: " <>
T.unpack haddockBucketName)) <*>
option
readText
(long "upload-bucket" <> value haddockBucketName <> metavar "UPLOAD_BUCKET" <>
help
("S3 Bucket where hoogle db and snapshots.json file will be uploaded to. Default is: " <>
T.unpack haddockBucketName)) <*>
switch
(long "do-not-upload" <>
help "Stop from hoogle db and snapshots.json from being generated and uploaded") <*>
option
readLogLevel
(long "log-level" <> metavar "LOG_LEVEL" <> short 'l' <> value LevelInfo <>
help "Verbosity level (debug|info|warn|error). Default level is 'info'.") <*>
option
readGithubRepo
(long "snapshots-repo" <> metavar "SNAPSHOTS_REPO" <>
value (GithubRepo repoAccount repoName) <>
help
("Github repository with snapshot files. Default level is '" ++
repoAccount ++ "/" ++ repoName ++ "'."))
where
repoAccount = "commercialhaskell"
repoName = "stackage-snapshots"
main :: IO ()
main = do
hSetBuffering stdout LineBuffering
hSetBuffering stderr LineBuffering
opts <-
execParser $
info
(optsParser <*
abortOption ShowHelpText (long "help" <> short 'h' <> help "Display this message."))
(header "stackage-cron - Keep stackage.org up to date" <>
progDesc
"Uses github.com/commercialhaskell/stackage-snapshots repository as a source \
\for keeping stackage.org up to date. Amongst other things are: update of hoogle db\
\and it's upload to S3 bucket, use stackage-content for global-hints" <>
fullDesc)
stackageServerCron opts