Add some basic info on overnight luigi tasks

This commit is contained in:
Lance Edgar 2022-11-23 10:45:28 -06:00
parent d32bfce883
commit 5aca67709e

View file

@ -2,10 +2,67 @@
Scheduled Tasks Scheduled Tasks
=============== ===============
TODO By "scheduled task" we mean, a command which runs automatically at a
pre-determined time. See also :doc:`commands` and :doc:`scripts`.
For now, Rattail assumes that any scheduled tasks will be manually
configured to run via `cron`_. See your operating system's
documentation for details.
.. _cron: https://en.wikipedia.org/wiki/Cron
(In the future, Rattail may add a `dedicated scheduler daemon`_, but
it is not yet a priority.)
.. _dedicated scheduler daemon: https://redmine.rattailproject.org/issues/13
Luigi Luigi
----- -----
TODO `Luigi`_ is a "task runner" of sorts. Rattail uses it to help
orchestrate a "batch" of commands such as is often needed for
overnight automation (aka. "EOD").
.. _Luigi: https://luigi.readthedocs.io/en/stable/
The general idea is to maintain a Python module containing all of the
individual tasks which need to run. Rattail can then invoke the
module in such a way that Luigi runs all the tasks, keeping track of
which tasks have (not yet) ran, and in which order they should run
etc.
Overnight
~~~~~~~~~
Often there is just one "overnight" module, which contains all tasks.
But in some cases you may want a couple of separate modules, to run at
different times. A common example is like:
* run "all" (typical EOD) tasks, starting at 1am
* run "backups" for all servers, starting at 4am
In this example the "all" task may include a dozen or more individual
"steps" (which, confusingly, are also called "tasks"). Maybe data is
exported from one system to another, reports are ran, etc. Then the
"backups" task runs later, and may itself also contain multiple steps,
or perhaps just one.
There are two main benefits from wiring up overnight tasks via modules
to be ran by Luigi.
The first is that each individual step will (typically) start
immediately after the previous one completes. This means all steps
are "packed in" as tightly as possible, to help keep overall run time
down. Whereas scheduling each step separately often involves
"guessing" how long each will take, and for example you might schedule
them 5 minutes apart just to be safe, even if a given step only needs
2 minutes. (So another benefit really, is just avoiding the hassle of
scheduling all those steps separately to begin with.)
The other benefit is to allow for *restarting* the overnight tasks.
Luigi can keep track of which steps have ran thus far, so if there is
a failure, which you then address, you can restart the overnight task
and Luigi will pick up where it left off. (Steps which already ran
successfully will be skipped over.)