rattail-manual/docs/data/batch/overview.rst
Lance Edgar db50895459 Add more docs for Feature Layer
not complete, but progress..
2022-03-20 13:57:06 -05:00

147 lines
5.8 KiB
ReStructuredText

==========
Overview
==========
What is a "batch" anyway? That seems to be one of those terms which means
something specific in various contexts, but the meanings don't always line up.
So let's define it within the Rattail context:
A "batch" is essentially any set of "pending" data, which must be "reviewed" by
an authorized user, and ultimately then "executed" - which causes the data set
to go into production by way of updating the various systems involved.
This means that if you do not specifically need a review/execute workflow, then
you probably do not need a batch. It also means that if you *do* need a batch,
then you will probably also need a web app, to give the user an actual way to
review/execute. (See also :doc:`/web/index`.) Most of the interesting parts
of a batch however do belong to the Data Layer, so are described here.
Beyond that vague conceptual definition, a "batch" in Rattail also implies the
use of its "batch framework" - meaning there is a "batch handler" involved etc.
And here is something hopefully worth 1000 words:
.. image:: /_static/batch-pattern.png
Batch Workflow
--------------
All batches have the following basic workflow in common:
* batch is created
* if applicable, batch is initially populated from some data set
* batch is reviewed and if applicable, further modified
* batch is executed
In nearly all cases, the execution requires user intervention, meaning a batch
is rarely "created and executed" automatically.
Batch vs. Importer
------------------
Let's say you have a CSV of product data and you wish to import that to the
``product`` table of your Rattail DB. Most likely you will use an "importer"
only, in which case the data goes straight in.
But let's say you would prefer to bring the CSV data into a "batch" within the
DB instead, and then a user must review and execute that batch in order for the
data to truly update your ``product`` table.
That is the essential difference between an importer and a batch - a batch will
always require user intervention (review/execute) whereas a plain importer
never does.
Now, that said, what if you have a working importer but you would rather
"interrupt" its process and introduce a user review/execute step before the
data was truly imported to the target system? Well that is also possible, and
we refer to that as an "importer batch" (creative, I know).
..
Batch Archetypes
----------------
Here are what you might call "archetypes" for all possible batches.
Batch from Importer
-------------------
aka. "importer batch"
This is relatively less common in practice, but it serves as a good starting
point since we already mentioned it.
With an "importer batch" the data in question always comes from whatever the
"source" (host) is for the importer. The schema for the batch data table is
kept dynamic so that "any" importer can produce a batch, theoretically. Each
importer run will produce a new table in the DB, columns for which will depend
on what was provided by the importer run. Such dynamic batch data tables are
kept in a separate (``batch``) schema within the DB, to avoid cluttering the
``default`` schema. (Note that the ``batch`` schema must be explicitly created
if you need to support importer batches.)
The execution step for this type of batch is also relatively well-defined.
Since the batch itself is merely "pausing" the processing which the importer
normally would do, when the batch is executed it merely "unpauses" and the
remainder of the importer logic continues. The only other difference being,
the importer will not read data from its normal "source" but instead will
simply read data from the batch - but from there it will continue as normal to
write data to the target/local system.
Batch from User-Provided File
-----------------------------
More typically a batch will be "created" when a user uploads a data file. With
this type of batch, the data obviously comes from the uploaded file, but it may
be supplemented via SQL lookups etc. The goal being, take what the user gave
us but then load the "full picture" into the batch data tables.
User then will review/execute as normal. What such a batch actually does when
executed, can vary quite a bit.
Note that in some cases, a user-provided file can be better handled via
importer only, with no batch at all unless you need the review/execute
workflow.
Batch from User-Provided Query
------------------------------
In this pattern the user starts by filtering e.g. the Products table grid
according to their needs. When they're happy with results, they create a new
batch using that same query. So this defines the data set for the batch - it
is simply all products matching the query.
Note that Products is the only table/query supported for this method thus far.
It is possible to create multiple batch types with this method.
User then will review/execute as normal. What such a batch actually does when
executed, can vary quite a bit. A common example might be to create a Labels
batch from product query, execution of which will print labels for all products
in the batch.
Batch from Custom Data Set
--------------------------
In this pattern the logic used to obtain the data is likely hard-coded within
the batch handler.
For instance let's say you need a batch which allowed the user to generate
"Please Come Back!" coupons for all active members who have not shopped in the
store in the past 6 months.
You should not require the user to provide a data file etc. because the whole
point of this batch is to identify relevant members and issue coupons. If the
batch itself can identify the members then no need for the user to do that.
So here the batch handler logic would a) create a new empty batch, then b)
identify members not seen in 6 months, adding each as a row to the batch,
then c) user reviews and executes per usual, at which point d) coupons are
issued.