Add some more datasync docs

This commit is contained in:
Lance Edgar 2022-04-06 17:02:43 -05:00
parent db50895459
commit 002cae6515
6 changed files with 77 additions and 6 deletions

View file

@ -11,7 +11,7 @@ Data Layer
other other
auth auth
importing/index importing/index
sync sync/index
versioning versioning
batch/index batch/index
autocomplete autocomplete

View file

@ -1,5 +0,0 @@
Data Real-Time Sync
===================
TODO

View file

@ -0,0 +1,6 @@
===========
Consumers
===========
TODO

11
docs/data/sync/index.rst Normal file
View file

@ -0,0 +1,11 @@
Real-Time Data Sync
===================
.. toctree::
:maxdepth: 2
:caption: Contents:
overview
watchers
consumers

View file

@ -0,0 +1,25 @@
==========
Overview
==========
Rattail provides a "datasync" daemon, which is meant to run in the
background and handle automatic sync of data between systems in "near
real-time".
The datasync daemon will "watch" some system(s) for changes, and any
found are then "consumed" by other system(s). The daemon spawns a
separate thread for each watcher, as well as for each consumer. There
is no limit to how many watchers or consumers you configure, beyond
machine resources etc.
Whereas an importer can be thought of as a "full sync" between systems
(see :doc:`../importing/index`), the datasync is more of a "single
record sync" - where each record changed in a given system, will be
synced individually to the consumer system(s).
Each datasync consumer ideally will "correspond" to, and leverage, an
existing importer (or exporter). The consumer may need some extra
logic to facilitate this, but then the importer can be responsible for
the actual record sync. This means the actual "sync" logic is defined
only once, and is effectively shared.

View file

@ -0,0 +1,34 @@
==========
Watchers
==========
A datasync "watcher" is responsible for checking a given system to see
if any records are in need of sync (e.g. have changed recently).
If the given system has a SQL DB, and applicable tables include a
"last modified" column, then the most common/simple way is for the
watcher to simply query the table, looking for records which were
modified since the last check. There are a couple of downsides to
this approach:
* must query each table separately, to find all "changed" records
* not possible to detect "deleted" records in this way
Rattail itself uses a different approach though. The Rattail DB
includes a ``change`` table, which in normal circumstances is empty.
If so configured, Rattail can insert records into this ``change``
table whenever (almost) *any* data record is changed or deleted etc.
Then a datasync watcher will look for records in the ``change`` table
instead of having to query each applicable table separately. The
watcher also removes ``change`` records as they are processed.
A similar effect can be achieved if the watched system has a SQL DB to
which you can install (tables and) triggers. In this case you can
create a new table (e.g. ``datasync``) and then add triggers for the
various tables you need to watch. When any record is changed or
deleted the trigger should add a record to your new ``datasync``
table. Then the actual watcher can check your ``datasync`` table, and
remove any found after they are processed.
Those are just the patterns used thus far; more are likely possible.