docs: overhaul intro wording

2025-07-13 09:39:25 -05:00 · 2025-07-13 09:39:25 -05:00 · 45dabce956
commit 45dabce956
parent 2b16c5573e
2 changed files with 41 additions and 11 deletions
--- a/docs/conf.py
+++ b/docs/conf.py
@ -30,6 +30,7 @@ exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

 intersphinx_mapping = {
    'python': ('https://docs.python.org/3/', None),
+    'rattail-manual': ('https://docs.wuttaproject.org/rattail-manual/', None),
    'wuttjamaican': ('https://docs.wuttaproject.org/wuttjamaican/', None),
 }

--- a/docs/index.rst
+++ b/docs/index.rst
@ -2,22 +2,51 @@
 WuttaSync
 =========

-This package adds data import/export and real-time sync utilities for
-the `Wutta Framework <https://wuttaproject.org>`_.
+This provides a "batteries included" way to handle data sync between
+arbitrary source and target.

-*(NB. the real-time sync has not been added yet.)*
+This builds / depends on :doc:`WuttJamaican <wuttjamaican:index>`, for
+sake of a common :term:`config object` and :term:`handler` interface.
+It was originally designed for import to / export from the :term:`app
+database` but **both** the source and target can be "anything" -
+e.g. CSV or Excel file, cloud API, another DB.

-The primary use cases in mind are:
+The basic idea is as follows:

-* keep operational data in sync between various business systems
-* import data from user-specified file
-* export to file
+* read a data set from "source"
+* read corresonding data from "target"
+* compare the two data sets
+* where they differ, create/update/delete records on the target

-This isn't really meant to replace typical ETL tools; it is smaller
-scale and (hopefully) more flexible.
+Although in some cases (e.g. export to CSV) the target has no
+meaningful data so all source records are "created" on / written to
+the target.

-While it of course supports import/export to/from the Wutta :term:`app
-database`, it may be used for any "source → target" data flow.
+.. note::
+
+   You may already have guessed, that this approach may not work for
+   "big data" - and indeed, it is designed for "small" data sets,
+   ideally 500K records or smaller.  It reads both (source/target)
+   data sets into memory so that is the limiting factor.
+
+   You can work around this to some extent, by limiting the data sets
+   to a particular date range (or other "partitionable" aspect of the
+   data), and only syncing that portion.
+
+   However this is not meant to be an ETL engine involving a data
+   lake/warehouse.  It is for more "practical" concerns where some
+   disparate "systems" must be kept in sync, or basic import from /
+   export to file.
+
+The general "source → target" concept can be used for both import and
+export, since "everything is an import" from the target's perspective.
+
+In addition to the import/export framework proper, a CLI framework is
+also provided.
+
+A "real-time sync" framework is also (eventually) planned, similar to
+the one developed in the Rattail Project;
+cf. :doc:`rattail-manual:data/sync/index`.


 .. toctree::