Skip to content

feat: Add bulk sync operations and advanced filtering#345

Draft
jdrew82 wants to merge 6 commits intodevelopfrom
feature/bulk-sync-and-filtering
Draft

feat: Add bulk sync operations and advanced filtering#345
jdrew82 wants to merge 6 commits intodevelopfrom
feature/bulk-sync-and-filtering

Conversation

@jdrew82
Copy link
Contributor

@jdrew82 jdrew82 commented Mar 18, 2026

Summary

Adds 9 new features for bulk sync operations and advanced filtering of synced objects, organized in 3 phases. All changes are fully backwards-compatible — every new parameter defaults to None/False, preserving existing behavior.

Phase 1: Diff-side Filtering

Feature 8 — Diff.filter() / Diff.exclude()

Post-diff, pre-sync manipulation of computed diffs. Returns a new Diff containing only matching (or excluding matching) elements. Recursive through child DiffElements.

diff = dst.diff_from(src)
filtered = diff.filter(actions={"create", "update"}, model_types={"device"})
dst.sync_from(src, diff=filtered)

Files: diffsync/diff.py

Feature 6 — model_types parameter

Restricts diff/sync to specific model type names. Applied at both top_level iteration and child recursion.

dst.diff_from(src, model_types={"site", "device"})

Files: diffsync/__init__.py, diffsync/helpers.py

Feature 7 — sync_attrs / exclude_attrs

Whitelist or blacklist specific attributes per model type during diff calculation.

dst.sync_from(src, sync_attrs={"device": {"role"}})  # only diff role
dst.sync_from(src, exclude_attrs={"device": {"last_seen"}})  # skip last_seen

Files: diffsync/__init__.py, diffsync/helpers.py

Feature 5 — filters query predicates

Per-model-type callable predicates to filter which objects participate in diffs.

dst.diff_from(src, filters={"device": lambda d: d.role == "spine"})

Files: diffsync/__init__.py, diffsync/helpers.py

Phase 2: Sync-side Enhancements

Feature 9 — sync_filter callback

Callback checked before each CRUD operation. Return False to skip the operation.

dst.sync_from(src, sync_filter=lambda action, model_type, ids, attrs: action != "delete")

Files: diffsync/__init__.py, diffsync/helpers.py

Feature 4 — Structured sync_complete operations summary

sync_complete() now receives an operations kwarg with a structured dict of all CRUD operations performed. Backwards-compatible with existing overrides via try/except TypeError fallback.

def sync_complete(self, source, diff, flags=None, logger=None, operations=None):
    # operations = {"device": {"create": [{"ids": ..., "attrs": ..., "model": ...}], ...}}

Files: diffsync/__init__.py, diffsync/helpers.py

Phase 3: Bulk & Parallel Execution

Feature 1 — Bulk CRUD methods

Optional create_bulk(), update_bulk(), delete_bulk() class methods on DiffSyncModel. Default implementations loop over individual calls. Override for batch API/DB operations.

Store-level add_bulk(), update_bulk(), remove_bulk() on BaseStore, LocalStore, and RedisStore (with Redis pipeline optimization).

Files: diffsync/__init__.py, diffsync/store/__init__.py, diffsync/store/local.py, diffsync/store/redis.py

Feature 2 — batch_size parameter

Parameter plumbed through to DiffSyncSyncer for future chunked dispatch of bulk operations.

Files: diffsync/__init__.py, diffsync/helpers.py

Feature 3 — concurrent parallel sync

concurrent=True flag parallelizes sync of independent top-level subtrees using ThreadPoolExecutor. Thread safety achieved via:

  • threading.local() for per-element syncer state (model_class, action, logger)
  • threading.Lock on LocalStore mutation methods (add, update, remove_item)
  • threading.Lock on operations summary dict
dst.sync_from(src, concurrent=True, max_workers=4)

Files: diffsync/__init__.py, diffsync/helpers.py, diffsync/store/local.py

New API Surface

Method New Parameters
Adapter.diff_from() / diff_to() model_types, filters, sync_attrs, exclude_attrs
Adapter.sync_from() / sync_to() model_types, filters, sync_attrs, exclude_attrs, sync_filter, batch_size, concurrent, max_workers
Adapter.sync_complete() operations
DiffSyncModel create_bulk(), update_bulk(), delete_bulk()
Diff filter(), exclude()
BaseStore add_bulk(), update_bulk(), remove_bulk()

Test plan

  • All 133 existing tests pass unchanged
  • 36 new tests added in tests/unit/test_new_features.py
  • Tests cover: Diff filter/exclude, model_types scoping, sync_attrs/exclude_attrs, query predicates, sync_filter callback, sync_complete operations, bulk CRUD, concurrent sync, and integration combinations
  • Review thread safety under high concurrency
  • Test with RedisStore backend
  • Performance benchmark with large datasets (1000+ models)

🤖 Generated with Claude Code

jdrew82 and others added 3 commits March 18, 2026 09:13
Add 9 new features organized in 3 phases:

Phase 1 - Diff-side Filtering:
- Diff.filter()/exclude() for post-diff, pre-sync manipulation
- model_types parameter to scope diffs/syncs to specific model types
- sync_attrs/exclude_attrs for attribute-level diff control
- filters parameter for per-model-type query predicates

Phase 2 - Sync-side Enhancements:
- sync_filter callback to approve/reject individual CRUD operations
- Structured operations summary passed to sync_complete()

Phase 3 - Bulk & Parallel Execution:
- Bulk CRUD methods (create_bulk/update_bulk/delete_bulk) on DiffSyncModel
- Store-level bulk methods (add_bulk/update_bulk/remove_bulk)
- Thread-safe LocalStore with locking for concurrent access
- concurrent flag for parallel sync of independent subtrees

All new parameters default to None/False for full backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace test_new_features.py with three files following project conventions:
- test_diff_filtering.py — Diff.filter() and Diff.exclude()
- test_diffsync_diff_and_sync_parameters.py — model_types, filters,
  sync_attrs, exclude_attrs, sync_filter, concurrent, and combinations
- test_diffsync_model_bulk.py — bulk CRUD and store bulk operations

Converted from class-based to flat function style with Apache license
headers, matching the existing test structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@jdrew82 jdrew82 added status: internal review Internal discussion is required to move forward with issue type: enhancement labels Mar 18, 2026
jdrew82 and others added 3 commits March 18, 2026 09:41
Apply linter auto-fixes (unused imports, whitespace, line length) and
add missing batch_size/concurrent/max_workers arg descriptions to
sync_to() docstring.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The linter renamed adapter to _adapter (unused argument), but this is
a public API meant to be overridden by subclasses that will use the
parameter. Use noqa: ARG003 to suppress the unused-argument lint
instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: internal review Internal discussion is required to move forward with issue type: enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant