Skip to main content

max / audiofiles

CONTRIBUTING: M018 row_id hashing + replay-safety guidance - Updated Tables-synced list: smart_folders out (dropped in M015), in: user_config and edit_history. - Added a paragraph in §Sync Changelog Triggers explaining the M018 hashed-row_id contract, the per-device row_id_salt, and the canonical-PK-in-data convention for DELETE triggers — so future synced tables follow the same pattern. - Added a paragraph in §Database/Inline Migrations on replay safety (`IF NOT EXISTS`, `DROP IF EXISTS`, `INSERT OR IGNORE`), the exception for M001/M002, and the regression test that catches non-idempotent CREATEs. Plus a note on the hash_row_id custom SQLite function registered on Database::open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author: Max Johnson <me@maxj.phd> · 2026-06-03 02:44 UTC
Commit: b11674da1ea1b4f67ae4a3fffd8072b88a23f422
Parent: 0bacd94
2 files changed, +8 insertions, -2 deletions
M CONTRIBUTING.md +7 -1
@@ -193,10 +193,16 @@ CREATE TABLE samples (
193 193
194 194 Production uses a file-backed SQLite database. Tests use `:memory:`.
195 195
196 + When adding a migration, make it **replay-safe**: every `CREATE TABLE / INDEX / TRIGGER` should be `IF NOT EXISTS` (or preceded by `DROP IF EXISTS` for triggers whose body changes), and any seed insert should be `INSERT OR IGNORE`. The `migration_replay_from_version_two_against_full_schema` test in `db.rs` rolls `user_version` back to 2 and re-runs every migration from M003 onward against a populated schema — non-idempotent CREATEs fail it. M001 (initial schema) and M002 (`DROP TABLE tags; ALTER tags_v2 RENAME TO tags`) are inherently one-shot and excluded from the replay test.
197 +
198 + The connection registers a custom `hash_row_id(salt, key)` SQLite function on open (rusqlite `functions` feature). It's used by the M018 sync triggers; if you write a migration that creates new sync triggers, prefer it for any row_id that would otherwise leak user content.
199 +
196 200 ### Sync Changelog Triggers
197 201
198 202 Every synced table has triggers that insert into `sync_changelog` on INSERT/UPDATE/DELETE. A `sync_state` row (`applying_remote = '1'`) suppresses triggers during pull operations to prevent recursion.
199 203
204 + Per migration M018 (2026-06-02), `sync_changelog.row_id` is hashed via `hash_row_id(row_id_salt, canonical_key)` for sensitive tables (samples, audio_analysis, tags, collection_members) so the server never sees raw sample hashes or tag strings. The salt is generated per device, stored in `sync_state`, never synced. DELETE triggers also emit the canonical PK in the encrypted `data` field, which `resolve::apply_delete` reads to reconstruct WHERE clauses without parsing the (now-opaque) row_id. When adding a new synced table, follow the same pattern: wrap row_id in `hash_row_id(...)` if it carries user content, and emit the canonical PK into `data` for DELETE.
205 +
200 206 ### rusqlite + async
201 207
202 208 `rusqlite::Connection` is `!Send`. In the sync crate (which uses tokio), all database operations go through `tokio::task::spawn_blocking`. In core (sync-only), no async runtime is needed.
@@ -205,7 +211,7 @@ Every synced table has triggers that insert into `sync_changelog` on INSERT/UPDA
205 211
206 212 Cloud sync is optional. The `SyncManager` coordinates push/pull:
207 213
208 - - **Tables synced** (in FK-safe order): `vfs`, `samples`, `collections`, `vfs_nodes`, `audio_analysis`, `tags`, `collection_members`, `smart_folders`
214 + - **Tables synced** (in FK-safe order): `vfs`, `samples`, `collections`, `vfs_nodes`, `audio_analysis`, `tags`, `collection_members`, `user_config`, `edit_history` (smart_folders merged into `collections.filter_json` in M015 and the standalone table dropped)
209 215 - **Delete order** is reversed (children first)
210 216 - **Column whitelist:** `table_columns()` restricts which columns sync to prevent schema drift
211 217 - **Blob sync:** Sample files sync to cloud storage for VFS entries with `sync_files = true`
M todo.md +1 -1
@@ -36,7 +36,7 @@ Launch shipped 2026-06-01 (see `/Users/max/Code/launchplan_final.md`). Post-laun
36 36 ### Repo hygiene (launchplan §2.3)
37 37 - [ ] Remove `crates/audiofiles-app/tests/harness/mod.rs.bak` if it ever reappears (deleted this session as part of the fix commit).
38 38 - [x] **Audit `docs/` for stale plans.** Subagent-graded 2026-06-02: `database_schema.md` rewritten for the 12→18 migration count + M013/M015/M018 schema (source_path, filter_json, smart_folders dropped, row_id hashing, row_id_salt). `architecture.md` updated for the same plus the m4a/alac/caf/bwf extension list and the atomic-write helper in §Export System. `description.md` swapped "Smart Folders" for "Dynamic Collections" + added Cmd+, About entry. `troubleshooting.md` fixed migration count, sync table list, row_id_salt note, and a wrong mirror path (`~/.audiofiles-mirror/` → `~/audiofiles-mirror/`). `design-system.md`, `loose-files-mode.md`, `ml_classifier.md`, `plugin_authoring.md`, `trial-mode.md` were graded CURRENT and left alone.
39 - - [ ] `CONTRIBUTING.md` walkthrough against current build commands.
39 + - [x] **CONTRIBUTING.md walkthrough.** Tables-synced list updated (smart_folders out, edit_history + user_config in). Added a Sync Changelog Triggers paragraph on M018 row_id hashing + canonical-PK-in-data, with guidance for new synced tables. Added a replay-safety paragraph in the Database section (`IF NOT EXISTS` / `DROP IF EXISTS` / `INSERT OR IGNORE`, exempting M001/M002, pointing at the regression test) and a note on the `hash_row_id` SQLite function. Build commands table at the end was already accurate.
40 40
41 41 ## Audit deltas to revisit
42 42