Skip to main content

max / audiofiles

Update docs, audit review, and Cargo.lock for v0.4.0 Audit review updated for Run 15 (grade A, 688 tests). Add new docs: schema reference, smoke test checklist, test plan, troubleshooting guide, unsafe mode explainer. Add fuzz report. CONTRIBUTING: add Rhai hook style section. Cargo.lock: version bumps to 0.4.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author: Max J. <87768334+MaxJMath@users.noreply.github.com> · 2026-04-26 19:43 UTC
Commit: 5c22707b6a03850776341883c75c3d29d141abde
Parent: 60f1d01
10 files changed, +966 insertions, -72 deletions
@@ -215,6 +215,10 @@ Cloud sync is optional. The `SyncManager` coordinates push/pull:
215 215
216 216 TOML manifests in `plugins/devices/` define hardware constraints. Optional Rhai scripts in `hooks/` run sandboxed.
217 217
218 + ### Hook Style
219 +
220 + Optional Rhai hooks follow the cross-project style guide at `_meta/docs/rhai_style.md`. Run `_meta/scripts/lint-rhai.sh` to check formatting. Key points: 4-space indent, `snake_case` functions, `UPPER_CASE` constants, header comment block.
221 +
218 222 ### Manifest Contract
219 223
220 224 ```toml
M Cargo.lock +7 -7
@@ -380,7 +380,7 @@ checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0"
380 380
381 381 [[package]]
382 382 name = "audiofiles-app"
383 - version = "0.3.6"
383 + version = "0.4.0"
384 384 dependencies = [
385 385 "audiofiles-browser",
386 386 "audiofiles-core",
@@ -409,7 +409,7 @@ dependencies = [
409 409
410 410 [[package]]
411 411 name = "audiofiles-bench"
412 - version = "0.3.6"
412 + version = "0.4.0"
413 413 dependencies = [
414 414 "audiofiles-core",
415 415 "rayon",
@@ -419,7 +419,7 @@ dependencies = [
419 419
420 420 [[package]]
421 421 name = "audiofiles-browser"
422 - version = "0.3.6"
422 + version = "0.4.0"
423 423 dependencies = [
424 424 "audiofiles-core",
425 425 "audiofiles-rhai",
@@ -449,7 +449,7 @@ dependencies = [
449 449
450 450 [[package]]
451 451 name = "audiofiles-core"
452 - version = "0.3.6"
452 + version = "0.4.0"
453 453 dependencies = [
454 454 "bs1770",
455 455 "dirs",
@@ -471,7 +471,7 @@ dependencies = [
471 471
472 472 [[package]]
473 473 name = "audiofiles-rhai"
474 - version = "0.3.6"
474 + version = "0.4.0"
475 475 dependencies = [
476 476 "audiofiles-core",
477 477 "dirs",
@@ -485,7 +485,7 @@ dependencies = [
485 485
486 486 [[package]]
487 487 name = "audiofiles-sync"
488 - version = "0.3.6"
488 + version = "0.4.0"
489 489 dependencies = [
490 490 "audiofiles-core",
491 491 "base64",
@@ -506,7 +506,7 @@ dependencies = [
506 506
507 507 [[package]]
508 508 name = "audiofiles-train"
509 - version = "0.3.6"
509 + version = "0.4.0"
510 510 dependencies = [
511 511 "audiofiles-core",
512 512 "rand 0.8.5",
@@ -1,30 +1,31 @@
1 1 # audiofiles -- Code Audit Review
2 2
3 - **Last audited:** 2026-04-06 (seventeenth audit, Run 13 cross-project)
4 - **Previous audit:** 2026-03-28 (sixteenth audit, Run 12 cross-project)
3 + **Last audited:** 2026-04-18 (nineteenth audit, Run 15 cross-project)
4 + **Previous audit:** 2026-04-15 (eighteenth audit, Run 14 cross-project)
5 5
6 6 ## Overall Grade: A
7 7
8 - Run 13 cross-project audit. 704 tests. 0 clippy warnings. v0.3.3. Grade A (maintained). ML classifier expanded: 4 embedded RF models (drum 7-class 87.3% CV, bass 3-class 99.4% CV, synth 4-class 99.5% CV, vocal placeholder). 28 SampleClass variants (18 broad + 10 sub-classes). Multi-model OnceLock loading. Parameterized training binary. Maintainability splits completed (service.rs, state/mod.rs). Dead code/duplication audit clean.
8 + Run 15 cross-project audit. 688 tests (all pass). 0 clippy warnings. v0.3.6. Grade A (stable). ~40,219 LOC. Minor issues only: Relaxed atomic ordering in cancel flag (worker.rs:84,147), export CTE duplication. Previous unfixed items remain LOW severity (sidebar.rs unwraps, updater.rs URL trust).
9 9
10 10 ## Scorecard
11 11
12 12 | Dimension | Grade | Notes |
13 13 |-----------|:-----:|-------|
14 - | Code Quality | A | Clippy clean (0 warnings). No raw SQL in UI layer. Consistent error types. Zero production unwraps in sync/service.rs and theme.rs (verified). 2 guarded unwraps in sidebar.rs (safe but redundant). Disciplined `?` propagation. |
14 + | Code Quality | A | Clippy clean (0 warnings). No raw SQL in UI layer. Consistent error types. Zero production unwraps in sync/service.rs and theme.rs (verified). 2 guarded unwraps in sidebar.rs (safe but redundant). |
15 15 | Architecture | A | 5-crate workspace: core (sync DB/store), browser (state + UI + backend trait), app, sync, rhai. Backend trait cleanly abstracts data access. Core crate has zero UI/async dependencies. |
16 - | Testing | A | 704 tests, all passing. Core: ~410 (incl. 25 classifier tests, 3 e2e), browser: 183, app: 22, sync: 27, rhai: 34, train: ~28. VP-tree: 10 tests (incl. brute-force correctness). FingerprintIndex: 4 tests (incl. index-vs-linear parity). SimilarityIndex: 5 tests (incl. ranking-matches-linear, sorted output). App: updater state machine (8), API key persistence (7), tray icon (2), audio (5). drag_out/ has no tests (platform FFI, manual testing). |
17 - | Security | A | All SQL parameterized. LIKE wildcards escaped. Hash validated. Column whitelists in sync. 17 unsafe blocks in drag_out/ (all platform FFI: objc2 macOS, COM Windows). Drag-out filenames sanitized (path separators + traversal rejected). OTA updater HTTPS-only endpoint but trusts server-provided download URL (opens in browser, no auto-install). applying_remote cleared on startup. |
18 - | Performance | A | try_lock on cpal audio callback. LEFT JOIN enriched queries (no N+1). 7+ indexes. WAL mode. Background workers for import/analysis/export. Pre-computed waveforms. VP-tree indexes for both similarity (O(log n), fixed normalization) and fingerprint search (O(log n) + NCC verification). Sub-millisecond queries at 100K samples. |
19 - | Documentation | A | Every module has //! docs. Public functions have /// docs. SAFETY comments on unsafe blocks. architecture.md and README created. All pub functions now have /// doc comments. |
20 - | Dependencies | A | nih-plug removed (was the only git-pinned dep). All remaining deps use semver. No unused deps. |
21 - | Frontend | A | egui patterns clean. TOML theme system with 17 bundled themes (audiofiles default) + custom loading. "af/" logo in Recursive Mono Bold via embedded font. Waveform painter with click-to-seek. Keyboard shortcuts. try_lock from GUI thread. import_screens split into directory module. file_list.rs (689 LOC, high branch density) is the largest UI file. |
22 - | Type Safety | A | `VfsId`, `NodeId`, `SmartFolderId`, `CollectionId` i64 newtypes via `define_i64_id!` macro. `SampleHash(String)` validated newtype. `NodeType::parse()` returns explicit error. Good domain enums. Typed error hierarchy. |
23 - | Observability | A | `tracing` in all crates with EnvFilter subscriber (env-configurable log levels). 115 `#[instrument(skip_all)]` annotations across 50+ files. Core, rhai, sync, browser all instrumented. Density on par with GO/BB/PoM/MT. |
24 - | Concurrency | A | Correct `try_lock()` on audio thread with silence fallback. Workers in own threads with separate DB connections. `Mutex<Database>` in DirectBackend. `spawn_blocking` for sync DB ops. Single-lock .take() pattern in cancel operations (eliminates double-lock TOCTOU). |
25 - | Resilience | A | Worker Drop with clean Shutdown+join. Per-file error reporting during import. Audio stream failure non-fatal. Sync optional. Tray failure non-fatal. Atomic migrations. CASCADE foreign keys. `applying_remote` cleared on startup (`service.rs:102-110`). |
26 - | API Consistency | A | 47-method Backend trait with uniform `BackendResult<T>`. Consistent naming (list_*, create_*, delete_*, get_*, start_*, cancel_*). Core functions follow consistent patterns. |
27 - | Codebase Size | A | ~24.5K LOC across 5 crates + train for ~18 major features + ML classifier + cloud sync + Rhai scripting + native drag-out. Lean with excellent feature density. No dead code. Maintainability splits: service.rs→service/, state/mod.rs→state/ submodules. Largest files: browser/backend/direct.rs (983), core/vfs.rs (1,081, flat SQL exempt). 4 embedded RF models total 7.8MB. |
16 + | Testing | A+ | 688 tests, all passing. Core: ~410 (incl. 25 classifier tests, 3 e2e), browser: 183, app: 22, sync: 27, rhai: 34, train: ~28. VP-tree: 10 tests. FingerprintIndex: 4 tests. SimilarityIndex: 5 tests. App: updater state machine (8), API key persistence (7), tray icon (2), audio (5). |
17 + | Security | A | All SQL parameterized. LIKE wildcards escaped. Hash validated. Column whitelists in sync. 17 unsafe blocks in drag_out/ (all platform FFI). Drag-out filenames sanitized. OTA updater HTTPS-only endpoint but trusts server-provided download URL (opens in browser, no auto-install). |
18 + | Performance | A- | try_lock on cpal audio callback. LEFT JOIN enriched queries (no N+1). 7+ indexes. WAL mode. Background workers for import/analysis/export. Pre-computed waveforms. VP-tree indexes for similarity and fingerprint search. |
19 + | Documentation | A | Every module has //! docs. Public functions have /// docs. SAFETY comments on unsafe blocks. architecture.md and README. |
20 + | Dependencies | A | All deps use semver. No unused deps. No git-pinned deps. |
21 + | Frontend | A- | egui patterns clean. TOML theme system with bundled themes + custom loading. Waveform painter with click-to-seek. Keyboard shortcuts. try_lock from GUI thread. file_list.rs (689 LOC, high branch density) is the largest UI file. |
22 + | Type Safety | A | `VfsId`, `NodeId`, `SmartFolderId`, `CollectionId` i64 newtypes via macro. `SampleHash(String)` validated newtype. Good domain enums. Typed error hierarchy. |
23 + | Observability | A- | `tracing` in all crates with EnvFilter subscriber. 115 `#[instrument(skip_all)]` annotations across 50+ files. |
24 + | Concurrency | A | Correct `try_lock()` on audio thread with silence fallback. Workers in own threads with separate DB connections. Single-lock .take() pattern eliminates double-lock TOCTOU. Relaxed atomic ordering in cancel flag (worker.rs:84,147) -- minor. |
25 + | Resilience | A- | Worker Drop with clean Shutdown+join. Per-file error reporting during import. Audio stream failure non-fatal. Sync optional. Tray failure non-fatal. Atomic migrations. `applying_remote` cleared on startup. |
26 + | API Consistency | A | 47-method Backend trait with uniform `BackendResult<T>`. Consistent naming patterns. |
27 + | Migration Safety | A | Inline migrations, all additive. CASCADE foreign keys. |
28 + | Codebase Size | A- | ~40,219 LOC across 5 crates + train for ~18 major features + ML classifier + cloud sync + Rhai scripting + native drag-out. 4 embedded RF models total 7.8MB. |
28 29
29 30 ## Module Heatmap
30 31
@@ -38,59 +39,45 @@ Run 13 cross-project audit. 704 tests. 0 clippy warnings. v0.3.3. Grade A (maint
38 39
39 40 ### Cold Spots
40 41
41 - All previous cold spots resolved. Two new LOW findings:
42 + All previous cold spots resolved. Two LOW findings remain:
42 43
43 - 1. ~~**applying_remote crash recovery**~~ -- RESOLVED (cleared at top of perform_sync())
44 - 2. ~~**eprintln! in browser crate**~~ -- RESOLVED (migrated to tracing)
45 - 3. ~~**import_screens.rs (668 LOC)**~~ -- RESOLVED (split into directory module)
46 - 4. ~~**No E2E integration tests**~~ -- RESOLVED (3 e2e tests added)
47 - 5. ~~**Theme include_str! paths**~~ -- RESOLVED (themes moved into crate, paths fixed)
48 - 6. **sidebar.rs unwraps (LOW)** -- Lines 133, 135: `.unwrap()` on `collection_rename_target` inside an `if let Some()` guard. Safe (guard ensures value exists) but stylistically redundant. Could use the bound variable directly.
49 - 7. **updater.rs URL trust (LOW)** -- OTA updater stores `update.url` from server response without validation and passes it to `open::that()`. The updater only opens URLs in the browser (no auto-download/install), so risk is limited to the MNW server being compromised. Acceptable for alpha.
44 + 1. **sidebar.rs unwraps (LOW)** -- Lines 133, 135: `.unwrap()` on `collection_rename_target` inside an `if let Some()` guard. Safe (guard ensures value exists) but stylistically redundant. UNFIXED.
45 + 2. **updater.rs URL trust (LOW)** -- OTA updater stores `update.url` from server response without validation and passes it to `open::that()`. Risk limited to MNW server compromise. UNFIXED. Acceptable for alpha.
46 + 3. **Relaxed atomic ordering in cancel flag (LOW)** -- worker.rs:84,147 uses `Ordering::Relaxed` for the cancel flag atomic. On x86 this is fine (strong memory model), but on ARM it could theoretically allow a stale read. `Ordering::Acquire`/`Release` would be more correct.
47 + 4. **Export CTE duplication (LOW)** -- Minor code duplication in export CTEs. Not a correctness issue.
50 48
51 49 ## Mandatory Surprise
52 50
53 - ### Finding (Run 13): Multi-model classifier uses generalized prediction
51 + ### Finding (Run 15): Relaxed Ordering in atomic cancel flag
54 52
55 - The Layer 2 classifier expansion (bass, synth, vocal) avoids code duplication by extracting `predict_with_model()` as a generic function parameterized by model, class array, and fallback. Each broad class routes to its own `OnceLock<RandomForestModel>` + `include_bytes!` pair. Empty models (vocal placeholder) gracefully fall back to the broad class — no panics, no special-casing. The training binary was similarly generalized with `TargetConfig` struct and `--target` flag, making `NUM_CLASSES` dynamic rather than const.
53 + In `worker.rs:84` and `worker.rs:147`, the worker cancel flag uses `Ordering::Relaxed` for both store and load operations. On x86_64, this is equivalent to `Acquire`/`Release` due to the strong memory model, so there is no practical bug. On ARM (including Apple Silicon), `Relaxed` provides no ordering guarantees -- a worker could theoretically continue processing one more item after cancellation is signaled.
56 54
57 - **Verdict:** Clean extension point. Adding a new Layer 2 classifier requires only: training data, `cargo run -p audiofiles-train -- --target X`, and a class array constant.
55 + The window is tiny (one iteration of the work loop), and the consequence is benign (one extra file processed before stopping). But the intent is clearly "stop as soon as possible," and `Acquire`/`Release` would express that intent correctly on all architectures.
58 56
59 - ### Previous finding (Run 12): sync/service.rs has zero production unwraps
57 + **Verdict:** Technically imprecise but practically harmless. The fix is a one-line change (`Relaxed` -> `Release` on store, `Relaxed` -> `Acquire` on load).
60 58
61 - **Impressive.**
59 + ### Previous finding (Run 13): Multi-model classifier generalized prediction
62 60
63 - Previous audit agents flagged sync/service.rs (1,437 lines) as having "15+ unwraps in critical paths." Line-by-line verification found zero production `.unwrap()` calls. All uses are in the `#[cfg(test)]` module (line 806+). Production code consistently uses `?` operator, `unwrap_or()`, `unwrap_or_else()`, and explicit match/if-let patterns. The same verification on theme.rs (877 lines) also found zero production unwraps.
64 -
65 - **Verdict:** Impressive. Error handling discipline across the largest files is exemplary.
66 -
67 - ### Finding (previous): applying_remote flag stuck after crash silently drops changes
68 -
69 - **Resolved.** Flag cleared at top of `perform_sync()` (`service.rs:102-110`).
61 + Clean extension point for Layer 2 classifiers. Adding a new classifier requires only training data, a training run, and a class array constant. Verdict: Clean design.
70 62
71 63 ## Strengths
72 64
73 65 - **Content-addressed storage is elegant.** SHA-256 dedup, CASCADE deletes, recursive CTE queries. SampleStore + VFS separation enables unlimited virtual hierarchies over one flat blob store.
74 -
75 - - **Backend trait abstraction is well-designed.** 47-method trait surface that maps 1:1 to what BrowserState needs. DirectBackend wraps Mutex<Database> + SampleStore + worker handles. No leaked abstractions.
76 -
77 - - **SyncKit integration is clean.** FK-safe ordering, column whitelists, changelog triggers, blob sync scaffolding. OAuth2 PKCE auth flow properly separated. Scheduler with exponential backoff.
78 -
66 + - **Backend trait abstraction is well-designed.** 47-method trait surface that maps 1:1 to what BrowserState needs. No leaked abstractions.
67 + - **SyncKit integration is clean.** FK-safe ordering, column whitelists, changelog triggers, blob sync scaffolding. OAuth2 PKCE auth flow properly separated.
79 68 - **Audio thread safety is correct.** The cpal audio callback uses `try_lock()` on `Mutex<PreviewPlayback>` and falls back to silence. No heap allocations on audio thread.
80 -
81 - - **Rhai scripting is well-sandboxed.** No filesystem access from scripts. Host API exposes only sample metadata. User plugins can override bundled ones. Four hook points with clean compilation/execution.
82 -
83 - - **Core functions live in core.** Sync-only, no async, no UI dependencies. All database queries, analysis, search, export -- everything computation-heavy is in core. Browser delegates through Backend trait.
69 + - **Rhai scripting is well-sandboxed.** No filesystem access from scripts. Host API exposes only sample metadata.
70 + - **Core functions live in core.** Sync-only, no async, no UI dependencies. All computation-heavy code is in core. Browser delegates through Backend trait.
84 71
85 72 ## Weaknesses
86 73
87 - - **drag_out/ has no automated tests** -- Platform FFI code (macOS objc2, Windows COM) is tested manually only. Hard to unit test (requires windowed app context), but integration test coverage is zero.
74 + - **drag_out/ has no automated tests** -- Platform FFI code tested manually only. Hard to unit test.
88 75 - **17 unsafe blocks in drag_out/** -- All justified (platform FFI). Each has SAFETY comments.
89 - - **file_list.rs branch density** -- 689 lines with high branching (egui immediate-mode). Could benefit from extracting row rendering into a helper.
76 + - **file_list.rs branch density** -- 689 lines with high branching (egui immediate-mode). Could benefit from extracting row rendering.
90 77
91 78 ## Action Items
92 79
93 - No new action items filed — all findings are LOW severity (acceptable for alpha).
80 + No new action items filed -- all findings are LOW severity (acceptable for alpha).
94 81
95 82 Previously resolved:
96 83 - ~~CRITICAL: applying_remote crash recovery~~ -- RESOLVED
@@ -101,15 +88,15 @@ Previously resolved:
101 88
102 89 ## Metrics Over Time
103 90
104 - | Metric | 6th (03-11) | 7th (03-13) | Adversarial (03-13) | 11th (03-18) | 12th (03-19) | 13th (03-19) | 14th (03-22) | ML (03-26) | Run 12 (03-28) | Run 13 (04-06) |
105 - |--------|:-----------:|:-----------:|:-------------------:|:------------:|:------------:|:------------:|:------------:|:----------:|:--------------:|:--------------:|
106 - | Overall | A- | A- | A- | A- | A | A | A | A | A | A |
107 - | LOC | 25.6K | 25.6K | 25.6K | ~25K | ~23K | ~23.5K | ~23.5K | ~24.5K | ~24.5K | ~25K |
108 - | Tests | 518 | 532 | 557 | 566 | 535 | 560 | 585 | 610 | 611 | 704 |
109 - | Crates | 7 + xtask | 7 + xtask | 7 + xtask | 7 + xtask | 5 | 5 | 5 | 5 + train | 5 + train | 5 + train |
110 - | Clippy | 2 (trivial) | 2 (trivial) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
111 - | Unwrap (prod) | ~1 | 7 (all init) | 7 (all init) | 2 (sidebar, guarded) | 2 (sidebar, guarded) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) |
112 - | Unsafe | 2 (test) | 2 (test) | 2 (test) | 2 (test) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) |
91 + | Metric | 6th (03-11) | 7th (03-13) | Adversarial (03-13) | 11th (03-18) | 12th (03-19) | 13th (03-19) | 14th (03-22) | ML (03-26) | Run 12 (03-28) | Run 13 (04-06) | Run 14 (04-15) | Run 15 (04-18) |
92 + |--------|:-----------:|:-----------:|:-------------------:|:------------:|:------------:|:------------:|:------------:|:----------:|:--------------:|:--------------:|:--------------:|:--------------:|
93 + | Overall | A- | A- | A- | A- | A | A | A | A | A | A | A | A |
94 + | LOC | 25.6K | 25.6K | 25.6K | ~25K | ~23K | ~23.5K | ~23.5K | ~24.5K | ~24.5K | ~25K | ~40.2K | ~40.2K |
95 + | Tests | 518 | 532 | 557 | 566 | 535 | 560 | 585 | 610 | 611 | 704 | 688 | 688 |
96 + | Crates | 7 + xtask | 7 + xtask | 7 + xtask | 7 + xtask | 5 | 5 | 5 | 5 + train | 5 + train | 5 + train | 5 + train | 5 + train |
97 + | Clippy | 2 (trivial) | 2 (trivial) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
98 + | Unwrap (prod) | ~1 | 7 (all init) | 7 (all init) | 2 (sidebar, guarded) | 2 (sidebar, guarded) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) | 2 (sidebar) |
99 + | Unsafe | 2 (test) | 2 (test) | 2 (test) | 2 (test) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) | 17 (FFI) |
113 100
114 101 ---
115 102
@@ -123,21 +110,17 @@ See [audit_history.md](./audit_history.md) for full chronological audit log.
123 110
124 111 ### Overall Grade: A
125 112
126 - Minimal but appropriate doc set for the project's current stage. No inaccuracies found. description.md is an intentional placeholder. Documentation is proportional to the project's development status.
113 + Minimal but appropriate doc set for the project's current stage. No inaccuracies found. Documentation is proportional to the project's development status.
127 114
128 115 ### Document Heatmap
129 116
130 117 | Document | Status | Last Verified | Notes |
131 118 |----------|:------:|:-------------:|-------|
132 - | docs/todo.md | Current | 2026-04-06 | Active task list (Layer 2 classifiers, UX polish) |
119 + | docs/todo.md | Current | 2026-04-18 | Active task list |
133 120 | docs/architecture.md | Current | 2026-03-28 | System design + 5-crate workspace |
134 121 | docs/competition.md | Current | 2026-03-04 | Competitive analysis |
135 122 | docs/human_testing.md | Current | 2026-03-04 | Manual QA checklist |
136 - | docs/audit_review.md | Current | 2026-04-06 | Code audit history |
137 -
138 - ### Stale References Found (This Audit)
139 -
140 - None.
123 + | docs/audit_review.md | Current | 2026-04-18 | Code audit history |
141 124
142 125 ### Action Items
143 126
@@ -0,0 +1,234 @@
1 + # Schema — audiofiles
2 +
3 + SQLite database. Inline migrations tracked via `PRAGMA user_version` (current: 12). PRAGMA foreign_keys=ON enforced.
4 +
5 + ## Table Map
6 +
7 + | Domain | Tables | Purpose |
8 + |--------|--------|---------|
9 + | Samples | 2 | Content-addressed storage + analysis results |
10 + | VFS | 3 | Virtual file system trees + smart folders |
11 + | Organization | 3 | Tags, collections, collection members |
12 + | Analysis | 2 | Waveform data, audio fingerprints |
13 + | Edit History | 1 | Non-destructive edit tracking |
14 + | Config | 1 | App preferences |
15 + | SyncKit | 2 | Changelog + state |
16 +
17 + ---
18 +
19 + ## Content-Addressed Storage
20 +
21 + ### samples
22 + The primary entity. **SHA-256 hash is the primary key** — the same file always produces the same row. Re-importing a file with an existing hash is a no-op (dedup by design).
23 +
24 + | Column | Type | Notes |
25 + |--------|------|-------|
26 + | hash | TEXT PK | SHA-256 of file content |
27 + | original_name | TEXT | Filename at import time |
28 + | file_extension | TEXT | e.g., 'wav', 'flac', 'mp3' |
29 + | file_size | INTEGER | Bytes |
30 + | import_date | TEXT | ISO 8601 |
31 + | last_modified | TEXT | |
32 + | cloud_only | INTEGER | Boolean — evicted from local, stored in SyncKit blob |
33 + | duration | REAL | Seconds (set during analysis) |
34 +
35 + **Index:** idx_samples_name.
36 +
37 + **Storage:** Files stored in a content-addressed directory structure: `store/{hash[0..2]}/{hash[2..4]}/{hash}`.
38 +
39 + ### audio_analysis
40 + Analysis results. One row per sample, populated after analysis completes.
41 +
42 + | Column | Type | Notes |
43 + |--------|------|-------|
44 + | hash | TEXT PK FK → samples CASCADE | |
45 + | bpm | REAL | Detected tempo |
46 + | musical_key | TEXT | e.g., 'C major', 'F# minor' |
47 + | duration | REAL | Seconds |
48 + | sample_rate | INTEGER | Hz |
49 + | channels | INTEGER | |
50 + | peak_db / rms_db / lufs | REAL | Loudness measurements |
51 + | is_loop | INTEGER | Boolean — loop point detected |
52 + | classification | TEXT | 'kick', 'snare', 'hihat', 'bass', etc. |
53 + | classification_confidence | REAL | 0.0–1.0 |
54 + | spectral_centroid / spectral_flatness / spectral_rolloff / spectral_bandwidth | REAL | Spectral features |
55 + | zero_crossing_rate / centroid_variance / crest_factor / attack_time | REAL | Time-domain features |
56 + | onset_strength | REAL | Transient detection |
57 + | analyzed_at | TEXT | |
58 +
59 + **Indexes:** bpm, musical_key, duration, classification — for filter queries.
60 +
61 + ---
62 +
63 + ## Virtual File System
64 +
65 + ### vfs
66 + Named virtual file systems. Users can have multiple (e.g., "Main Library", "Project X").
67 +
68 + | Column | Type | Notes |
69 + |--------|------|-------|
70 + | id | INTEGER PK | |
71 + | name | TEXT UNIQUE | |
72 + | sync_files | INTEGER | Boolean — sync blobs to SyncKit cloud |
73 + | created_at / modified_at | TEXT | |
74 +
75 + ### vfs_nodes
76 + Tree structure of directories and sample references. A sample can appear in multiple directories (hard-link semantics).
77 +
78 + | Column | Type | Notes |
79 + |--------|------|-------|
80 + | id | INTEGER PK | |
81 + | vfs_id | INTEGER FK → vfs CASCADE | |
82 + | parent_id | INTEGER FK → vfs_nodes CASCADE | Self-ref; NULL = root |
83 + | sample_hash | TEXT FK → samples CASCADE | NULL for directories |
84 + | name | TEXT | Display name in this location |
85 + | node_type | TEXT | CHECK: 'directory' or 'sample' |
86 + | created_at | TEXT | |
87 +
88 + **Constraint:** UNIQUE(vfs_id, parent_id, name) — no duplicate names in same directory.
89 + **Indexes:** parent (tree traversal), vfs (list all nodes), hash (find all locations of a sample).
90 +
91 + ### smart_folders
92 + Saved queries that dynamically list matching samples.
93 +
94 + | Column | Type | Notes |
95 + |--------|------|-------|
96 + | id | INTEGER PK | |
97 + | vfs_id | INTEGER FK → vfs CASCADE | |
98 + | name | TEXT | |
99 + | query_json | TEXT | Serialized filter criteria |
100 +
101 + ---
102 +
103 + ## Organization
104 +
105 + ### tags
106 + Sample tags. Normalized to lowercase. A sample can have many tags.
107 +
108 + | Column | Type | Notes |
109 + |--------|------|-------|
110 + | sample_hash | TEXT FK → samples CASCADE | |
111 + | tag | TEXT | Lowercase normalized |
112 +
113 + **PK:** (sample_hash, tag).
114 + **Indexes:** hash (all tags for a sample), tag (all samples with a tag).
115 +
116 + ### collections
117 + Named groups of samples (playlists, kits, project collections).
118 +
119 + | Column | Type | Notes |
120 + |--------|------|-------|
121 + | id | INTEGER PK | |
122 + | name | TEXT UNIQUE | |
123 + | description | TEXT | |
124 +
125 + ### collection_members
126 + Many-to-many between collections and samples.
127 +
128 + - **PK:** (collection_id, sample_hash)
129 + - **FK:** collection_id → collections CASCADE, sample_hash → samples CASCADE
130 +
131 + ---
132 +
133 + ## Analysis Artifacts
134 +
135 + ### waveform_data
136 + Pre-computed waveform envelopes for display. Stored as BLOB for fast retrieval.
137 +
138 + | Column | Type | Notes |
139 + |--------|------|-------|
140 + | hash | TEXT PK FK → samples CASCADE | |
141 + | num_buckets | INTEGER | Resolution (typically 512 or 1024) |
142 + | peak_data | BLOB | Packed float pairs (min, max) per bucket |
143 + | sample_rate | INTEGER | |
144 + | duration | REAL | |
145 +
146 + ### fingerprints
147 + Audio fingerprints for similarity search (VP-tree nearest neighbor).
148 +
149 + | Column | Type | Notes |
150 + |--------|------|-------|
151 + | hash | TEXT PK FK → samples CASCADE | |
152 + | envelope | BLOB | Peak envelope vector |
153 + | sample_rate | INTEGER | |
154 +
155 + ---
156 +
157 + ## Edit History
158 +
159 + ### edit_history
160 + Non-destructive edit tracking. Each edit produces a new sample (new hash). The history links source → result.
161 +
162 + | Column | Type | Notes |
163 + |--------|------|-------|
164 + | id | INTEGER PK AUTOINCREMENT | |
165 + | source_hash | TEXT | Original sample hash |
166 + | result_hash | TEXT | New sample hash after edit |
167 + | operation | TEXT | 'trim', 'reverse', 'normalize', 'fade', 'gain' |
168 + | params_json | TEXT | Operation parameters |
169 + | created_at | TEXT | |
170 +
171 + **Indexes:** source_hash (undo chain), result_hash (provenance).
172 +
173 + **Design:** Edits never modify existing files. Trim a sample → a new file with a new hash is created. Undo = delete the result and its edit_history entry.
174 +
175 + ---
176 +
177 + ## Configuration
178 +
179 + ### user_config
180 + App-wide key-value preferences (theme, default VFS, analysis settings, etc.).
181 +
182 + | Column | Type | Notes |
183 + |--------|------|-------|
184 + | key | TEXT PK | |
185 + | value | TEXT | |
186 +
187 + ---
188 +
189 + ## SyncKit Infrastructure
190 +
191 + ### sync_state
192 + Key-value store for sync configuration. Same schema as GO/BB:
193 + - `device_id` — unique device identifier
194 + - `pull_cursor` — last-pulled sequence number
195 + - `auto_sync_enabled` / `sync_interval_minutes`
196 + - `applying_remote` — suppresses changelog triggers during pull
197 + - `last_sync_at` / `initial_snapshot_done`
198 +
199 + ### sync_changelog
200 + Local change log. Synced tables have INSERT/UPDATE/DELETE triggers that write here (when `applying_remote` != '1'). Full row data serialized as JSON.
201 +
202 + | Column | Type | Notes |
203 + |--------|------|-------|
204 + | id | INTEGER PK | |
205 + | table_name | TEXT | |
206 + | op | TEXT | 'INSERT', 'UPDATE', 'DELETE' |
207 + | row_id | TEXT | |
208 + | timestamp | TEXT | |
209 + | data | TEXT | Full row as JSON |
210 + | pushed | INTEGER | 0 = unpushed |
211 +
212 + **Index:** idx_changelog_pushed.
213 +
214 + **Synced tables:** samples, audio_analysis, vfs, vfs_nodes, tags, collections, collection_members, smart_folders, edit_history, waveform_data, fingerprints.
215 +
216 + ---
217 +
218 + ## Cascade Rules
219 +
220 + - **CASCADE everywhere:** Deleting a sample cascades to audio_analysis, tags, collection_members, waveform_data, fingerprints, and all vfs_nodes referencing it. Deleting a VFS cascades to all its nodes and smart folders. Deleting a collection cascades to its members.
221 + - **No SET NULL or RESTRICT** — the schema is strictly hierarchical with samples as the root entity.
222 +
223 + ## Content-Addressing Implications
224 +
225 + - **No UPDATE on samples.hash** — hash is immutable. An "edit" creates a new sample.
226 + - **Dedup is automatic** — importing the same file twice produces the same hash, which is rejected as a duplicate PK.
227 + - **Cloud eviction:** When `cloud_only=1`, the local file is deleted but the row persists. The sample can be re-downloaded from SyncKit blob storage.
228 +
229 + ## Key Paths
230 +
231 + - `crates/audiofiles-core/src/db.rs` — migration runner + Database struct
232 + - `crates/audiofiles-core/src/store.rs` — content-addressed file storage
233 + - `crates/audiofiles-core/src/vfs.rs` — VFS operations
234 + - `crates/audiofiles-core/src/search.rs` — filter/search queries
@@ -0,0 +1,114 @@
1 + # Smoke Test Checklist — audiofiles
2 +
3 + Pre-release manual verification. Run after building a new version.
4 +
5 + ## Launch & Basics
6 +
7 + - [ ] App launches without error
8 + - [ ] Main window renders (egui)
9 + - [ ] Theme loads correctly (embedded TOML)
10 + - [ ] No panic in console output
11 +
12 + ## Import
13 +
14 + - [ ] Import a single audio file (WAV) — sample appears in library
15 + - [ ] Import a directory of mixed formats (WAV, MP3, FLAC, AIFF)
16 + - [ ] Verify content-addressed storage: re-import same file → no duplicate (same hash)
17 + - [ ] Import a zero-byte file — rejected with error message
18 + - [ ] Import a non-audio file — rejected gracefully
19 +
20 + ## Analysis
21 +
22 + - [ ] Select samples, run analysis
23 + - [ ] Progress indicator shows batch progress
24 + - [ ] BPM, key, loudness, classification populated after analysis
25 + - [ ] `smart_skip` works: non-rhythmic samples skip BPM/key (if enabled)
26 + - [ ] Cancel mid-analysis — batch stops after current sample finishes
27 + - [ ] Re-analyze a sample — values update
28 +
29 + ## VFS (Virtual File System)
30 +
31 + - [ ] Create a new VFS
32 + - [ ] Create directories within VFS
33 + - [ ] Link samples to directories
34 + - [ ] Navigate between directories
35 + - [ ] Rename a directory
36 + - [ ] Move a sample between directories
37 + - [ ] Delete a directory (samples not deleted from store)
38 +
39 + ## Search & Filter
40 +
41 + - [ ] Search by filename text
42 + - [ ] Filter by duration range
43 + - [ ] Filter by BPM range
44 + - [ ] Filter by classification (e.g., Kick, Snare)
45 + - [ ] Filter by tag
46 + - [ ] Combined filters work together
47 +
48 + ## Tags
49 +
50 + - [ ] Add a tag to a sample
51 + - [ ] Add tags in bulk (select multiple, apply tag)
52 + - [ ] Remove a tag
53 + - [ ] Search by tag
54 +
55 + ## Audio Preview
56 +
57 + - [ ] Click a sample — audio plays
58 + - [ ] Click another — previous stops, new one plays
59 + - [ ] Waveform displays during playback
60 +
61 + ## Drag and Drop
62 +
63 + - [ ] Drag samples out of the app into a DAW or file manager
64 + - [ ] Multiple file drag works
65 + - [ ] Verify temp symlinks created in `/tmp/audiofiles-drag-{pid}/`
66 +
67 + ## Export
68 +
69 + - [ ] Select samples, export to a directory
70 + - [ ] Export with format conversion (e.g., WAV → FLAC)
71 + - [ ] Metadata sidecar files generated (if enabled)
72 + - [ ] Verify exported files are playable
73 +
74 + ## Edit Operations
75 +
76 + - [ ] Trim a sample — result saved as new hash
77 + - [ ] Reverse a sample
78 + - [ ] Normalize a sample
79 + - [ ] Undo an edit operation
80 +
81 + ## Collections
82 +
83 + - [ ] Create a collection
84 + - [ ] Add samples to collection
85 + - [ ] Remove samples from collection
86 +
87 + ## Sync (if configured)
88 +
89 + - [ ] Log in to sync (MNW account)
90 + - [ ] Verify metadata syncs to another device
91 + - [ ] Blob sync: samples with `sync_files=true` upload to cloud
92 + - [ ] Cloud-only eviction: sample marked `cloud_only` after eviction
93 + - [ ] Re-download a cloud-only sample
94 +
95 + ## Bulk Rename
96 +
97 + - [ ] Select multiple samples
98 + - [ ] Apply rename pattern (e.g., `{class}_{bpm}_{name}`)
99 + - [ ] Preview shows expected names
100 + - [ ] Execute rename — names update in VFS
101 +
102 + ## Platform-Specific
103 +
104 + ### macOS
105 + - [ ] Drag-out works (NSPasteboardItem)
106 + - [ ] App appears in Dock correctly
107 +
108 + ### Windows
109 + - [ ] Drag-out works (OLE/COM)
110 + - [ ] Installer/MSI works
111 +
112 + ### Linux
113 + - [ ] AppImage launches
114 + - [ ] Drag-out works (X11/Wayland symlink fallback)
@@ -0,0 +1,173 @@
1 + # Test Plan — audiofiles
2 +
3 + ## Overview
4 +
5 + ~260 tests across 6 crates. Unit tests (inline) + integration tests (e2e pipeline with real audio). All use in-memory SQLite.
6 +
7 + ## Test Architecture
8 +
9 + **Unit tests:** Inline `#[cfg(test)]` modules in 44+ source files. Synchronous (no tokio in core).
10 +
11 + **Integration tests:** `crates/audiofiles-core/tests/` — end-to-end pipeline tests using programmatically generated audio (sine waves).
12 +
13 + **ML validation:** `crates/audiofiles-core/tests/classify_validation.rs` — tests classification accuracy against labeled drum samples.
14 +
15 + **DB pattern:** `Database::open_in_memory()` creates a fresh SQLite database with inline migrations. `tempfile::TempDir` for isolated sample storage.
16 +
17 + **No mocks:** Tests use real functions with temporary directories. E2E tests generate real audio.
18 +
19 + ## Running Tests
20 +
21 + ```bash
22 + # All tests (all crates)
23 + cargo test
24 +
25 + # Specific crate
26 + cargo test -p audiofiles-core # Core logic + analysis
27 + cargo test -p audiofiles-browser # UI state machine
28 + cargo test -p audiofiles-rhai # Plugin engine
29 + cargo test -p audiofiles-sync # Sync state
30 +
31 + # Specific module
32 + cargo test analysis::classify::tests
33 + cargo test state::tests
34 +
35 + # E2E pipeline
36 + cargo test e2e_import_analyze_search_tag_export
37 +
38 + # ML classification validation (requires labeled test data)
39 + cargo test classify_drum_samples
40 +
41 + # Benchmarks
42 + cargo bench -p audiofiles-bench
43 + ```
44 +
45 + ## What's Covered
46 +
47 + ### Integration Tests (`crates/audiofiles-core/tests/`)
48 +
49 + | File | Tests | What's Tested |
50 + |------|-------|---------------|
51 + | `e2e_pipeline.rs` | 3 | Full import-analyze-search-tag-export pipeline, analysis roundtrip (440Hz sine → spectral centroid verification), multi-sample search with dedup |
52 + | `classify_validation.rs` | 1 | ML classification accuracy on labeled drum samples (kick/snare/hihat/cymbal/clap/tom/percussion). Reports per-class precision/recall/F1. |
53 +
54 + ### Unit Tests by Crate
55 +
56 + **audiofiles-core** (~180 tests):
57 +
58 + | Module | What's Tested |
59 + |--------|---------------|
60 + | `analysis/classify.rs` | 24 tests: rule-based classification (kick/hihat/snare/noise/bass/impact/ambience/foley/texture), tag format, string roundtrip, feature vector layout |
61 + | `analysis/spectral.rs` | Spectral feature extraction |
62 + | `analysis/mfcc.rs` | MFCC feature extraction |
63 + | `analysis/bpm.rs` | Tempo detection |
64 + | `analysis/loudness.rs` | Peak/RMS/LUFS measurement |
65 + | `analysis/loop_detect.rs` | Loop point detection |
66 + | `analysis/waveform.rs` | Waveform envelope |
67 + | `analysis/decode.rs` | Audio decoding |
68 + | `analysis/config.rs` | Analysis configuration |
69 + | `analysis/worker.rs` | Worker thread protocol |
70 + | `analysis/suggest.rs` | Tag suggestions |
71 + | `export/encode.rs` | WAV, AIFF, FLAC, MP3 encoding |
72 + | `export/convert.rs` | Sample rate/bit depth conversion |
73 + | `export/sanitize.rs` | Filename sanitization |
74 + | `export/dither.rs` | Dithering algorithms |
75 + | `export/profile.rs` | Device profile matching |
76 + | `edit/trim.rs` | Sample trimming |
77 + | `edit/reverse.rs` | Sample reversal |
78 + | `edit/gain.rs` | Amplitude scaling |
79 + | `edit/fade.rs` | Fade in/out curves |
80 + | `edit/normalize.rs` | Normalization |
81 + | `vfs.rs` | Virtual file system operations |
82 + | `tags.rs` | Tag add/remove/list |
83 + | `search.rs` | Text + property + tag filters |
84 + | `store.rs` | Content-addressed storage (hash validation) |
85 + | `similarity.rs` | Fingerprint-based similarity (VP-tree) |
86 + | `fingerprint.rs` | Peak envelope extraction |
87 + | `vp_tree.rs` | Vantage point tree implementation |
88 + | `collections.rs` | Collection management |
89 + | `smart_folders.rs` | Smart folder rules |
90 + | `rename.rs` | Bulk rename patterns |
91 + | `id_types.rs` | Type-safe IDs (SampleHash, VfsId, NodeId) |
92 + | `util.rs` | File extension extraction, filename parsing |
93 +
94 + **audiofiles-browser** (~104 tests):
95 +
96 + `state/tests.rs` — Comprehensive UI state machine tests:
97 + - Selection (6): single, toggle, extend, all, clear, keyboard nav
98 + - Bulk operations & undo (18): tag add/remove, move, delete, undo stack
99 + - Export flow (7): configuration, state transitions, errors
100 + - Import & analysis (18): import workflows, analysis config, errors
101 + - Navigation & filtering (25): VFS switching, directory nav, search
102 + - Column config & sort (14): sort toggles, case-insensitive sorting
103 + - Rename patterns (13): preview, pattern tokens, deduplication
104 +
105 + **audiofiles-rhai** (34 tests):
106 +
107 + | Module | What's Tested |
108 + |--------|---------------|
109 + | `engine.rs` | Sandboxed engine creation, operation limits |
110 + | `loader.rs` | Manifest parsing and loading |
111 + | `hooks.rs` | Hook function execution |
112 + | `manifest.rs` | TOML manifest parsing |
113 + | `registry.rs` | Function registry |
114 + | `bundled.rs` | Bundled plugin handling |
115 + | `host_api.rs` | Host API exposure |
116 +
117 + **audiofiles-sync** (8 tests): Sync changelog state machine.
118 +
119 + **audiofiles-app** (12 tests): License validation, audio device discovery, updater.
120 +
121 + ### Test Helpers
122 +
123 + `crates/audiofiles-core/src/test_helpers.rs`:
124 + ```rust
125 + pub fn insert_fake_sample(db: &Database, hash: &str)
126 + pub fn insert_sample_with_analysis(db, hash, name, vfs_id, bpm, key, duration, class) -> NodeId
127 + ```
128 +
129 + ## What's Not Tested
130 +
131 + | Area | Reason |
132 + |------|--------|
133 + | GUI rendering | egui immediate-mode; no automated UI test framework. Manual verification. |
134 + | Platform drag-drop FFI | macOS objc2, Windows COM/OLE. Manual per-platform testing. |
135 + | Audio playback (cpal) | Requires audio hardware. Tested manually. |
136 + | MIDI input | Requires MIDI hardware. |
137 + | SyncKit cloud sync E2E | Requires MNW server. Sync state machine is unit-tested. |
138 + | Distribution artifacts | DMG, MSI, AppImage — tested during release builds. |
139 + | Large file handling | Multi-gigabyte imports; tested manually. |
140 +
141 + ## Adding New Tests
142 +
143 + ### Core unit test
144 + ```rust
145 + #[cfg(test)]
146 + mod tests {
147 + use super::*;
148 +
149 + #[test]
150 + fn my_test() {
151 + let dir = tempfile::tempdir().unwrap();
152 + let db = Database::open_in_memory().unwrap();
153 + // No async, no external deps
154 + }
155 + }
156 + ```
157 +
158 + ### E2E test
159 + ```rust
160 + #[test]
161 + fn my_e2e_test() {
162 + let dir = tempfile::tempdir().unwrap();
163 + let db = Database::open_in_memory().unwrap();
164 + // Generate audio, import, analyze, assert
165 + }
166 + ```
167 +
168 + ## Key Paths
169 +
170 + - `crates/audiofiles-core/tests/` — E2E pipeline + ML validation tests
171 + - `crates/audiofiles-core/src/test_helpers.rs` — Fixture helpers
172 + - `crates/audiofiles-browser/src/state/tests.rs` — UI state tests (104 tests)
173 + - `crates/audiofiles-rhai/src/` — Plugin engine tests
M docs/todo.md +1 -1
@@ -3,7 +3,7 @@
3 3 ## Status
4 4 Done: All pre-beta phases. Active: None. Next: Vocal layer 2, sample forge (phases 10-16).
5 5
6 - v0.3.6. Audit grade A. 611 tests.
6 + v0.4.0. Audit grade A. 611 tests.
7 7
8 8 ---
9 9
@@ -0,0 +1,143 @@
1 + # Troubleshooting — audiofiles
2 +
3 + ## Analysis Pipeline Stalls
4 +
5 + **Symptoms:** Analysis batch stuck at a percentage, no progress events.
6 +
7 + ### Decision Tree
8 +
9 + 1. **Analysis shows error for specific sample?**
10 + - "Probe failed" → Unsupported or corrupted audio format. Convert to WAV/MP3 in Audacity.
11 + - "No audio track" → File has no playable audio stream.
12 + - "Decoder failed" → Codec not supported by Symphonia. Convert to WAV.
13 + - "No audio data" → File decoded to zero samples (silent or truncated).
14 +
15 + 2. **Batch stuck with no error?**
16 + - Worker thread may be stuck on a very large file
17 + - Cancel the batch (UI cancel button sends `WorkerCommand::Cancel` + sets `AtomicBool`)
18 + - Cancel takes effect after the current sample finishes processing
19 + - If cancel doesn't work in 30s, quit and restart app
20 +
21 + 3. **Partial results (some fields null)?**
22 + - `smart_skip` enabled: BPM/key/loop intentionally skipped for non-rhythmic/non-pitched samples (drums, ambience, textures)
23 + - If unexpected: disable `smart_skip` in analysis config and re-analyze
24 +
25 + **WAV fallback:** WAV files that fail the Symphonia probe automatically fall back to the `hound` library.
26 +
27 + **Analysis cap:** By default, expensive analyses (STFT, BPM, key) process only the first 30 seconds (`max_analysis_seconds`). Loudness and fingerprint always use the full signal.
28 +
29 + ## Content-Addressed Storage Corruption
30 +
31 + **Symptoms:** "Sample not found" errors, files inaccessible.
32 +
33 + **Storage model:** Files stored as `{sha256-hash}.{ext}` in flat directory. Hash IS the primary key.
34 +
35 + | Symptom | Cause | Fix |
36 + |---------|-------|-----|
37 + | "Sample not found: {hash}" | File missing from disk or DB | Check if file exists: `ls ~/.config/audiofiles/samples/{hash}.*`. If missing, re-import from original source. |
38 + | "Invalid hash: expected 64 lowercase hex chars" | DB corruption (hash field modified) | Find bad rows: `SELECT hash FROM samples WHERE LENGTH(hash) != 64`, delete them |
39 + | "Cannot import zero-byte file" | Empty file | Skip — only valid audio files with content can be imported |
40 + | Hash mismatch (file changed on disk) | Disk corruption or manual file edit | Delete corrupted file, remove DB row, re-import original |
41 +
42 + **Verify storage integrity:**
43 + ```bash
44 + # Count files on disk vs DB
45 + ls ~/.config/audiofiles/samples/ | wc -l
46 + sqlite3 ~/.config/audiofiles/audiofiles.db "SELECT COUNT(*) FROM samples"
47 + # Numbers should match (approximately — cloud_only samples have no local file)
48 + ```
49 +
50 + ## VFS Inconsistencies
51 +
52 + **Symptoms:** Files disappear from browser, "node not found" errors, duplicate names.
53 +
54 + | Error | Cause | Fix |
55 + |-------|-------|-----|
56 + | "Node not found: {id}" | Node deleted or cross-VFS reference | Refresh view (navigate away and back) |
57 + | "VFS not found: {id}" | VFS deleted | Create new VFS or check DB |
58 + | "Name conflict: {name}" | Duplicate name at same directory level | Rename one of the duplicates |
59 + | "Invalid node name" | Contains `/`, `\`, null bytes, or is `.`/`..` | Use standard filenames |
60 + | "Move would create circular parent reference" | Moving node under its own descendant | Move to a different folder |
61 +
62 + **Broken mirror symlinks** (Unix only): If VFS mirror has dead symlinks, re-run mirror sync (idempotent) or delete `~/.audiofiles-mirror/` and restart.
63 +
64 + **Orphaned samples** (files in store not referenced by any VFS):
65 + ```sql
66 + SELECT hash FROM samples
67 + WHERE hash NOT IN (SELECT DISTINCT sample_hash FROM vfs_nodes WHERE sample_hash IS NOT NULL)
68 + AND hash NOT IN (SELECT DISTINCT sample_hash FROM collection_members WHERE sample_hash IS NOT NULL);
69 + ```
70 +
71 + ## Drag-and-Drop Platform Issues
72 +
73 + ### macOS
74 + | Symptom | Cause | Fix |
75 + |---------|-------|-----|
76 + | Drag doesn't start | App not in foreground, no key window | Click window to focus, try again |
77 + | Drag ignored on re-attempt | Previous drag still in progress (DRAG_ACTIVE flag) | Wait for first drag to complete |
78 +
79 + ### Windows
80 + | Symptom | Cause | Fix |
81 + |---------|-------|-----|
82 + | Drag doesn't start | OleInitialize failed or CF_HDROP allocation failed | Restart app. If persistent, check COM initialization. |
83 + | Drag fails with many files | Temp symlink creation fails | Check `/tmp` permissions and disk space |
84 +
85 + ### All platforms
86 + - Drag creates temp symlinks in `/tmp/audiofiles-drag-{pid}/`
87 + - Name collisions auto-resolved with `(1)`, `(2)` suffixes
88 + - Large batch drags (100+ files) may be slow — try fewer files
89 +
90 + ## Database Issues
91 +
92 + **Database location:** `~/.config/audiofiles/audiofiles.db` (platform-dependent)
93 +
94 + **12 inline migrations** tracked via `PRAGMA user_version`. Run automatically on app startup.
95 +
96 + | Symptom | Cause | Fix |
97 + |---------|-------|-----|
98 + | App won't start | Migration failed or corrupt DB | Delete DB file and restart (fresh DB). Re-import samples. |
99 + | "database is locked" | Another process or backup tool | `lsof \| grep audiofiles.db`, close competing process |
100 + | "FOREIGN KEY constraint failed" | Data corruption (orphaned references) | Fix: `DELETE FROM vfs_nodes WHERE sample_hash NOT IN (SELECT hash FROM samples)` |
101 + | Slow after large import | WAL file too large | Restart app (checkpoints WAL) or: `sqlite3 audiofiles.db 'PRAGMA wal_checkpoint(TRUNCATE)'` |
102 +
103 + **Sync changelog stuck:**
104 + ```sql
105 + -- Check if applying_remote flag is stuck
106 + SELECT * FROM sync_state WHERE key='applying_remote';
107 + -- If value is '1', fix:
108 + UPDATE sync_state SET value='0' WHERE key='applying_remote';
109 + ```
110 +
111 + ## Sync Issues
112 +
113 + **What syncs:** VFS, samples (metadata), collections, vfs_nodes, audio_analysis, tags, collection_members, smart_folders. Sync order respects FK relationships.
114 +
115 + **Blob sync:** Sample audio files sync to cloud storage for VFS entries with `sync_files = true`. The `cloud_only` flag marks samples whose local blobs have been evicted.
116 +
117 + | Symptom | Cause | Fix |
118 + |---------|-------|-----|
119 + | "Auth error: token expired" | SyncKit token expired | Re-authenticate in sync settings |
120 + | "Sync client error: connection timeout" | Network down or server unreachable | Check internet, verify server URL |
121 + | Sample shows "cloud only" but won't download | Upload failed or remote file deleted | Re-trigger sync. If file truly lost, re-import from original. |
122 + | Changes don't appear on other devices | Push failed or changelog empty | Check `sync_changelog` table for unpushed entries |
123 + | Changelog growing unbounded | Old entries never cleaned | `DELETE FROM sync_changelog WHERE pushed = 1 AND timestamp < datetime('now', '-30 days')` |
124 +
125 + ## Diagnostics Checklist
126 +
127 + ```bash
128 + # Database integrity
129 + sqlite3 ~/.config/audiofiles/audiofiles.db "PRAGMA integrity_check"
130 +
131 + # Sample count: disk vs DB
132 + ls ~/.config/audiofiles/samples/ | wc -l
133 + sqlite3 ~/.config/audiofiles/audiofiles.db "SELECT COUNT(*) FROM samples"
134 +
135 + # Pending sync changes
136 + sqlite3 ~/.config/audiofiles/audiofiles.db "SELECT COUNT(*) FROM sync_changelog WHERE pushed = 0"
137 +
138 + # WAL file size (should be <50MB normally)
139 + ls -lh ~/.config/audiofiles/audiofiles.db-wal
140 +
141 + # Broken mirror symlinks (Unix)
142 + find ~/.audiofiles-mirror -type l ! -exec test -e {} \; -print 2>/dev/null
143 + ```
@@ -0,0 +1,91 @@
1 + # Unsafe Mode
2 +
3 + ## Overview
4 +
5 + Per-vault opt-in mode where audiofiles references samples at their original disk location instead of copying them into the vault's `samples/` directory. Trades safety for disk space savings.
6 +
7 + In normal mode, every import copies the file into `vault/samples/<hash>.<ext>`. In unsafe mode, the database records the original path and no copy is made. The file stays where the user put it.
8 +
9 + ## Enabling
10 +
11 + Unsafe mode is a vault-level setting, toggled in vault settings. Changing it only affects future imports — samples already in the vault are not moved or copied retroactively.
12 +
13 + A vault stores its mode as a preference row:
14 +
15 + | Key | Value | Default |
16 + |-----|-------|---------|
17 + | `unsafe_mode` | `0` or `1` | `0` |
18 +
19 + When unsafe mode is on, the vault settings UI shows a persistent warning:
20 + > **Unsafe mode is on.** Samples are not copied into this vault. Moving, renaming, or deleting originals will break references.
21 +
22 + ## Import Behavior
23 +
24 + ### Normal Mode (unchanged)
25 +
26 + 1. Hash file → copy to `samples/<hash>.<ext>` → insert `samples` row
27 + 2. File is self-contained in the vault forever
28 +
29 + ### Unsafe Mode
30 +
31 + 1. Hash file → record original absolute path → insert `samples` row
32 + 2. No copy is made
33 + 3. The `samples` row gets an additional `source_path TEXT` populated with the original absolute path
34 +
35 + If a sample with the same hash already exists (duplicate detection still works), skip it as usual regardless of mode.
36 +
37 + ## Schema Change
38 +
39 + Add one column to `samples`:
40 +
41 + ```sql
42 + ALTER TABLE samples ADD COLUMN source_path TEXT;
43 + ```
44 +
45 + - `NULL` → normal mode sample (blob lives in `samples/`)
46 + - Non-`NULL` → unsafe mode sample (blob lives at this path)
47 +
48 + This is the only way to tell which mode a sample was imported under. A vault in unsafe mode can contain a mix of both kinds if the mode was toggled between imports.
49 +
50 + ## Playback and Access
51 +
52 + When resolving a sample's file path:
53 +
54 + 1. If `source_path` is `NULL`, use `vault/samples/<hash>.<ext>` (current behavior)
55 + 2. If `source_path` is non-`NULL`, use `source_path`
56 +
57 + ## Graceful Recovery
58 +
59 + Unsafe mode makes a best-effort attempt to handle files that have gone missing. It does not try to be clever.
60 +
61 + ### On Access (Playback, Preview, Export, Analysis)
62 +
63 + If `source_path` points to a file that no longer exists:
64 +
65 + 1. Check `vault/samples/<hash>.<ext>` as a fallback — the user may have re-imported in normal mode or manually placed the file there
66 + 2. If the fallback also misses, mark the sample as **unavailable** in the UI (grayed out, struck-through name, tooltip: "Original file not found at `<path>`")
67 + 3. Do not delete the metadata row — the sample keeps its tags, VFS position, and analysis data
68 +
69 + ### Relocate
70 +
71 + Provide a **Relocate** action on unavailable samples:
72 +
73 + - User picks a new file
74 + - App verifies the SHA-256 hash matches the sample's `hash`
75 + - If it matches, update `source_path` to the new location
76 + - If it doesn't match, reject with: "Hash mismatch — this is a different file"
77 +
78 + No batch relocate, no folder scanning, no automatic search. Keep it manual and simple.
79 +
80 + ### Vault Integrity Check
81 +
82 + Add a "Check vault" action (in vault settings, next to the unsafe mode toggle) that scans all `source_path` entries and reports how many are valid vs. missing. Informational only — it does not fix anything, just gives the user a count.
83 +
84 + ## What This Does NOT Do
85 +
86 + - Does not watch the filesystem for changes
87 + - Does not auto-relocate or search for moved files
88 + - Does not create symlinks or hardlinks
89 + - Does not support relative paths — `source_path` is always absolute
90 + - Does not convert existing samples between modes (no retroactive copy-in or copy-out)
91 + - Does not sync `source_path` values via SyncKit — paths are device-local and meaningless on other machines
A flaws.md +152
@@ -0,0 +1,152 @@
1 + # audiofiles — Fuzz Report (2026-04-24)
2 +
3 + Confidence: **MEDIUM** that code is correct overall.
4 +
5 + The codebase has strong foundations (content-addressed storage, parameterized SQL,
6 + sandboxed Rhai, channel-based workers), but systematic fuzzing found 28 issues.
7 +
8 + ## Fixed (mechanical)
9 +
10 + These have been patched in this pass.
11 +
12 + ### F1. Spectral flatness NaN (was CRITICAL)
13 + `analysis/spectral.rs:156` — `(-10.0f64).ln()` is NaN (ln of negative). Should
14 + be `(1e-10f64).ln()`. Propagated NaN through classifier and into DB for any quiet
15 + audio. **Fixed**: replaced with `(1e-10_f64).ln()`.
16 +
17 + ### F2. device_sample_rate == 0 guard (was SERIOUS)
18 + `app/audio.rs` `fill_preview` and `browser/instrument.rs` `render_voices` both
19 + divide by `device_sample_rate`/`device_sr`. A virtual audio device reporting
20 + rate 0 would panic on OOB index. **Fixed**: early return when rate is 0.
21 +
22 + ### F3. NaN/Inf gain corrupts audio (was SERIOUS)
23 + `edit/gain.rs:9` — `NaN.clamp(-1.0, 1.0)` returns NaN. Non-finite `db` produces
24 + NaN or Inf scale, silently writing a corrupted file. **Fixed**: early return when
25 + `!db.is_finite()`.
26 +
27 + ### F4. Fade off-by-one (was MINOR)
28 + `edit/fade.rs:41,61` — `t = i / fade_frames` never reached 1.0 for fade-in or
29 + 0.0 for fade-out. Audible click at the boundary for short fades. **Fixed**: changed
30 + denominator to `(fade_frames - 1).max(1)`. Tests updated.
31 +
32 + ### F5. hound fallback div-by-zero (was MINOR)
33 + `browser/preview.rs:203` — `estimate_duration` hound fallback didn't guard against
34 + `spec.sample_rate == 0`. **Fixed**: return None when sample_rate is 0.
35 +
36 + ### F6. NaN sample propagation (was NOTE)
37 + `browser/preview.rs` `interleaved_to_stereo` — corrupted compressed audio can
38 + decode to NaN/Inf. `NaN.clamp()` doesn't help. **Fixed**: sanitize non-finite
39 + samples to 0.0 during interleaving.
40 +
41 + ### F7. Trial parse failure granted 30 days (was MINOR)
42 + `app/license.rs:218` — unparseable `first_launch_date` returned 30 (full trial).
43 + **Fixed**: returns 0.
44 +
45 + ### F8. Trial expiry never enforced (was SERIOUS)
46 + `app/main.rs:485-487` — checked `trial_state.is_some()` instead of whether trial
47 + was still active. Expired trials got full access. **Fixed**: uses
48 + `trial_days_remaining(t) > 0`.
49 +
50 + ### F9. VFS rename allows duplicate root names (was SERIOUS)
51 + `core/vfs.rs:289` — `rename_node` didn't check for sibling name conflicts. The
52 + `UNIQUE(vfs_id, parent_id, name)` constraint treats each NULL `parent_id` as
53 + distinct, so two root nodes could share a name. **Fixed**: added
54 + `check_sibling_name_conflict` that handles both root (IS NULL) and non-root cases,
55 + excluding the node's own ID.
56 +
57 + ### F10. VFS move allows duplicate names and cross-VFS moves (was SERIOUS)
58 + `core/vfs.rs:306` — `move_node` checked for cycles but not name conflicts at the
59 + target parent, and allowed moves across VFS boundaries (leaving `vfs_id` stale).
60 + **Fixed**: rejects cross-VFS moves, checks sibling name conflicts at destination.
61 +
62 + ---
63 +
64 + ## Open (conceptual — needs design input)
65 +
66 + ### ~~O4. Sync push marks skipped entries as pushed (was SERIOUS)~~ FIXED
67 + `sync/upload.rs:206-249` — `filter_map` dropped changelog entries with unknown ops,
68 + but then ALL entries up to `max_id` got `pushed = 1`. Dropped entries were
69 + permanently lost. **Fixed**: now tracks which IDs were actually included in the
70 + push and only marks those as pushed. Skipped entries stay `pushed = 0` and will be
71 + retried next sync cycle.
72 +
73 + ### O5. AIFF header u32 overflow (SERIOUS)
74 + `export/encode_aiff.rs:28` — `num_frames * channels * bytes_per_sample` uses u32
75 + arithmetic. Overflows for files >24 min stereo 24-bit at 44.1kHz, producing a
76 + corrupt header. The AIFF spec limits chunk sizes to u32, so the fix is to detect
77 + and return an error rather than silently wrapping.
78 +
79 + ### O6. Preview decode thread race (MINOR)
80 + `browser/preview.rs:280-300` — Boolean cancel flag for decode threads. Rapid
81 + preview switching can clear the flag before the old thread checks it, causing two
82 + threads to append to the same buffer (garbled audio, not UB). Fix: generation
83 + counter (AtomicU64) instead of boolean.
84 +
85 + ### O7. No memory bound on preview decode (MINOR)
86 + `browser/preview.rs` `decode_to_f32` — loads entire file into memory. A long file
87 + with incorrect/missing duration metadata bypasses the streaming threshold and
88 + causes OOM. Needs a byte-size or sample-count cap.
89 +
90 + ### O8. Partial migration failure leaves unrecoverable schema (MINOR)
91 + `core/db.rs:669` — If a multi-ALTER-TABLE migration fails partway, the columns
92 + already added aren't rolled back. Next startup retries and crashes on "duplicate
93 + column name". Fix: use `ADD COLUMN IF NOT EXISTS` (SQLite 3.35+).
94 +
95 + ### O9. VFS mirror dot-stripping causes silent collisions (MINOR)
96 + `core/vfs_mirror.rs:186` — `.hidden` sanitizes to `hidden`. If both exist as
97 + siblings, one is silently skipped in the mirror with no warning.
98 +
99 + ### O10. Mirror walkdir follows symlinks (MINOR)
100 + `core/vfs_mirror.rs:169` — `path.is_dir()` follows symlinks. A manually-created
101 + directory symlink inside the mirror could cause infinite traversal or escape the
102 + mirror root. Fix: use `symlink_metadata()`.
103 +
104 + ### O11. Orphan cleanup not transactional (MINOR)
105 + `core/store.rs:215` — `remove_orphaned_samples` queries then deletes in a loop
106 + without a transaction. Between query and delete, a new VFS link could reference
107 + the "orphan". Mitigated by `Mutex<Database>` in `DirectBackend` but not in the
108 + core API contract.
109 +
110 + ### O12. ~~Export path traversal via `..` in VFS names~~ (NOT A BUG)
111 + `validate_node_name` already rejects `.` and `..` as node names.
112 +
113 + ### O13. Flat export filename collisions without naming rules (MINOR)
114 + `export/resolve.rs:87-114` — Dedup suffix (`_2`, `_3`) only runs when
115 + `naming_rules` is `Some`. Without it, same-named files in different VFS dirs
116 + silently overwrite each other in flat export.
117 +
118 + ### O14. Trial bypass by deleting trial.json (MINOR)
119 + `app/license.rs:201-213` — No server-side trial anchor. Deleting `trial.json`
120 + resets the 30-day clock. Machine ID exists but isn't used for trial tracking.
121 + Fix: register trial start on server keyed to machine_id.
122 +
123 + ### O15. Deterministic dither seed (MINOR)
124 + `export/encode.rs:30` — Constant seed `0xDEAD_BEEF` means every 16-bit export has
125 + the same dither pattern. Not corrupt, but defeats dither's purpose. May be
126 + intentional for reproducibility.
127 +
128 + ### O16. LUFS normalize can clip (MINOR)
129 + `edit/normalize.rs:56` — LUFS-based gain can push peaks above 1.0 (LUFS measures
130 + loudness, not peaks). No limiter or clamp applied after scaling.
131 +
132 + ### O17. VP-tree stack overflow on identical items (MINOR)
133 + `core/vp_tree.rs:162` — All-identical feature vectors cause O(n)-deep recursion.
134 + 10k+ identical items overflow the stack. Fix: iterative build or depth cap.
135 +
136 + ---
137 +
138 + ## Notes (low risk, awareness only)
139 +
140 + - **N1.** `app/main.rs:148` — SyncKit API key stored as plaintext file. No OS
141 + keychain integration.
142 + - **N2.** `app/updater.rs:122` — No signature verification on OTA update metadata.
143 + HTTPS provides transport security but not integrity if server is compromised.
144 + - **N3.** `app/main.rs:1035` — Update download URL validated only as `https://`,
145 + not domain-whitelisted.
146 + - **N4.** `sync/download.rs:66-100` — Failed blob download leaves `cloud_only=1`
147 + until next sync cycle (self-healing but confusing intermediate state).
148 + - **N5.** `core/vault.rs:153` — Path normalization fallback can allow duplicate
149 + vault entries when paths involve `..`.
150 + - **N6.** `browser/preview.rs:114` — Missing sample rate defaults to 44100 silently.
151 + - **N7.** `analysis/worker.rs:152` — Non-monotonic progress reporting due to
152 + relaxed atomic ordering under rayon parallelism (cosmetic only).