| 1 |
# Balanced Breakfast -- Architecture |
| 2 |
|
| 3 |
When it comes to media and news, it's good to be a picky eater. |
| 4 |
|
| 5 |
## Positioning |
| 6 |
|
| 7 |
The only native desktop feed reader with a user-scriptable plugin system for arbitrary sources. Rhai plugins let users write custom source adapters (no other reader is extensible this way). First-class Hacker News and arXiv support. Free with no limits, no account required, local-first. Cross-platform native (macOS, Windows, Linux via Tauri 2). Target users: power users/developers consuming content from many sources, privacy-conscious local-first users, technical professionals who value keyboard shortcuts and hackability. |
| 8 |
|
| 9 |
## System Overview |
| 10 |
|
| 11 |
Balanced Breakfast is a desktop feed aggregator built with Tauri 2. It unifies RSS, Atom, JSON Feed, Hacker News, arXiv, and other sources into a single timeline. The backend is a Rust workspace with four library crates and one application crate. The frontend is vanilla HTML/CSS/JS served by Tauri's webview. Feeds are fetched by Rhai script plugins ("bussers"), stored in SQLite, and presented through a three-panel layout (sources, items, detail). |
| 12 |
|
| 13 |
## Workspace Layout |
| 14 |
|
| 15 |
|
| 16 |
|
| 17 |
| bb-interface | `crates/bb-interface/` | Leaf crate. Shared types for the plugin contract: `FeedItem`, `FetchResult`, `ConfigSchema`, `ConfigField`, `BusserCapabilities`, `BusserConfig`. No internal dependencies. | |
| 18 |
| bb-core | `crates/bb-core/` | Orchestrator and plugin runtime. Coordinates plugins, database, and feed scheduling. Contains the Rhai engine setup, plugin manager, config encryption (`crypto`), and URL tracker stripping (`url_cleaner`). Depends on bb-interface. | |
| 19 |
| bb-feed | `crates/bb-feed/` | Feed aggregation and ordering. `FeedGenerator` reads items from the DB and applies filters (source, unread, starred, search, tags, query feed conditions). `OrderBy` sorts results (chronological, score, unread-first, starred-first). Depends on bb-interface. | |
| 20 |
| bb-db | `crates/bb-db/` | SQLite persistence via sqlx. Repository types for feeds, items, tags, busser state, user config, and query feeds. FTS5 full-text search. Depends on bb-interface. | |
| 21 |
| src-tauri | `src-tauri/` | Tauri 2 desktop shell. Thin command wrappers over the library crates, app state management, background tasks (auto-fetch, stale cleanup), sync scheduler, and the vanilla JS frontend. Depends on all four library crates. | |
| 22 |
|
| 23 |
Dependency flow: `bb-interface` (leaf) --> `bb-core`, `bb-feed`, `bb-db` --> `src-tauri` (root). |
| 24 |
|
| 25 |
## Orchestrator |
| 26 |
|
| 27 |
The `Orchestrator` (`bb-core::orchestrator`) is the central coordination point. It owns the `Database` and a `PluginManager` behind an `Arc<RwLock<>>`. Its responsibilities: |
| 28 |
|
| 29 |
- **Plugin lifecycle** -- load `.rhai` scripts from the plugins directory, initialize them with config from the DB, and provide fetch/shutdown operations. |
| 30 |
- **Fetch execution** -- call a plugin's `fetch()`, strip tracking parameters from item URLs and HTML bodies, upsert results into the DB via the items repository, and record success/failure on the feed. |
| 31 |
- **Circuit breaker** -- after 10 consecutive fetch failures (`CIRCUIT_BREAKER_THRESHOLD`), the feed is marked `circuit_broken` and excluded from auto-fetch until manually reset. |
| 32 |
- **Secret management** -- holds an optional AES-256-GCM key. On startup, encrypts any plaintext Secret fields in existing feed configs (migration from legacy plaintext). |
| 33 |
- **Fetch-all** -- iterates all loaded plugins and fetches each, collecting total item counts. |
| 34 |
|
| 35 |
The orchestrator does not own the fetch scheduler or background tasks. Those are managed by `AppState` in the Tauri layer. |
| 36 |
|
| 37 |
## Plugin System (Rhai) |
| 38 |
|
| 39 |
Plugins are `.rhai` text files dropped into the plugins directory. The Rhai engine is configured with safety limits and host functions. |
| 40 |
|
| 41 |
### Plugin Contract |
| 42 |
|
| 43 |
Every plugin must define four functions: |
| 44 |
|
| 45 |
- `id()` -- returns a unique string identifier (e.g. `"rss"`, `"hackernews"`) |
| 46 |
- `name()` -- returns a human-readable display name |
| 47 |
- `config_schema()` -- returns a map describing configuration fields (key, label, field_type, required, default, options) |
| 48 |
- `fetch(config, cursor)` -- returns `{ items: [...], has_more: bool, next_cursor: string? }` |
| 49 |
|
| 50 |
An optional `capabilities()` function can declare pagination support, custom fetch intervals, auth requirements, etc. |
| 51 |
|
| 52 |
### Sandboxing |
| 53 |
|
| 54 |
- **Operations cap:** `max_operations(100_000)` -- a typical RSS fetch costs 1k-5k ops; this catches infinite loops. |
| 55 |
- **Expression depth:** `max_expr_depths(128, 128)` -- prevents stack overflows from deeply nested or recursive scripts. |
| 56 |
- **HTTP timeout:** 15 seconds per request. |
| 57 |
- **Response size:** 2 MB cap per response body. |
| 58 |
- **Request limit:** 100 HTTP requests per `fetch()` invocation. Counter resets at the start of each fetch. |
| 59 |
- **URL validation:** only `http://` and `https://` schemes; localhost, `127.0.0.1`, `[::1]`, `0.0.0.0`, and private RFC 1918 ranges (`10.x`, `172.16-31.x`, `192.168.x`, `169.254.x`) are blocked. |
| 60 |
|
| 61 |
### Host Functions |
| 62 |
|
| 63 |
Functions registered into the Rhai engine for scripts to call: |
| 64 |
|
| 65 |
|
| 66 |
|
| 67 |
| `http_get(url)` | Fetch URL, return response body as string | |
| 68 |
| `http_get_json(url)` | Fetch URL, parse JSON, return as Dynamic | |
| 69 |
| `parse_json(str)` | Parse a JSON string | |
| 70 |
| `parse_xml(str)` | Parse XML into a simplified `{tag, text, attrs, children}` structure | |
| 71 |
| `parse_feed(str)` | Auto-detect RSS/Atom/JSON Feed, return `{title, link, entries}` | |
| 72 |
| `parse_datetime(str)` | Parse ISO 8601 or RFC 2822 date to Unix timestamp | |
| 73 |
| `timestamp_now()` | Current UTC timestamp (seconds) | |
| 74 |
| `html_to_text(html)` | Strip HTML, render as plain text (80-char width) | |
| 75 |
| `extract_article(html)` | Readability extraction: returns `{title, content, text}` | |
| 76 |
| `truncate(text, max)` | Truncate with ellipsis | |
| 77 |
| `str_contains`, `str_split`, `str_replace`, `str_trim` | String utilities | |
| 78 |
| `strip_tracking(url)` | Remove utm_*, fbclid, gclid, etc. from a URL | |
| 79 |
| `parse_int(str)` | Parse string to integer (returns UNIT on failure) | |
| 80 |
| `debug_print(val)` | Log to tracing at debug level | |
| 81 |
|
| 82 |
### Config Field Types |
| 83 |
|
| 84 |
Plugins declare their configuration schema with these field types: `Text`, `TextArea`, `Secret`, `Url`, `Number`, `Toggle`, `Select`. Fields marked `Url` become feed subscriptions; other fields become key-value options passed to `fetch()`. |
| 85 |
|
| 86 |
### Bundled Plugins |
| 87 |
|
| 88 |
Three plugins ship with the app: `rss.rhai` (RSS/Atom/JSON Feed), `hackernews.rhai` (HN stories), `arxiv.rhai` (arXiv papers). A `reader.rhai` plugin extracts article content from URLs using the readability algorithm. |
| 89 |
|
| 90 |
## Feed Aggregation |
| 91 |
|
| 92 |
The `FeedGenerator` (`bb-feed::generator`) reads items from the database, applies filters and ordering, and returns paginated results. |
| 93 |
|
| 94 |
**Filtering** combines SQL-level and in-memory strategies: |
| 95 |
- Source, unread, starred, and FTS5 search are pushed into SQL for accurate LIMIT/OFFSET pagination. |
| 96 |
- Item-level tags, feed-level tags, and query feed conditions (title/author/body contains, equals, not_contains, matches_regex) run in-memory after the SQL query. |
| 97 |
|
| 98 |
**Ordering** is applied in-memory after filtering: |
| 99 |
- `Chronological` -- newest first (default) |
| 100 |
- `Score` -- highest score first, with chronological tiebreak |
| 101 |
- `UnreadFirst` -- unread items before read, chronological within each group |
| 102 |
- `StarredFirst` -- starred items before unstarred, chronological within each group |
| 103 |
|
| 104 |
**Pagination** fetches `page_size + 1` items to detect whether more pages exist, then truncates to the exact page size. |
| 105 |
|
| 106 |
## Database Layer |
| 107 |
|
| 108 |
SQLite via sqlx with compile-time migrations (10 migrations). The `Database` struct holds a connection pool (`max_connections: 16`) and provides typed repository accessors. |
| 109 |
|
| 110 |
### Tables |
| 111 |
|
| 112 |
|
| 113 |
|
| 114 |
| `feeds` | Registered feed subscriptions. Keyed by UUID, linked to a busser_id. Tracks config JSON, enabled state, last_fetch, health counters, and circuit breaker state. | |
| 115 |
| `feed_items` | All fetched items. Deduplicated by `external_id` (UNIQUE). Stores bite display fields, full content, metadata (score, tags as JSON array), and user state (is_read, is_starred). | |
| 116 |
| `feed_items_fts` | FTS5 virtual table in external-content mode. Indexes title, body, and bite_text. Kept in sync via INSERT/UPDATE/DELETE triggers. | |
| 117 |
| `feed_tags` | User-assigned tags on feeds (many-to-many). | |
| 118 |
| `busser_state` | Plugin key-value state (cursors, tokens, pagination markers). Keyed by `(busser_id, key)`. | |
| 119 |
| `user_config` | Key-value preferences (theme, welcome flag). Synced via changelog triggers. | |
| 120 |
| `query_feeds` | Saved filter rules that act as virtual sources. Rules stored as JSON array. Synced via changelog triggers. | |
| 121 |
| `sync_state` | Single-row sync metadata (device_id, pull_cursor, auto_sync settings). | |
| 122 |
| `sync_changelog` | Local changes pending push. Written by triggers on feeds, feed_tags, user_config, query_feeds, and feed_items (user state only). | |
| 123 |
|
| 124 |
### Repositories |
| 125 |
|
| 126 |
- `FeedsRepository` -- CRUD, enable/disable, last_fetch updates, fetch failure recording, circuit breaker management |
| 127 |
- `ItemsRepository` -- upsert (dedup by external_id), read/star toggling, paginated listing (by busser, by feed, unread, starred), FTS5 search, counts, stale item deletion |
| 128 |
- `TagsRepository` -- per-feed tag assignment, distinct tag listing, bulk feed-tag pairs |
| 129 |
- `StateRepository` -- busser key-value state (get/set/delete by busser_id + key) |
| 130 |
- `ConfigRepository` -- user_config key-value pairs (get/set/delete) |
| 131 |
- `QueryFeedsRepository` -- query feed CRUD (create/update/delete/list) |
| 132 |
|
| 133 |
FTS5 queries are sanitized by wrapping each search term in double quotes to prevent syntax injection (`AND`, `OR`, `NOT`, `NEAR` operators). The `^` prefix and `*` suffix characters are stripped. |
| 134 |
|
| 135 |
## Sync Integration |
| 136 |
|
| 137 |
Balanced Breakfast integrates with the SyncKit client SDK for cross-device sync. The `sync_service` module handles push/pull of local changes. |
| 138 |
|
| 139 |
**What gets synced:** |
| 140 |
- Feed subscriptions (feeds table: config, enabled state, health counters) |
| 141 |
- Feed tags |
| 142 |
- User config (preferences) |
| 143 |
- Query feeds (saved filter rules) |
| 144 |
- Feed item user state (is_read, is_starred changes only -- not item content) |
| 145 |
|
| 146 |
**How it works:** SQLite triggers on synced tables write changes to `sync_changelog`. The sync engine pushes unpushed entries in batches of 500, pulls remote changes using a cursor, and applies them in FK-safe order (parents before children for upserts, children before parents for deletes). A `applying_remote` flag in `sync_state` suppresses trigger firing during remote change application to prevent echo loops. |
| 147 |
|
| 148 |
The sync scheduler runs on a configurable interval (default 15 minutes). Encryption is E2E via the SyncKit client's ChaCha20-Poly1305 with keys stored in the OS keychain. |
| 149 |
|
| 150 |
## Security Model |
| 151 |
|
| 152 |
- **Plugin secrets at rest:** AES-256-GCM encryption. Encrypted format: `bb_enc:v1:<base64(nonce[12] || ciphertext || tag[16])>`. Key stored in `encryption.key` with 0600 permissions (Unix). Backward-compatible: unencrypted values pass through on decrypt. |
| 153 |
- **FTS5 query sanitization:** User search input is quoted per-word to prevent FTS5 operator injection. Special characters (`^`, `*`) are stripped. |
| 154 |
- **URL validation:** Rhai HTTP host functions block non-HTTP schemes and requests to localhost/internal addresses. |
| 155 |
- **Response size limits:** 2 MB cap on HTTP response bodies prevents memory exhaustion. |
| 156 |
- **URL tracking removal:** utm_*, fbclid, gclid, msclkid, and other tracking parameters stripped from item URLs and body HTML on ingest. |
| 157 |
- **Sync encryption:** E2E via SyncKit (ChaCha20-Poly1305 + Argon2 key derivation). Server never sees plaintext. |
| 158 |
|
| 159 |
## Concurrency Model |
| 160 |
|
| 161 |
- **Tokio async runtime** (multi-threaded) drives all I/O: database queries, HTTP fetches, sync operations. |
| 162 |
- **`Arc<RwLock<PluginManager>>`** -- the orchestrator holds the plugin manager behind a Tokio RwLock. Read lock for fetches and schema queries; write lock only during plugin loading. |
| 163 |
- **`Arc<AppState>`** -- shared across Tauri commands and background tasks. Managed by Tauri's state system. |
| 164 |
- **AbortHandles** -- background tasks (auto-fetch loop, stale cleanup) store their `AbortHandle` in `AppState` behind `std::sync::Mutex`. On shutdown or task replacement, existing handles are aborted. |
| 165 |
- **Auto-fetch loop** -- checks every 60 seconds which plugins are due for a fetch based on their last_fetch timestamp and configured interval. |
| 166 |
- **Stale cleanup** -- runs every 6 hours, deleting read (non-starred) items older than 30 days. |
| 167 |
|
| 168 |
## Frontend Architecture |
| 169 |
|
| 170 |
The frontend is vanilla HTML/CSS/JS served by Tauri's webview. There is no build step or bundler. |
| 171 |
|
| 172 |
- **Tauri commands** act as thin wrappers: each command extracts parameters, calls the orchestrator or feed generator, and returns a serialized response. All business logic lives in the library crates. |
| 173 |
- **Tauri events** notify the frontend of background activity: `auto-fetch-complete` (new items available), `auto-fetch-error`, `feed-circuit-broken`. |
| 174 |
- **JS files** live in `src-tauri/frontend/js/`. Communication with Rust is via `window.__TAURI__`. |
| 175 |
|
| 176 |
## Feed Health Tracking |
| 177 |
|
| 178 |
Each feed tracks `consecutive_failures` and `last_error`. On fetch success, failures reset to 0. On failure, the counter increments. Health status: |
| 179 |
|
| 180 |
- **Green (healthy):** 0 consecutive failures |
| 181 |
- **Yellow (degraded):** 1-9 consecutive failures |
| 182 |
- **Red (circuit broken):** 10+ failures trips the circuit breaker; feed is excluded from auto-fetch until manually reset via `reset_circuit_breaker` |
| 183 |
|
| 184 |
## Key Design Decisions |
| 185 |
|
| 186 |
- **Rhai over WASM/Lua:** Rhai is a Rust-native scripting language with easy type bridging and built-in safety limits. No FFI boundary, no separate runtime. Plugins are plain text files, not compiled artifacts. |
| 187 |
- **Single orchestrator:** All coordination flows through one struct. No message passing between crates; the orchestrator calls methods directly. Simpler than an actor model for this scale. |
| 188 |
- **SQL-first filtering with in-memory fallback:** Simple filters (source, unread, starred, search) use SQL for correct pagination. Complex filters (regex, tag intersection, query feed conditions) run in-memory. This avoids dynamic SQL generation while keeping common paths fast. |
| 189 |
- **External-content FTS5:** The FTS index references `feed_items` by rowid with no data duplication. Triggers keep it in sync. This saves disk space compared to a full-copy FTS table. |
| 190 |
- **Dedup by external_id:** Items use `external_id` (UNIQUE) for deduplication on upsert. The busser provides the ID; the DB enforces uniqueness. |
| 191 |
- **Changelog-based sync:** SQLite triggers write changes to `sync_changelog` rather than diffing snapshots. This captures intent (INSERT/UPDATE/DELETE) and works naturally with the SyncKit push/pull model. |
| 192 |
|
| 193 |
## Key Paths |
| 194 |
|
| 195 |
|
| 196 |
|
| 197 |
| Workspace manifest | `Cargo.toml` | |
| 198 |
| Plugin interface types | `crates/bb-interface/src/` | |
| 199 |
| Orchestrator | `crates/bb-core/src/orchestrator.rs` | |
| 200 |
| Plugin manager | `crates/bb-core/src/plugin_manager.rs` | |
| 201 |
| Rhai runtime | `crates/bb-core/src/rhai_plugin/` | |
| 202 |
| Host functions | `crates/bb-core/src/rhai_plugin/host_functions.rs` | |
| 203 |
| Type conversions | `crates/bb-core/src/rhai_plugin/conversions.rs` | |
| 204 |
| Secret encryption | `crates/bb-core/src/crypto.rs` | |
| 205 |
| URL cleaner | `crates/bb-core/src/url_cleaner.rs` | |
| 206 |
| Feed generator | `crates/bb-feed/src/generator.rs` | |
| 207 |
| Ordering/filtering | `crates/bb-feed/src/ordering.rs` | |
| 208 |
| Database layer | `crates/bb-db/src/` | |
| 209 |
| Repositories | `crates/bb-db/src/repository.rs` | |
| 210 |
| Migrations | `migrations/sqlite/` (001-010) | |
| 211 |
| Tauri app state | `src-tauri/src/state.rs` | |
| 212 |
| Tauri commands | `src-tauri/src/commands/` | |
| 213 |
| Sync service | `src-tauri/src/sync_service.rs` | |
| 214 |
| Bundled plugins | `plugins/` | |
| 215 |
| Frontend JS | `src-tauri/frontend/js/` | |
| 216 |
| Frontend CSS | `src-tauri/frontend/css/` | |
| 217 |
|