# Balanced Breakfast -- Architecture When it comes to media and news, it's good to be a picky eater. ## Positioning The only native desktop feed reader with a user-scriptable plugin system for arbitrary sources. Rhai plugins let users write custom source adapters (no other reader is extensible this way). First-class Hacker News and arXiv support. Free with no limits, no account required, local-first. Cross-platform native (macOS, Windows, Linux via Tauri 2). Target users: power users/developers consuming content from many sources, privacy-conscious local-first users, technical professionals who value keyboard shortcuts and hackability. ## System Overview Balanced Breakfast is a desktop feed aggregator built with Tauri 2. It unifies RSS, Atom, JSON Feed, Hacker News, arXiv, and other sources into a single timeline. The backend is a Rust workspace with four library crates and one application crate. The frontend is vanilla HTML/CSS/JS served by Tauri's webview. Feeds are fetched by Rhai script plugins ("bussers"), stored in SQLite, and presented through a three-panel layout (sources, items, detail). ## Workspace Layout | Crate | Path | Purpose | |-------|------|---------| | bb-interface | `crates/bb-interface/` | Leaf crate. Shared types for the plugin contract: `FeedItem`, `FetchResult`, `ConfigSchema`, `ConfigField`, `BusserCapabilities`, `BusserConfig`. No internal dependencies. | | bb-core | `crates/bb-core/` | Orchestrator and plugin runtime. Coordinates plugins, database, and feed scheduling. Contains the Rhai engine setup, plugin manager, config encryption (`crypto`), and URL tracker stripping (`url_cleaner`). Depends on bb-interface. | | bb-feed | `crates/bb-feed/` | Feed aggregation and ordering. `FeedGenerator` reads items from the DB and applies filters (source, unread, starred, search, tags, query feed conditions). `OrderBy` sorts results (chronological, score, unread-first, starred-first). Depends on bb-interface. | | bb-db | `crates/bb-db/` | SQLite persistence via sqlx. Repository types for feeds, items, tags, busser state, user config, and query feeds. FTS5 full-text search. Depends on bb-interface. | | src-tauri | `src-tauri/` | Tauri 2 desktop shell. Thin command wrappers over the library crates, app state management, background tasks (auto-fetch, stale cleanup), sync scheduler, and the vanilla JS frontend. Depends on all four library crates. | Dependency flow: `bb-interface` (leaf) --> `bb-core`, `bb-feed`, `bb-db` --> `src-tauri` (root). ## Orchestrator The `Orchestrator` (`bb-core::orchestrator`) is the central coordination point. It owns the `Database` and a `PluginManager` behind an `Arc>`. Its responsibilities: - **Plugin lifecycle** -- load `.rhai` scripts from the plugins directory, initialize them with config from the DB, and provide fetch/shutdown operations. - **Fetch execution** -- call a plugin's `fetch()`, strip tracking parameters from item URLs and HTML bodies, upsert results into the DB via the items repository, and record success/failure on the feed. - **Circuit breaker** -- after 10 consecutive fetch failures (`CIRCUIT_BREAKER_THRESHOLD`), the feed is marked `circuit_broken` and excluded from auto-fetch until manually reset. - **Secret management** -- holds an optional AES-256-GCM key. On startup, encrypts any plaintext Secret fields in existing feed configs (migration from legacy plaintext). - **Fetch-all** -- iterates all loaded plugins and fetches each, collecting total item counts. The orchestrator does not own the fetch scheduler or background tasks. Those are managed by `AppState` in the Tauri layer. ## Plugin System (Rhai) Plugins are `.rhai` text files dropped into the plugins directory. The Rhai engine is configured with safety limits and host functions. ### Plugin Contract Every plugin must define four functions: - `id()` -- returns a unique string identifier (e.g. `"rss"`, `"hackernews"`) - `name()` -- returns a human-readable display name - `config_schema()` -- returns a map describing configuration fields (key, label, field_type, required, default, options) - `fetch(config, cursor)` -- returns `{ items: [...], has_more: bool, next_cursor: string? }` An optional `capabilities()` function can declare pagination support, custom fetch intervals, auth requirements, etc. ### Sandboxing - **Operations cap:** `max_operations(100_000)` -- a typical RSS fetch costs 1k-5k ops; this catches infinite loops. - **Expression depth:** `max_expr_depths(128, 128)` -- prevents stack overflows from deeply nested or recursive scripts. - **HTTP timeout:** 15 seconds per request. - **Response size:** 2 MB cap per response body. - **Request limit:** 100 HTTP requests per `fetch()` invocation. Counter resets at the start of each fetch. - **URL validation:** only `http://` and `https://` schemes; localhost, `127.0.0.1`, `[::1]`, `0.0.0.0`, and private RFC 1918 ranges (`10.x`, `172.16-31.x`, `192.168.x`, `169.254.x`) are blocked. ### Host Functions Functions registered into the Rhai engine for scripts to call: | Function | Description | |----------|-------------| | `http_get(url)` | Fetch URL, return response body as string | | `http_get_json(url)` | Fetch URL, parse JSON, return as Dynamic | | `parse_json(str)` | Parse a JSON string | | `parse_xml(str)` | Parse XML into a simplified `{tag, text, attrs, children}` structure | | `parse_feed(str)` | Auto-detect RSS/Atom/JSON Feed, return `{title, link, entries}` | | `parse_datetime(str)` | Parse ISO 8601 or RFC 2822 date to Unix timestamp | | `timestamp_now()` | Current UTC timestamp (seconds) | | `html_to_text(html)` | Strip HTML, render as plain text (80-char width) | | `extract_article(html)` | Readability extraction: returns `{title, content, text}` | | `truncate(text, max)` | Truncate with ellipsis | | `str_contains`, `str_split`, `str_replace`, `str_trim` | String utilities | | `strip_tracking(url)` | Remove utm_*, fbclid, gclid, etc. from a URL | | `parse_int(str)` | Parse string to integer (returns UNIT on failure) | | `debug_print(val)` | Log to tracing at debug level | ### Config Field Types Plugins declare their configuration schema with these field types: `Text`, `TextArea`, `Secret`, `Url`, `Number`, `Toggle`, `Select`. Fields marked `Url` become feed subscriptions; other fields become key-value options passed to `fetch()`. ### Bundled Plugins Three plugins ship with the app: `rss.rhai` (RSS/Atom/JSON Feed), `hackernews.rhai` (HN stories), `arxiv.rhai` (arXiv papers). A `reader.rhai` plugin extracts article content from URLs using the readability algorithm. ## Feed Aggregation The `FeedGenerator` (`bb-feed::generator`) reads items from the database, applies filters and ordering, and returns paginated results. **Filtering** combines SQL-level and in-memory strategies: - Source, unread, starred, and FTS5 search are pushed into SQL for accurate LIMIT/OFFSET pagination. - Item-level tags, feed-level tags, and query feed conditions (title/author/body contains, equals, not_contains, matches_regex) run in-memory after the SQL query. **Ordering** is applied in-memory after filtering: - `Chronological` -- newest first (default) - `Score` -- highest score first, with chronological tiebreak - `UnreadFirst` -- unread items before read, chronological within each group - `StarredFirst` -- starred items before unstarred, chronological within each group **Pagination** fetches `page_size + 1` items to detect whether more pages exist, then truncates to the exact page size. ## Database Layer SQLite via sqlx with compile-time migrations (10 migrations). The `Database` struct holds a connection pool (`max_connections: 16`) and provides typed repository accessors. ### Tables | Table | Purpose | |-------|---------| | `feeds` | Registered feed subscriptions. Keyed by UUID, linked to a busser_id. Tracks config JSON, enabled state, last_fetch, health counters, and circuit breaker state. | | `feed_items` | All fetched items. Deduplicated by `external_id` (UNIQUE). Stores bite display fields, full content, metadata (score, tags as JSON array), and user state (is_read, is_starred). | | `feed_items_fts` | FTS5 virtual table in external-content mode. Indexes title, body, and bite_text. Kept in sync via INSERT/UPDATE/DELETE triggers. | | `feed_tags` | User-assigned tags on feeds (many-to-many). | | `busser_state` | Plugin key-value state (cursors, tokens, pagination markers). Keyed by `(busser_id, key)`. | | `user_config` | Key-value preferences (theme, welcome flag). Synced via changelog triggers. | | `query_feeds` | Saved filter rules that act as virtual sources. Rules stored as JSON array. Synced via changelog triggers. | | `sync_state` | Single-row sync metadata (device_id, pull_cursor, auto_sync settings). | | `sync_changelog` | Local changes pending push. Written by triggers on feeds, feed_tags, user_config, query_feeds, and feed_items (user state only). | ### Repositories - `FeedsRepository` -- CRUD, enable/disable, last_fetch updates, fetch failure recording, circuit breaker management - `ItemsRepository` -- upsert (dedup by external_id), read/star toggling, paginated listing (by busser, by feed, unread, starred), FTS5 search, counts, stale item deletion - `TagsRepository` -- per-feed tag assignment, distinct tag listing, bulk feed-tag pairs - `StateRepository` -- busser key-value state (get/set/delete by busser_id + key) - `ConfigRepository` -- user_config key-value pairs (get/set/delete) - `QueryFeedsRepository` -- query feed CRUD (create/update/delete/list) FTS5 queries are sanitized by wrapping each search term in double quotes to prevent syntax injection (`AND`, `OR`, `NOT`, `NEAR` operators). The `^` prefix and `*` suffix characters are stripped. ## Sync Integration Balanced Breakfast integrates with the SyncKit client SDK for cross-device sync. The `sync_service` module handles push/pull of local changes. **What gets synced:** - Feed subscriptions (feeds table: config, enabled state, health counters) - Feed tags - User config (preferences) - Query feeds (saved filter rules) - Feed item user state (is_read, is_starred changes only -- not item content) **How it works:** SQLite triggers on synced tables write changes to `sync_changelog`. The sync engine pushes unpushed entries in batches of 500, pulls remote changes using a cursor, and applies them in FK-safe order (parents before children for upserts, children before parents for deletes). A `applying_remote` flag in `sync_state` suppresses trigger firing during remote change application to prevent echo loops. The sync scheduler runs on a configurable interval (default 15 minutes). Encryption is E2E via the SyncKit client's ChaCha20-Poly1305 with keys stored in the OS keychain. ## Security Model - **Plugin secrets at rest:** AES-256-GCM encryption. Encrypted format: `bb_enc:v1:`. Key stored in `encryption.key` with 0600 permissions (Unix). Backward-compatible: unencrypted values pass through on decrypt. - **FTS5 query sanitization:** User search input is quoted per-word to prevent FTS5 operator injection. Special characters (`^`, `*`) are stripped. - **URL validation:** Rhai HTTP host functions block non-HTTP schemes and requests to localhost/internal addresses. - **Response size limits:** 2 MB cap on HTTP response bodies prevents memory exhaustion. - **URL tracking removal:** utm_*, fbclid, gclid, msclkid, and other tracking parameters stripped from item URLs and body HTML on ingest. - **Sync encryption:** E2E via SyncKit (ChaCha20-Poly1305 + Argon2 key derivation). Server never sees plaintext. ## Concurrency Model - **Tokio async runtime** (multi-threaded) drives all I/O: database queries, HTTP fetches, sync operations. - **`Arc>`** -- the orchestrator holds the plugin manager behind a Tokio RwLock. Read lock for fetches and schema queries; write lock only during plugin loading. - **`Arc`** -- shared across Tauri commands and background tasks. Managed by Tauri's state system. - **AbortHandles** -- background tasks (auto-fetch loop, stale cleanup) store their `AbortHandle` in `AppState` behind `std::sync::Mutex`. On shutdown or task replacement, existing handles are aborted. - **Auto-fetch loop** -- checks every 60 seconds which plugins are due for a fetch based on their last_fetch timestamp and configured interval. - **Stale cleanup** -- runs every 6 hours, deleting read (non-starred) items older than 30 days. ## Frontend Architecture The frontend is vanilla HTML/CSS/JS served by Tauri's webview. There is no build step or bundler. - **Tauri commands** act as thin wrappers: each command extracts parameters, calls the orchestrator or feed generator, and returns a serialized response. All business logic lives in the library crates. - **Tauri events** notify the frontend of background activity: `auto-fetch-complete` (new items available), `auto-fetch-error`, `feed-circuit-broken`. - **JS files** live in `src-tauri/frontend/js/`. Communication with Rust is via `window.__TAURI__`. ## Feed Health Tracking Each feed tracks `consecutive_failures` and `last_error`. On fetch success, failures reset to 0. On failure, the counter increments. Health status: - **Green (healthy):** 0 consecutive failures - **Yellow (degraded):** 1-9 consecutive failures - **Red (circuit broken):** 10+ failures trips the circuit breaker; feed is excluded from auto-fetch until manually reset via `reset_circuit_breaker` ## Key Design Decisions - **Rhai over WASM/Lua:** Rhai is a Rust-native scripting language with easy type bridging and built-in safety limits. No FFI boundary, no separate runtime. Plugins are plain text files, not compiled artifacts. - **Single orchestrator:** All coordination flows through one struct. No message passing between crates; the orchestrator calls methods directly. Simpler than an actor model for this scale. - **SQL-first filtering with in-memory fallback:** Simple filters (source, unread, starred, search) use SQL for correct pagination. Complex filters (regex, tag intersection, query feed conditions) run in-memory. This avoids dynamic SQL generation while keeping common paths fast. - **External-content FTS5:** The FTS index references `feed_items` by rowid with no data duplication. Triggers keep it in sync. This saves disk space compared to a full-copy FTS table. - **Dedup by external_id:** Items use `external_id` (UNIQUE) for deduplication on upsert. The busser provides the ID; the DB enforces uniqueness. - **Changelog-based sync:** SQLite triggers write changes to `sync_changelog` rather than diffing snapshots. This captures intent (INSERT/UPDATE/DELETE) and works naturally with the SyncKit push/pull model. ## Key Paths | What | Where | |------|-------| | Workspace manifest | `Cargo.toml` | | Plugin interface types | `crates/bb-interface/src/` | | Orchestrator | `crates/bb-core/src/orchestrator.rs` | | Plugin manager | `crates/bb-core/src/plugin_manager.rs` | | Rhai runtime | `crates/bb-core/src/rhai_plugin/` | | Host functions | `crates/bb-core/src/rhai_plugin/host_functions.rs` | | Type conversions | `crates/bb-core/src/rhai_plugin/conversions.rs` | | Secret encryption | `crates/bb-core/src/crypto.rs` | | URL cleaner | `crates/bb-core/src/url_cleaner.rs` | | Feed generator | `crates/bb-feed/src/generator.rs` | | Ordering/filtering | `crates/bb-feed/src/ordering.rs` | | Database layer | `crates/bb-db/src/` | | Repositories | `crates/bb-db/src/repository.rs` | | Migrations | `migrations/sqlite/` (001-010) | | Tauri app state | `src-tauri/src/state.rs` | | Tauri commands | `src-tauri/src/commands/` | | Sync service | `src-tauri/src/sync_service.rs` | | Bundled plugins | `plugins/` | | Frontend JS | `src-tauri/frontend/js/` | | Frontend CSS | `src-tauri/frontend/css/` |