Skip to main content

max / pter

chore: move internal todo to private docs store
Author: Max J. <87768334+MaxJMath@users.noreply.github.com> · 2026-05-21 01:40 UTC
Commit: ba15f33b4c50ba7160533d5603e2b79a25317d85
Parent: f595794
1 file changed, +0 insertions, -95 deletions
D docs/todo.md -95
@@ -1,95 +0,0 @@
1 - # pter - Todo
2 -
3 - Done: Phases 1-5 (except publish). Active: None. Next: cargo publish when ready.
4 -
5 - v0.1.0. 116 tests.
6 -
7 - ---
8 -
9 - ## Phase 1: Core Conversion
10 -
11 - ### Done
12 - - [x] Crate scaffold (Cargo.toml, MIT license, README)
13 - - [x] HTML element to markdown conversion (p, h1-h6, strong, em, a, img, ul/ol/li, blockquote, pre/code, hr, br, del, sup, sub)
14 - - [x] Tracking pixel detection (1x1 img, empty src, data URI, inline style)
15 - - [x] Hidden element skipping (display:none, visibility:hidden)
16 - - [x] Whitespace normalization (collapse blank lines, trim)
17 - - [x] Script/style/head stripping
18 - - [x] Entity decoding (via html5ever)
19 - - [x] Link deduplication (text matches URL)
20 - - [x] Nested list indentation
21 - - [x] Nested blockquote rendering
22 - - [x] Pre/code block rendering (no double-wrap)
23 -
24 - ---
25 -
26 - ## Phase 2: Email Layout Unwrapping
27 -
28 - ### Done
29 - - [x] Layout table detection heuristic (layout vs data table)
30 - - [x] Single-cell table unwrapping
31 - - [x] Multi-column table linearization
32 - - [x] Data table rendering as markdown table
33 - - [x] Nested layout table recursion
34 - - [x] font-size:0 / line-height:0 / height:0+overflow:hidden spacer detection
35 - - [x] role="presentation" detection
36 -
37 - ### Deferred
38 - - [ ] Outlook conditional comment stripping (client-specific, low cross-platform value)
39 -
40 - ---
41 -
42 - ## Phase 3: Reply Chain Detection
43 -
44 - ### Done
45 - - [x] Reply boundary abstraction (`is_reply_boundary` predicate)
46 - - [x] Structural markers (type=cite)
47 - - [x] CSS class markers (gmail_quote, divRplyFwdMsg, yahoo_quoted, protonmail_quote, tutanota_quote, moz-cite-prefix, zmail_extra)
48 - - [x] Attribution text detection (On ... wrote:, Forwarded message, Original Message, Begin forwarded message, French/German variants)
49 - - [x] Attribution line preservation above quote blocks
50 - - [x] Quote depth rendering via temporary buffer + `>` prefix
51 - - [x] Outlook separator detection (From/Sent/To/Subject blocks)
52 - - [x] Heuristic: div with attribution text followed by blockquote
53 - - [x] Previous sibling text scanning for attribution
54 -
55 - ---
56 -
57 - ## Phase 4: Integration
58 -
59 - ### Done
60 - - [x] GoingsOn: pter::convert() replaces strip_html in imap_client.rs extract_body_with_html()
61 - - [x] GoingsOn: removed ~230 lines of hand-rolled HTML stripping code + 30 tests (covered by pter)
62 - - [x] GoingsOn: path dep added to src-tauri/Cargo.toml
63 - - [x] Balanced Breakfast: pter::convert() replaces html2text in html_to_text + extract_article Rhai host functions
64 - - [x] Balanced Breakfast: html2text dependency removed from bb-core/Cargo.toml
65 - - [x] Both projects compile clean, BB tests pass (153 tests)
66 -
67 - ---
68 -
69 - ## Phase 5: Polish + Publish
70 -
71 - ### Done
72 - - [x] Property-based testing with proptest (7 fuzz strategies: never panics, no HTML leak, valid UTF-8, no triple newlines, no trailing whitespace, arbitrary bytes, whitespace-only)
73 - - [x] Edge case hardening (24 tests: empty, whitespace-only, deeply nested divs/blockquotes/lists, malformed HTML, unicode, large input, empty table cells, nested link formatting)
74 - - [x] Benchmarks with criterion (simple: 4µs, newsletter: 15µs, reply chain: 10µs, 100 sections: 101µs)
75 -
76 - ### Remaining
77 - - [ ] cargo publish to crates.io
78 - - [ ] Update GO and BB to crates.io version
79 -
80 - ---
81 -
82 - ## Key Paths
83 -
84 - | What | Where |
85 - |------|-------|
86 - | Public API | `src/lib.rs` |
87 - | Conversion pipeline | `src/convert.rs` |
88 - | Element classification | `src/elements.rs` |
89 - | Table handling | `src/tables.rs` |
90 - | Reply detection | `src/replies.rs` |
91 - | Whitespace normalization | `src/whitespace.rs` |
92 - | Integration tests | `tests/integration.rs` |
93 - | Edge case tests | `tests/edge_cases.rs` |
94 - | Property-based tests | `tests/proptest.rs` |
95 - | Benchmarks | `benches/convert_bench.rs` |