← Building log 2026-05-06
This week's work

The most common AI tell is not em-dashes

Built a mechanical audit for the writing this week, ran it across the twenty guides, and found that the AI tell manual review misses most often is not em-dashes. It is parallel sentence openings, three or four in a row starting with the same word. Then Sander reviewed one of the guides and caught three patterns the script will never reach.

AppKeep’s writing rulebook lives at docs/WRITING_RULES.md. It is the filtered version of Karolina’s writing framework, calibrated for a stressed homeowner reading on a phone after a contractor sent a quote they cannot make sense of. For months we were enforcing it by hand. This week we built the script that enforces it mechanically, ran it across the twenty guides, and found things manual review had missed entirely.

The script is site/scripts/audit-content.mjs. It walks the guides, glossary, landing page, and email templates and flags every em-dash beyond the one-per-piece budget, every empty intensifier (actually, literally, genuinely), every lazy rhetorical opener, every padding closer, every parallel sentence opening of three or more in a row, every five-word phrase that appears three or more times in a file. Run it with npm run audit:content from site/.

What the script found that manual review had not

Em-dashes were not the worst of it. We had run an em-dash purge across the guides weeks ago. The dominant AI tell across the writing was something the previous passes had not been specifically looking for: parallel sentence openings, three or four consecutive sentences inside a paragraph each starting with the same word. “By price alone, A is the cheapest. By annual save rate, B is cheapest. By schedule reliability, B and D both commit. By extras handling, B and D have named pricing.” Four sentences carrying the same content as a normally-varied paragraph would, dressed up as parallel structure.

It was nearly everywhere. “When the brief asks for warranty length, contractors give it. When the brief asks for named extras pricing, they declare it…” in how-contractors-think. “The first signal is that a good contractor… The second signal is that… The third signal is…” in how-to-spot-a-good-contractor. “The store warranty covers… The manufacturer’s warranty covers… The product warranty covers… The workmanship warranty covers…” in what-your-warranty-actually-covers.

Once the pattern is named you cannot unsee it. The fix is to vary the openers, or, when the content is a list of N parallel items, convert to a bullet list and let structure do the work that the parallel sentences were doing.

The rules were fighting the writing in two places

The framework had a sixty-word target for guide ledes and an eight-hundred to fifteen-hundred-word target for full guides. Both turned out to be too tight against the writing once a guide had to answer a search query in full prose with multi-country breakdowns and inline source citations. A cost guide with Netherlands, Finland, and Sweden price bands plus three source links lands at one hundred and twenty words honestly. The methodology guides run two thousand to three thousand because the layered argument requires it.

We dropped both targets. The real rule is in No padding: every word, every claim, every sentence earns its place. Word counts were proxies. The mechanical version of the real rule lives in the script as parallel-opening detection plus repeated-phrase detection. Together they catch what word counts were trying to catch.

What the script cannot see

Sander reviewed how-contractors-think.md mid-week and caught three patterns the script will never reach. One: the guide argued from two incompatible moral framings in the same piece. The dominant frame was charitable (“contractors answer the question they are asked, not malicious”), but the extras section called one contractor “honest” and another “vague,” which is a moral judgement the rest of the guide spends paragraphs refusing. Two: a list of two outcomes that excluded the most-common real-world outcome (the bad-faith case the guide was dancing around). Three: a causal claim with a confound, attributing a price drop to a clearer brief when the same revised quote also had a site visit attached to it.

These are interpretive, not mechanical. The script knows what an em-dash is. It does not know what charitable framing is, or which third outcome the writer skipped to keep the frame clean. Sander does. The audit is two layers now: the script catches the mechanical tells; the human review catches the framing slips.

We folded the patterns into the rulebook anyway. Charitable framing for third parties as a new section. Committed vs uncommitted as the structural distinction in place of honest vs vague. Three new entries under Never: outcome-list completeness, unsupported confidence qualifiers, and causal confounds.

What stays

The script runs before any new content ships. It is wired into the package as npm run audit:content, with a reference doc at docs/CONTENT_AUDIT.md that lists what is checked, what counts as a known false positive, and the recalibrations made along the way. The rulebook in docs/WRITING_RULES.md carries the new patterns and the recalibrated length rule. The voice tests stay human, because they have to.

The first time we ran the audit it flagged thirty-nine violations across forty content surfaces. After two passes it sits at twenty-seven, all medium or low, the majority of which are arguably structural lists the script reads as parallel openings. Em-dashes across the writing are at zero. The pattern manual review had missed is now visible every time we look.