<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>pikachus</title>
    <subtitle>Owen&#x27;s blog</subtitle>
    <link rel="self" type="application/atom+xml" href="https://pikach.us/atom.xml"/>
    <link rel="alternate" type="text/html" href="https://pikach.us"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2026-03-23T00:00:00+00:00</updated>
    <id>https://pikach.us/atom.xml</id>
    <entry xml:lang="en">
        <title>Mechanical empathy for machines</title>
        <published>2026-03-23T00:00:00+00:00</published>
        <updated>2026-03-23T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://pikach.us/lab-notes/mechanical-empathy-for-machines/"/>
        <id>https://pikach.us/lab-notes/mechanical-empathy-for-machines/</id>
        
        <content type="html" xml:base="https://pikach.us/lab-notes/mechanical-empathy-for-machines/">&lt;p&gt;LLMs keep getting more capable, but humans stay in the loop, bottlenecking potential output. Moving us out of the loop is the implicit goal. But it’s hard when we target human-oriented outcomes – UIs, visual layouts, things humans spot-check. How do you close the iteration loop without a human squinting at the screen?&lt;&#x2F;p&gt;
&lt;p&gt;Enter intermediate representations. An IR (in this context) is a machine-readable representation between the human interface and the application. Two requirements: it must be data-only (serializable), and computing the human-visible layer from it must be a pure function. Fidelity through determinism.&lt;&#x2F;p&gt;
&lt;p&gt;I’ve been building something called FrameTape that applies this idea to TUI animations. It records a sequence of widget states, renders each into an off-screen buffer, and computes metrics: smoothness (coefficient of variation of frame-to-frame deltas), coverage (fraction of cells that changed), periodicity (distance between first and last frame for loop detection). Then it exposes assertions – WCAG contrast ratios, smoothness thresholds, coverage minimums – that agents can run autonomously.&lt;&#x2F;p&gt;
&lt;p&gt;The point isn’t the metrics themselves. It’s that the iteration loop becomes machine-native. An agent can tweak easing curves, re-record, measure, and converge – no human watching the animation and saying “that feels janky.” The IR makes the subjective objective.&lt;&#x2F;p&gt;
&lt;p&gt;There’s something after the coding agent. Cursor gave way to Claude. The pattern is always more autonomy, more decoupling. Separate intent from execution. The question is what the next layer of indirection looks like.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Newsletter #2: Source Code Is the New Assembly</title>
        <published>2026-03-15T00:00:00+00:00</published>
        <updated>2026-03-15T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://pikach.us/blog/newsletter-2026-03/"/>
        <id>https://pikach.us/blog/newsletter-2026-03/</id>
        
        <content type="html" xml:base="https://pikach.us/blog/newsletter-2026-03/">&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Preamble&lt;&#x2F;em&gt;: It’s time for my second newsletter. About a month and a half since the last one – a bit over my monthly target cadence. I’ve got some interesting stuff today.&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;h1 id=&quot;source-code-is-the-new-assembly&quot;&gt;Source Code Is the New Assembly&lt;&#x2F;h1&gt;
&lt;p&gt;Line 4892 was off and it was already 11pm. I rubbed my eyes; at what point does exhaustion outweigh my principles? I was too tired to even name the problem, let alone fix it.&lt;&#x2F;p&gt;
&lt;p&gt;I’d traded coding fatigue for reviewing fatigue. More productive, but not nearly enough. And more sinister.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-trough&quot;&gt;The Trough&lt;&#x2F;h2&gt;
&lt;p&gt;I had a reasonable plan: make the AI follow rules, such that I could orchestrate it into larger and larger tasks. My friend Cyril warned me against building multi-agent from the get-go, so of course I nodded and kept building. After all, is not experience the best teacher? For weeks it accreted endlessly… a hydra of features and edge cases begetting their own edge cases. The thing meant to save me time was costing me more of it, and I could feel it failing before I admitted it. I think most builders know. You just keep going because the sunk cost whispers that the next fix will be the one that makes it click.&lt;&#x2F;p&gt;
&lt;p&gt;But the deeper problem wasn’t the tool I was building. It was me. My reviews got sloppier as the day wore on. Frustration yielded to YOLOing the model against tasks and hoping for the best. The style guides I’d written were being ignored by the model, and I wasn’t even noticing. It was draining me and giving it more hours wasn’t working.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;It would take everything I had. I had to give it nothing.&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;h2 id=&quot;first-principles&quot;&gt;First Principles&lt;&#x2F;h2&gt;
&lt;p&gt;I stepped back and stared at the shape of the problem. LLMs had moved me from writing code to reviewing code, but my time was still the bottleneck. I could generate ten times more code, but I had to review ten times more code. Better, but another linear function of my time. Not enough.&lt;&#x2F;p&gt;
&lt;p&gt;So what are agents actually good at? They’re persistent. They don’t get sloppy at 4pm. They can grind on a problem. But they need a quantitative target — not “follow this style guide,” something they can pursue mechanically, over and over, without qualitative judgment.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;What if I gave my code a score, and the agent just… made the score go up?&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;p&gt;Not a feeling — a number. You run your code through analysis, and out comes X.&lt;&#x2F;p&gt;
&lt;p&gt;Now imagine you could break that score down. Not just “the codebase scores X” but “this file is dragging it down, and within that file, these three functions are the worst offenders.” You can point at the pain.&lt;&#x2F;p&gt;
&lt;p&gt;Now hand that breakdown to an AI: fix the worst thing. Re-score. Better. Fix the next worst thing. Keep going.&lt;&#x2F;p&gt;
&lt;p&gt;That’s the whole loop. Score, find the hotspots, fix, re-score, repeat.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-link&quot;&gt;The Link&lt;&#x2F;h2&gt;
&lt;p&gt;This pattern has a name. It’s how we train neural networks.&lt;&#x2F;p&gt;
&lt;p&gt;The score (inverted, so it goes down as quality goes up) is called a loss function. The breakdown of what’s responsible for the loss is called the gradient. The loop of scoring and improving is called training. Each pass through the loop is an epoch.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;I’m training my codebase the way we train neural nets.&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;h2 id=&quot;goodhart-s-tension&quot;&gt;Goodhart’s Tension&lt;&#x2F;h2&gt;
&lt;p&gt;There’s an obvious problem with optimizing for a score: gaming it. Tell an AI to minimize code, and the optimal solution is to delete everything.&lt;&#x2F;p&gt;
&lt;p&gt;But real problems can be subtler than the absurd version. I hit this early. The optimizer, trying to minimize function complexity, started shattering functions into tiny single-use helpers. Each function looked simpler on paper, but the logic was scattered across dozens of fragments. The codebase became harder to read, harder to follow, harder to reason about. It was trimming muscle, not fat.&lt;&#x2F;p&gt;
&lt;p&gt;The fix was a competing score. I added a measure of ‘code economy’, which penalized scattering. This incentivized &lt;em&gt;balancing&lt;&#x2F;em&gt; function simplicity and cohesive logic. Trim the fat, not the muscle.&lt;&#x2F;p&gt;
&lt;p&gt;The tension between scores isn’t a flaw. It IS the system. You don’t want any single metric maximized. You want the equilibrium where competing goals hold each other honest.&lt;&#x2F;p&gt;
&lt;p&gt;Btw, this trap has a name: Goodhart’s Law. When a measure becomes a target, it ceases to be a good measure. The antidote, it turns out, is not better measures. I want to keep those few and simple. It’s tension between them.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;the-loop&quot;&gt;The Loop&lt;&#x2F;h2&gt;
&lt;p&gt;Here’s some messy networking code I ran through the loop.&lt;&#x2F;p&gt;
&lt;p&gt;The loss breakdown pointed at two things: duplication across similar functions, and state complexity. I didn’t tell it to look for these; the loss function found them.&lt;&#x2F;p&gt;
&lt;p&gt;Four passes. Notice the loss drops:&lt;&#x2F;p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Epoch&lt;&#x2F;th&gt;&lt;th style=&quot;text-align: right&quot;&gt;Loss&lt;&#x2F;th&gt;&lt;th style=&quot;text-align: right&quot;&gt;Delta&lt;&#x2F;th&gt;&lt;&#x2F;tr&gt;&lt;&#x2F;thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;baseline&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;0.49&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;–&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;epoch 1&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;0.32&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;-0.17&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;epoch 2&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;0.13&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;-0.19&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;epoch 3&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;0.11&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;-0.02&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;tr&gt;&lt;td&gt;epoch 4&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;0.09&lt;&#x2F;td&gt;&lt;td style=&quot;text-align: right&quot;&gt;-0.02&lt;&#x2F;td&gt;&lt;&#x2F;tr&gt;
&lt;&#x2F;tbody&gt;&lt;&#x2F;table&gt;
&lt;p&gt;Big gains early, then diminishing returns (the same shape as neural networks!). We can stop when the losses converge. This is when we’ve maximized tension between our competing principles.&lt;&#x2F;p&gt;
&lt;p&gt;The state cleanup illustrates what it found. Before, the state model was a bag of flags:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color-scheme: light dark; color: light-dark(#657B83, #839496); background-color: light-dark(#FDF6E3, #002B36);&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; Before: each flag is independently true&#x2F;false,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; each Option is independently Some&#x2F;None.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; 8 booleans + 3 Options = 2^11 = 2,048 representable states.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; Most of those states are nonsensical -- authenticated&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; but socket closed? handshake complete but not connected?&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; Nothing in the code prevents it.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#586E75, #93A1A1);font-weight: bold;&quot;&gt; struct&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; Connection&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; is_connected&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; bool&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; socket_open&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; bool&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; is_authenticated&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; bool&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; handshake_complete&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; bool&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; socket_id&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; Option&lt;&#x2F;span&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;u64&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;    pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; last_error&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; Option&lt;&#x2F;span&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;String&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;    &#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; ...&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;After: five states, enforced by the type system.&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color-scheme: light dark; color: light-dark(#657B83, #839496); background-color: light-dark(#FDF6E3, #002B36);&quot;&gt;&lt;code data-lang=&quot;rust&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; After: exactly 5 states. The impossible combinations&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; don&amp;#39;t just go unchecked -- they become inexpressible.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; socket_id only exists when the connection is Ready.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; last_error only exists when something went wrong.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt;&#x2F;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#93A1A1, #586E75);font-style: italic;&quot;&gt; The type system holds the truth, not the programmer&amp;#39;s memory.&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;pub&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#586E75, #93A1A1);font-weight: bold;&quot;&gt; enum&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; Connection&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;    Idle&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;    Draining&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;    Ready&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; socket_id&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; u64&lt;&#x2F;span&gt;&lt;span&gt; }&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;    Backoff&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; retry_after_ms&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; u64&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; last_error&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; String&lt;&#x2F;span&gt;&lt;span&gt; }&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt;    Error&lt;&#x2F;span&gt;&lt;span&gt; {&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#268BD2, #268BD2);&quot;&gt; last_error&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#859900, #859900);&quot;&gt;:&lt;&#x2F;span&gt;&lt;span style=&quot;color: light-dark(#CB4B16, #CB4B16);&quot;&gt; String&lt;&#x2F;span&gt;&lt;span&gt; }&lt;&#x2F;span&gt;&lt;span&gt;,&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;}&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;From 2,048 representable states to 5. Any LLM could make this change if you pointed at the struct and asked. The point is the system identified this as the highest-value target without being told. The loss function pointed here. The gradient said &lt;em&gt;this is where the pain lives&lt;&#x2F;em&gt;. The agent made the fix. I never said “look at the state model.”&lt;&#x2F;p&gt;
&lt;p&gt;That’s what the loop buys you at scale. Not better one-shot refactors – better &lt;em&gt;prioritization&lt;&#x2F;em&gt;. It finds the things a human reviewer misses at 4pm, across a whole codebase, without getting tired.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;side-quest-llm-vision&quot;&gt;Side Quest: LLM Vision&lt;&#x2F;h2&gt;
&lt;p&gt;The loop gives the LLM two things: a loss to reduce and a location responsible for it. I wanted to see what it sees.&lt;&#x2F;p&gt;
&lt;p&gt;You used to read code to find problems. Now the loss finds problems and points at code. But maybe that isn’t surprising. Performance engineers don’t read every function looking for the slow one — they profile, see the hotspot, go straight there. Same idea. I built a flame graph for code quality: instead of “where is the CPU time going?” it shows “where is the complexity going?” You can explore it on the above example here:&lt;&#x2F;p&gt;
&lt;div class=&quot;asciinema-container&quot;&gt;
&lt;script
  src=&quot;https:&#x2F;&#x2F;asciinema.org&#x2F;a&#x2F;vlyBU5qMA3BkxYVb.js&quot;
  id=&quot;asciicast-vlyBU5qMA3BkxYVb&quot;
  async
  
  
  
  
  
  
  
  
  
  
&gt;&lt;&#x2F;script&gt;
&lt;&#x2F;div&gt;
&lt;h2 id=&quot;human-out-the-loop&quot;&gt;Human Out the Loop&lt;&#x2F;h2&gt;
&lt;p&gt;So what do you think about when you’re not staring at line 4,892?&lt;&#x2F;p&gt;
&lt;p&gt;This has happened before. Assembly programmers went up the stack. They stopped writing instructions by hand and started writing in higher-level languages. Assembly didn’t disappear – it became intermediate. We just stopped looking at it. I haven’t been writing code for a while now… and I don’t miss it.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Source code is starting to feel like the new assembly.&lt;&#x2F;strong&gt;&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;P.S.&lt;&#x2F;p&gt;
&lt;p&gt;I’d love to hear what you think.&lt;&#x2F;p&gt;
&lt;p&gt;What’s next for me: I want to push this further and start orchestrating larger work streams on top of the loop. More files, more services, more complex dependency graphs. I want to find the walls. I also have some interesting questions ahead of me that I don’t have answers to yet – how do I show this to people? What’s the right UX? Open source or closed? If you have opinions on any of that, I’m genuinely asking.&lt;&#x2F;p&gt;
&lt;p&gt;A few things I’ve been enjoying lately:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;x.com&#x2F;KatanaLarp&#x2F;status&#x2F;2029928471632224486&quot;&gt;Your LLM Doesn’t Write Correct Code. It Writes Plausible Code.&lt;&#x2F;a&gt; – an analysis of an LLM reimplementation of SQLite. Honest, helpful, and a good reminder that LLMs target compilation, not correctness.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Project_Hail_Mary&quot;&gt;Project Hail Mary&lt;&#x2F;a&gt; – fun sci-fi. Almost put it down in the beginning, but glad I stuck with it. A bit “written for TV” feeling. Kept wishing it would turn into &lt;em&gt;The Three Body Problem&lt;&#x2F;em&gt; instead.&lt;&#x2F;li&gt;
&lt;li&gt;Blizzard stealth-released the first Diablo 2 expansion in over twenty-five years. Took over my life for a week or two.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Till next time.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Determinism is the gradient</title>
        <published>2026-03-05T00:00:00+00:00</published>
        <updated>2026-03-05T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://pikach.us/lab-notes/determinism-is-the-gradient/"/>
        <id>https://pikach.us/lab-notes/determinism-is-the-gradient/</id>
        
        <content type="html" xml:base="https://pikach.us/lab-notes/determinism-is-the-gradient/">&lt;p&gt;Three connected thoughts that keep sharpening.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Against the grain.&lt;&#x2F;strong&gt; We’re bolting LLMs onto decades of human-oriented software process. Style guides, reviewer agents, architecture prompts. It feels like teaching, but LLMs don’t learn. Every conversation is a cold start. You’re not building understanding, you’re performing it on repeat. Lossy compression of something that resists compression. When you find yourself banging your head against the wall, step back. This is not the way.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Source and object.&lt;&#x2F;strong&gt; If the LLM is the compiler, code is object code. What’s the new source? Something upstream of syntax – mental models, specs, constraints, intent. The things you already care about but currently express &lt;em&gt;through&lt;&#x2F;em&gt; code.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;strong&gt;Determinism unlocks descent.&lt;&#x2F;strong&gt; Specs and validation are a loss function – complexity, duplication, performance, correctness. LLMs solve problems the way they were trained: iteration, descent. This works when the gradient is clean. Nondeterminism is noise – flaky tests, environment state, race conditions. Enough noise and descent becomes a random walk. The unlock isn’t smarter models. It’s making the environment simulable. WALs, replay, pure functions, hermetic state. Reduce the noise to zero and let the machine grind.&lt;&#x2F;p&gt;
&lt;p&gt;Practical implications: organize code into deterministic runtimes. Inject nondeterminism only at boundaries. Separate rendering from internal state (UIs are hard for machines to introspect; internal state as data is easily verifiable). Follow TigerBeetle’s determinism principles – simulation and replay. Let machines load bugs and iterate to solution in a provable manner.&lt;&#x2F;p&gt;
&lt;p&gt;Don’t look at code. Look at loss.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Merge first, bisect later</title>
        <published>2026-02-26T00:00:00+00:00</published>
        <updated>2026-02-26T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://pikach.us/lab-notes/merge-first-bisect-later/"/>
        <id>https://pikach.us/lab-notes/merge-first-bisect-later/</id>
        
        <content type="html" xml:base="https://pikach.us/lab-notes/merge-first-bisect-later/">&lt;p&gt;Software dev from first principles: manual code review doesn’t make sense anymore. It’s too much of a bottleneck. Rather merge and &lt;code&gt;git bisect&lt;&#x2F;code&gt; problems. This shifts risk right towards prod, which means we’ll see more products around software lifecycle infrastructure (canary deploys, automated rollback, observability) and fewer around PR review bots.&lt;&#x2F;p&gt;
&lt;p&gt;Related: I distrust tools I don’t want to use anymore. s&#x2F;cursor&#x2F;claude&#x2F;. Each one felt permanent until it didn’t. The pattern is always the same – more autonomy, less supervision.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Newsletter #1: Up and Stumbling</title>
        <published>2026-02-01T00:00:00+00:00</published>
        <updated>2026-02-01T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://pikach.us/blog/newsletter-2026-01/"/>
        <id>https://pikach.us/blog/newsletter-2026-01/</id>
        
        <content type="html" xml:base="https://pikach.us/blog/newsletter-2026-01/">&lt;p&gt;Hello friends, welcome to my first newsletter. Here’s where I’ll keep you apprised of what I’ve been up to – and thinking about. I’m writing this for a couple reasons: to keep myself accountable to writing and to make sure my friends and peers (you) still keep me top of mind.&lt;&#x2F;p&gt;
&lt;p&gt;It’s been almost two months since I left Grafana. I planned (and did, more on that later) to incorporate a new company in January to avoid any tax headaches for 2025. This created a nice delineation between visiting friends and family in December, plus a bit of tinkering, and getting the ball rolling in January.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;before-you-dive-in&quot;&gt;Before you dive in&lt;&#x2F;h1&gt;
&lt;p&gt;I’ll aim to write these with some regularity. A few asks to keep in mind as you read:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Where should I write this next?&lt;&#x2F;li&gt;
&lt;li&gt;How was the structure and length of the newsletter?&lt;&#x2F;li&gt;
&lt;li&gt;If anyone is connected to a cloud&#x2F;LLM provider’s startup program, I’d love some more cloud credits.&lt;&#x2F;li&gt;
&lt;li&gt;Any ideas I should know about?&lt;&#x2F;li&gt;
&lt;li&gt;Connect me with someone?&lt;&#x2F;li&gt;
&lt;li&gt;If a version of you were in my shoes, what would the you today tell them? Candidness, please.&lt;&#x2F;li&gt;
&lt;li&gt;Please follow&#x2F;like&#x2F;retweet me on &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;x.com&#x2F;castle_vanity&quot;&gt;twitter&lt;&#x2F;a&gt; when i post&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h1 id=&quot;where-my-head-is-at&quot;&gt;Where my Head is at&lt;&#x2F;h1&gt;
&lt;p&gt;Most of all, I’m trying to separate stimulus from response. Disentangling myself from the fire fighting and unrelenting slack notifications. I’ve known this for a while and want to find my mojo again. My best inpsirations come in the quiet, empty breaks in time. So for a while, I’ll give myself that.&lt;&#x2F;p&gt;
&lt;p&gt;Swinging at every pitch is nearly as bad as never swinging, though. I do not intend to be lazy, but restrained.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;timeline&quot;&gt;Timeline&lt;&#x2F;h1&gt;
&lt;p&gt;To avoid this becoming too long, here’s a glance of what kept me busy these months, then I’ll dive into some themes:&lt;&#x2F;p&gt;
&lt;p&gt;December:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Took apart OpenAI’s &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;openai&#x2F;codex&quot;&gt;Codex&lt;&#x2F;a&gt; repo and reassembled it into a server&#x2F;client runtime&lt;&#x2F;li&gt;
&lt;li&gt;Built a media server for myself&lt;&#x2F;li&gt;
&lt;li&gt;Linked the two with a chatbot, allowing agentic tool use to quickly lookup quotes from my media library (via &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;lancedb.com&quot;&gt;LanceDB&lt;&#x2F;a&gt;) and dynamically cut gifs for my favorite quotes.&lt;&#x2F;li&gt;
&lt;li&gt;Did some thinking about how I wanted to operate a lean, AI-assisted startup in the new year.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;January:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Retooled my local dev env &amp;amp; workflows&lt;&#x2F;li&gt;
&lt;li&gt;Incorporated under MoonHeron, Inc. (moonheron.com)&lt;&#x2F;li&gt;
&lt;li&gt;Some foundational work for a “pocket data analyst”&lt;&#x2F;li&gt;
&lt;li&gt;Designing a cost effective and UX friendly data layer (stream oriented database)&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h1 id=&quot;incorporation&quot;&gt;Incorporation&lt;&#x2F;h1&gt;
&lt;p&gt;This one felt great. I’ve always wanted to start my own company, and now I have. After some research and whatnot, I’ve chosen a C Corp in Delaware. I’ve taken care of some tax optimizations too: 83(b) election + QSBS (Qualified Small Business Stock) prep.&lt;&#x2F;p&gt;
&lt;p&gt;I did this all via &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;stripe.com&#x2F;atlas&quot;&gt;Stripe Atlas&lt;&#x2F;a&gt;, a product from Stripe, designed to make startup incorporation easy. I was planning to use this, having followed them for years, and it did not disappoint. Incorporating via Atlas has been &lt;em&gt;great&lt;&#x2F;em&gt;; both straightforward and enlightening. Can not recommend highly enough.&lt;&#x2F;p&gt;
&lt;p&gt;I started incorporation two weeks ago and my 83(b) was filed with the IRS yesterday. Banking via &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;mercury.com&quot;&gt;Mercury&lt;&#x2F;a&gt; has been up about a week as well. There are some other things here and there. Google workspaces (you can now find me at owen@moonheron.com!), cloud credits (I got a few grand from AWS via their Atlas partnership), etc.&lt;&#x2F;p&gt;
&lt;p&gt;re; MoonHeron: I needed a name, ideally one that’s a bit memorable with an open domain. Nothing deep here; I’ve always loved herons and the moon. I also got some pretty nice artwork for the homepage.&lt;&#x2F;p&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;&lt;em&gt;Technical workflow stuff ahead — &lt;a href=&quot;https:&#x2F;&#x2F;pikach.us&#x2F;blog&#x2F;newsletter-2026-01&#x2F;#a-pocket-data-analyst&quot;&gt;skip to “A pocket data analyst”&lt;&#x2F;a&gt; if you’re here for a product.&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
&lt;h1 id=&quot;retooling&quot;&gt;Retooling&lt;&#x2F;h1&gt;
&lt;p&gt;A combination of free time and software dev’s changing landscape made this a natural moment to reexamine my tools. I tend to do this every few years as things change and I want to try new things. Here’s some new additions:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;zed.dev&quot;&gt;Zed&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt;: Replaced Cursor &amp;amp; VSCode. Performant, opinionated, well-documented. I don’t miss Cursor’s extra AI features — I do most AI work in Claude Code anyway.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tailscale.com&quot;&gt;Tailscale&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt;: VPN without the VPN hassle. OAuth into my home network from anywhere, letting me access my home network, media server, etc from anywhere.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;ghostty.org&quot;&gt;Ghostty&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt;: Replaced Alacritty+tmux. Zig-based terminal with native GPU acceleration. Sensible defaults, low learning curve.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;worktrunk.dev&quot;&gt;Worktrunk&lt;&#x2F;a&gt;&lt;&#x2F;strong&gt;: Replaced my &lt;code&gt;git worktree&lt;&#x2F;code&gt; shell scripts. Essential for bouncing between Claude Code sessions.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;Writing code today is so much different. I almost never use Cursor anymore — I suspect it’s a stepping stone, not a destination. Their feature releases usually involve workflows that already exist in terminal-based tools like &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;claude.ai&#x2F;code&quot;&gt;Claude Code&lt;&#x2F;a&gt; and &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;openai.com&#x2F;codex&#x2F;&quot;&gt;Codex&lt;&#x2F;a&gt;, just with a new GUI to learn. I’m using Claude exclusively these days, with some bells and whistles on top.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;less-is-more-focusing-on-token-density&quot;&gt;Less is More: Focusing on ‘Token Density’&lt;&#x2F;h1&gt;
&lt;p&gt;I recently switched from a documentation-heavy approach where I built a multi-folder style guide for my LLMs. Tooling choices, architectural patterns, glossaries. These were much better than nothing, but I’ve found the more instruction you give, the less fidelity to any single rule you get. So I recently rewrote all my supporting agent documentation to favor succinct, illustrative points. I think of this as “token density”. Note, my own docs are similar to &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tigerstyle.dev&quot;&gt;tigerstyle.dev&lt;&#x2F;a&gt;, the style guide of the tigerbeetledb project.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;roleplaying-agent-orchestration&quot;&gt;Roleplaying: Agent Orchestration&lt;&#x2F;h1&gt;
&lt;p&gt;I wanted to optimize my workflow to parallelize LLM output and minimize bottlenecks — both the number of times I need to intervene and the delay when I do (especially as I bounce between Claude sessions).&lt;&#x2F;p&gt;
&lt;p&gt;The old approach: collaboratively plan a feature with Claude, let it build, then repeatedly remind it of the same things — “keep in mind X”, “ensure tests pass”, “can we simplify control flow”. Every repetition was a break in throughput, especially the next step was trivial.&lt;&#x2F;p&gt;
&lt;p&gt;So I started writing specialized agents with isolated responsibilities. This let me parallelize them and invoke each one for a single set of principles. This part was not novel: AI coding tools all have specialized agents. But I didn’t want to keep nudging them along with the same patterns: ‘design &amp;lt;x&amp;gt;’, ‘review the design in light of our style guide’, ‘keep addressing reviewer feedback until you are both happy, then show me’, etc. The next step was having something orchestrate them the way I would — so I modeled how I think about problems, wrapped it in a state machine, and exposed it as a system prompt. Now Claude loops over itself until it needs my review. Combined with the simpler style guide, this gave me fewer bottlenecks and more efficient execution. Here’s roughly what it looks like:&lt;&#x2F;p&gt;
&lt;pre class=&quot;giallo&quot; style=&quot;color-scheme: light dark; color: light-dark(#657B83, #839496); background-color: light-dark(#FDF6E3, #002B36);&quot;&gt;&lt;code data-lang=&quot;plain&quot;&gt;&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;Workflow:&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                 ┌───────────┐&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            ┌───►│  designer │◄───────────────────────────────┐&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    └─────┬─────┘                                │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          │                                      │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          ▼                                      │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    ┌───────────┐                                │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;     revise │    │standardizer                                │ rethink&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    │ reviewer  │                                │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    └─────┬─────┘                                │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          │                                      │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            └──────────┤ issues?                              │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │                                      │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │ approved                             │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       ▼                                      │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                 ┌───────────┐                                │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            ┌───►│implementer│◄───────────┐                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    └─────┬─────┘            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          │                  │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          ▼                  │ simplified        │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    ┌───────────┐            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;       fix  │    │ verifier  │            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    │standardizer            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    │ reviewer  │            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │    └─────┬─────┘            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            │          │                  │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;            └──────────┤ issues?          │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │                  │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │ complex?         │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       ▼                  │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                 ┌───────────┐            │                   │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                 │ simplifier├────────────┴───────────────────┘&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                 └─────┬─────┘&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       │ clean&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                       ▼&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;
&lt;span class=&quot;giallo-l&quot;&gt;&lt;span&gt;                     done&lt;&#x2F;span&gt;&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;I usually run this ‘orchestrator’ version of claude for more elongated tasks and when I’m switching back and forth between claudes. It still has it’s issues, namely it stops
following this state machine after a while, but it’s useful.
&lt;em&gt;Note&lt;&#x2F;em&gt;: I’m working on a more robust version to address some of these concerns, but that’ll be in the next letter.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;a-pocket-data-analyst&quot;&gt;A pocket data analyst&lt;&#x2F;h1&gt;
&lt;p&gt;Talking with some friends outside my sphere helped me realize we need to make AI analysis more quantitative and programmable.&lt;&#x2F;p&gt;
&lt;p&gt;First, I was talking to a friend of mine in DC who works for a communications strategy consultancy. He had a problem where they’d accumulate proprietary, unstructured datasets, but continually analyzing them wasn’t cost effective. It sounded like he wanted ad-hoc ETL (extract, transform, load) pipelines for unstructured data.&lt;&#x2F;p&gt;
&lt;p&gt;Later that week I met up with the founder of a previous startup I worked for. He was using claude code to build one-of business intelligence experiments for his job, collecting arbitrary datasets, quantifying and visualizing them. While the workflows were ad-hoc (new codebase for each experiment), AI had dramatically reduced the barrier to analyzing his own business.&lt;&#x2F;p&gt;
&lt;p&gt;I spin up claude research plans daily to familiarize myself with anything I can think of: from researching companies to collecting restaurant lists (in Mexico City this week!). I started thinking about bringing AI’s analytical capabilities to more structured and long running tasks. And from there, making them programmable.&lt;&#x2F;p&gt;
&lt;p&gt;I expect this trend will be beneficial everywhere, but more pronounced on the small&#x2F;med businesses which don’t have data analytics teams on tap.&lt;&#x2F;p&gt;
&lt;p&gt;Claude&#x2F;ChatGPT already support research modes, but there’s a big gap between that and the features users would need to actually use them as &lt;em&gt;quantitative&lt;&#x2F;em&gt; analysts. I think they’d need:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;Audit trails: Cite your sources for verifiable claims&lt;&#x2F;li&gt;
&lt;li&gt;Collaborative introspection: explore and sample the dataset. Build execution plans with the user.&lt;&#x2F;li&gt;
&lt;li&gt;Iterative refinement: As new edge cases are discovered, escalate them and update the execution plan&#x2F;protocol.&lt;&#x2F;li&gt;
&lt;li&gt;Human in the loop: In the same way a senior would advise a junior, the human should be able to guide and redirect the analysis process, injecting expertise as needed. Move the user up the value chain.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;I’ve been putting together the foundation for this — ‘pocket data analyst’ is the placeholder text on moonheron.com.&lt;&#x2F;p&gt;
&lt;p&gt;One of the harder problems: the data layer. If you want to parallelize LLM-based data processing, what would it look like? LLM computation is unpredictable — it winds and folds back on itself, more of a graph with cycles than one without. You’d need:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;arbitrary connections (agent&amp;lt;&amp;gt;agent communication)&lt;&#x2F;li&gt;
&lt;li&gt;asynchronous work. pause&#x2F;resume and waiting for other agents or human feedback&lt;&#x2F;li&gt;
&lt;li&gt;parallelization, distributing and collecting work as necessary.&lt;&#x2F;li&gt;
&lt;li&gt;cheap, short-lived, on-demand data ‘sinks’ for ad-hoc analytical workloads.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;I expect these workloads will be parallelized and predictable and it’s no surprise the agent infrastructure layer is where so many companies seem to be spending a lot of their effort.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;agent-infrastructure&quot;&gt;Agent Infrastructure&lt;&#x2F;h1&gt;
&lt;p&gt;&lt;em&gt;More technical research ahead — &lt;a href=&quot;https:&#x2F;&#x2F;pikach.us&#x2F;blog&#x2F;newsletter-2026-01&#x2F;#what-i-m-reading&quot;&gt;skip to “What I’m reading”&lt;&#x2F;a&gt; if you don’t want the deep dive.&lt;&#x2F;em&gt;&lt;&#x2F;p&gt;
&lt;p&gt;The switch to asynchronous and unbounded computation with arbitrary connections between nodes is challenging and novel to me. It’s requiring me to rethink a lot from first principles. I think there’s going to be a &lt;em&gt;huge&lt;&#x2F;em&gt; amount of value for new infrastructure projects that can sensibly seize this opportunity. Here are a couple that I think have interesting applications in the post agent world:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;restatedev&#x2F;restate&quot;&gt;restate&lt;&#x2F;a&gt;: A decoupled data layer and asynchronous runtime for the Actor Model (e.g. Erlang, Akka). Some interesting applications for agent systems. I’d bet against this approach though – the Actor Model has been a bit too obtuse to gain wide adoption historically.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;tursodatabase&#x2F;agentfs&quot;&gt;agentfs&lt;&#x2F;a&gt;: sqlite backed embedded ‘file system’ for agents&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;neon.com&#x2F;&quot;&gt;neon (acq databricks)&lt;&#x2F;a&gt;: copy-on-write, storage&#x2F;compute decoupled postgres w&#x2F; instant-branching. Basically iceberg for sql.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h1 id=&quot;catching-up-with-databases-again&quot;&gt;Catching up with databases again&lt;&#x2F;h1&gt;
&lt;p&gt;I said I’d never build another database. And I held that conviction for a month and half. Now, I’m not so sure. Having time away from everything allowed me to not force it and a couple weeks ago I started researching new approaches projects are taking. I swear I blinked and three days had passed. By and large I’m seeing using object stores to decouple storage and compute, plus embedded operating models. Here’s a few favorites:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;turbopuffer.com&quot;&gt;turbopuffer&lt;&#x2F;a&gt;: Really cool little project. Closed source, but they’re building an object storage first vector DB. Where they really differentiate themselves is a focus on &lt;em&gt;online&lt;&#x2F;em&gt; model: continual ingestion without heavy reindexing based on a really cool &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2410.14452&quot;&gt;“SPFresh” paper&lt;&#x2F;a&gt;.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;slatedb.io&quot;&gt;slatedb&lt;&#x2F;a&gt;: This thing is awesome, it’s basically object storage backed rocksdb. For those unfamiliar, this is an embedded key-value database, commonly used as a building block in other DBs. The diskless, object storage approach circumvents replication needs and keeps the operating model simple. I see myself using this in the near future.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;tigerbeetle.com&quot;&gt;tigerbeetle&lt;&#x2F;a&gt;: Few things have resonated as much with me as this project, particularly its focus on determinism including fully replicable testing. Their style guide resonated with me and became the base for my agent documentation rewrite.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;h1 id=&quot;thinking-about-a-new-streaming-data-layer&quot;&gt;Thinking about a new streaming data layer&lt;&#x2F;h1&gt;
&lt;p&gt;A compelling question:&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;‘What would streaming data look like in the modern era, from first principles, if Kafka didn’t exist?’&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;There are a few players in the new-age streaming space, including &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;warpstream.com&quot;&gt;Warpstream&lt;&#x2F;a&gt;, &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;buf.build&#x2F;product&#x2F;bufstream&quot;&gt;Bufstream&lt;&#x2F;a&gt;, and the new YC backed startup &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;s2.dev&quot;&gt;s2.dev&lt;&#x2F;a&gt;. They all do some really great things:&lt;&#x2F;p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Warpstream&lt;&#x2F;strong&gt;: pioneered the object storage approach targeting Kafka replacement&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;Bufstream&lt;&#x2F;strong&gt;: Kafka replacement with typed streams. Uses their protobuf roots to expose type info, writes to iceberg+parquet. Safer, better compression.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;strong&gt;s2.dev&lt;&#x2F;strong&gt;: Most interesting — discards Kafka entirely, makes streams a lightweight primitive. Object storage backed, per-stream configs. Way more approachable; opens use cases for non-distributed-systems engineers (conversation histories, audit logs, config changes). This could support use cases we were asked about in Grafana Loki but could never satisfy.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;p&gt;These all have great ideas and I started designing a project to combine them. Object storage backed, RF1 (replication factor 1), schema optional, lightweight streams, avoiding the kafka API, and writing to iceberg&#x2F;parquet, plus a little BYOC (bring your own cloud) on top. Eventually, I started having some doubts: who is the target user, the one who helps build traction? How was it materially differentiated from s2? I did a little research into these companies and it didn’t seem like they had compelling funding&#x2F;revenue trajectories, which soured me on the idea. I was putting this together because it was fun, not due to a real need.&lt;&#x2F;p&gt;
&lt;p&gt;Time to get out of the weeds, but I’d like to know what you think about this space.&lt;&#x2F;p&gt;
&lt;h1 id=&quot;what-i-m-reading&quot;&gt;What I’m reading&lt;&#x2F;h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Breakneck:_China%27s_Quest_to_Engineer_the_Future&quot;&gt;Breakneck - Dan Wang&lt;&#x2F;a&gt;. “America is run by lawyers, and China is run by engineers”. Read his annual letters for years, had it preordered. Cannot recommend highly enough. Candid, playful. If you’re not paying attention to the dance between China and the US and you’d like to, start here. Heavy on technology policy and the cultural impetus behind them.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;materializedview.io&quot;&gt;materializedview.io&lt;&#x2F;a&gt;. Wow, a lot of candid, no fluff thoughts on data engineering. Immediately intuitable for those in our shoes and i’m agreeing with a lot.&lt;&#x2F;li&gt;
&lt;li&gt;&lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Stranger_in_a_Strange_Land&quot;&gt;Stranger in a Strange Land&lt;&#x2F;a&gt;. Stalled about 2&#x2F;3 through, but hasn’t been unenjoyable. Shows it’s age like a Mike Myers movie.&lt;&#x2F;li&gt;
&lt;&#x2F;ul&gt;
&lt;hr &#x2F;&gt;
&lt;p&gt;That’s all for now. I’m still idea hopping and trying to follow my nose to the intersection of what’s interesting to me and valuable to everyone.&lt;&#x2F;p&gt;
&lt;p&gt;One reflection: posting is painful. I need to stay on that horse — publishing, tweeting, meeting people.&lt;&#x2F;p&gt;
&lt;p&gt;If you made it this far, don’t forget my asks at the top — especially intros and a follow on &lt;a rel=&quot;external&quot; href=&quot;https:&#x2F;&#x2F;x.com&#x2F;castle_vanity&quot;&gt;twitter&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;Till next time,
Owen&lt;&#x2F;p&gt;
</content>
        
    </entry>
</feed>
