LLM Neuroanatomy III: Why RYS Works — The Language-Agnostic Middle

Probing a 27B model shows its middle layers organise by meaning, not by language or format — weak evidence against the strong Sapir-Whorf hypothesis, and the reason RYS works.

Mar 26, 2026 LLMs, Research

LLM Neuroanatomy: How I Topped the LLM Leaderboard Without Changing a Single Weight

In mid-2024, the HuggingFace Open LLM Leaderboard was the Colosseum for Open-Weight AI. Thousands of models were battling it out, submitted by both well-funded labs with teams of PhDs and fine-tuni...

Mar 10, 2026 LLMs, Research

Building the Beam Universe Splitter II: Building a Quantum LLM

TL;DR: Here is the NotebookLLM Podcast audio version. In Part 1, I built a Quantum Random Number Generator out of a pair of old lab-equipment photomultiplier tubes, a 50:50 beam splitter, and an F...

Jul 22, 2026 hardware, LLMs, quantum mechanics

Building the Beam Universe Splitter I: A Quantum Magic 8-Ball

Sometimes I get weird projects stuck in my head. I’ve always wanted to do a “quantum” hardware project; in my PhD and postdoc I’d messed about making fluorescent dyes and proteins that were technic...

Jun 25, 2026 hardware, FPGA, quantum mechanics

2x GH200 for LLM inference, Part 3: GLM-5.2, expert offload, and the CPU question

Introduction Part 1 measured the dual GH200 workstation as a memory system. Part 2 used those measurements to explain why DeepSeek V4 Flash can be fast in vLLM when the model layout fits the hardw...

Jun 17, 2026 LLMs, workstations

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

Introduction Small AI computers are usually sold with large dreams and shitty memory buses. I have a ridiculous server that pulls a few kilowatts, but I wanted a local Hermes Agent box that cou...

Jun 9, 2026 LLMs, workstations

2x GH200 for LLM inference, Part 2: vLLM, DeepSeek V4 Flash/Pro, and MTP

Introduction A while back I did some optimisation on my Hopper system for MiniMax M2.1, and this was followed by some deeper GH200 benchmarking, where I measured the machine as a memory-shuffling ...

Jun 8, 2026 LLMs, workstations

What 2x GH200 delivers: memory paths for LLM inference

Introduction This article is mostly for me, as a way to record the peculiarities of my server; but it might come in handy for the ~3 other people running a home Grace-Hopper server? In a previous ...

Apr 25, 2026 LLMs, workstations

LLM Neuroanatomy II: Modern LLM Hacking and hints of a Universal Language?

In Part 1, I described how duplicating a block of seven middle layers in Qwen2-72B — no weight changes, no training — produced the #1 model on the HuggingFace Open LLM Leaderboard. The method, whic...

Mar 22, 2026 LLMs, Research

Arrhenius Integrals, IR Lasers, and Cooking Proteins

Introduction May 2016, Munich. I had just joined NanoTemper Technologies as a Bioanalytics Scientist. If you aren’t familiar with NanoTemper, they build high-end biophysical instruments. At the ti...

Feb 25, 2026 protein engineering, biophysics