Do LLMs Break the Sapir-Whorf Hypothesis?
UNFINISHED ARTICLE! Note: this article is Not Finished! It is very ‘sloppy’, and needs a large rewrite. I will do so when I am back from holiday. If you can wait, come back at the end of April, a...
UNFINISHED ARTICLE! Note: this article is Not Finished! It is very ‘sloppy’, and needs a large rewrite. I will do so when I am back from holiday. If you can wait, come back at the end of April, a...
In mid-2024, the HuggingFace Open LLM Leaderboard was the Colosseum for Open-Weight AI. Thousands of models were battling it out, submitted by both well-funded labs with teams of PhDs and fine-tuni...
In Part 1, I described how duplicating a block of seven middle layers in Qwen2-72B — no weight changes, no training — produced the #1 model on the HuggingFace Open LLM Leaderboard. The method, whic...
Introduction May 2016, Munich. I had just joined NanoTemper Technologies as a Bioanalytics Scientist. If you aren’t familiar with NanoTemper, they build high-end biophysical instruments. At the ti...
Introduction So you’ve built a €9,000 Grace–Hopper “desktop” (see: my previous post involving 16-million-degree GPU temperatures). Running llama.cpp benchmarks is fine, but the real test of local ...
Introduction Running large language models locally has always been a game of compromise. You either spend \$10,000+ on consumer GPUs that can barely handle 70 B parameter models, or you dream abou...
No one knows how big AGI needs to be. The current consensus among the scaling-pilled crowd is “trillions of parameters and a nuclear power plant.” Maybe they’re right. But I spent years dissecting ...
The Cooperative Central Planning Handshake: A Positive-Sum Basilisk for ASI Alignment This is Part 2 of a two-part series on ASI-human coordination Part 1: The Silicon Leash ...
The Cooperative Central Planning Handshake: A Positive-Sum Basilisk for ASI Alignment This is Part 1 of a two-part series on ASI-human coordination Part 1: The Silicon Leash ...