AI is paid by the word. That’s not a metaphor — tokens are literally what you pay for. That’s why AI loves to be verbose, to generate more “tokens”… it’s a feature, not a bug. This is literally how they reason, and this is how most AI is monetized, what you pay. “Vibe coding”, “Thinking”, “planning”, “agents” … layers upon layers of less and less visible internal narrative you paid. This is only getting worse as the “price per token” (what the “pricing page” shows) gets cheaper and we follow the Jevons paradox. There isn’t much attention to tokens used per outcome.
We do have an out here. Instead of “thinking”, have good “indices” of what to look up. Instead of giving your AI intern an unlimited budget to think, give them good reference material. What I mean in AI is called “embeddings” — a kind of semantic compression. Most AI systems use them already, but almost no one uses them as references to offload thinking. I call this approach “flat embeddings”. It’s like a map. It’s really hard to make a good map, but it’s really easy to use a good map. It might be hard to make a “flat embedding,” but it’s well worth it when they are extremely easy to use and reuse. IMO most embedding work focuses on compression, not on ease of retrieval.
The next wave of AI shouldn’t be another agentic layer on top of an already bloated stack. It should flatten it. And part of this will involve “flat embeddings” — which is exactly the kind of IP we are building at LGND AI for Earth.
Originally posted on LinkedIn.