Anthrophic Natural Langauge Autoencoders

Updated 16 May 2026

https://www.anthropic.com/research/natural-language-autoencoders

Interesting technique.

Take a activation vector, use words as the latent space, and map back to activation vector. Figure out what model is thinkining internally.