Christopher Olah
Pioneer of interpretability research at Anthropic. His essays on neural network internals (circuits, features, mechanistic interpretability) are foundational reading.
Pioneer of interpretability research at Anthropic. His essays on neural network internals (circuits, features, mechanistic interpretability) are foundational reading.