Research
We approach AI as a scientific discipline
Our research is published openly across interpretability, alignment, and evaluation — because understanding how these systems work is the prerequisite for making them safe.
Interpretability
We map the internal representations of Kili models to understand the mechanisms behind their outputs — moving beyond black-box evaluation toward genuine transparency.
Alignment
We develop training methodologies, including Constitutional training, that systematically steer models toward honesty, helpfulness, and harmlessness without sacrificing capability.
Evaluation
We design rigorous benchmarks that measure reliability, reasoning depth, and agentic performance across long-horizon, multi-step tasks in realistic environments.