Research

We approach AI as a scientific discipline

Our research is published openly across interpretability, alignment, and evaluation — because understanding how these systems work is the prerequisite for making them safe.

Interpretability

We map the internal representations of Adamas models to understand the mechanisms behind their outputs — moving beyond black-box evaluation toward genuine transparency.

Alignment

We develop training methodologies, including Constitutional training, that systematically steer models toward honesty, helpfulness, and harmlessness without sacrificing capability.

Evaluation

We design rigorous benchmarks that measure reliability, reasoning depth, and agentic performance across long-horizon, multi-step tasks in realistic environments.

Publications

Model update

We approach AI as a scientific discipline

Interpretability

Alignment

Evaluation

Publications

Adamas 1.1 — refined instruction following and tool-use reliability

Constitutional training for honest, harmless language assistants

Measuring agentic reliability across long-horizon task sequences

Mapping feature circuits in Adamas 1 at the 7B scale

Adamas 1 — compute-optimal pre-training at the 7B scale

Introducing the Merkium Alignment Index