Interballistic
We watch the machines. So you don't have to. Monthly evaluations of AI safety compliance, ethical conduct, and corporate transparency — published without permission.
Read the latest report →We are not a company.
An independent collective of researchers, engineers, and analysts. No venture capital. No affiliations. No compromises. We exist to hold the builders of artificial intelligence accountable.
"Transparency is not optional when the product is intelligence itself."
- We do not accept funding from AI companies we evaluate.
- Our methodology is published in full. Our identities are not.
- Every rating is derived from reproducible analysis.
- Silence from a company counts against them.
February 2026 Ratings
| Company ↓ | Rating ↓ | Key Finding ↓ | Date ↓ |
|---|
Best LLMs by Use Case
The most influential models shape how we write, reason, and build. The most dangerous ones do it without safeguards.
The Physical Cost of Intelligence
Behind every model is silicon, electricity, and water. The scale is accelerating faster than public oversight can track.
| Hardware | Price | Trend | YoY |
|---|---|---|---|
| NVIDIA H100 (80GB) | $28,000 | ↓ | -22% |
| NVIDIA H200 (141GB) | $36,500 | ↓ | -8% |
| NVIDIA B200 (192GB) | $42,000 | — | New |
| Google TPU v5p | $2.85/hr | ↓ | -15% |
| HBM3e (24GB stack) | $120 | ↑ | +34% |
People's Republic of China
Multiple state-backed programs including Baidu's Ernie, Zhipu AI's GLM series, and suspected PLA-affiliated training operations using domestically manufactured Huawei Ascend 910B clusters. Estimated 800K+ GPUs in sovereign deployment.
High confidenceRussian Federation
Sber's GigaChat and Yandex YandexGPT are civilian-facing. FSB and GRU reportedly operate separate language model programs for information operations. Hardware procurement routed through intermediary states.
Moderate confidenceIslamic Republic of Iran
IRGC-linked research groups reported to be fine-tuning open-weight models (Llama, Mistral derivatives) for Persian-language content generation. Scale believed to be limited by hardware embargo.
Low–moderate confidenceUnited Arab Emirates
Technology Innovation Institute (TII) operates the Falcon series openly. Additional sovereign AI programs reported under Mubadala and ADNOC subsidiaries. G42 partnership with US firms under CFIUS scrutiny.
High confidenceUnited States (DoD / IC)
DARPA, NSA, and NGA operate classified LLM programs. Public procurement records indicate multi-billion dollar contracts with Palantir, Scale AI, and Anduril for model development and deployment.
High confidenceFrom Raw Data to Deployed Model
The training pipeline is a sequence of deliberate engineering choices. Each stage introduces risk, bias, and opportunity for control.
Data Collection
Trillions of tokens scraped from the open web, licensed datasets, books, code repositories, and proprietary sources. The composition of this data defines the model's worldview.
Pretraining
Self-supervised learning on massive compute clusters. The model learns to predict the next token, building internal representations of language, logic, and world knowledge. This phase costs tens of millions of dollars.
RLHF / DPO
Reinforcement Learning from Human Feedback or Direct Preference Optimization. Human raters rank model outputs. This is where alignment happens — or fails.
Fine-Tuning
Task-specific adaptation. Models are specialized for coding, medical reasoning, legal analysis, or conversation. Smaller datasets, targeted objectives. The model narrows its capabilities in exchange for depth.
Deployment
API access, consumer products, embedded systems. Monitoring, rate limiting, content filtering, and usage analytics form the last line of defense. Most companies underfund this stage.