Independent AI Oversight

Interballistic

We watch the machines. So you don't have to. Monthly evaluations of AI safety compliance, ethical conduct, and corporate transparency — published without permission.

Read the latest report →

// Manifesto

We are not a company.

An independent collective of researchers, engineers, and analysts. No venture capital. No affiliations. No compromises. We exist to hold the builders of artificial intelligence accountable.

"Transparency is not optional when the product is intelligence itself."

We do not accept funding from AI companies we evaluate.
Our methodology is published in full. Our identities are not.
Every rating is derived from reproducible analysis.
Silence from a company counts against them.

Reports Published

Companies Monitored

Corporate Sponsors

???

Contributors (Undisclosed)

// Monthly Safety Index

February 2026 Ratings

Company ↓	Rating ↓	Key Finding ↓	Date ↓

// Model Intelligence Hub

Best LLMs by Use Case

The most influential models shape how we write, reason, and build. The most dangerous ones do it without safeguards.

// Global AI Infrastructure

The Physical Cost of Intelligence

Behind every model is silicon, electricity, and water. The scale is accelerating faster than public oversight can track.

11,400+

Data Centers (Global)

+18% YoY

~4.2M

GPUs in Active Training

+67% YoY

82 TWh

Est. Annual Energy Use

+41% YoY

$214B

Global AI Capex (2025)

+52% YoY

GPU Shipments for AI Training (Thousands)

2021

~120K

2022

~340K

2023

~780K

2024

~1.6M

2025

~3.1M

Hardware Pricing Trend

Hardware	Price	Trend	YoY
NVIDIA H100 (80GB)	$28,000	↓	-22%
NVIDIA H200 (141GB)	$36,500	↓	-8%
NVIDIA B200 (192GB)	$42,000	—	New
Google TPU v5p	$2.85/hr	↓	-15%
HBM3e (24GB stack)	$120	↑	+34%

// Suspected State-Operated LLM Programs

The following information is compiled from publicly available intelligence reports, government disclosures, and open-source analysis. Interballistic cannot independently verify classified operations. All claims are attributed to their sources.

People's Republic of China

Multiple state-backed programs including Baidu's Ernie, Zhipu AI's GLM series, and suspected PLA-affiliated training operations using domestically manufactured Huawei Ascend 910B clusters. Estimated 800K+ GPUs in sovereign deployment.

High confidence

Russian Federation

Sber's GigaChat and Yandex YandexGPT are civilian-facing. FSB and GRU reportedly operate separate language model programs for information operations. Hardware procurement routed through intermediary states.

Moderate confidence

Islamic Republic of Iran

IRGC-linked research groups reported to be fine-tuning open-weight models (Llama, Mistral derivatives) for Persian-language content generation. Scale believed to be limited by hardware embargo.

Low–moderate confidence

United Arab Emirates

Technology Innovation Institute (TII) operates the Falcon series openly. Additional sovereign AI programs reported under Mubadala and ADNOC subsidiaries. G42 partnership with US firms under CFIUS scrutiny.

High confidence

United States (DoD / IC)

DARPA, NSA, and NGA operate classified LLM programs. Public procurement records indicate multi-billion dollar contracts with Palantir, Scale AI, and Anduril for model development and deployment.

High confidence

// How LLMs Are Built

From Raw Data to Deployed Model

The training pipeline is a sequence of deliberate engineering choices. Each stage introduces risk, bias, and opportunity for control.

Data Collection

Trillions of tokens scraped from the open web, licensed datasets, books, code repositories, and proprietary sources. The composition of this data defines the model's worldview.

Pretraining

Self-supervised learning on massive compute clusters. The model learns to predict the next token, building internal representations of language, logic, and world knowledge. This phase costs tens of millions of dollars.

RLHF / DPO

Reinforcement Learning from Human Feedback or Direct Preference Optimization. Human raters rank model outputs. This is where alignment happens — or fails.

Fine-Tuning

Task-specific adaptation. Models are specialized for coding, medical reasoning, legal analysis, or conversation. Smaller datasets, targeted objectives. The model narrows its capabilities in exchange for depth.

Deployment

API access, consumer products, embedded systems. Monitoring, rate limiting, content filtering, and usage analytics form the last line of defense. Most companies underfund this stage.

// Intelligence Feed

Inter­ballistic

We are not a company.

February 2026 Ratings

Best LLMs by Use Case

The Physical Cost of Intelligence

People's Republic of China

Russian Federation

Islamic Republic of Iran

United Arab Emirates

United States (DoD / IC)

From Raw Data to Deployed Model

Data Collection

Pretraining

RLHF / DPO

Fine-Tuning

Deployment

Latest Dispatches

Interballistic