Weekly AI Rankings — June 28 – July 05, 2026

Top 5 AI Models of the Week

#1 Claude Sonnet 5 NEW

This week, Anthropic launched Claude Sonnet 5, making it available to a wider audience, including free access. The model is positioned as a cheaper working default with an announced pricing structure.

Claude Sonnet 5 offers an API pricing of $2 per 1M input and $10 per 1M output, making it accessible for developers.

Claude Sonnet 5 Claude Sonnet 5 – benchmark results

#2 ZCode 3.0 NEW

Z.ai released ZCode 3.0, an AI-native IDE that supports multi-agent workflows. This release marks a significant step for development with agents.

ZCode 3.0 includes code review features and tight integration with GLM-5.2, making it a powerful tool for developers.

ZCode 3.0 Remote Labor Index update

#3 GLM-5.2 ↑3

This week, GLM-5.2 surpassed Claude Opus 4.8 on Terminal Bench 2.1, sparking discussions about its performance. The model is also integrated into ZCode 3.0.

On Terminal Bench 2.1, GLM-5.2 scored 82.7, while Claude Opus 4.8 scored 78.9, highlighting its competitiveness.

GLM 5.2 beats Claude in our benchmarks GLM-5.2 is the new leading open weights model on Artificial Analysis GLM 5.2 Is Out ZCode 3.0

#4 LongCat 2.0 NEW

Meituan announced LongCat 2.0, trained on Chinese chips, drawing attention to architectural solutions for long context.

LongCat 2.0 includes information on API pricing and training parameters at a large scale.

LongCat-2.0 LongCat-2.0, a large-scale MoE model with 1.6T total and 48B Active China's LongCat-2.0 Becomes the Biggest AI Model Without Nvidia Chips LongCat 2.0

#5 Leanstral 1.5 NEW

Mistral updated Leanstral to version 1.5, marking an important step for supporting formal verification in Lean 4.

Leanstral 1.5 is aimed at assisting in the formulation and verification of proofs, making it useful for developers.

Leanstral 1.5: Proof abundance for all Leanstral 1.5 Leanstral 1.5 Leanstral-1.5-119B-A6B

Top 5 AI Tools of the Week

#1 Claude Code ↑1

This week, Anthropic released an official prompt library for Claude Code, marking a significant step for developers. The company also restricted access to the tool for developers in China.

The library includes templates for planning, debugging, and automation, making it useful for real engineering scenarios.

Claude Code is steganographically marking requests I used Claude Code to get a second opinion on my MRI Alibaba to ban Claude Code in workplace over alleged backdoor risks, source says Anthropic blocks Claude Code in China amid distillation concerns

#2 GFusion NEW

Sber launched GFusion, a diffusion LLM based on GigaChat, drawing attention to its atypical output mechanics. The model was released as open source.

GFusion attempts to generate and edit text in blocks, differing from traditional autoregressive models.

GFusion — экспериментальная диффузионная LLM от Сбера GFusion-10B-A1.8B GFusion code

#3 Caveman Code NEW

Caveman Code has become a popular plugin that saves tokens by compressing the style of LLM responses. This tool has attracted developers' attention amid rising token costs.

Caveman Code can save up to 75% of tokens by reducing introductory explanations and lengthy transitions.

Show HN: Ctx, save tokens by loading only the relevant tools Caveman Code

#4 KVAE-Audio NEW

Sber launched KVAE-Audio for audio compression, marking an important step in generative models. The tool demonstrates significant compression in audio processing time.

KVAE-Audio processes audio at 48 kHz and demonstrates compression of up to 960× in processing time.

kvae-audio KVAE-Audio weights KVAE-Audio blogpost

#5 KVAE-Audio weights NEW

Sber launched KVAE-Audio for audio compression, marking an important step in generative models. The tool demonstrates significant compression in audio processing time.

KVAE-Audio processes audio at 48 kHz and demonstrates compression of up to 960× in processing time.

kvae-audio KVAE-Audio weights KVAE-Audio blogpost

Get daily AI signals in Telegram →