Evaluating Mistral OCR 4 on PulseBench-Tab

Sid and Ritvik

June 25, 2026

Mistral released OCR 4 - the obvious question on our end working in the space was a narrow but vital one: how does it handle tables? Table extraction is one of the toughest parts of document understanding, so we ran OCR 4 through the full PulseBench-Tab document set the same day it shipped.

PulseBench-Tab is our open benchmark for multilingual table extraction, built with academic contributions from S&P Global and spanning 1,820 tables across nine languages. It scores models on T-LAG, a metric designed to reward structurally faithful extraction rather than surface-level text overlap, which is why it tends to expose the gap between models that read a page and models that actually reconstruct the grid underneath it.

Here is where OCR 4 lands against the rest of the field:

PulseBench-Tab T-LAG scores across leading document models on multilingual table extraction.

Model	T-LAG Score
Pulse Ultra 2	93.5%
Mistral OCR 4	86.2%
GPT-5.4 Pro	84.9%
GPT-5.4	84.9%
Claude Opus 4.6	84.5%
Gemini 3.1	81.5%
LlamaParse (Agentic)	79.8%
Reducto (Agentic)	79.5%
Reducto	71.8%

‍

At 86.2 T-LAG it lands second on the board to Pulse Ultra 2 (state-of-the-art), ahead of the frontier general-purpose models including GPT-5.4 Pro and Claude Opus 4.6, as well as Gemini 3.1 and other agentic parsers.

The gap that remains is the one that matters for the documents our customers actually run, where Pulse Ultra 2 holds a roughly seven-point lead at 93.5 T-LAG. That margin comes from the cases PulseBench-Tab was built to stress, including merged cells, nested headers, footnoted figures, and tables that break across pages or languages. Those are exactly the structures where a few points of T-LAG translate into the difference between a clean extraction and one that needs a human to repair it, and they remain the frontier of the problem rather than a solved corner of it.

The broader takeaway is that table extraction is approaching higher and higher accuracy, which is great for clients and everyone building on document AI. It also confirms why we keep PulseBench-Tab public and keep running new models through it. The full benchmark, methodology, and document set are available for anyone who wants to reproduce these numbers or test against their own corpus.

Announcements

Word and Cell Level Bounding Boxes Are Now Generally Available

Word and cell level bounding boxes are now publicly available through the Pulse API, giving every extracted word pixel-precise coordinates and every table cell its exact geometric boundaries, including merged cells, multi-line values, and gridless tables.

Sid and Ritvik

March 4, 2026

AI Trends

Why Nested Tables in Financial Statements Break Most Extraction Pipelines

A concise overview of why financial statements are not simple tables and how nested hierarchies, multi period layouts, and footnotes break traditional extraction pipelines. Explains why preserving structure matters more than OCR accuracy for reliable financial data.

Sid and Ritvik

January 15, 2026

Evaluating Mistral OCR 4 on PulseBench-Tab

Related articles