Introducing Pulse Ultra 2

Sid and Ritvik

April 21, 2026

After a year of building a proprietary training corpus, a dedicated human annotation pipeline, and an entirely new model architecture, we're releasing Pulse Ultra 2: our highest-accuracy document extraction model to date.

Pulse Ultra 2 is built on a completely rearchitected pipeline:

Reduced network hops. The previous Pulse architecture routed documents through multiple discrete stages, each as a separate service call: OCR, layout detection, table detection, cell extraction, structure reconstruction. Pulse Ultra 2 collapses this into a unified end-to-end architecture. Fewer hops, fewer failure points, fewer places for errors to compound.

End-to-end model architecture. Instead of chaining independent models that each solve a subtask, Enterprise processes the full page in a single forward pass that jointly handles layout understanding, cell detection, text recognition, and structure prediction. This eliminates the error propagation problem where an upstream OCR mistake cascades into a downstream structure error.

Merged cell handling. Colspan and rowspan detection was the single biggest accuracy bottleneck in the previous generation. Pulse Ultra 2 models cell spanning as a native output rather than a post-processing heuristic applied after grid detection.

Multilingual OCR. Rebuilt text recognition for CJK (Chinese, Japanese, Korean), Arabic/RTL, and Cyrillic scripts. Our data pipeline previously was trained predominantly on Latin-script documents. Pulse Ultra 2 was trained on a multilingual corpus spanning 100+ languages.

Table boundary detection. Improved handling of multi-table pages, borderless tables, and tables embedded in complex layouts. This is a culmination of architecture and data improvements.

Given the high computational demand, we're rolling out with specific rate limits for non-enterprise accounts while we scale capacity. Reach out to our team with the link here for higher rate limits or dedicated capacity.

Link to the docs here

Announcements

Unlocking Documents at Scale: Pulse Opens Its Platform

Pulse opens its platform after processing 600 million pages for Fortune 100 enterprises.

Sid and Ritvik

August 15, 2025

Announcements

Introducing the Pulse CLI

The Pulse CLI brings Pulse's full document extraction pipeline to your terminal, letting you convert PDFs, folders, and URLs into clean Markdown and structured JSON with a single command. Built for developers, it integrates seamlessly with shell scripts, CI/CD, and automation workflows, making document processing fast, scriptable, and agent-ready.

Sid and Ritvik

July 6, 2026

Introducing Pulse Ultra 2

Related articles