February 21, 2025
3 min read

Why Financial Institutions Are Abandoning Legacy OCR Tools

Why Financial Institutions Are Abandoning Legacy OCR Tools

Following our recent funding announcement, we've been overwhelmed by messages from enterprises struggling with legacy OCR tools. Financial institutions are dominating our inbound interest, and it's easy to see why.

After extensively testing AWS Textract, Google Gemini, Azure Document Intelligence, and other solutions for processing complex financial documents (10-Ks, earnings reports, CIMs), we've uncovered the following critical failure nodes:

  • Nested table “blindness” with missing columns, shuffled ordering, etc.
  • Failed/poor accuracy extraction of pie charts/bar graphs into structured output
  • Poor numerical precision, with occasional missing currency markers + inconsistent formatting

We’ve attached an example below comparing these providers/LLMs with Pulse so you can judge for yourself! The best test is the visual test. 

Ground truth source: Blackstone

As you can see in our comparison with Gemini 2.0 Flash on a private equity CIM, the LLM shifts columns and hallucinates ~50% of values. We have many examples like this, across all industries, so let us know if you want to see more!

——

We have many examples like this, across all industries, so reach out if you want to see more!