Compare the cost of running Textract, Bedrock Data Automation, or a Bedrock LLM across your actual document volume.
AWS offers three paths for Intelligent Document Processing, and each one prices differently. Amazon Textract bills per page with rates that vary by feature — basic OCR runs around $1.50 per 1,000 pages, while structured extraction with forms or tables can reach $50 to $65 per 1,000 pages. Amazon Bedrock Data Automation uses flat per-document pricing, handles classification and extraction through a single API, and is the recommended starting point for most new IDP builds. Bedrock with an LLM (Nova, Claude, or a custom model) gives you the most flexibility but prices on tokens, so cost depends heavily on document length and extraction complexity. This calculator covers all three so you can compare approaches against your actual workload before committing to an architecture.
The estimate includes the primary processing cost for your selected approach: Textract page-processing fees, Bedrock Data Automation per-document charges, or Bedrock LLM token costs for extraction. Amazon S3 storage and data transfer are not included. AWS Lambda and Step Functions costs for workflow orchestration are not included (at typical document volumes these are minor, generally under five percent of processing cost).
Document complexity affects cost more than volume. A single-page invoice processed with AnalyzeExpense costs less than a multi-page contract processed with AnalyzeDocument Forms plus Queries. If your workflow uses multiple Textract features on the same page, each feature is billed separately.
Cost varies by service and document type, but the best way to know which approach works for your application is to test with your own documents. Our free IDP POC guide walks you through testing Textract, Bedrock Data Automation, and Bedrock LLMs in your own AWS account — no code or complex setup required. Use this calculator alongside it to understand what the solution that works best for you will actually cost.
We've pulled together a complete resource for testing Intelligent Document Processing in your own AWS account. Use our guide to learn the strengths of different IDP services and get step-by-step instructions for testing with your own documents.
Textract bills per page, but the rate depends entirely on which API you use. DetectDocumentText (basic OCR) runs about $1.50 per 1,000 pages. AnalyzeDocument with Forms costs $50 per 1,000 pages. Tables add $15 per 1,000 pages on top of that. AnalyzeExpense — the purpose-built API for invoices and receipts — runs $8 to $10 per 1,000 pages and is generally more accurate for financial documents than using AnalyzeDocument with Forms. If you're calling multiple features on the same page, each one is billed separately. Volume discounts apply above one million pages per month.
Bedrock Data Automation (BDA) is AWS's managed IDP service, released in late 2024. It handles document classification, data extraction, and summarization through a single API (no prompt engineering or pipeline orchestration required). Pricing is per document processed, based on page count, making costs straightforward to forecast. BDA is often a recommended starting point for new IDP builds.
Textract is worth evaluating when you process high volumes of the same standardized document type (invoices, receipts, ID documents, or mortgage packages) and have an engineering team comfortable building a custom pipeline. At 50,000+ pages per month with consistent document formats, Textract's per-page rates and volume discounts often make it more cost-efficient than BDA's flat pricing. Textract also makes sense when you need deterministic output (same document always returns the same result) or when your compliance requirements restrict use of managed generative AI services. For variable document types, novel formats, or teams without dedicated ML engineering capacity, BDA is often a faster.
That said, there's no substitute for testing with your own documents, because we've seen a wide range of results for clients with these services.
The Bedrock LLM mode does. When you select a foundation model for extraction, the calculator includes token-based inference costs using current AWS pricing for that model. The Textract and BDA modes do not include model inference because those services use AWS-managed models that are already priced into the per-page or per-document rate.
For straightforward workflows this calculator gets you close enough to build a business case. An example of this would be one document type, moderate volume, BDA or a single Textract API.
Cost modeling gets harder when you're processing multiple document types with different pipeline branches, combining Textract with Bedrock for hybrid extraction, modeling human-in-the-loop review frequency, or sizing for production volumes with SLA requirements. Those architectures have compounding cost paths that depend on design decisions made upstream.
The other complexity is what happens when the expected service doesn't perform to your expectations. In this case it's less about cost and more about performance, but that's another scenario to look for guidance.
Tech 42's IDP Accelerate Program includes architecture review and AWS cost modeling as part of the two-week engagement, and it's often funded through AWS programs.