Skip to content

freight document intelligence AI

Your Documents Should Process Themselves

AI that reads, extracts, validates, and routes freight documents — invoices, AWBs, packing lists, customs forms — faster and more accurately than any human team.

Built For

Who This Is For

  • Freight forwarders processing 500+ documents per day across multiple document types
  • Customs brokers handling high-volume declaration workflows
  • Logistics companies with document processing teams of 5+ people doing manual data entry
  • Operations teams where document errors cause 10%+ of their shipment exceptions

Before CargoIQ

Manual document processing is the tax your ops team pays on every shipment

Every shipment generates 10–30 documents: commercial invoices, packing lists, bills of lading, air waybills, certificates of origin, customs declarations, insurance certificates, dangerous goods declarations, and more. Today, someone on your team reads each document, identifies what it is, extracts the relevant data fields, types them into your TMS or customs system, and cross-checks the numbers. This process is repeated thousands of times per week with the same predictable problems: tired eyes miss a digit, a new document layout confuses the process, a non-English document slows everything down, and there is never enough time during peak periods. Document errors are the #1 cause of customs holds, billing disputes, and delivery delays — and they are almost entirely preventable.

Data entry teams spending 6–8 hours per day manually keying document data into TMS and customs systems

Error rates of 3–8% on manually processed documents, with each error generating 30–60 minutes of downstream exception handling

No standardization — each operator extracts data slightly differently, making audit and compliance reporting unreliable

Multi-language documents (Chinese, Japanese, Arabic, Korean) require specialist staff or external translation, adding time and cost

Document classification itself is a bottleneck — operators must identify the document type before they can extract the right fields

Peak season volumes outstrip team capacity, forcing overtime, temporary staff, or delayed processing that impacts SLAs

Cross-document validation (does the invoice match the packing list? does the BL match the booking?) is rarely done systematically due to time pressure

What We Build

Capabilities

1

Multi-format document classification and routing

The system automatically classifies incoming documents by type (commercial invoice, packing list, BL, AWB, CoO, customs declaration, DG declaration, insurance certificate, etc.) regardless of format, layout, or language. Classification accuracy exceeds 98% across 30+ document types. Classified documents are automatically routed to the appropriate extraction pipeline and downstream workflow.

2

High-accuracy field extraction from structured and unstructured documents

Combines Azure Document Intelligence (for structured layouts), custom OCR pipelines (for degraded scans and stamps), and LLM-based extraction (for unstructured text and non-standard formats) to achieve 99%+ accuracy on structured documents and 95%+ on unstructured formats. Each extracted field includes a confidence score that drives downstream routing — high-confidence fields proceed automatically, low-confidence fields are flagged for review.

3

Cross-document validation and reconciliation

The system doesn't just extract documents in isolation — it cross-validates data across related documents within the same shipment. Invoice totals are checked against packing list quantities and unit prices. BL weights are compared against packing list totals. Booking reference numbers are validated across all documents. This catches discrepancies that would otherwise surface days or weeks later as billing disputes or customs holds.

4

Automated exception detection and flagging

Beyond extraction accuracy, the system applies business logic to detect operational exceptions: missing mandatory documents for a shipment type, suspicious value declarations that could trigger customs scrutiny, weight discrepancies exceeding tolerance thresholds, and expired certificates. Exceptions are routed with full context to the appropriate handler.

5

Direct TMS and customs system integration

Extracted, validated data is pushed directly into your operational systems — CargoWise, SAP TM, Oracle TMS, Descartes CustomsInfo, or your customs broker's declaration system. No manual data entry, no copy-paste, no re-keying. The integration handles field mapping, reference number linking, and system-specific formatting requirements.

6

Complete audit trail and compliance reporting

Every document processed generates a full audit record: original document image, extracted data, confidence scores, validation results, any human review actions, and the final data pushed to downstream systems. This audit trail supports customs compliance requirements, internal quality audits, client SLA reporting, and dispute resolution with clear, timestamped evidence.

In Practice

Real-World Use Cases

High-volume customs declaration workflow

A customs broker processes 200+ import declarations per day. Each requires data from 3–5 source documents (invoice, packing list, BL, CoO). The system extracts data from all source documents, cross-validates across documents, maps fields to the declaration format, and pre-populates the declaration — reducing per-declaration processing from 25 minutes to 5 minutes with higher accuracy.

Multi-language document processing for Asia-Europe trade

A forwarder handles China-to-Europe shipments where supplier invoices are in Chinese, certificates are in English, and customs forms are in the destination country's language. The system handles all languages natively, extracting the same structured data regardless of language, eliminating the need for specialist language staff or external translation services.

Carrier invoice audit and overcharge detection

The system processes carrier invoices by extracting all charge line items and automatically comparing them against contracted rates and applicable surcharge tables. It flags overcharges, duplicate charges, and unauthorized surcharges — typically recovering 3–5% of freight spend that would otherwise go unnoticed in manual processing.

Implementation

How We Deploy It

Timeline: 6–12 weeks depending on document type coverage

1

Weeks 1–2: Document audit — catalog all document types, sample 50–100 of each type, define field extraction requirements and validation rules

2

Weeks 3–6: Extraction pipeline build — model training on your actual documents, OCR optimization for your document quality, confidence threshold tuning

3

Weeks 7–9: TMS integration, exception routing workflows, audit trail implementation

4

Weeks 10–12: UAT, accuracy benchmarking against manual processing, production deploy with parallel run

Results

Real Numbers from Production Systems

99%+

Extraction accuracy

On structured documents (invoices, AWBs, BLs) in standard formats. 95%+ on unstructured or non-standard layouts.

Near-perfect data quality eliminates downstream exception cascades

70%

Processing time reduction

Average time from document receipt to data availability in TMS, compared to manual processing baseline

Shipments clear faster, improving SLA compliance across the board

90%

Reduction in manual data entry

Remaining 10% are edge cases requiring human review — flagged automatically with specific fields highlighted

Ops team refocused on exception management instead of data keying

24/7

Processing availability

Documents arriving at 2 AM are processed immediately, not queued until the morning shift starts

No timezone dependency — global operations move at the speed of the document

Tech Stack: PythonAzure Document IntelligenceLangGraphTesseract OCROpenAI GPT-4oPostgreSQL
Integrations: CargoWise OneSAP Transportation ManagementOracle TMSDescartes CustomsInfoMicrosoft Dynamics 365BluJay / E2openEmail / SFTP / EDI ingestionSharePoint / Google Drive (document sources)

Works with your existing TMS

Direct integration with CargoWise, SAP TM, Oracle TMS, Microsoft Dynamics, and Descartes.

View Integrations

Frequently Asked Questions

What types of freight documents can it process?
Commercial invoices, proforma invoices, air waybills (MAWB and HAWB), bills of lading (MBL, HBL, switch BLs), packing lists, certificates of origin, EUR.1 movement certificates, customs declarations, dangerous goods declarations (IMDG, IATA DGR), insurance certificates, delivery orders, arrival notices, cargo manifests, and virtually any other freight document format. The system is extensible — if you have a document type we haven't encountered, we can add extraction capability for it within 1–2 weeks.
How accurate is the extraction?
We measure accuracy at the field level, not the document level, because one wrong field on an otherwise perfect extraction still matters in freight. On structured documents (standard-format invoices, AWBs, BLs), field-level accuracy exceeds 99%. On unstructured or non-standard formats (handwritten notes, unusual layouts, degraded scans), accuracy is 95%+. Every field includes a confidence score — fields below the threshold (configurable, typically 85–90%) are flagged for human review with the specific field highlighted, not the entire document. This means your team only touches the 5–10% of data points that genuinely need human judgment.
Can it handle handwritten or poor-quality scans?
Yes, with appropriate expectations. The pipeline includes image enhancement (de-skewing, contrast adjustment, noise removal), multiple OCR engines (Azure Document Intelligence for printed text, Tesseract with custom models for degraded scans), and LLM-based extraction that uses context to resolve ambiguous characters (e.g., understanding that a field labeled "Weight" probably contains a number, not a word). For truly degraded documents (thermal paper fax copies, severely blurred scans), the system extracts what it can with confidence scores and flags the rest for review rather than guessing.
How does it integrate with our existing systems?
Direct API integration with CargoWise (eHub/Universal Gateway), SAP TM, Oracle TMS, Descartes CustomsInfo, Microsoft Dynamics 365, and BluJay/E2open. For systems without API access, we support EDI output, database insertion, and file-based integration (CSV/Excel to SFTP). The integration layer handles field mapping, reference number linking, and system-specific formatting. We build and test the connectors specific to your environment — this is not a generic integration that requires you to do the mapping.
What languages does it support?
The system handles documents in all major shipping languages: English, Chinese (Simplified and Traditional), Japanese, Korean, German, French, Spanish, Portuguese, Italian, Dutch, Arabic, Turkish, Thai, Vietnamese, Hindi, and Malay/Indonesian. Multi-language support is native — the same document can contain text in multiple languages (common in international shipping) and the system extracts correctly from all of them without requiring language-specific configuration.
How does pricing compare to manual processing costs?
A typical document processing operator costs $3,000–$6,000/month (fully loaded, depending on location) and processes 80–120 documents per day at 92–96% accuracy. Our system processes the same volume for a fraction of that cost at 99%+ accuracy, running 24/7 without breaks, turnover, or training time. Most clients see ROI within 3–4 months. We price on monthly volume tiers, not per-document, so costs are predictable and decrease per-unit as volume grows.

Ready to Automate Your Document Intelligence?

Book a free audit. We'll show you exactly what we'd build for your operations.