How Kaaj Uses Generative AI to Classify Loan Documents and Turbo-Charge Underwriting

Shivi Sharma

A single small business loan credit package can contain 20-plus PDFs, and every minute an underwriter spends renaming “01.03.24-final(3).pdf” is a minute they’re not pricing risk or winning business. Kaaj fixes that with a hybrid of generative AI models, both vision-language models (VLMs) and large language models (LLMs), and proven traditional ML that reads files, tags them against our 20+ business-friendly categories, and routes each one to the right verification step—cutting “unknown” rates by 78 %, getting borrowers an answer in minutes rather than days, and giving credit teams cleaner data for underwriting.

What Is Document Chaos and How Does It Slow Underwriting?

Manual triage is the hidden tax on every deal—yet few credit teams track it. Let’s unpack the pain.

Volume: a Monday-morning fire-hose

Picture this: it’s 8:05 a.m. and a deal on your submissions email is received with a zip folder labeled “Acme_Corp_Small_Business_Loan.zip.” Inside sit 28 PDFs plus a handful of stray JPEGs. Some are four-page bank statements; others are ID scans, tax returns, invoices, insurance certs, and the obligatory “Final-revised-actual-statement-v3(1).pdf.” Even opening the ZIP can take minutes on a remote desktop. Multiply that by dozens of deals per week and the hours vanish fast.

Variety: layouts that change weekly

Templates drift: last quarter the logo was top-left; now it’s a floating watermark. A file named “BankStatement-Feb.pdf” might actually be a pay stub because a busy borrower grabbed the wrong attachment. Analysts open—and reopen—files just to be sure.

Human drag: the silent cost center

Renaming, re-ordering, and pushing each file to the correct folder eats two to three hours per analyst per day. At a fully-loaded salary of $90 k that’s $22 k a year, per seat, spent on glorified file management—before any real underwriting begins.

Error risk: the audit-time nightmare

Mis-filed docs create audit gaps that surface months later. A missing tax return can trigger a frantic “can you resend page 14?” email days before funding. Worse, if a critical ID is buried under “other,” KYC checks can be skipped entirely. Regulators don’t accept “we couldn’t find it.”

The net result: longer turnaround times, higher cost per loan, and frustrated borrowers who may shop elsewhere.

Why Accurate Classification Matters

Accurate labels aren’t vanity; they power every downstream decision. Let’s zoom in.

1. Fraud & risk controls

Each document type activates a unique rule set. A bank statement triggers balance-variance math; a passport triggers face-match and expiration-date checks. Mis-label a file and you blindside the corresponding control—opening the door to synthetic identities or doctored balances.

2. Regulatory compliance

Auditors love tidy trails: document type → review step → outcome. When files are correctly tagged, proving “reasonable care” takes minutes, not days.

3. Downstream automation

Modern underwriting resembles an assembly line:

Classification → Tailored OCR → Structured JSON → Credit Model → Decision

Botch step #1 and every robot arm downstream stalls. Correct tags let extraction scripts pick the right template, feed clean data to risk models, and push decisions in minutes—not business days.

4. Borrower experience & brand equity

Nothing shouts “legacy lender” like repeated requests for the same doc. Auto-classification reduces back-and-forth, shortens SLAs, and keeps five-star reviews flowing.

Bottom line: Miss the classification step and you pay in fraud losses, fines, extra head-count, and churn.

Limitations of Traditional Approaches

Challenge	Why Rule-Based or Classic ML Falls Short
Layout drift	Small template tweaks break brittle regex rules.
Low-text scans	OCR alone can miss logos, stamps, or handwriting.
Edge cases	The “other” bucket balloons, forcing manual review.
Scaling new types	Adding a label can take weeks of data gathering and retraining.

Curious how Kaaj solves these issues? Book a live demo to see it in action.

Inside Kaaj AI Pipeline

We’re English-first on purpose—our lenders serve U.S. markets, so every research hour goes into squeezing maximum accuracy from English-language files.

Stage	What happens	Why it matters
1. Visual checkpoint	Lightweight vision pass spots obvious layouts and logos.	Instantly classifies the bulk of documents at superfast speed.
2. Contextual read	A language model scans headings and key phrases.	Adds confidence and picks up ambiguous cases the visual pass flags.
3. Deep multimodal review	A multimodal model looks at image + text + layout together—but only when needed.	Handles tough edge cases (low-quality scans, uncommon formats) without slowing the whole queue.

Smart routing means high accuracy and insane speed, keeping compute spend lower than manual indexing—while still driving “unknown” rates toward zero.

Kaaj.ai’s Readable Document Taxonomy

Category	Friendly Names
Financial Statements	Bank Statement, Financial Statement, Financial Projections
Credit & Risk Reports	Credit Report, PayNet Report, Business Web Presence Report
Applications & Packages	Loan Application, Submission Package
Revenue Docs	Invoice
Identity Documents	Passport, Social Security Card, Commercial Driver’s License
Government & Tax	IRS SS-4 EIN Letter, Tax Returns
Business Formation	Articles of Incorporation, Bylaws / Operating Agreement, Secretary-of-State Certificate
Branding	Company Logo
Other	Catch-all for truly unrecognized items (flagged for taxonomy review)

Synonyms such as “P&L” or “Profit and Loss” automatically map to Financial Statement, so analysts see only the clean labels.

Business Impact at a Glance

-78 % “unknown” rate in the first month of go-live with pilot lenders.
Hours reclaimed per analyst each week—time now spent assessing risk, not rearranging PDFs.
Audit-ready explainability with confidence scores and token-level heat maps for every prediction.
Drop-in deployment via APIs; no engineering team required.

Burning Questions (FAQ)

Q1. What is a vision-language model (VLM)?
A neural network trained to understand images and the text inside them at the same time—ideal for low-quality scans or documents with important visual cues (e.g., watermarks, stamps).

Q2. How accurate is Kaaj.ai?
Internal benchmarks show > 99 % macro-accuracy across our 20-item taxonomy on real customer data.

Q3. Does it replace my existing OCR or extraction tool?
It can sit in front of any downstream system. Feed Kaaj.ai PDFs and it returns the file plus a clean label; your current extractors can keep doing their job, only faster.

Q4. Can I add a new document type?
Yes. Flag examples in the UI; the system promotes frequent “other” files to a first-class label without full-scale retraining.

Q5. Is my data secure?
Kaaj.ai is SOC 2 Type II compliant with bank grade security controls in place.

Ready to See It in Action?

Upload a messy credit package and watch Kaaj.ai sort, label, and route every page in seconds. Book a live demo and turn document chaos into a competitive edge.

Table of Contents