METHODOLOGY · AI ACCURACY

How accurate is Trueleveler? Radically transparent.

Dual-model AI. Cross-validated findings. Published limitations. Here is exactly how every analysis gets made — and what you can do to get the best results on the documents you upload.

Try Free for 14 Days → Open App

AI Models Cross-Validate

Engines Independently Tuned

Validation Stages

Documents Stored

§ 01 THE METHOD · DUAL-MODEL CROSS-VALIDATION

Two AI models. One reconciled answer.

Step 01

Document parsing

Every upload is parsed into structured data. Tables, line items, clauses, and terms are extracted and normalized — regardless of the source format.

Step 02

Independent analysis

Two AI models (Claude by Anthropic and Gemini by Google) analyze each document independently. Neither sees the other's output during this stage.

Step 03

Cross-validation

Results are compared and reconciled. Where both models agree, confidence is highest. Where they disagree, the system flags the discrepancy for closer review.

Step 04

Structured output

Final results are presented with confidence indicators, source references, and clear explanations so you can quickly verify findings against the original documents.

§ 02 ACCURACY BY ENGINE · WHAT EACH ENGINE CATCHES

What each engine catches best.

Accuracy depends on document quality, format, and complexity. All engines deliver consistently high accuracy on well-formatted documents. Below is what each engine is designed to identify — and the conditions where it performs best.

Engine	What It Catches	Accuracy Factor
Bid Leveling	Line-item discrepancies, scope gaps, missing items, unit price outliers, math errors, and bid-to-bid inconsistencies across multiple proposals.	Highest on structured bid tables with clear line items.
Contract Review	Risky clauses, indemnification gaps, payment term issues, insurance requirements, change order provisions, and termination conditions.	Best on standard contract formats (AIA, ConsensusDocs, NEC, JCT).
Submittal Extractor	Submittal requirements from specifications, section references, responsible parties, due dates, and approval workflows.	Strongest on CSI-formatted specifications.
Bid Scope Compliance	Bid-to-spec misalignments, missing scope items, qualification conflicts, and exclusion gaps when comparing a bid against project requirements.	Most effective with clear scope-of-work definitions.
RFQ Generator	Generates comprehensive RFQs from project specs with appropriate scope, terms, and evaluation criteria for each trade package.	Quality improves with detailed project specifications.
Change Order Review	Pricing reasonableness, scope justification, markup compliance, schedule impact, and alignment with original contract terms.	Best with itemized change order breakdowns.
Pay App Review	Overbilling, schedule-of-values discrepancies, retainage errors, percent-complete mismatches, and stored materials issues.	Most accurate with standard AIA G702/G703 formats.
Document Compare	Clause changes, added/removed terms, modified pricing, scope alterations, and hidden revisions between document versions.	Best comparing documents of the same type and format.

§ 03 KNOWN LIMITATIONS · WHERE HUMAN REVIEW STAYS ESSENTIAL

What the AI cannot do.

No AI system is perfect. Here are the areas where Trueleveler has known limitations and where human review remains essential.

Handwriting

Handwritten notes

Handwritten annotations, margin notes, and hand-drawn markups are not reliably extracted. If critical terms exist only in handwritten form, they may be missed.

Scan Quality

Severely damaged scans

Documents that are heavily skewed, extremely low resolution, or have significant portions obscured will produce incomplete results. Clean re-scans dramatically improve output.

Format

Highly unusual formats

Proprietary or non-standard contract formats, bespoke bid structures, and highly customized templates may reduce accuracy. Standard industry formats yield the best results.

Context

Implied context

The AI analyzes what is written in the document. Industry-specific verbal agreements, local customs, or context that exists outside the document cannot be considered.

Code

Local code compliance

While the AI understands general construction standards, it does not verify compliance with specific local building codes, municipal regulations, or jurisdiction-specific requirements.

Language

Multi-language documents

Documents mixing multiple languages in the same page may reduce extraction quality. Single-language documents, particularly in English, produce the most reliable results.

§ 04 QUALITY FACTORS · WHAT AFFECTS YOUR RESULTS

What affects your accuracy.

You can significantly improve your results by understanding what factors affect AI analysis quality. These are the levers that matter most.

Format

Document format

Native PDFs (digitally created) produce far better results than scanned documents. When you have the option, always upload the digital original rather than a scanned copy.

Scan

Scan quality

When scans are necessary, use 300+ DPI, ensure pages are straight, and avoid dark edges or fold marks. Color scans outperform black-and-white for documents with highlighted sections.

Template

Standard formats

Industry-standard templates (AIA, ConsensusDocs, CSI-formatted specs, standard bid forms) produce the highest accuracy. The AI recognizes these patterns immediately.

Language

English-language documents produce the strongest results. All major European languages are supported with consistently high accuracy, but English remains the benchmark.

Completeness

Document completeness

Complete documents with all pages included yield better analysis than partial uploads. Missing pages, especially scope descriptions or pricing schedules, will create gaps in findings.

Tables

Table structure

Clear, well-structured tables with consistent column headers and row formatting are parsed with high fidelity. Merged cells, nested tables, and irregular layouts reduce extraction accuracy.

§ 05 CONTINUOUS IMPROVEMENT · GETTING BETTER, EVERY ANALYSIS

Getting better with every analysis.

Trueleveler improves continuously through multiple feedback channels. Your usage helps make the platform more accurate for everyone — without ever storing or training on your documents.

Prompt engineering

Our engineering team continuously refines the prompts and instructions that guide each AI model, improving how they parse, interpret, and cross-validate construction documents.

Edge case library

When users report unexpected results, we add those document patterns to our internal test suite. This prevents regressions and ensures known edge cases are handled correctly.

Model updates

As Claude and Gemini release improved model versions, we evaluate and integrate upgrades that improve construction document understanding while maintaining result consistency.

User feedback loop

Every analysis includes feedback options. When you flag an inaccuracy or confirm a finding, that signal feeds into our quality pipeline to prioritize the most impactful improvements.

§ 06 IMPORTANT NOTICE · AI ANALYTICAL AID, NOT REPLACEMENT

Always verify against source documents.

AI ANALYTICAL AID

Not a replacement for professional judgment.

Trueleveler is an AI-powered analytical aid designed to augment — not replace — the expertise of construction professionals. All AI-generated findings should be verified against the original source documents before making project decisions. No AI system can guarantee 100% accuracy, and results should be treated as a highly capable first pass that accelerates your review process, not as a final determination. For contractual, legal, or financial decisions, always consult with qualified professionals.

SEE THE ACCURACY · TRY IT YOURSELF

Upload your own document. Compare to your review.

No credit card to get started. Run a real analysis against the documents you know best — then judge the accuracy for yourself.

Try Free for 14 Days → See Pricing