DeQA-Doc-Color: Document Image Color Quality Assessment

DeQA-Doc-Color is a vision-language model specialized in assessing the color quality of document images. It evaluates color fidelity, saturation, white balance, and color-related artifacts in scanned or photographed documents.

Model Family

This model is part of the DeQA-Doc family, which includes three specialized models:

Model Description HuggingFace
DeQA-Doc-Overall Overall document quality mapo80/DeQA-Doc-Overall
DeQA-Doc-Color Color quality assessment (this model) mapo80/DeQA-Doc-Color
DeQA-Doc-Sharpness Sharpness/clarity assessment mapo80/DeQA-Doc-Sharpness

Quick Start

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    "mapo80/DeQA-Doc-Color",
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Score an image
image = Image.open("document.jpg").convert("RGB")
score = model.score([image])
print(f"Color Quality Score: {score.item():.2f} / 5.0")

What Does Color Quality Measure?

The color quality score evaluates:

  • Color Fidelity: How accurately colors are reproduced
  • White Balance: Neutral whites without color casts (yellow, blue tints)
  • Saturation: Appropriate color intensity (not washed out or oversaturated)
  • Color Artifacts: Absence of color bleeding, banding, or chromatic aberration
  • Uniformity: Consistent color reproduction across the document

Score Interpretation

Score Range Quality Level Typical Issues
4.5 - 5.0 Excellent Perfect color reproduction
3.5 - 4.5 Good Minor color shifts, slight tinting
2.5 - 3.5 Fair Noticeable color cast, uneven colors
1.5 - 2.5 Poor Strong color distortion, washed out
1.0 - 1.5 Bad Severe color problems, unusable

Batch Processing

images = [
    Image.open("doc1.jpg").convert("RGB"),
    Image.open("doc2.jpg").convert("RGB"),
    Image.open("doc3.jpg").convert("RGB"),
]

scores = model.score(images)
for i, score in enumerate(scores):
    print(f"Document {i+1} Color Score: {score.item():.2f} / 5.0")

Use Cases

  • Scanner Calibration: Detect when scanners need color calibration
  • Photo Document QA: Flag photos with poor lighting/white balance
  • Color-Critical Documents: Verify color accuracy for maps, charts, branded materials
  • Archive Preservation: Identify documents with color degradation
  • Print Quality Control: Verify color reproduction in printed documents

Example: Detect Color Issues

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

model = AutoModelForCausalLM.from_pretrained(
    "mapo80/DeQA-Doc-Color",
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto",
)

def diagnose_color_quality(image_path):
    img = Image.open(image_path).convert("RGB")
    score = model.score([img]).item()

    if score >= 4.5:
        diagnosis = "Excellent color quality"
    elif score >= 3.5:
        diagnosis = "Good - minor color issues"
    elif score >= 2.5:
        diagnosis = "Fair - consider color correction"
    elif score >= 1.5:
        diagnosis = "Poor - needs color correction or rescan"
    else:
        diagnosis = "Bad - severe color problems, rescan required"

    return score, diagnosis

score, diagnosis = diagnose_color_quality("scanned_document.jpg")
print(f"Score: {score:.2f}/5.0 - {diagnosis}")

Multi-Dimensional Quality Assessment

Combine with other DeQA-Doc models for comprehensive assessment:

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

# Load all three models
models = {
    "overall": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Overall", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
    "color": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Color", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
    "sharpness": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Sharpness", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
}

def full_quality_report(image_path):
    img = Image.open(image_path).convert("RGB")

    scores = {}
    for name, model in models.items():
        scores[name] = model.score([img]).item()

    return scores

report = full_quality_report("document.jpg")
print(f"Overall:   {report['overall']:.2f}/5.0")
print(f"Color:     {report['color']:.2f}/5.0")
print(f"Sharpness: {report['sharpness']:.2f}/5.0")

Model Architecture

  • Base Model: mPLUG-Owl2 (LLaMA2-7B + ViT-L Vision Encoder)
  • Vision Encoder: CLIP ViT-L/14 (1024 visual tokens via Visual Abstractor)
  • Language Model: LLaMA2-7B
  • Training: Full fine-tuning on document color quality datasets
  • Input Resolution: Images are resized to 448x448 (with aspect ratio preservation)

Technical Details

Property Value
Model Size ~16 GB (float16)
Parameters ~7.2B
Input RGB images (any resolution)
Output Color quality score (1.0 - 5.0)
Inference ~2-3 seconds per image on A100

Hardware Requirements

Setup VRAM Required Recommended
Full precision (fp32) ~32 GB A100, H100
Half precision (fp16) ~16 GB A100, A40, RTX 4090
With CPU offload ~8 GB GPU + RAM RTX 3090, RTX 4080

Installation

pip install torch transformers accelerate pillow sentencepiece protobuf

Note: Use transformers>=4.36.0 for best compatibility.

Limitations

  • Optimized for document images (may not generalize to natural photos)
  • Color assessment is relative to training data distribution
  • Black & white documents may receive lower scores (use Overall model instead)
  • Requires GPU with sufficient VRAM for efficient inference

Credits & Attribution

This model is based on the DeQA-Doc project by Junjie Gao et al., which won the Championship in the VQualA 2025 DIQA (Document Image Quality Assessment) Challenge.

Original Repository: https://github.com/Junjie-Gao19/DeQA-Doc

All credit for the research, training methodology, and model architecture goes to the original authors.

Citation

If you use this model in your research, please cite the original paper:

@inproceedings{deqadoc,
  title={{DeQA-Doc}: Adapting {DeQA-Score} to Document Image Quality Assessment},
  author={Gao, Junjie and Liu, Runze and Peng, Yingzhe and Yang, Shujian and Zhang, Jin and Yang, Kai and You, Zhiyuan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop},
  year={2025},
}

ArXiv: https://arxiv.org/abs/2507.12796

License

Apache 2.0

Related Models

Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support