DeQA-Doc-Color: Document Image Color Quality Assessment

DeQA-Doc-Color is a vision-language model specialized in assessing the color quality of document images. It evaluates color fidelity, saturation, white balance, and color-related artifacts in scanned or photographed documents.

Model Family

This model is part of the DeQA-Doc family, which includes three specialized models:

Model	Description	HuggingFace
DeQA-Doc-Overall	Overall document quality	mapo80/DeQA-Doc-Overall
DeQA-Doc-Color	Color quality assessment (this model)	mapo80/DeQA-Doc-Color
DeQA-Doc-Sharpness	Sharpness/clarity assessment	mapo80/DeQA-Doc-Sharpness

Quick Start

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    "mapo80/DeQA-Doc-Color",
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Score an image
image = Image.open("document.jpg").convert("RGB")
score = model.score([image])
print(f"Color Quality Score: {score.item():.2f} / 5.0")

What Does Color Quality Measure?

The color quality score evaluates:

Color Fidelity: How accurately colors are reproduced
White Balance: Neutral whites without color casts (yellow, blue tints)
Saturation: Appropriate color intensity (not washed out or oversaturated)
Color Artifacts: Absence of color bleeding, banding, or chromatic aberration
Uniformity: Consistent color reproduction across the document

Score Interpretation

Score Range	Quality Level	Typical Issues
4.5 - 5.0	Excellent	Perfect color reproduction
3.5 - 4.5	Good	Minor color shifts, slight tinting
2.5 - 3.5	Fair	Noticeable color cast, uneven colors
1.5 - 2.5	Poor	Strong color distortion, washed out
1.0 - 1.5	Bad	Severe color problems, unusable

Batch Processing

images = [
    Image.open("doc1.jpg").convert("RGB"),
    Image.open("doc2.jpg").convert("RGB"),
    Image.open("doc3.jpg").convert("RGB"),
]

scores = model.score(images)
for i, score in enumerate(scores):
    print(f"Document {i+1} Color Score: {score.item():.2f} / 5.0")

Use Cases

Scanner Calibration: Detect when scanners need color calibration
Photo Document QA: Flag photos with poor lighting/white balance
Color-Critical Documents: Verify color accuracy for maps, charts, branded materials
Archive Preservation: Identify documents with color degradation
Print Quality Control: Verify color reproduction in printed documents

Example: Detect Color Issues

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

model = AutoModelForCausalLM.from_pretrained(
    "mapo80/DeQA-Doc-Color",
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto",
)

def diagnose_color_quality(image_path):
    img = Image.open(image_path).convert("RGB")
    score = model.score([img]).item()

    if score >= 4.5:
        diagnosis = "Excellent color quality"
    elif score >= 3.5:
        diagnosis = "Good - minor color issues"
    elif score >= 2.5:
        diagnosis = "Fair - consider color correction"
    elif score >= 1.5:
        diagnosis = "Poor - needs color correction or rescan"
    else:
        diagnosis = "Bad - severe color problems, rescan required"

    return score, diagnosis

score, diagnosis = diagnose_color_quality("scanned_document.jpg")
print(f"Score: {score:.2f}/5.0 - {diagnosis}")

Multi-Dimensional Quality Assessment

Combine with other DeQA-Doc models for comprehensive assessment:

import torch
from transformers import AutoModelForCausalLM
from PIL import Image

# Load all three models
models = {
    "overall": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Overall", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
    "color": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Color", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
    "sharpness": AutoModelForCausalLM.from_pretrained(
        "mapo80/DeQA-Doc-Sharpness", trust_remote_code=True,
        torch_dtype=torch.float16, device_map="auto"
    ),
}

def full_quality_report(image_path):
    img = Image.open(image_path).convert("RGB")

    scores = {}
    for name, model in models.items():
        scores[name] = model.score([img]).item()

    return scores

report = full_quality_report("document.jpg")
print(f"Overall:   {report['overall']:.2f}/5.0")
print(f"Color:     {report['color']:.2f}/5.0")
print(f"Sharpness: {report['sharpness']:.2f}/5.0")

Model Architecture

Base Model: mPLUG-Owl2 (LLaMA2-7B + ViT-L Vision Encoder)
Vision Encoder: CLIP ViT-L/14 (1024 visual tokens via Visual Abstractor)
Language Model: LLaMA2-7B
Training: Full fine-tuning on document color quality datasets
Input Resolution: Images are resized to 448x448 (with aspect ratio preservation)

Technical Details

Property	Value
Model Size	~16 GB (float16)
Parameters	~7.2B
Input	RGB images (any resolution)
Output	Color quality score (1.0 - 5.0)
Inference	~2-3 seconds per image on A100

Hardware Requirements

Setup	VRAM Required	Recommended
Full precision (fp32)	~32 GB	A100, H100
Half precision (fp16)	~16 GB	A100, A40, RTX 4090
With CPU offload	~8 GB GPU + RAM	RTX 3090, RTX 4080

Installation

pip install torch transformers accelerate pillow sentencepiece protobuf

Note: Use transformers>=4.36.0 for best compatibility.

Limitations

Optimized for document images (may not generalize to natural photos)
Color assessment is relative to training data distribution
Black & white documents may receive lower scores (use Overall model instead)
Requires GPU with sufficient VRAM for efficient inference

Credits & Attribution

This model is based on the DeQA-Doc project by Junjie Gao et al., which won the Championship in the VQualA 2025 DIQA (Document Image Quality Assessment) Challenge.

Original Repository: https://github.com/Junjie-Gao19/DeQA-Doc

All credit for the research, training methodology, and model architecture goes to the original authors.

Citation

If you use this model in your research, please cite the original paper:

@inproceedings{deqadoc,
  title={{DeQA-Doc}: Adapting {DeQA-Score} to Document Image Quality Assessment},
  author={Gao, Junjie and Liu, Runze and Peng, Yingzhe and Yang, Shujian and Zhang, Jin and Yang, Kai and You, Zhiyuan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop},
  year={2025},
}

ArXiv: https://arxiv.org/abs/2507.12796

License

Apache 2.0

Related Models

DeQA-Doc-Overall - Overall quality assessment
DeQA-Doc-Sharpness - Sharpness assessment

Downloads last month: 17