BiCLIP

Domain Canonicalization via Structured Geometric Transformation.

BiCLIP addresses the “modality gap” in Vision-Language Models like CLIP and SigLIP. By introducing a structured, bilinear transformation matrix, we achieve state-of-the-art domain adaptation with extreme parameter efficiency.

BiCLIP realigns visual features to the textual manifold.

Key Highlights

  • SOTA Performance: +15.2% average improvement over zero-shot CLIP across 11 benchmarks.
  • Extreme Gains: Up to +42% improvement on specialized domains like EuroSAT.
  • Geometric Insight: Validates that domain shift can be recovered via canonical transformations.

References

2026

  1. BiCLIP: Domain Canonicalization via Structured Geometric Transformation
    Pranav Mantini and Shishir K Shah
    arXiv preprint arXiv:2603.08942, 2026