[Summary] MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
TL;DR Recent advancements in vector retrieval demonstrate that mapping queries and documents to multiple vectors and performing multi-vector retrieval surpasses common single-vector retrieval methods. However, multi-vector retrieval is computationally intensive as it does not use linear operations like the inner product. MUVERA is a method designed to accelerate multi-vector retrieval: It converts sets of vectors into single Fixed Dimensional Encodings (FDEs). The inner product of two FDEs approximates the Chamfer Similarity score, allowing standard, highly-optimized Maximum Inner Product Search (MIPS) solvers to generate a candidate list, following by exact similarity calculation on this small set of candidates for precise ranking....