• Home
  • Categories
  • Tags
  • Pricing
  • Submit
    Decorative pattern
    1. Home
    2. Curated Resource Lists
    3. Implement two-tower retrieval for large-scale candidate generation

    Implement two-tower retrieval for large-scale candidate generation

    A Google Cloud reference architecture demonstrating an end-to-end two-tower retrieval system for large-scale candidate generation that uses Vertex AI and vector similarity search concepts to learn and serve semantic similarity between entities.

    🌐Visit Website

    About this tool

    Implement two-tower retrieval for large-scale candidate generation

    • Type: Reference architecture / technical guide
    • Provider: Google Cloud
    • Category: Curated resource list / Architecture pattern
    • Topics: RAG, semantic search, large-scale recommendation, two-tower retrieval
    • URL: https://cloud.google.com/architecture/implement-two-tower-retrieval-large-scale-candidate-generation

    Overview

    This reference architecture describes how to design and implement an end-to-end two-tower retrieval system for large-scale candidate generation using Google Cloud, Vertex AI, and vector similarity search. It focuses on learning and serving semantic similarity between entities (for example, web queries and candidate items) in large-scale recommendation and personalization systems with low-latency serving requirements.

    Primary audience: data scientists and machine learning engineers building large-scale recommenders and retrieval systems.

    Key Concepts

    • Two-tower (dual encoder) model: Two separate neural networks (towers) independently encode:

      • Query tower: user/query features
      • Candidate tower: item/candidate features These towers produce embeddings that can be compared via vector similarity.
    • Candidate generation stage:

      • First stage in a two- or multi-stage recommender architecture.
      • Filters millions of items down to hundreds of relevant candidates.
      • Designed for high throughput and low latency.
    • Ranking stage:

      • Second stage that re-ranks the candidate subset.
      • Produces the final set of recommended items.
    • Semantic similarity learning:

      • Learns similarity between <query, candidate> pairs
      • Uses vector similarity search to efficiently retrieve nearest neighbors in embedding space.

    Architecture

    The architecture demonstrates how to:

    • Train a two-tower model on Google Cloud.
    • Deploy each tower separately for different serving and deployment tasks.
    • Use vector similarity search for large-scale retrieval from a large corpus of candidate items.

    High-level components (as shown in the diagrams and description):

    • A training pipeline that learns a dual-encoder (two-tower) model.
    • A query tower service that encodes queries or user context at request time.
    • A candidate tower service that encodes items/candidates, typically offline or periodically.
    • A vector index (vector similarity search) storing candidate embeddings for large-scale approximate nearest neighbor (ANN) search.
    • A candidate generation service that retrieves top candidates from the vector index.
    • A downstream ranking component that further filters and sorts the retrieved candidates.

    Products Used

    The reference architecture uses the following Google Cloud products (as stated or implied):

    • Vertex AI
      • Model training for the two-tower architecture
      • Model deployment for serving each tower independently
    • Vector similarity search (Vertex AI Vector Search or equivalent Google Cloud vector indexing) for large-scale nearest neighbor retrieval in embedding space

    Use Cases

    • Large-scale recommendation systems that must:
      • Serve recommendations with low latency
      • Handle millions of candidate items
      • Support two-stage or multi-stage architectures (candidate generation + ranking)
    • Personalization scenarios where you need to learn semantic similarity between:
      • Search queries and documents/items
      • Users and items
      • Any <query, candidate> entity pairs

    Features

    • End-to-end reference design for a two-tower retrieval system on Google Cloud.
    • Separation of concerns:
      • Independent training and deployment of query and candidate towers.
    • Scalable candidate generation:
      • Efficiently filters a very large candidate pool to a manageable set of hundreds of items.
    • Low-latency retrieval:
      • Uses vector similarity search to support fast, high-throughput queries.
    • Semantic retrieval:
      • Embedding-based matching of queries and candidates via learned semantic similarity.
    • Integration with multi-stage recommenders:
      • Designed to feed downstream ranking models.
    • Guidance on modeling and data preparation (via linked resources):
      • Refers to “Scaling deep retrieval with TensorFlow Recommenders and Vector Search” for details on model design, problem framing, and data preparation.

    Pricing

    This item is a free reference architecture / documentation page. There is no direct pricing for the document itself. Any costs arise from using underlying Google Cloud services (for example, Vertex AI and vector search), which are priced separately in Google Cloud’s standard pricing documentation.

    Surveys

    Loading more......

    Information

    Websitecloud.google.com
    PublishedDec 25, 2025

    Categories

    1 Item
    Curated Resource Lists

    Tags

    3 Items
    #Rag#Semantic Search#architectures

    Similar Products

    6 result(s)
    DataRobot Vector Database

    DataRobot Vector Database is a managed vector store capability within the DataRobot AI Platform that allows users to create, register, deploy, and update vector databases for AI workloads, including RAG and semantic search. It integrates with NVIDIA NIM embeddings and supports both built-in and bring-your-own embeddings for building production-grade vector search solutions.

    Verba

    Verba is a community-driven, open-source Retrieval-Augmented Generation (RAG) application that provides an end-to-end, user-friendly interface for building RAG workflows on top of a vector database, showcasing practical semantic search and retrieval patterns with Weaviate.

    Building Applications with Vector Databases
    Featured

    DeepLearning.AI course teaching six practical vector database applications using Pinecone, including RAG for LLMs, recommender systems, and hybrid search combining images and text.

    LangChain & Vector Databases in Production

    Free comprehensive course from Activeloop with 60+ lessons and 10+ practical projects, teaching production-ready LLM applications with vector databases, trusted by 10,000+ engineers.

    Mastering Multimodal RAG

    A course focused on mastering multimodal Retrieval Augmented Generation (RAG) and embeddings, which are fundamental components often stored and managed by vector databases.

    Awesome papers and technical blogs on vector DB

    A curated collection of papers and technical blogs focused on vector databases, semantic-based vector search, and approximate nearest neighbor search (ANN Search). These resources are essential for understanding and building large-scale information retrieval systems and vector databases.

    Decorative pattern
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Tags
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Awesome Vector Databases. All rights reserved.·Terms of Service·Privacy Policy·Cookies