Software Dev

GraphQL Federation: Building Distributed Supergraph APIs for Microservices

Q: What is Federation v2 Architecture and how does it work?

┌─────────────────────────────────────────────────────────────┐ │ CLIENTS │ │ Web App Mobile App Partner API │ └───────────────────────────┬─────────────────────────────────┘ │ Single GraphQL query ┌───────────────────────────▼─────────────────────────────────┐ │ APOLLO ROUTER │ │ - Query planning (decomposes supergraph query) │ │ - Subgraph fan-out (parallel where possible) │ │ - Response composition │ │ - Auth, rate limiting, caching, tracing │ └────┬──────────────┬────────────────┬────────────┬───────────┘ │ │ │ │ ┌────▼────┐ ┌──────▼──────┐ ┌─────▼────┐ ┌───▼─────────┐ │ User │ │ Order │ │ Product │ │ Review │ │ Service │ │ Service │ │ Service │ │ Service │ │(subgraph│ │ (subgraph) │ │(subgraph)│ │ (subgraph) │ └─────────┘ └─────────────┘ └──────────┘ └─────────────┘ The supergraph is the composed schema representing the union of all subgraph schemas. Clients query the supergraph.

Q: What is Entities and @key and how does it work?

The most important concept in Federation is the entity . An entity is a type that can be referenced and extended across multiple subgraphs. It is identified by a @key directive specifying its unique identifier fields.

Q: What is Order Service Subgraph (entity extension) and how does it work?

extend type User @key(fields: "id") { id: ID! @external # Owned by user-service orders(first: Int = 10): [Order!]! orderCount: Int! } type Order @key(fields: "id") { id: ID! userId: ID! status: OrderStatus! lineItems: [LineItem!]! totalAmount: Float! createdAt: DateTime! } type LineItem { productId: ID! quantity: Int! unitPrice: Float! product: Product # Resolved via product-service entity } type Query { order(id: ID!): Order orders(userId: ID!): [Order!]! } With this schema, a client can query: query OrderDetailPage($orderId: ID!) { order(id: $orderId) { id status totalAmount user { # Resolved by user-service name email } lineItems { quantity product { # Resolved by product-service name imageUrl inventory { # Resolved by inventory-service stockLevel } reviews(first: 3) { # Resolved by review-service rating body } } } } } The Router decomposes this query,.

In a microservices architecture, frontend teams face a brutal reality: to render a single product page, they must orchestrate calls to the product service, user service, inventory service, review service, and recommendation service. GraphQL Federation solves this with a unified supergraph — one endpoint, infinite composability, zero backend coordination required between frontend and individual service teams.

Md Sanwar Hossain March 2026 13 min read Software Dev

GraphQL Federation Supergraph API Architecture

The Problem: API Explosion in Microservices
Federation v2 Architecture: Router, Subgraphs, Supergraph
Entities and @key: How Services Share Types
@requires and @provides for Computed Fields
Netflix DGS Framework: Spring Boot Implementation
N+1 Problem in Federated GraphQL: DataLoader Pattern
Apollo Router vs Apollo Gateway: Performance Comparison
Schema Composition and Breaking Change Detection
Failure Scenarios
When NOT to Use GraphQL Federation
Conclusion

The Problem: API Explosion in Microservices

GraphQL Federation Architecture | mdsanwarhossain.me — GraphQL Federation Architecture — mdsanwarhossain.me

A mid-size e-commerce platform migrated to microservices and immediately created a problem for their frontend team. Loading the order detail page required:

GET /users/{id} — user name, email, preferences
GET /orders/{id} — order details, line items
GET /products/{ids} — product details for each line item (N+1!)
GET /inventory/{productIds} — stock levels
GET /reviews?productId={id} — reviews for each product
GET /recommendations?userId={id} — personalised upsells

Six API calls with waterfall dependencies. Mobile clients on 3G connections saw 2–4 second page loads. The BFF (Backend for Frontend) pattern was considered — but creating a dedicated BFF for every client type meant duplicating business logic and adding another service for every team to maintain.

GraphQL Federation solved this with a single query that the router decomposes and fans out to the appropriate subgraphs in parallel, returning a single composed response.

Federation v2 Architecture: Router, Subgraphs, Supergraph

┌─────────────────────────────────────────────────────────────┐
│                     CLIENTS                                 │
│         Web App    Mobile App    Partner API                │
└───────────────────────────┬─────────────────────────────────┘
                            │ Single GraphQL query
┌───────────────────────────▼─────────────────────────────────┐
│                  APOLLO ROUTER                              │
│  - Query planning (decomposes supergraph query)            │
│  - Subgraph fan-out (parallel where possible)              │
│  - Response composition                                    │
│  - Auth, rate limiting, caching, tracing                   │
└────┬──────────────┬────────────────┬────────────┬───────────┘
     │              │                │            │
┌────▼────┐  ┌──────▼──────┐  ┌─────▼────┐  ┌───▼─────────┐
│  User   │  │   Order     │  │ Product  │  │  Review     │
│ Service │  │  Service    │  │ Service  │  │  Service    │
│(subgraph│  │ (subgraph)  │  │(subgraph)│  │ (subgraph)  │
└─────────┘  └─────────────┘  └──────────┘  └─────────────┘

The supergraph is the composed schema representing the union of all subgraph schemas. Clients query the supergraph. The Router holds the query plan — a directed execution graph that determines which subgraphs to call, in what order, with what inputs. The Router never touches a database; it only orchestrates subgraph queries.

Federated Graph Services | mdsanwarhossain.me — Federated Graph Services — mdsanwarhossain.me

The most important concept in Federation is the entity. An entity is a type that can be referenced and extended across multiple subgraphs. It is identified by a @key directive specifying its unique identifier fields.

User Service Subgraph (entity owner)

type User @key(fields: "id") {
    id: ID!
    name: String!
    email: String!
    profilePictureUrl: String
    createdAt: DateTime!
}

type Query {
    user(id: ID!): User
    me: User
}

Order Service Subgraph (entity extension)

extend type User @key(fields: "id") {
    id: ID! @external           # Owned by user-service
    orders(first: Int = 10): [Order!]!
    orderCount: Int!
}

type Order @key(fields: "id") {
    id: ID!
    userId: ID!
    status: OrderStatus!
    lineItems: [LineItem!]!
    totalAmount: Float!
    createdAt: DateTime!
}

type LineItem {
    productId: ID!
    quantity: Int!
    unitPrice: Float!
    product: Product   # Resolved via product-service entity
}

type Query {
    order(id: ID!): Order
    orders(userId: ID!): [Order!]!
}

With this schema, a client can query:

query OrderDetailPage($orderId: ID!) {
    order(id: $orderId) {
        id
        status
        totalAmount
        user {            # Resolved by user-service
            name
            email
        }
        lineItems {
            quantity
            product {     # Resolved by product-service
                name
                imageUrl
                inventory { # Resolved by inventory-service
                    stockLevel
                }
                reviews(first: 3) { # Resolved by review-service
                    rating
                    body
                }
            }
        }
    }
}

The Router decomposes this query, fans out to four subgraphs in parallel where dependencies allow, and returns a single composed response. What was 6+ sequential HTTP calls becomes one round-trip from the client's perspective.

@requires and @provides for Computed Fields

Sometimes a field in service B requires a field from service A that the Router would not normally fetch. @requires solves this:

// In the review service subgraph
extend type Product @key(fields: "id") {
    id: ID! @external
    category: String @external        # Owned by product-service

    # This field requires category from product-service to compute
    similarProductReviews: [Review!]! @requires(fields: "category")
}

// The Router will fetch Product.category from product-service
// before calling review-service to resolve similarProductReviews

@provides is the inverse: it tells the Router that a resolver will also provide additional entity fields, avoiding an extra round-trip to the owning service:

type Order @key(fields: "id") {
    lineItems: [LineItem!]!
}

type LineItem {
    # Order service provides product name alongside line items
    # Router doesn't need to call product-service for just the name
    product: Product @provides(fields: "name")
}

extend type Product @key(fields: "id") {
    id: ID! @external
    name: String @external
}

Netflix DGS Framework: Spring Boot Implementation

Netflix DGS (Domain Graph Service) is the premier Spring Boot GraphQL framework, with native Apollo Federation support:

// pom.xml
<dependency>
    <groupId>com.netflix.graphql.dgs</groupId>
    <artifactId>graphql-dgs-spring-boot-starter</artifactId>
    <version>8.4.0</version>
</dependency>
<dependency>
    <groupId>com.netflix.graphql.dgs</groupId>
    <artifactId>graphql-dgs-federation-graphql-java-support</artifactId>
    <version>8.4.0</version>
</dependency>

// Order service DGS data fetcher with Federation entity resolver
@DgsComponent
public class OrderDataFetcher {

    private final OrderRepository orderRepository;
    private final OrderDataLoader orderDataLoader;

    @DgsQuery
    public Order order(@InputArgument String id) {
        return orderRepository.findById(UUID.fromString(id))
            .orElseThrow(() -> new DgsEntityNotFoundException("Order not found: " + id));
    }

    // Entity resolver: called by Router to resolve User.orders
    // The @key field (user.id) is provided by the Router
    @DgsEntityFetcher(name = "User")
    public User resolveUser(Map<String, Object> values) {
        // We only need to return a stub User with id for further resolution
        return new User((String) values.get("id"));
    }

    @DgsData(parentType = "User", field = "orders")
    public CompletableFuture<List<Order>> userOrders(DgsDataFetchingEnvironment dfe) {
        User user = dfe.getSource();
        // Use DataLoader to batch multiple user.orders requests
        DataLoader<String, List<Order>> loader = dfe.getDataLoader("ordersForUser");
        return loader.load(user.getId());
    }
}

N+1 Problem in Federated GraphQL: DataLoader Pattern

Federation introduces a new N+1 vector: when the Router calls product-service to resolve 50 line items, it can generate 50 individual product queries. DataLoader batches these into a single request:

@DgsDataLoader(name = "products")
public class ProductBatchLoader implements MappedBatchLoader<String, Product> {

    private final ProductRepository productRepository;

    @Override
    public CompletionStage<Map<String, Product>> load(Set<String> productIds) {
        return CompletableFuture.supplyAsync(() -> {
            // Single query for all product IDs
            List<Product> products = productRepository.findAllById(
                productIds.stream().map(UUID::fromString).toList()
            );
            return products.stream()
                .collect(Collectors.toMap(p -> p.getId().toString(), p -> p));
        });
    }
}

@DgsData(parentType = "LineItem", field = "product")
public CompletableFuture<Product> lineItemProduct(DgsDataFetchingEnvironment dfe) {
    LineItem lineItem = dfe.getSource();
    DataLoader<String, Product> loader = dfe.getDataLoader("products");
    return loader.load(lineItem.getProductId()); // batched automatically
}

Apollo Router vs Apollo Gateway: Performance Comparison

Apollo Router (Rust-based, v1.0+ stable) vs Apollo Gateway (Node.js) is a critical production choice:

Apollo Router: written in Rust, 5–10x lower memory footprint, sub-millisecond query planning overhead, supports Rhai scripts for custom logic, recommended for all new deployments
Apollo Gateway: Node.js, higher memory usage (~200MB vs ~20MB for Router), slower cold start, but supports JavaScript plugins for teams with existing Node.js expertise

At 10,000 queries/second, Apollo Router consumes ~120MB RAM and adds ~0.5ms planning overhead. Apollo Gateway at the same load consumes ~800MB and adds ~3–5ms. For latency-sensitive APIs, the Router is the clear choice.

Schema Composition and Breaking Change Detection

# CI pipeline schema validation with Rover CLI
# Install Rover
curl -sSL https://rover.apollo.dev/nix/latest | sh

# Check subgraph composition (run in CI before every subgraph deployment)
rover subgraph check my-graph@production \
  --name order-service \
  --schema ./src/main/resources/graphql/schema.graphqls

# Publish updated subgraph schema to Apollo Registry
rover subgraph publish my-graph@production \
  --name order-service \
  --schema ./src/main/resources/graphql/schema.graphqls \
  --routing-url https://order-service.internal/graphql

Breaking changes detected by Rover include: removing a field, changing a field's type from nullable to non-nullable, removing an entity's @key, changing argument names. Non-breaking additions (new fields, new types) pass automatically.

Failure Scenarios

Subgraph Down

When a subgraph becomes unavailable, the Router returns partial data for fields resolvable from healthy subgraphs. Fields requiring the unavailable subgraph return null with an error extension. Configure the Router with health checks and circuit breaking:

// router.yaml
traffic_shaping:
  all:
    timeout: 30s
  subgraphs:
    order-service:
      timeout: 5s

health_check:
  enabled: true
  listen: 0.0.0.0:8088

Schema Composition Failure

If a subgraph publishes a schema that is incompatible with the supergraph (e.g., two services define conflicting types), composition fails. The Router continues serving the last valid supergraph. Composition errors are surfaced in Apollo Studio. Prevention: always run rover subgraph check in CI before publishing.

When NOT to Use GraphQL Federation

Simple REST APIs with <5 services: the operational overhead of Router deployment, schema composition, and Apollo Registry is not justified
Teams without GraphQL expertise: Federation adds substantial complexity on top of base GraphQL — learn base GraphQL first
High write-heavy APIs: GraphQL mutations across federated services introduce distributed transaction complexity; prefer REST + Saga pattern for heavy write workloads
Microservices with very different SLAs: if one subgraph has 99.0% availability, the supergraph's effective availability for queries crossing that subgraph cannot exceed 99.0%

Key Takeaways

GraphQL Federation eliminates the frontend API orchestration problem by moving it into the Router — a single, purpose-built, high-performance layer
Entities with @key are the glue of the supergraph — design them carefully, as changing key fields is a breaking change
DataLoader is mandatory in federated GraphQL; without it, entity resolution becomes an N+1 avalanche
Apollo Router (Rust) is 5–10x more efficient than Apollo Gateway (Node.js) — use it for all new production deployments
Run rover subgraph check in every CI pipeline; never allow breaking schema changes to reach the production supergraph undetected
Netflix DGS provides the best Federation v2 support for Spring Boot teams — it handles the federation boilerplate so you focus on resolvers

Conclusion

GraphQL Federation is not a simple technology — it introduces a new coordination layer, a new failure domain (the Router), and new concepts (entities, subgraphs, query planning) that every team member must understand. But the return on that investment is real: frontend teams gain complete autonomy over data fetching without requiring backend coordination, microservice teams gain complete autonomy over their subgraph schemas, and the entire organisation benefits from a single, consistent, discoverable API graph. For organisations with 5+ microservices and multiple client teams, Federation is the most powerful API architecture available today.

Frequently Asked Questions

What is The Problem and how does it work?

A mid-size e-commerce platform migrated to microservices and immediately created a problem for their frontend team. Loading the order detail page required: Six API calls with waterfall dependencies. Mobile clients on 3G connections saw 2–4 second page loads. The BFF (Backend for Frontend) pattern was considered — but creating a dedicated BFF for every client type meant duplicating business logic and adding another service for every team to maintain. GraphQL Federation solved this with a single query that the router decomposes and fans out to the appropriate subgraphs in parallel, returning a single composed response. GET /users/{id} — user name, email, preferences GET /orders/{id} — order details, line items GET /products/{ids} — product details for each line item (N+1!) GET /inventory/{productIds} — stock levels

What is Federation v2 Architecture and how does it work?

┌─────────────────────────────────────────────────────────────┐ │ CLIENTS │ │ Web App Mobile App Partner API │ └───────────────────────────┬─────────────────────────────────┘ │ Single GraphQL query ┌───────────────────────────▼─────────────────────────────────┐ │ APOLLO ROUTER │ │ - Query planning (decomposes supergraph query) │ │ - Subgraph fan-out (parallel where possible) │ │ - Response composition │ │ - Auth, rate limiting, caching, tracing │ └────┬──────────────┬────────────────┬────────────┬───────────┘ │ │ │ │ ┌────▼────┐ ┌──────▼──────┐ ┌─────▼────┐ ┌───▼─────────┐ │ User │ │ Order │ │ Product │ │ Review │ │ Service │ │ Service │ │ Service │ │ Service │ │(subgraph│ │ (subgraph) │ │(subgraph)│ │ (subgraph) │ └─────────┘ └─────────────┘ └──────────┘ └─────────────┘ The supergraph is the composed schema representing the union of all subgraph schemas. Clients query the supergraph.

What is Entities and @key and how does it work?

The most important concept in Federation is the entity . An entity is a type that can be referenced and extended across multiple subgraphs. It is identified by a @key directive specifying its unique identifier fields.

What is Order Service Subgraph (entity extension) and how does it work?

extend type User @key(fields: "id") { id: ID! @external # Owned by user-service orders(first: Int = 10): [Order!]! orderCount: Int! } type Order @key(fields: "id") { id: ID! userId: ID! status: OrderStatus! lineItems: [LineItem!]! totalAmount: Float! createdAt: DateTime! } type LineItem { productId: ID! quantity: Int! unitPrice: Float! product: Product # Resolved via product-service entity } type Query { order(id: ID!): Order orders(userId: ID!): [Order!]! } With this schema, a client can query: query OrderDetailPage($orderId: ID!) { order(id: $orderId) { id status totalAmount user { # Resolved by user-service name email } lineItems { quantity product { # Resolved by product-service name imageUrl inventory { # Resolved by inventory-service stockLevel } reviews(first: 3) { # Resolved by review-service rating body } } } } } The Router decomposes this query,.

What is @requires and @provides for Computed Fields and how does it work?

Sometimes a field in service B requires a field from service A that the Router would not normally fetch. @requires solves this: // In the review service subgraph extend type Product @key(fields: "id") { id: ID! @external category: String @external # Owned by product-service # This field requires category from product-service to compute similarProductReviews: [Review!]! @requires(fields: "category") } // The Router will fetch Product.category from product-service // before calling review-service to resolve similarProductReviews @provides is the inverse: it tells the Router that a resolver will also provide additional entity fields, avoiding an extra round-trip to the owning service:

GraphQL Federation: Building Distributed Supergraph APIs for Microservices

Table of Contents

The Problem: API Explosion in Microservices

Federation v2 Architecture: Router, Subgraphs, Supergraph

User Service Subgraph (entity owner)

Order Service Subgraph (entity extension)

@requires and @provides for Computed Fields

Netflix DGS Framework: Spring Boot Implementation

N+1 Problem in Federated GraphQL: DataLoader Pattern

Apollo Router vs Apollo Gateway: Performance Comparison

Schema Composition and Breaking Change Detection

Failure Scenarios

Subgraph Down

Schema Composition Failure

When NOT to Use GraphQL Federation

Key Takeaways

Conclusion

Frequently Asked Questions

What is The Problem and how does it work?

What is Federation v2 Architecture and how does it work?

What is Entities and @key and how does it work?

What is Order Service Subgraph (entity extension) and how does it work?

What is @requires and @provides for Computed Fields and how does it work?

Tags

Leave a Comment

Related Posts

GraphQL Federation: Building Distributed Supergraph APIs for Microservices

Table of Contents

The Problem: API Explosion in Microservices

Federation v2 Architecture: Router, Subgraphs, Supergraph

Entities and @key: How Services Share Types

User Service Subgraph (entity owner)

Order Service Subgraph (entity extension)

@requires and @provides for Computed Fields

Netflix DGS Framework: Spring Boot Implementation

N+1 Problem in Federated GraphQL: DataLoader Pattern

Apollo Router vs Apollo Gateway: Performance Comparison

Schema Composition and Breaking Change Detection

Failure Scenarios

Subgraph Down

Schema Composition Failure

When NOT to Use GraphQL Federation

Key Takeaways

Conclusion

Frequently Asked Questions

What is The Problem and how does it work?

What is Federation v2 Architecture and how does it work?

What is Entities and @key and how does it work?

What is Order Service Subgraph (entity extension) and how does it work?

What is @requires and @provides for Computed Fields and how does it work?

Tags

Leave a Comment

Related Posts

API Design Best Practices: REST, gRPC, and GraphQL for Modern Backend Teams

API Gateway & Service Mesh: Architecting the Network Layer for Distributed Systems

API Versioning Strategies in Production: Breaking Changes and Deprecation

Microservices Architecture Patterns: Building Resilient, Scalable Distributed Systems

Cookie Notice