GraphQL Federation: Building Distributed Supergraph APIs for Microservices
In a microservices architecture, frontend teams face a brutal reality: to render a single product page, they must orchestrate calls to the product service, user service, inventory service, review service, and recommendation service. GraphQL Federation solves this with a unified supergraph — one endpoint, infinite composability, zero backend coordination required between frontend and individual service teams.
The Problem: API Explosion in Microservices
A mid-size e-commerce platform migrated to microservices and immediately created a problem for their frontend team. Loading the order detail page required:
GET /users/{id}— user name, email, preferencesGET /orders/{id}— order details, line itemsGET /products/{ids}— product details for each line item (N+1!)GET /inventory/{productIds}— stock levelsGET /reviews?productId={id}— reviews for each productGET /recommendations?userId={id}— personalised upsells
Six API calls with waterfall dependencies. Mobile clients on 3G connections saw 2–4 second page loads. The BFF (Backend for Frontend) pattern was considered — but creating a dedicated BFF for every client type meant duplicating business logic and adding another service for every team to maintain.
GraphQL Federation solved this with a single query that the router decomposes and fans out to the appropriate subgraphs in parallel, returning a single composed response.
Federation v2 Architecture: Router, Subgraphs, Supergraph
┌─────────────────────────────────────────────────────────────┐
│ CLIENTS │
│ Web App Mobile App Partner API │
└───────────────────────────┬─────────────────────────────────┘
│ Single GraphQL query
┌───────────────────────────▼─────────────────────────────────┐
│ APOLLO ROUTER │
│ - Query planning (decomposes supergraph query) │
│ - Subgraph fan-out (parallel where possible) │
│ - Response composition │
│ - Auth, rate limiting, caching, tracing │
└────┬──────────────┬────────────────┬────────────┬───────────┘
│ │ │ │
┌────▼────┐ ┌──────▼──────┐ ┌─────▼────┐ ┌───▼─────────┐
│ User │ │ Order │ │ Product │ │ Review │
│ Service │ │ Service │ │ Service │ │ Service │
│(subgraph│ │ (subgraph) │ │(subgraph)│ │ (subgraph) │
└─────────┘ └─────────────┘ └──────────┘ └─────────────┘
The supergraph is the composed schema representing the union of all subgraph schemas. Clients query the supergraph. The Router holds the query plan — a directed execution graph that determines which subgraphs to call, in what order, with what inputs. The Router never touches a database; it only orchestrates subgraph queries.
Entities and @key: How Services Share Types
The most important concept in Federation is the entity. An entity is a type that can be referenced and extended across multiple subgraphs. It is identified by a @key directive specifying its unique identifier fields.
User Service Subgraph (entity owner)
type User @key(fields: "id") {
id: ID!
name: String!
email: String!
profilePictureUrl: String
createdAt: DateTime!
}
type Query {
user(id: ID!): User
me: User
}
Order Service Subgraph (entity extension)
extend type User @key(fields: "id") {
id: ID! @external # Owned by user-service
orders(first: Int = 10): [Order!]!
orderCount: Int!
}
type Order @key(fields: "id") {
id: ID!
userId: ID!
status: OrderStatus!
lineItems: [LineItem!]!
totalAmount: Float!
createdAt: DateTime!
}
type LineItem {
productId: ID!
quantity: Int!
unitPrice: Float!
product: Product # Resolved via product-service entity
}
type Query {
order(id: ID!): Order
orders(userId: ID!): [Order!]!
}
With this schema, a client can query:
query OrderDetailPage($orderId: ID!) {
order(id: $orderId) {
id
status
totalAmount
user { # Resolved by user-service
name
email
}
lineItems {
quantity
product { # Resolved by product-service
name
imageUrl
inventory { # Resolved by inventory-service
stockLevel
}
reviews(first: 3) { # Resolved by review-service
rating
body
}
}
}
}
}
The Router decomposes this query, fans out to four subgraphs in parallel where dependencies allow, and returns a single composed response. What was 6+ sequential HTTP calls becomes one round-trip from the client's perspective.
@requires and @provides for Computed Fields
Sometimes a field in service B requires a field from service A that the Router would not normally fetch. @requires solves this:
// In the review service subgraph
extend type Product @key(fields: "id") {
id: ID! @external
category: String @external # Owned by product-service
# This field requires category from product-service to compute
similarProductReviews: [Review!]! @requires(fields: "category")
}
// The Router will fetch Product.category from product-service
// before calling review-service to resolve similarProductReviews
@provides is the inverse: it tells the Router that a resolver will also provide additional entity fields, avoiding an extra round-trip to the owning service:
type Order @key(fields: "id") {
lineItems: [LineItem!]!
}
type LineItem {
# Order service provides product name alongside line items
# Router doesn't need to call product-service for just the name
product: Product @provides(fields: "name")
}
extend type Product @key(fields: "id") {
id: ID! @external
name: String @external
}
Netflix DGS Framework: Spring Boot Implementation
Netflix DGS (Domain Graph Service) is the premier Spring Boot GraphQL framework, with native Apollo Federation support:
// pom.xml
<dependency>
<groupId>com.netflix.graphql.dgs</groupId>
<artifactId>graphql-dgs-spring-boot-starter</artifactId>
<version>8.4.0</version>
</dependency>
<dependency>
<groupId>com.netflix.graphql.dgs</groupId>
<artifactId>graphql-dgs-federation-graphql-java-support</artifactId>
<version>8.4.0</version>
</dependency>
// Order service DGS data fetcher with Federation entity resolver
@DgsComponent
public class OrderDataFetcher {
private final OrderRepository orderRepository;
private final OrderDataLoader orderDataLoader;
@DgsQuery
public Order order(@InputArgument String id) {
return orderRepository.findById(UUID.fromString(id))
.orElseThrow(() -> new DgsEntityNotFoundException("Order not found: " + id));
}
// Entity resolver: called by Router to resolve User.orders
// The @key field (user.id) is provided by the Router
@DgsEntityFetcher(name = "User")
public User resolveUser(Map<String, Object> values) {
// We only need to return a stub User with id for further resolution
return new User((String) values.get("id"));
}
@DgsData(parentType = "User", field = "orders")
public CompletableFuture<List<Order>> userOrders(DgsDataFetchingEnvironment dfe) {
User user = dfe.getSource();
// Use DataLoader to batch multiple user.orders requests
DataLoader<String, List<Order>> loader = dfe.getDataLoader("ordersForUser");
return loader.load(user.getId());
}
}
N+1 Problem in Federated GraphQL: DataLoader Pattern
Federation introduces a new N+1 vector: when the Router calls product-service to resolve 50 line items, it can generate 50 individual product queries. DataLoader batches these into a single request:
@DgsDataLoader(name = "products")
public class ProductBatchLoader implements MappedBatchLoader<String, Product> {
private final ProductRepository productRepository;
@Override
public CompletionStage<Map<String, Product>> load(Set<String> productIds) {
return CompletableFuture.supplyAsync(() -> {
// Single query for all product IDs
List<Product> products = productRepository.findAllById(
productIds.stream().map(UUID::fromString).toList()
);
return products.stream()
.collect(Collectors.toMap(p -> p.getId().toString(), p -> p));
});
}
}
@DgsData(parentType = "LineItem", field = "product")
public CompletableFuture<Product> lineItemProduct(DgsDataFetchingEnvironment dfe) {
LineItem lineItem = dfe.getSource();
DataLoader<String, Product> loader = dfe.getDataLoader("products");
return loader.load(lineItem.getProductId()); // batched automatically
}
Apollo Router vs Apollo Gateway: Performance Comparison
Apollo Router (Rust-based, v1.0+ stable) vs Apollo Gateway (Node.js) is a critical production choice:
- Apollo Router: written in Rust, 5–10x lower memory footprint, sub-millisecond query planning overhead, supports Rhai scripts for custom logic, recommended for all new deployments
- Apollo Gateway: Node.js, higher memory usage (~200MB vs ~20MB for Router), slower cold start, but supports JavaScript plugins for teams with existing Node.js expertise
At 10,000 queries/second, Apollo Router consumes ~120MB RAM and adds ~0.5ms planning overhead. Apollo Gateway at the same load consumes ~800MB and adds ~3–5ms. For latency-sensitive APIs, the Router is the clear choice.
Schema Composition and Breaking Change Detection
# CI pipeline schema validation with Rover CLI
# Install Rover
curl -sSL https://rover.apollo.dev/nix/latest | sh
# Check subgraph composition (run in CI before every subgraph deployment)
rover subgraph check my-graph@production \
--name order-service \
--schema ./src/main/resources/graphql/schema.graphqls
# Publish updated subgraph schema to Apollo Registry
rover subgraph publish my-graph@production \
--name order-service \
--schema ./src/main/resources/graphql/schema.graphqls \
--routing-url https://order-service.internal/graphql
Breaking changes detected by Rover include: removing a field, changing a field's type from nullable to non-nullable, removing an entity's @key, changing argument names. Non-breaking additions (new fields, new types) pass automatically.
Failure Scenarios
Subgraph Down
When a subgraph becomes unavailable, the Router returns partial data for fields resolvable from healthy subgraphs. Fields requiring the unavailable subgraph return null with an error extension. Configure the Router with health checks and circuit breaking:
// router.yaml
traffic_shaping:
all:
timeout: 30s
subgraphs:
order-service:
timeout: 5s
health_check:
enabled: true
listen: 0.0.0.0:8088
Schema Composition Failure
If a subgraph publishes a schema that is incompatible with the supergraph (e.g., two services define conflicting types), composition fails. The Router continues serving the last valid supergraph. Composition errors are surfaced in Apollo Studio. Prevention: always run rover subgraph check in CI before publishing.
When NOT to Use GraphQL Federation
- Simple REST APIs with <5 services: the operational overhead of Router deployment, schema composition, and Apollo Registry is not justified
- Teams without GraphQL expertise: Federation adds substantial complexity on top of base GraphQL — learn base GraphQL first
- High write-heavy APIs: GraphQL mutations across federated services introduce distributed transaction complexity; prefer REST + Saga pattern for heavy write workloads
- Microservices with very different SLAs: if one subgraph has 99.0% availability, the supergraph's effective availability for queries crossing that subgraph cannot exceed 99.0%
Key Takeaways
- GraphQL Federation eliminates the frontend API orchestration problem by moving it into the Router — a single, purpose-built, high-performance layer
- Entities with
@keyare the glue of the supergraph — design them carefully, as changing key fields is a breaking change - DataLoader is mandatory in federated GraphQL; without it, entity resolution becomes an N+1 avalanche
- Apollo Router (Rust) is 5–10x more efficient than Apollo Gateway (Node.js) — use it for all new production deployments
- Run
rover subgraph checkin every CI pipeline; never allow breaking schema changes to reach the production supergraph undetected - Netflix DGS provides the best Federation v2 support for Spring Boot teams — it handles the federation boilerplate so you focus on resolvers
Conclusion
GraphQL Federation is not a simple technology — it introduces a new coordination layer, a new failure domain (the Router), and new concepts (entities, subgraphs, query planning) that every team member must understand. But the return on that investment is real: frontend teams gain complete autonomy over data fetching without requiring backend coordination, microservice teams gain complete autonomy over their subgraph schemas, and the entire organisation benefits from a single, consistent, discoverable API graph. For organisations with 5+ microservices and multiple client teams, Federation is the most powerful API architecture available today.