GraphQL APIs: the painful bits if you skip them at design time
People remember “ask for only the fields you need.” Fewer remember how many times your resolvers run. A query that looks fine in GraphiQL can turn into a DB shredder under load. This is not a syntax tutorial—it is the stuff we wish we had nailed when shipping internal and partner GraphQL APIs.
The schema is a contract, not just types
As the SDL grows, so do hallway debates: when is this nullable, who pays for this edge? Mark expensive fields (external APIs, heavy aggregates) during review, or split them into dedicated queries, and you argue less later.
Nothing stops a client from nesting user { posts { author { posts { ... } } } } unless you add guardrails.
Naive resolvers invite N+1
Classic pattern: User.posts loads posts, Post.author loads each user one by one. Fifty posts, fifty author fetches. It hides in dev and explodes in staging.
DataLoader is closer to table stakes than a “trick”
It batches loads that repeat inside a single request. Think minimum viable production, not optional optimization.
Create loaders per request and throw them away. A global singleton loader leaks cache across requests and breeds weird bugs.
Depth and cost limits catch honest mistakes too
An accidentally deep query from an internal tool can peg the server. graphql-depth-limit and graphql-query-complexity (or similar) set a ceiling. That is as much team safety as security theater.
Tune numbers to your service; the win is agreeing why those numbers exist.
Without per-field auth, you leak more than under REST
One URL, dozens of fields. You end up guarding Query.me vs public lists, User.email for self vs admin, and so on—at the resolver layer, not a single middleware.
Field-level Redis is not the default answer
Samples love Redis keys per field. Production fights invalidation after mutations. Read-mostly public data can work; high-churn per-user graphs often get low hit rates and high bug rates. CDNs and HTTP caching rarely map cleanly to POST + GraphQL either.
Subscriptions are mostly an infrastructure topic
In-memory PubSub demos are easy; multiple app instances need a shared bus (Redis, etc.) or subscribers on the wrong box never see events.
Pick one error shape and stick to it
extensions.code, HTTP status, masking stack traces in prod—pick a house style per service so clients do not fork into special cases per field.
Observe at resolver granularity
Field-level timings surface surprise hot spots. Managed tracing or OpenTelemetry—either way, know the expensive field names.
Closing
GraphQL is fine; it just asks you to encode operational rules earlier than many REST setups. Schema discipline, DataLoader, depth/cost limits, field auth, errors, and observability—if you defer all of that, clients bake into the graph and refactors get scary. Boring guardrails up front age better than heroic fixes later.