Managing split DNS in a multi-tenant Kubernetes setup

4 months ago 11

Fabián Sellés Rosa

In previous posts, we’ve explored various aspects of SCHIP, our Kubernetes-based PaaS at Adevinta, like How we avoided an outage caused by running out of IPs in EKS or migrating to terraform. Today, I want to share a behind-the-scenes look at how we tackled a unique challenge when one of our tenants requested a new feature.

Most of our applications are stateless microservices, many of which produce or consume data from a queue, typically a Kafka queue.

While we have an internal managed Kafka offering, several tenants decided to experiment with a third-party managed Kafka service, and that’s when things got interesting.

I won’t name the third-party provider, but let’s just say some details might sound… Confluent.

The tenant wanted a serverless Kafka model, which required deploying an AWS VPC PrivateLink (a feature we use extensively in our PaaS) and injecting a split DNS zone. The goal: direct all traffic for `*.eu-west-1.kafka.managed.service.` to the PrivateLink endpoint.

Of course, the tenant also needed to maintain connectivity to our in-house Kafka during migration. So, both Kafka environments had to be accessible.

Why does DNS matter for Kafka?

Kafka clients connect to brokers using hostnames from the broker list (bootstrap servers). When a client initiates a connection, it relies on DNS to resolve these hostnames to IP addresses. This is crucial: DNS abstracts broker locations, supports failover, and enables dynamic scaling. If DNS is misconfigured — especially in split DNS scenarios — clients may fail to connect or reach the wrong endpoints, impacting reliability and security.

The Multitenancy Dilemma

SCHIP is a multitenant platform. We learned that several tenants might want to use this third-party Kafka in the near future, so we needed a solution that worked for one tenant without affecting others or increasing our team’s workload.

What followed was a journey of trial and error. Here’s how we navigated it:

— -

Attempt 1: Route53 Split DNS

We tried creating a split DNS at the Route53 level. This works for a single-tenant, isolated VPC, but our clusters are multitenant and share VPCs to simplify NAT management (a story for another day). So, this approach was out.

Attempt 2: Pod DNSConfig

Could we use the Kubernetes-native `dnsConfig` section in the pod spec? Tenants can modify this themselves, so it seemed promising. But the `hostAliases` field requires concrete hostnames, and we needed a wildcard for dynamic DNS records. Dead end.

Attempt 3: Kafka Proxy

What if we built a Kafka proxy? Apps would connect to the proxy, which would handle all the DNS magic. Technically possible (think Envoy), but it would introduce a single point of failure and increase our team’s toil.

Not ideal, developing the proxy will add a new thing to our portfolio.

Attempt 4: Custom DNS Resolver

A refinement: create a custom DNS resolver for the client to discover brokers. But this had similar drawbacks to the proxy idea — more toil, more moving parts.

Attempt 5: CoreDNS Override hardcoding brokers addresses

The root of the problem was DNS resolution. Our clusters use in-cluster CoreDNS and a node-local DNS cache. We could modify CoreDNS to override broker FQDNs to the PrivateLink address. But broker hostnames are dynamic — if the third-party service recycles them, we risk downtime.

Example CoreDNS config:

. {
# ... other plugins ...
template IN CNAME lkc-AAAA.eu-west-1.kafka.managed.service {
answer "{{ .Name }} 60 IN CNAME pl-12345.eu-west-1.vpce.amazonaws.com."
fallthrough
}
forward . /etc/resolv.conf
# ... other plugins ...
}

This works, but only our team can modify cluster-level DNS config — not the tenants. More toil.

Attempt 6: CoreDNS with Meta Plugin

We liked the previous approach’s containment at the Kubernetes level. What if tenants could control the endpoint using pod labels and a wildcard?

template IN ANY kafkas.managed.service {
answer "{{ .Name }} 60 IN CNAME {{.Meta \"kubernetes/client-label/kafka-dedicated-tenant\"}}.kafka.managed.service"
}
rewrite stop name exact mykafka.kafkas.managed.service. pl-12345.eu-west-1.vpce.amazonaws.com.

Tenants add a label (e.g., `mykafka`) to their pods, and CoreDNS overrides the resolution to the PrivateLink address. This was a great compromise: manageable TOIL, flexible for tenants.

At SCHIP, we decided to use a local DNS cache per node for good reasons.

But there was a catch: the node-local DNS cache sits between the app and CoreDNS in our setup. We worried it would cache queries, breaking things if two pods on the same node needed different Kafka brokers.

We thought: let’s move the config to node-local DNS. But node-local DNS doesn’t support the Meta plugin. So close…

Attempt 7: NodeLocal DNS with Template Plugin

We needed a solution compatible with node-local DNS. Luckily, the template plugin was available, and we had experience with it.

template IN ANY svc.cluster.local {
match ".+\\.aws\\.private\\.kafka\\.managed\\.(.+)\\.svc\\.cluster\\.local"
answer "{{`{{ .Name }}`}} 60 IN CNAME kafka-upstream.{{`{{ index .Match 1 }}`}}.svc.cluster.local"
fallthrough
}

How does this work?

- The template plugin intercepts DNS queries for `*.aws.private.kafka.managed.<something>.svc.cluster.local`.

- It responds with a CNAME to `kafka-upstream.<something>.svc.cluster.local`, where `<something>` is captured from the query.

- If there’s no match, CoreDNS continues processing as usual.

What do tenants need to do?

- Create an `ExternalName` service, giving them full control over the endpoint — no TOIL for our team!

apiVersion: v1
kind: Service
metadata:
name: kafka-upstream
namespace: default
spec:
externalName: vpce-559334461673652-uvan60je.vpce-svc-00df17f473adaf690.eu-west-1.vpce.amazonaws.com.
type: ExternalName

Tenants also need to set the `ndots` value in their pod DNS config:

apiVersion: apps/v1
kind: Deployment
metadata:
name: test
namespace: cpr-dev
spec:
template:
spec:
dnsConfig:
options:
- name: ndots
value: "7"

Why set `ndots: 7`?

With deeply nested DNS names like `lkc-AAAA.aws.private.kafka.managed.foo.svc.cluster.local` (7 dots), setting `ndots: 7` ensures the resolver treats them as fully qualified. This avoids unnecessary search domain appends, extra lookups, and potential failures.

Since modifying DNSConfig is tedious, we also provide a Kyverno rule using a MutatingWebhook to automate it:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: inject-ndots-if-annotation-present
spec:
admission: true
background: false
validationFailureAction: Enforce
rules:
- name: inject-ndots-dns-annotation
match:
resources:
kinds:
- Deployment
- StatefulSet
- DaemonSet
preconditions:
all:
- key: "{{request.object.metadata.annotations.\"context.adevinta.io/extended-ndots\" || \"\"}}"
operator: NotEquals
value: ""
- key: "{{request.object.metadata.annotations.\"context.adevinta.io/extended-ndots\" || \"\"}}"
operator: GreaterThan
value: "5"
mutate:
patchStrategicMerge:
spec:
template:
spec:
dnsConfig:
options:
- name: ndots
value: "{{request.object.metadata.annotations.\"context.adevinta.io/extended-ndots\"}}"

After installing this new policy, if a pod includes the annotation `context.adevinta.io/extended-ndots` higher than 5 will trigger a mutation webhook modifying the dnsConfig of the pod.

This journey through Kafkian SplitDNS in a multitenant Kubernetes environment was full of twists and turns. Ultimately, we found a solution that empowers tenants, minimises toil, and keeps our platform flexible for future needs. If you’re facing similar challenges, I hope this story helps you metamorphose your own Kafka connection!

Read Entire Article