Chapter 14: Network Namespaces And Virtual Ethernet

A process can have the same filesystem view as another process and still be on a different network. On Linux, that separation is a network namespace: a separate network stack with its own interfaces, addresses, routes, firewall state, sysctls, procfs networking views, port numbers, and abstract UNIX domain socket namespace.

That scope matters because container networking is not one mechanism. A namespace gives the process its own network stack. A veth pair connects that stack to a peer device in another namespace. A bridge, route table, overlay, cloud route, eBPF datapath, or direct device assignment decides where packets go after that.

What A Network Namespace Holds

A network namespace owns network devices, IPv4 and IPv6 protocol stacks, routing tables, firewall rules, /proc/net, /sys/class/net, /proc/sys/net, port numbers, and the abstract UNIX domain socket namespace. Four ordinary Linux objects compose that state:

Object	Kernel-facing meaning	Container relevance
Link	A network interface, backed by `struct net_device`.	A veth end, bridge, loopback device, and physical NIC are all links.
Address	An IPv4 or IPv6 address associated with an interface index.	A pod IP is an address on an interface in the pod namespace.
Route	A rule for selecting an output device, table, and optional next hop.	A namespace needs routes before traffic can leave its local link.
Neighbor	A mapping from protocol address to link-layer address.	Ethernet delivery still needs ARP or neighbor discovery after route lookup.

The kernel object behind an interface is struct net_device. It is larger than the fields a user normally sees, but the common inspection vocabulary is visible in the struct: MTU, flags, interface index, name, protocol-specific state, and hardware address.

unsigned int mtu;
unsigned int flags;
int ifindex;
char name[IFNAMSIZ];
struct in_device __rcu *ip_ptr;
const unsigned char *dev_addr;

Modern Linux network configuration usually travels over rtnetlink. The message names line up with the same objects:

RTM_NEWLINK,
RTM_NEWADDR = 20,
RTM_NEWROUTE = 24,

Three reminders before plugins enter the picture: an IP address belongs to an interface, not a process; a route picks an output path but does not deliver a frame; on Ethernet, the neighbor subsystem still resolves the link-layer destination.

flowchart LR host[Host network namespace] --> bridge[cni0 bridge] bridge --> hostveth[host veth end] hostveth <-->|veth pair| podveth[pod veth end] podveth --> podns[Pod or container network namespace]

This is one CNI layout — a local bridge with veth pairs into the host namespace; chapter 16 covers the Kubernetes pod model, and chapter 15 covers the CNI ADD/DEL contract.

Network Namespace State

In the kernel, a network namespace is represented by struct net. A few fields are enough to show what that namespace owns:

struct user_namespace *user_ns;
struct list_head dev_base_head;
struct proc_dir_entry *proc_net;

The namespace has an owning user namespace, a device list, and procfs state.

Creating a new network namespace is tied to CLONE_NEWNET. In the kernel path that copies network namespace state for a new task, no flag means "keep the old namespace." The flag means "allocate and initialize a new struct net."

if (!(flags & CLONE_NEWNET))
    return get_net(old_net);
net = net_alloc();
rv = setup_net(net);

That is the boundary runtimes and tools cross when they use clone(2), unshare(2), or setns(2) with network namespace flags.

Device Lifetime

A physical network device belongs to one network namespace at a time. Moving a physical NIC into a container namespace is possible, but it takes the device away from the host namespace while it is there. When the last process in a network namespace exits, physical devices move back to the initial network namespace.

Virtual Ethernet devices behave differently. A veth pair is destroyed when the namespace that owns it is freed. That makes veth useful for container lifetimes: a runtime or plugin can create a pair, move one end into the target namespace, keep the other end on the host, and let cleanup remove the virtual link when the namespace is gone or when the plugin deletes the attachment.

The veth(4) model is direct: packets transmitted on one end are received on the other.

veth Pairs

The veth driver encodes the pair relationship as peer pointers. During creation it registers both devices and stores the peer pointer in each direction:

rcu_assign_pointer(priv->peer, peer);
rcu_assign_pointer(priv->peer, dev);

Transmit begins by looking up that peer:

rcv = rcu_dereference(priv->peer);

One end of the veth can be named eth0 inside a pod namespace. The other end can have a host-generated name and live in the host namespace. The kernel treats them as two Ethernet devices joined back to back. What happens outside the host end depends on the surrounding network implementation.

The host end might be attached to a Linux bridge. It might be routed directly. It might be consumed by an eBPF datapath. It might be part of an overlay system.

Bridges

A Linux bridge is a layer-2 forwarding device. In the local bridge pattern, the bridge sits in the host network namespace, host-side veth ends are enslaved to it, and the bridge can also hold a gateway address for the container subnet. Packets between containers on the same bridge can be forwarded at layer 2. Packets leaving that subnet need routing, forwarding, and sometimes masquerade.

The bridge interface code shows the relationship between an enslaved device and the bridge. When a device is added as a bridge port, the kernel registers a receive handler on the device and marks it as a bridge port:

err = netdev_rx_handler_register(dev, br_get_rx_handler(dev), p);
dev->priv_flags |= IFF_BRIDGE_PORT;

From inside the namespace, the bridge is invisible: the process sees only the pod-side veth, addresses, routes, and resolver configuration.

Routes, NAT, And Firewall State

Network namespaces have their own route tables and firewall state. A simple bridge setup usually needs an address inside the container namespace, a default route pointing at the bridge gateway, forwarding on the host, and a route or NAT rule for traffic beyond the local subnet.

Local bridge demos commonly masquerade outbound traffic because the external network does not know how to route back to the container subnet. Kubernetes makes a stronger cluster-level promise: pods can communicate with pods on other nodes without NAT at the Kubernetes layer. A plugin may still use NAT for selected features, such as host ports or private egress, but pod-to-pod reachability is the contract.

The CNI bridge plugin exposes the local-bridge choice as configuration. When ipMasq is enabled, the plugin installs masquerade for the configured networks.

DNS Is Separate

A network namespace does not configure DNS. Processes still read resolver configuration from /etc/resolv.conf, and orchestrators arrange /etc/hosts, hostnames, and search domains.

In Kubernetes, pod DNS belongs above the Linux namespace boundary. The pod gets a network namespace and IP address, but kubelet and the runtime arrange files such as /etc/resolv.conf, while the cluster DNS add-on supplies service and pod records according to Kubernetes DNS rules. A CNI result can include DNS fields, but that is not the same thing as the cluster's DNS policy.

Chapter 15 picks up the CNI contract — the spec that tells a plugin which namespace to modify and how to report what it did.

Sources And Further Reading

Linux netdevice(7): https://man7.org/linux/man-pages/man7/netdevice.7.html
Linux rtnetlink(7): https://man7.org/linux/man-pages/man7/rtnetlink.7.html
Linux ip-route(8): https://man7.org/linux/man-pages/man8/ip-route.8.html
Linux arp(7): https://man7.org/linux/man-pages/man7/arp.7.html
Linux packet(7): https://man7.org/linux/man-pages/man7/packet.7.html
Linux network_namespaces(7): https://man7.org/linux/man-pages/man7/network_namespaces.7.html
Linux veth(4): https://man7.org/linux/man-pages/man4/veth.4.html
Linux resolv.conf(5): https://man7.org/linux/man-pages/man5/resolv.conf.5.html
Linux kernel struct net_device: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/include/linux/netdevice.h
Linux kernel rtnetlink: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/net/core/rtnetlink.c
Linux kernel IPv4 routing: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/net/ipv4/route.c
Linux kernel neighbor subsystem: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/net/core/neighbour.c
Linux kernel struct net: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/include/net/net_namespace.h
Linux kernel network namespace setup: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/net/core/net_namespace.c
Linux kernel veth driver: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/drivers/net/veth.c
Linux bridge interface code: https://github.com/torvalds/linux/blob/57b8e2d666a31fa201432d58f5fe3469a0dd83ba/net/bridge/br_if.c
Kubernetes networking model: https://kubernetes.io/docs/concepts/services-networking/
CNI bridge plugin: https://github.com/containernetworking/plugins/blob/v1.9.1/plugins/main/bridge/bridge.go