Day 5. The Path to Mastery in Go Language: Docker

Review some basic knowledge of Docker and understanding of some concepts.

About Docker#

Docker is an open-source application container engine that leverages containerization technology to greatly accelerate the building, testing, and deployment of applications. Its core idea is to package the application and all its dependencies into a standardized unit called a "container," which can run on any system that supports Docker, achieving unprecedented portability and environmental consistency.

Compared to traditional virtualization technologies (e.g., KVM), Docker containers are more efficient and lightweight. Virtual machines need to run a complete operating system for each application, while Docker shares the host system's kernel, significantly reducing resource overhead and startup speed. This architecture not only improves resource utilization but also provides stronger isolation for applications, allowing each container to run independently of the host and other containers.

Docker is not just a tool; it has led a revolution in the "cloud-native" era, profoundly changing how enterprises manage application delivery and deployment cycles.

Core Architecture of Docker Engine#

Docker Engine is the core of the Docker platform. It is a client-server architecture application responsible for container creation, running, and management. This architecture decouples the execution of Docker commands from their actual effects through an API.

Client-Server#

Docker Engine is essentially an open-source containerization technology used to build and containerize applications. It mainly consists of the following parts:

Server (Server): A long-running daemon process named dockerd, responsible for executing core tasks such as building images, running containers, managing networks, and storage.
REST API: dockerd exposes its functionality through a REST API, which can be used to interact with dockerd. The API can be accessed via UNIX sockets or network interfaces.
Client (Client): A command-line tool named docker. Users issue commands through docker, which are converted into REST API requests to dockerd for execution.

This architecture makes Docker very flexible; for example, docker can run locally, while dockerd can run on a remote server, communicating over the network. Additionally, since all operations are performed using the API, this also facilitates automation tools and scripts without the cumbersome wrapping of local CLI commands.

Main Components#

Docker Daemon (dockerd)#

dockerd is a background service running on the host machine. It listens for requests from Docker clients (via the Docker API) and manages Docker objects such as images, containers, networks, and volumes. It is a core element of the Docker architecture, responsible for building, running, and distributing Docker containers. dockerd can use runtimes compliant with OCI (Open Container Initiative) standards, such as ContainerD and CRI-O, to run containers.

OCI (Open Container Initiative) is an open governance structure (project) supported by the Linux Foundation, aimed at creating open industry standards around container formats and runtimes. It was initiated in June 2015 by Docker, CoreOS, and other leaders in the container industry.

ContainerD is an industry-standard core container runtime that manages the complete container lifecycle on its host system. This includes image transfer and storage, container execution and supervision, storage, and network connection tasks.

CRI-O is a tool that implements the Kubernetes Container Runtime Interface (CRI), allowing Kubernetes to use runtimes compliant with OCI (Open Container Initiative) standards to manage pods and containers.

The daemon's configuration can be managed through the daemon.json file. It can listen for API requests via Unix sockets (defaulting to /var/run/docker.sock), TCP sockets, or file descriptors (fd).

Docker Engine API (REST API)#

The Docker Engine API is a Restful API provided by dockerd. The Docker client communicates with dockerd through this API, which specifies the interface for applications to send commands to the daemon, managing tasks such as container building, running, and distribution.
This API is based on HTTP and supports access via Unix sockets or TLS encrypted TCP connections. For HTTPS encrypted sockets, only TLS 1.0 and higher versions are supported, while SSLv3 and lower versions are not supported for security reasons.

Docker Client (Docker Client - Docker CLI)#

Docker CLI is the primary way to interact with Docker, typically a CLI tool with commands starting with docker. Users send commands to dockerd through this CLI tool. A Docker CLI can interact with multiple dockerd instances.

Core Concepts#

Container#

A container is a standardized software unit that packages application code and all its dependencies together, ensuring that the application can run quickly and reliably when migrating from one computing environment to another. This self-contained feature ensures that applications that work correctly on a developer's laptop will also work correctly in a private data center or public cloud.

Essentially, containers virtualize at the operating system level, allowing applications to run in isolated user spaces while sharing the host operating system's kernel. Containers are not just simple software packages; they are packaged in a universal way, and this standardization is a key driving factor for their widespread adoption.

The core of container technology lies in its concept of "operating system virtualization," which sharply contrasts with the hardware virtualization employed by virtual machines (VMs). Virtual machines create a complete virtual hardware environment on physical hardware through a hypervisor, with each VM having its own independent operating system. In contrast, containers create isolated runtime environments directly on the host operating system's kernel.

Containerization#

Containerization is a software deployment process that packages the application code and all files, libraries, configurations, and binaries required to run it into a single executable image. This image then runs as a container. The process isolates the application, allowing it to share only the operating system kernel with the host, enabling a single software package to run on various devices or operating systems.

The typical lifecycle of containerization usually includes three stages:

Development (Develop): Developers define the application's dependencies in a container image file when submitting source code. This file is typically stored alongside the source code.
Build (Build): The image is published to a container repository (Container Repository), where it is version-controlled and tagged as immutable. This stage effectively creates the container.
Deploy (Deploy): The containerized application runs locally, in a continuous integration/continuous delivery (CI/CD) pipeline, or in a production environment.

Containerization technology effectively addresses the issue of inconsistent application behavior across different environments due to differences in operating systems, library versions, or configurations, known as the "it works on my machine" problem.

The "immutability" of container images is the cornerstone of achieving deployment consistency and reliability. This means that once an image is built, it will not be modified. All container instances derived from that image will be identical. This feature encourages operational practices to shift from patching live systems to replacing old systems with new, updated images. This immutability ensures that every version instance of the application is identical, eliminating configuration drift, making deployments highly predictable and repeatable. This, in turn, simplifies rollback and debugging processes.

Core Working Principles of Container Technology#

Operating System-Level Virtualization Principles#

Container technology relies on operating system-level virtualization, a lightweight form of virtualization whose core idea is that the operating system kernel allows multiple isolated user space instances to exist. These isolated instances are the containers. Unlike hardware virtualization used by virtual machines—which requires a hypervisor to simulate hardware for each guest operating system—operating system-level virtualization builds an abstraction layer directly in the host operating system's kernel.

All containers share the same host operating system kernel. This means that the operating system user space running inside the container (e.g., a specific Linux distribution) must be compatible with the host's kernel (e.g., Linux containers run on Linux hosts). Nevertheless, each container has its own independent root filesystem and can run different distributions in user space (e.g., running different Linux distributions on the same Linux kernel). Since programs inside containers communicate with the kernel through normal system call interfaces, operating system-level virtualization incurs almost no performance overhead, which is a significant advantage over virtual machines.

The shared kernel architecture is a double-edged sword: it is the primary reason containers achieve high efficiency (low overhead, high density) and a core concern for security (kernel vulnerabilities may affect all containers).

Moreover, the compatibility requirement for the host operating system kernel (e.g., Linux containers run on Linux, Windows containers run on Windows) means that containers are not a universal solution for running any operating system on any host like virtual machines.

Core Kernel Mechanisms: Namespaces and Control Groups (cgroups)#

Namespaces: Provide process isolation. They partition kernel resources so that one set of processes sees one set of resources while another set sees a different set. This gives processes the illusion of running in their own independent systems.
- The Linux kernel implements several types of namespaces:
  - PID (Process ID) Namespace: Isolates process IDs. Each PID namespace can have its own process with PID 1.
  - NET (Network) Namespace: Isolates network interfaces, IP addresses, routing tables, port numbers, and other network resources.
  - MNT (Mount) Namespace: Isolates filesystem mount points, allowing processes in different namespaces to have different filesystem hierarchy views.
  - UTS (UNIX Timesharing System) Namespace: Isolates hostnames and domain names.
  - IPC (Inter-Process Communication) Namespace: Isolates System V IPC objects (such as message queues, semaphores, shared memory).
  - User Namespace: Isolates user and group IDs, allowing a process to have root privileges within a namespace while not having those privileges on the host.
Control Groups (cgroups): Used to manage and limit resource usage (such as CPU, memory, I/O, network bandwidth) for a group of processes. Processes can be organized into a hierarchical tree structure. Key cgroup controllers include: CPU (regulates CPU cycle allocation), memory (limits memory usage), devices (controls device access), I/O (regulates I/O rates), and Freezer (pauses and resumes process groups).

Container platforms utilize these features to create and manage containers.

The fine-grained features of namespaces allow for precisely tuned isolation. This means that not all containers need the same level of isolation; certain containers can share specific namespaces (e.g., network namespaces) for particular purposes, providing flexibility but also increasing complexity and potential security risks if not managed properly.

Container Engine and Runtime#

Container Engine (or container runtime) is the software responsible for creating and running containers from container images. It acts as an intermediary between containers and the operating system, providing and managing the resources required for containers. Common container engines include Docker Engine, containerd, and CRI-O.

Modern container engines typically separate high-level management functions (such as image pulling, API handling) from low-level container execution:

containerd: This is an industry-standard core container runtime responsible for managing the complete container lifecycle on its host system, from image transfer and storage to container execution and supervision, to low-level storage and network attachment management. Docker has donated containerd to the Cloud Native Computing Foundation (CNCF).
runc: This is a low-level command-line tool compliant with OCI specifications for creating and running containers. It is used by higher-level runtimes like containerd.

The container runtime is responsible for setting up namespaces and cgroups for containers and then running application processes within them.

Image#

Understanding Container Images#

A container image is a lightweight, standalone, executable software package that contains everything needed to run an application: code, runtime, system tools, system libraries, and settings. It represents the binary data that encapsulates the application and all its software dependencies. Images are read-only templates; when a container runs, it is an instance of that image.

Based on the principle of immutability, all containers derived from the same image are identical.

Images are typically created from a text file named Dockerfile, which contains instructions for assembling the image. The image name can include the repository hostname, path, name, and tag (tag) or digest (digest) (e.g., fictional.registry.example/imagename:tag). Tags are used to indicate different versions (e.g., v1.42.0), while digests are unique hash values of the image content, ensuring the immutability of the content.

Using content-based digests to identify images provides strong guarantees of image integrity and precise version control. This is crucial for security (ensuring that the pulled image has not been tampered with) and reproducibility (ensuring consistency in builds and deployments), especially in complex software supply chains. In Kubernetes CI/CD practices, it is recommended to use immutable image tags (often resolved to digests).

Layered Architecture of Container Images#

Container images consist of a series of layers. Each instruction in the Dockerfile (such as RUN, COPY, ADD) typically creates a new layer. These layers stack together, with each layer representing filesystem changes (adding, modifying, or deleting files) relative to the previous layer.

Once created, layers are immutable. When a container runs, a writable layer ("container layer") is added on top of the immutable image layers. This layered architecture brings many benefits, such as:

Efficiency: Layers can be shared between images. If multiple images share the same base layer (e.g., operating system layer, runtime layer), these layers only need to be stored once and downloaded once on the host. This saves disk space and network bandwidth.
Faster Builds: Docker can cache layers. If an instruction and its context in the Dockerfile have not changed, existing layers in the cache can be reused, speeding up the image build process.

The layered filesystem is a key optimization that makes containers practical. Without it, each image would be monolithic, and the advantages of fast downloads and efficient storage would be lost, severely hindering container adoption. The layered filesystem directly addresses this issue, making it possible to have different images with many shared common bases.

A poorly structured Dockerfile may lead to bloated images, with intermediate layers containing unnecessary data, or cache misses that slow down build speeds.

Registry#

Container Repository is used to store container images. A Container Registry is a collection of one or more repositories, typically including features such as authentication, authorization, and storage management. Repositories can be public (e.g., Docker Hub) or private (for internal teams or controlled sharing).

Volume#

Docker Volume provides a data storage and management solution independent of the container lifecycle, addressing the data loss issues caused by the temporary nature of containers.

Understanding Volume#

Docker Volume is a storage mechanism managed by Docker for persisting container data. It is essentially a specific directory in the host's filesystem, but its creation, lifecycle, and management are handled by the Docker engine. In Linux systems, these Volumes are typically stored in the /var/lib/docker/volumes/ directory. A key principle is that non-Docker processes should not directly modify these files and directories managed by Docker to avoid potential data corruption or management conflicts.

Volumes can be mounted and used by one or more containers. When a container is deleted, its associated Volume continues to exist by default, ensuring data persistence. Volumes can be named (Named Volume) or anonymous (Anonymous Volume). Named volumes have a clear name specified by the user, making them easier to manage and reference, while anonymous volumes are assigned a random, unique name by Docker upon creation.

Importance and Advantages of Volume#

Docker Volume plays a crucial role in containerized applications, with its importance and advantages reflected in several aspects:

Data Persistence: This is the core function of Volume. The storage layer of Docker containers is temporary by default; any filesystem changes made inside the container are lost when the container stops or is deleted. The lifecycle of a Volume is independent of the container, meaning that even if the container is deleted, the data stored in the Volume will remain by default, achieving persistent storage of critical data.
Data Sharing and Reuse: Volumes can be shared and reused between multiple containers. This means different containers can access and operate on the same data, making it very suitable for scenarios where collaboration on data is needed in microservices architectures.
Performance Improvement: Using Volumes typically provides better I/O performance compared to direct read/write operations in the container's writable layer. This is because Volume read/write operations usually act directly on the host's filesystem, bypassing the additional abstraction layers and copy-on-write overhead introduced by container storage drivers (such as OverlayFS, AUFS).
Easier Backup and Migration: Since Volumes are uniformly managed by Docker and their data is stored in a specific location on the host, backing up, restoring, and migrating data in Volumes becomes relatively simple and straightforward.
Independent of Container Image: Modifications to data in a Volume do not affect the original image used by the container. This aligns with the best practices that Docker images should remain stateless and immutable, allowing for broader reuse of images.
Support for Volume Drivers: Docker Volume supports integration with various external storage systems through Volume drivers, such as cloud storage services (AWS S3, Azure Files), network file systems (NFS), etc.
Cross-Platform Compatibility: Volumes can work on both Linux and Windows containers, providing a cross-operating system solution for persistent data.

These advantages make Volume the officially recommended method for data persistence in Docker. It not only addresses the issue of data persistence but also enhances the reliability and manageability of Docker applications in production environments by providing standardized management interfaces and good support for external storage.

Working Mechanism#

How Volume Achieves Data Persistence#

The core mechanism by which Docker Volume achieves data persistence is by mapping a specified path (mount point) inside the container to a specific directory managed by Docker on the host's filesystem. When a container writes data to its internal mount point, that data is actually written to the corresponding directory on the host, thus detaching it from the container's own temporary writable layer.

Specifically, its working mechanism can be summarized as follows:

Creation and Mapping: When creating a Volume (whether explicitly using the docker volume create command or implicitly created when running a container), Docker allocates a directory for that Volume in a specific storage location on the host (such as /var/lib/docker/volumes/ on Linux).
Bypassing Union File System (UFS): Volume read/write operations bypass the union filesystem (Union File System) managed by the container storage driver. The writable layer of a container is typically implemented using copy-on-write (Copy-on-Write) based on UFS, which incurs some performance overhead. In contrast, Volume directly directs I/O operations to the host's filesystem, reducing abstraction layers and thus improving performance.
Independent Lifecycle: The data in a Volume is stored on the host, and its lifecycle is independent of any containers that use it. This means that even if all containers using that Volume are deleted, the Volume and its data will remain on the host by default until explicitly deleted.

This mechanism is similar to the "mounting" concept in operating systems.

Data Filling and Overwriting Mechanism#

When a Volume is mounted to a directory inside a container, Docker's handling of any existing data in that directory depends on the Volume's state (whether it is empty) and the mount options.

Mounting an Empty Volume to an Existing Container Directory:
- If an empty Volume (whether newly created or already existing but empty) is mounted to a path in the container that already has existing files or directories, Docker will by default copy the existing files and directories in that path inside the container to this empty Volume.
- If this copying behavior is not desired, the volume-nocopy option can be used (when using the --mount syntax) or the corresponding variant of the -v syntax to prevent it.
Mounting a Non-Empty Volume to a Container Directory (regardless of whether the container directory is empty):
- If a non-empty Volume (i.e., the Volume already contains data) is mounted to a path in the container, any existing content in that path inside the container will be obscured or overwritten by the content of this Volume.
- For Docker containers, once a Volume is mounted, there is no direct "unmount" operation to restore the obscured files; typically, the container needs to be recreated without mounting that Volume to access the original files.
Specifying a Non-Existent Volume When Starting a Container:
- If a name for a non-existent Volume is specified when starting a container, Docker will automatically create an empty Volume for you, then apply the logic of "mounting an empty Volume to an existing container directory," meaning that the data at the mount point inside the container will be copied to the newly created Volume.

These data filling and overwriting mechanisms provide convenience for initializing Volume data while also requiring users to be mindful of their behavior to avoid accidental overwriting or data loss.

Types of Volume#

Docker Volume is mainly divided into named volumes and anonymous volumes, which differ in their creation and referencing methods.

Named Volumes

Named volumes are Volumes that are assigned a clear, easily recognizable name at creation. This is the type of Volume officially recommended by Docker, especially in development and production environments.

The introduction of named volumes means that Volumes are no longer merely appendages to containers but can be independently managed and maintained data storage units.

Anonymous Volumes

Anonymous volumes are Volumes that are not assigned a clear name at creation. When they are initialized, Docker assigns them a randomly generated, globally unique ID (usually a long hash string) as their "name."

Anonymous volumes provide the capability for data persistence, but due to their management inconvenience, Docker and the community typically recommend using named volumes in scenarios where data persistence is needed.

Advanced Applications and Features#

Sharing Between Containers:

This is a core function of Volumes. Multiple containers can simultaneously mount and access the same named Volume. When one container writes data to that shared Volume, other containers that have mounted the same Volume can immediately see those changes. This mechanism is crucial for building microservices applications that require collaboration.

Sharing Between Container and Host:
- Through Volume: Docker-managed Volumes are stored in a specific location in the host's filesystem (usually /var/lib/docker/volumes/). Although Docker does not recommend non-Docker processes directly modifying these files managed by Docker, processes on the host (such as backup tools, monitoring agents) can technically access these paths. However, it is important to note that directly operating on Volume content from the host may bypass Docker's management mechanisms, introducing potential risks. Docker documentation explicitly states that if bidirectional access to files from both the host and container is needed, bind mounts are a more appropriate choice.
- Through Bind Mounts: This is a more commonly used and recommended way to achieve direct, bidirectional file sharing between containers and the host. Users can explicitly specify a directory or file on the host to map to the container internally. This method offers high transparency, and modifications to shared data by the host and container will be immediately visible to each other.

In microservices and distributed application architectures, the sharing capability of Volumes is a key enabler. It allows state, configuration, or intermediate processing data to be decoupled and shared between different service components (containers).

Volume Drivers#

Volume drivers are a core component of Docker's storage architecture, providing Docker Volumes with great flexibility and scalability, enabling integration with various storage backends.

Working Principle: Volume drivers act as a bridge between the Docker engine and specific storage systems. When Docker needs to perform operations on a Volume (such as creating, deleting, mounting, unmounting, getting paths, etc.), it calls the corresponding Volume driver plugin to implement these functions.
Default Driver (local): Docker comes with a default Volume driver named local. This driver stores Volume data on the local filesystem of the Docker host. For single-host deployments and local development environments, the local driver is usually sufficient.
Built-in and Third-Party Drivers: In addition to the local driver, the Docker ecosystem supports many other built-in or third-party Volume drivers. These drivers can store Volume data in:
- Network File Systems (NFS): Allowing multiple Docker hosts to share Volumes stored on NFS servers.
- Cloud Storage Services: Such as AWS Elastic Block Store (EBS), AWS S3 (via specific drivers like Cloudstor), Azure Disk Storage, Azure Files, Google Persistent Disk, etc. This enables containerized applications to leverage persistent, highly available, and scalable storage services provided by cloud platforms.
- Distributed File Systems: Such as GlusterFS, Ceph, etc.
- Block Storage Devices: Storage Area Network (SAN) devices connected via protocols like iSCSI, Fibre Channel, etc.

The Volume driver mechanism reflects the openness and pluggable architecture of the Docker platform regarding storage. It allows Docker to not be limited to local storage but to integrate into complex enterprise data center environments or seamlessly connect with cloud-native storage services. This capability is crucial for large-scale deployments of stateful Docker applications in production environments, as it allows enterprises to choose the most suitable storage solutions based on their needs (performance, cost, availability, compliance, etc.) and integrate them with Docker workloads.

Networks#

Understanding Docker Networks#

Docker networks are a core component of the Docker platform, specifically designed to meet the communication needs of containerized applications. The fundamental goal is to provide a mechanism for containers to communicate with each other, for containers to communicate with their host, and for containers to communicate with external networks (such as the internet or services within other local area networks).

In short, Docker networks are responsible for managing and orchestrating all network interactions of containers.

Docker networks can be viewed as a cluster composed of two or more devices (primarily containers and hosts) that can communicate with each other physically or virtually. Docker includes a dedicated network unit to handle all communication tasks executed between containers, Docker hosts, and external users.

Network Model#

Docker's networking functionality is implemented based on a specification called the Container Network Model (CNM). CNM is an open design specification that defines the fundamental building blocks and interactions of Docker networks, providing Docker with a flexible and pluggable network architecture.

The core components of CNM include Sandbox, Endpoint, and Network.

Sandbox: A sandbox represents the network stack configuration of a container. It contains network-related configuration information such as the container's network interfaces (e.g., eth0), routing tables, DNS settings, etc. Each container has its own independent network sandbox, ensuring isolation at the network level between containers. The implementation of the sandbox typically relies on the operating system's network namespace technology.

Network namespaces are a powerful feature provided by the Linux kernel that allows the creation of multiple isolated network stacks. It can be understood as creating multiple independent, virtual network environments under the same operating system kernel.

Docker leverages Linux network namespace technology to achieve network isolation between containers and between containers and hosts.

Endpoint: An endpoint is a virtual network interface whose primary role is to connect the sandbox (i.e., the container's network stack) to one or more networks. An endpoint can only belong to one network and can only belong to one sandbox.
Network: A network is a collection of endpoints that can directly communicate with each other. Conceptually, a network is the basic unit for implementing container connections and isolation in Docker. Docker allows the creation of various types of networks (e.g., bridge networks, overlay networks), each implemented by specific network drivers and providing specific communication capabilities for containers connected to that network.

These three components work together to form the foundation of Docker networks. When a container needs to connect to a network, the Docker engine creates a sandbox for that container (if it does not already exist), then creates an endpoint and connects one end of this endpoint to the container's sandbox and the other end to the specified network. In this way, the container gains the ability to communicate within that network.

The design of CNM is not just a detail of Docker's internal implementation; it provides a standardized model. This modeling approach gives Docker networks a high degree of pluggability and scalability.

Network Drivers#

Docker network drivers are specific modules that implement the CNM specification, responsible for creating and managing networks and defining the communication rules and behaviors between containers and the host system, as well as between containers.

Overview#

Network drivers play a role as both the connection layer and implementation layer in the Docker architecture. They serve as a bridge between the Docker engine and the underlying network infrastructure (whether physical or virtual). When users issue commands to create networks or connect containers to networks, the Docker engine calls the corresponding network driver to execute these operations.

Each network driver encapsulates a specific set of network technologies and logic. For example, the bridge driver utilizes the bridging capabilities of the Linux kernel and iptables rules, while the overlay driver relies on VXLAN tunneling technology and key-value storage to achieve cross-host communication.

Bridge Network Driver#

The Bridge network driver is the most commonly used and default network mode in Docker. When Docker is installed, a default virtual bridge named bridge (sometimes referred to as docker0) is automatically created. All newly started containers, if not explicitly specified to connect to other networks at startup, will default to connecting to this bridge network.

Working Principle, Default Bridge

The working mechanism of the Bridge driver is to create a software-implemented virtual bridge on the Docker host. This virtual bridge behaves like a physical switch, and all containers connected to this bridge will receive an IP address and can communicate with each other within this virtual network. Containers connect to this bridge through a pair of virtual Ethernet devices (veth pair), one end in the container's network namespace (usually named eth0), and the other end connected to the virtual bridge on the host.

Host Network Driver#

The Host network driver provides a special network mode that completely removes the network isolation layer between the container and the Docker host. When a container uses the Host network mode, it no longer has its own independent network namespace but directly shares and uses the host's network namespace.

Mechanism, Performance Impact, and Reduced Isolation

In this mode, the container is not assigned its own independent IP address. Instead, it directly uses the host's IP address and port space. This means that if an application running in a container in Host mode listens on a certain port (e.g., port 80), that application will directly serve on port 80 of the corresponding IP address of the host, visible to the external network.

All Docker command parameters related to port mapping (such as -p HOST_PORT:CONTAINER_PORT, --publish, -P, --publish-all) will be ignored in Host mode.

Performance Advantages: The most significant advantage of Host mode is network performance. Since containers directly use the host's network stack, packets do not need to go through the additional processing layers introduced by network address translation (NAT) or userland proxies (userland-proxy) of the Docker engine. This eliminates the performance overhead and latency that these intermediate layers may introduce, making network communication in Host mode nearly equivalent to running applications directly on the host.
Reduced Isolation and Security Risks: Since containers share the host's network, they no longer benefit from the network isolation protection provided by Docker by default. Processes inside the container can directly access all network interfaces of the host, and other processes on the host can also directly access ports listened to by the container. This direct exposure poses significant security risks, as vulnerabilities in the container may directly impact the host.

Userland proxy typically refers to the docker-proxy process in the Docker engine, which is responsible for port forwarding in specific network scenarios.

None Network Driver#

The None network driver provides the highest level of network isolation for Docker containers. When a container is configured to use the none network driver, Docker does not configure any external network interfaces for it, except for a required loopback interface (loopback interface, usually lo).

This means that containers connected to the none network:

Do not have external network interfaces like eth0.
Are not assigned IP addresses (except for the loopback address 127.0.0.1).
Cannot communicate with the Docker host (except through the loopback interface with themselves).
Cannot communicate with other containers.
Cannot access external networks (such as the internet).

Overlay Network Driver#

The Overlay network driver is a solution specifically designed for cross-host environments, with the core goal of enabling containers running on different physical or virtual machines to communicate as if they were on the same local area network.

This is crucial for building and managing distributed applications, especially when deploying microservices architectures in Docker Swarm cluster mode.

Working Principle (VXLAN): Overlay networks achieve cross-host communication by creating a virtual Layer 2 network on top of existing physical networks (typically Layer 3 networks). It usually employs VXLAN (Virtual Extensible LAN) tunneling technology. VXLAN encapsulates the original Ethernet frames (network packets sent by containers) within a UDP packet and transmits them over the underlying host network. When this UDP packet reaches the target host, the Docker engine on the target host will decapsulate it, extract the original Ethernet frame, and pass it to the target container. In this way, even if containers are distributed across different physical network segments, they feel as if they are connected on the same virtual Layer 2 network and can communicate using the same subnet IP addresses.
Docker Swarm Integration: Overlay networks are a core networking feature and default choice in Docker Swarm mode. When initializing a Swarm cluster or adding nodes to the Swarm, Docker automatically creates some predefined overlay networks (such as the ingress network, which handles the routing mesh for external traffic). Users can also easily create custom overlay networks and deploy Swarm services onto these networks. The Swarm manager is responsible for coordinating and managing the configuration of overlay networks, ensuring correct routing of containers across all nodes in the cluster.
Service Discovery: In the overlay network of Docker Swarm, a powerful service discovery mechanism is built-in. When a service (composed of one or more container replicas) is deployed to an overlay network, Swarm assigns a virtual IP (VIP) to that service, and the embedded DNS server in Swarm resolves the service name to this VIP. This means that other containers or services within the network can access the target service directly by its service name, without needing to know which node its replicas are running on or the specific container IP addresses.

Load Balancing: Docker Swarm utilizes overlay networks and its ingress routing mesh to provide automatic load balancing for services published externally. When external requests reach any published port on a node in the cluster, Swarm distributes the traffic to a healthy running container replica of that service, regardless of which node that replica is located on. For inter-service communication within the cluster, Swarm also load balances requests made by service names across multiple replicas of the service.

The Overlay network driver provides a solid foundation for building and operating complex distributed containerized applications through its powerful cross-host communication capabilities, deep integration with Swarm, and built-in service discovery and load balancing features.

Macvlan Network Driver#

Containers as Physical Network Devices

The Macvlan network driver allows assigning an independent MAC (Media Access Control) address to Docker containers and directly bridging their virtual network interfaces to the physical network interface of the host. This makes containers appear as independent physical devices at the physical network level, with their own MAC and IP addresses, allowing them to participate directly in the local area network where the host resides, communicating with other physical devices or virtual machines without going through the host's NAT or port mapping.

Limitations and Considerations
- Host and Container Communication: By default, containers using the Macvlan network cannot directly communicate with their host. This is because traffic from the container goes directly out through the parent physical interface, and the host's network stack cannot intercept this traffic directed to itself. If host communication with Macvlan containers is needed, additional network configuration is usually required.
- Network Device Support: The host's physical network interface (or its driver) needs to support promiscuous mode to receive frames sent to multiple MAC addresses.
- IP Address and MAC Address Management: Careful planning of IP and MAC address allocation is needed to avoid conflicts with other devices in the existing network. Too many MAC addresses may also stress network switches (exhausting the MAC address table) or trigger security policies (such as port security restrictions).