Setup a Docker Swarm

This lab set ups a Docker Swarm, deploys a basic Docker Service, then scales the Docker Service up and performs a rolling update

Requirements

To create a Swarm you require:

A minimum of 2 Docker nodes that can communicte on a network over:

  • 2377/TCP (client to Swarm over HTTPS)
  • 7946/BOTH (Control plane)
  • 4789/TCP (VXLAN-based overlay networking)

Initialize the Swarm on the first manager node

A node with Docker installed is a single-engine node and you need to configure Docker to run in Swarm mode so it can act as a multi-engine node. Logon to the first Docker node that you want to be a manager.

The IP and join details in this example will be different per setup

$ docker swarm init --advertise-addr 192.168.0.13:2377 --listen-addr 192.168.0.13:2377
Swarm initialized: current node (uuukt57mhnr02p7q0wk8kw10l) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-0btobek7ebvbtk4c3ncs2laux2yqpz4roiufsl2cdwftt3dqat-a0uofotfgakzpixrt6ol6zm77 192.168.0.13:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
  • advertise-addrconfigures the address and port to be advertised to other nodes using this Swarm
  • listen-addr is where the node will accept traffic on

Add additional manager nodes

On the original manager node you can run the following command to retrieve the command and values that can be copy/pasted onto new manager nodes to join them.

$ docker swarm join-token manager
To add a manager to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-0btobek7ebvbtk4c3ncs2laux2yqpz4roiufsl2cdwftt3dqat-7kz5u6t75ptrqpc3kygwh7ssi 192.168.0.13:2377

Now logon onto the additional manager nodes and paste the command.

You can check that your managers are registered by running the following command.

$ docker node ls
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
uuukt57mhnr02p7q0wk8kw10l *   node1      Ready     Active         Leader           20.10.17
iwft3pma29oo12nfddru1deid     node2      Ready     Active         Reachable        20.10.17
swhtvpxigheh744unqc09hj7z     node3      Ready     Active         Reachable        20.10.17

Add worker nodes

Repeat the same process on the worker nodes as the manager but you need the worker token instead. So on a manager node run the following command and copy the output to clipboard and paste it onto your worker nodes to join them.

$ docker swarm join-token worker
To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-0btobek7ebvbtk4c3ncs2laux2yqpz4roiufsl2cdwftt3dqat-a0uofotfgakzpixrt6ol6zm77 192.168.0.13:2377

On any of the manager nodes, check to make sure your worker nodes are joined. Any node with no data in the MANAGER STATUS column are worker nodes and the node with an * simply identifies to you which node you are logged into and running the command from.

$ docker node ls
ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
uuukt57mhnr02p7q0wk8kw10l     node1      Ready     Active         Leader           20.10.17
iwft3pma29oo12nfddru1deid     node2      Ready     Active         Reachable        20.10.17
swhtvpxigheh744unqc09hj7z *   node3      Ready     Active         Reachable        20.10.17
7gr0zv9cw0dcwb6gg6mcxodxv     node4      Ready     Active                          20.10.17
blpcfyi7hole1dx792zq17pvt     node5      Ready     Active                          20.10.17

Create an overlay network

This command creates a new Docker overlay network called grinntec-net which is available to all Docker nodes in the same Swarm.

$ docker network create -d overlay grinntec-net
o8iaoig5den15ha3tvir17cpc

$ docker network ls
NETWORK ID     NAME              DRIVER    SCOPE
9ae4bca4962e   bridge            bridge    local
62dd78760eae   docker_gwbridge   bridge    local
o8iaoig5den1   grinntec-net      overlay   swarm
8c93d83c35df   host              host      local
adt8wii4ba2s   ingress           overlay   swarm
2c5a36e59aea   none              null      local

Deploy a container as a Service

Logged into one of the manager nodes run the following command to deploy an image to the worker nodes. This command creates a new service with the name of web-fe and to work on the overlay network call grinntec-net, and told Docker to listen on port 8080 and translate that traffic to 8080, two replicas of the image are required and the image used in nginx.

$ docker service create --name web-fe --network grinntec-net -p 8080:8080 --replicas 2 nginx
t9h19g5mhucy6v7evhmz344er
overall progress: 2 out of 2 tasks 
1/2: running   [==================================================>] 
2/2: running   [==================================================>] 
verify: Service converged

The service is saved as a configuration of the desired state of this service. As the Swarm monitors the web-fe containers, if the observed state does not match the desired state then the Swarm will work to ensure the desired state is achieved. This is typically when a worker node is offline and the number of replicas as dropped below the required number as per the current desired state for this container web-fe.

You can see the observed state

$  docker service ls
ID             NAME      MODE         REPLICAS   IMAGE          PORTS
t9h19g5mhucy   web-fe    replicated   2/2        nginx:latest   *:8080->8080/tcp

Or a more detailed view using

$ docker service ps web-fe
ID             NAME       IMAGE          NODE      DESIRED STATE   CURRENT STATE           ERROR     PORTS
j22zc3m5pzpz   web-fe.1   nginx:latest   node4     Running         Running 8 minutes ago             
1t2ayyc4v12b   web-fe.2   nginx:latest   node3     Running         Running 8 minutes ago

Scale the service up

We’re seeing a lot of traffic, let’s scale up from 2 to 10 instances.

$ docker service scale web-fe=10
web-fe scaled to 10
overall progress: 10 out of 10 tasks 
1/10: running   [==================================================>] 
2/10: running   [==================================================>] 
3/10: running   [==================================================>] 
4/10: running   [==================================================>] 
5/10: running   [==================================================>] 
6/10: running   [==================================================>] 
7/10: running   [==================================================>] 
8/10: running   [==================================================>] 
9/10: running   [==================================================>] 
10/10: running   [==================================================>] 
verify: Service converged 

$ docker service ps web-fe
ID             NAME        IMAGE          NODE      DESIRED STATE   CURRENT STATE                ERROR     PORTS
2u7oj8ere0jh   web-fe.1    nginx:latest   node2     Running         Running 3 minutes ago                  
450ic1m7ij0f   web-fe.2    nginx:latest   node1     Running         Running 3 minutes ago                  
iaui9yyp9sar   web-fe.3    nginx:latest   node1     Running         Running about a minute ago             
m8htwrriw7zh   web-fe.4    nginx:latest   node3     Running         Running about a minute ago             
lbl27juxx05t   web-fe.5    nginx:latest   node4     Running         Running about a minute ago             
64sfczr1jr63   web-fe.6    nginx:latest   node3     Running         Running about a minute ago             
onogf97lkbpt   web-fe.7    nginx:latest   node4     Running         Running about a minute ago             
ihqio6csb2et   web-fe.8    nginx:latest   node5     Running         Running about a minute ago             
h4dx06cn1ewx   web-fe.9    nginx:latest   node5     Running         Running about a minute ago             
6sp9aux50ykg   web-fe.10   nginx:latest   node2     Running         Running about a minute ago

Update the Service with a new version

A new version of the image has been released so we need to push it out to all 10 instances of the Service without downtime. The following command will update the web-fe service to use the latest version of the nginx image, with a rolling update configuration that updates 2 tasks at a time and waits for 20 seconds between each update. This rolling update strategy helps to minimize downtime during the update process.

$ docker service update --image nginx:latest --update-parallelism 2 --update-delay 20s web-fe
web-fe
overall progress: 10 out of 10 tasks 
1/10: running   [==================================================>] 
2/10: running   [==================================================>] 
3/10: running   [==================================================>] 
4/10: running   [==================================================>] 
5/10: running   [==================================================>] 
6/10: running   [==================================================>] 
7/10: running   [==================================================>] 
8/10: running   [==================================================>] 
9/10: running   [==================================================>] 
10/10: running   [==================================================>] 
verify: Service converged 

State of the Service during the update. Some are shutdown whilst upgrading. But you have multiples running to allow for the desired state.

$ docker service ps web-fe
ID             NAME            IMAGE          NODE      DESIRED STATE   CURRENT STATE                 ERROR                              PORTS
4qrjsjqzdgiu   web-fe.1        nginx:stable   node2     Running         Running 1 second ago                                             
2u7oj8ere0jh    \_ web-fe.1    nginx:latest   node2     Shutdown        Shutdown 3 seconds ago                                           
5mzx1icqp7x7   web-fe.2        nginx:stable   node1     Running         Running 53 seconds ago                                           
450ic1m7ij0f    \_ web-fe.2    nginx:latest   node1     Shutdown        Shutdown 54 seconds ago                                          
d2pyk0zl77lu   web-fe.3        nginx:stable   node2     Running         Running 1 second ago                                             
1ggalbjr0g9e    \_ web-fe.3    nginx:latest   node2     Shutdown        Failed 3 seconds ago          "error while removing network:…"   
24ifzyhu96hd   web-fe.4        nginx:stable   node3     Running         Running about a minute ago                                       
7c6owd7kvcf3    \_ web-fe.4    nginx:latest   node1     Shutdown        Shutdown about a minute ago                                      
y0dxcybkmklq   web-fe.5        nginx:stable   node5     Running         Running 27 seconds ago                                           
pdujac0801kl    \_ web-fe.5    nginx:latest   node5     Shutdown        Shutdown 29 seconds ago                                          
8vv4hmm7iqfa   web-fe.6        nginx:stable   node3     Running         Running 27 seconds ago                                           
m6u7nu8qijkr    \_ web-fe.6    nginx:latest   node3     Shutdown        Shutdown 29 seconds ago                                          
z6op7153ufpj   web-fe.7        nginx:stable   node5     Running         Running 53 seconds ago                                           
fbcqknuqlcgn    \_ web-fe.7    nginx:latest   node5     Shutdown        Shutdown 54 seconds ago                                          
1gmx8zp4czde   web-fe.8        nginx:stable   node1     Running         Running about a minute ago                                       
aeh626vdw93r    \_ web-fe.8    nginx:latest   node3     Shutdown        Shutdown about a minute ago                                      
opi9vpc6lz8w   web-fe.9        nginx:stable   node4     Running         Running about a minute ago                                       
2pnqlhxpv3ov    \_ web-fe.9    nginx:latest   node4     Shutdown        Shutdown about a minute ago                                      
pp8ib1c5v0r9   web-fe.10       nginx:stable   node4     Running         Running about a minute ago                                       
1zx5hidgt9dj    \_ web-fe.10   nginx:latest   node4     Shutdown        Shutdown about a minute ago 

Last modified January 16, 2025: Update docker-swarm.md (1519d6d)