Docker swarm keeps a tasks.db file which can grow large...
Some of our deployments suddenly failed mysteriously:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:319: getting the final child's pid from pipe caused \"EOF\"": unknown.
Various commands showed something was wrong:
# docker service ls insdns4o4bu8 xxx_1 replicated 0/1 xxx *:8000->8000/tcp dggj6i4actyn xxx_2 replicated 0/1 xxx *:8001->8001/tcp
# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ux3enftr6krhlrp8bbodo2q1l * wsdocker6 Down Active
systemctl status docker
finally gave the right hint:
swarm component could not be started before timeout was reached
Which sent me to this issue comment which finally resolved the issue.
Turns out docker swarm has a
tasks.db file which can get corrupted somehow.