Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker daemon floods logs with false errors: Error getting service... service ... not found #45316

Closed
filosof86 opened this issue Apr 13, 2023 · 1 comment
Labels
area/logging area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage

Comments

@filosof86
Copy link

Description

Hi!

Recently spotted that the Docker with a swarm initiated puts a lot of messages to the system logs about the unavailability of the service after looking for this service by the task id.

level=error msg="Error getting service 1qqzep2cj7d3mdut1zq3ufbal: service 1qqzep2cj7d3mdut1zq3ufbal not found"
level=error msg="Error getting service l1g3p7hev5gh6sds2z6t9ztqh: service l1g3p7hev5gh6sds2z6t9ztqh not found"
level=error msg="Error getting service tg34g7fxxo1i40kycu4i6zeu1: service tg34g7fxxo1i40kycu4i6zeu1 not found"
level=error msg="Error getting service waa3id190ievujjflz8of28e6: service waa3id190ievujjflz8of28e6 not found"
.....

The example and investigation based on this example:

Take some container (in my case a task of 'es_test' service)

> docker container ls
CONTAINER ID   IMAGE                 COMMAND                  CREATED          STATUS          PORTS                NAMES
2f8d6e5d217f   elasticsearch:2.4.1   "/docker-entrypoint.��"   24 minutes ago   Up 24 minutes   9200/tcp, 9300/tcp   es_test.1.p438c59lx1k1g7rq4etfyh32c

Its task id is p438c59lx1k1g7rq4etfyh32c

With docker daemon debug enabled we see the following picture.

Apr 12 16:03:52 dockerd[385]: time="2023-04-12T16:03:52.495601968Z" level=debug msg="Calling GET /v1.42/containers/p438c59lx1k1g7rq4etfyh32c/json"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.496073189Z" level=debug msg="Calling GET /v1.42/images/p438c59lx1k1g7rq4etfyh32c/json"
Apr 12 16:03:52 dockerd[385]: time="2023-04-12T16:03:52.496467414Z" level=debug msg="Calling GET /v1.42/networks/p438c59lx1k1g7rq4etfyh32c"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.496983849Z" level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = network p438c59lx1k1g7rq4etfyh32c
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.497905335Z" level=debug msg="Calling GET /v1.42/volumes/p438c59lx1k1g7rq4etfyh32c"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.497928245Z" level=debug msg="Probing all drivers for volume with name: p438c59lx1k1g7rq4etfyh32c"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.498083737Z" level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = volume p438c59lx1k1g7rq4etfyh32c
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.498957461Z" level=debug msg="Calling GET /v1.42/info"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.506947659Z" level=debug msg="Calling GET /v1.42/services/p438c59lx1k1g7rq4etfyh32c?insertDefaults=true"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.507196801Z" level=debug msg="error handling rpc" error="rpc error: code = NotFound desc = service p438c59lx1k1g7rq4etfyh32c
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.507612298Z" level=error msg="Error getting service p438c59lx1k1g7rq4etfyh32c: service p438c59lx1k1g7rq4etfyh32c not found"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.507948866Z" level=debug msg="Calling GET /v1.42/tasks/p438c59lx1k1g7rq4etfyh32c"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.574177273Z" level=debug msg="Calling HEAD /_ping"
....

AFAICS, docker probes the id p438c59lx1k1g7rq4etfyh32c supposing it to be a container, then an image, network, and so on.
These entities are not found and on the default level of logging, we get one of these messages: error msg="Error getting service p438c59lx1k1g7rq4etfyh32c: service p438c59lx1k1g7rq4etfyh32c not found"

At the end of probing docker eventually finds the id as a task and goes on:

Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.507612298Z" level=error msg="Error getting service p438c59lx1k1g7rq4etfyh32c: service p438c59lx1k1g7rq4etfyh32c not found"
Apr 12 16:03:52  dockerd[385]: time="2023-04-12T16:03:52.507948866Z" level=debug msg="Calling GET /v1.42/tasks/p438c59lx1k1g7rq4etfyh32c"
Apr 12 16:03:52 dockerd[385]: time="2023-04-12T16:03:52.574177273Z" level=debug msg="Calling HEAD /_ping"
....

This is done for all task ids in the system and that's the way I'm getting flooded with Error getting service... messages.

Can you please help to sort out this issue? Is this an expected behavior or something I can adjust in configs?

Thank you in advance!

Reproduce

  1. Start docker swarm
  2. Deploy some services
  3. Take a look at docker daemon logs (ie.g. journalctl -u docker.service |tail -n50)

Expected behavior

Docker should not try to get the various entities using the task ID and thus causing the errors.

docker version

> docker version
Client: Docker Engine - Community
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.5
 Git commit:        a5ee5b1
 Built:             Thu Feb  9 19:51:00 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.5
  Git commit:       bc3805a
  Built:            Thu Feb  9 19:48:42 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.18
  GitCommit:        2456e983eb9e37e47538f59ea18f2043c9a73640
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.16.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 22
  Running: 6
  Paused: 0
  Stopped: 16
 Images: 12
 Server Version: 23.0.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: o2iug7etpp9mmjrsxisk3tsem
  Is Manager: true
  ClusterID: p48ag1uej4mf3jurzz40ymyos
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.11.0.0/16  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.0.0.1
  Manager Addresses:
   10.0.0.1:2377
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2456e983eb9e37e47538f59ea18f2043c9a73640
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 3.10.0-1160.88.1.0.1.el7.x86_64
 Operating System: Oracle Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 14.45GiB
 Name: host.test
 ID: 0051033b-1698-4915-9d5b-7a5db33c5850
 Docker Root Dir: /docker_var/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 10.19.0.0/16, Size: 21

Additional Info

No response

@filosof86 filosof86 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Apr 13, 2023
@corhere
Copy link
Contributor

corhere commented May 3, 2023

This issue has been fixed in v23.0.2.

@corhere corhere closed this as not planned Won't fix, can't repro, duplicate, stale May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logging area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage
Projects
None yet
Development

No branches or pull requests

3 participants