Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf docker error - just me ? #625

Open
SimonMcN opened this issue Nov 9, 2022 · 4 comments
Open

Telegraf docker error - just me ? #625

SimonMcN opened this issue Nov 9, 2022 · 4 comments

Comments

@SimonMcN
Copy link

SimonMcN commented Nov 9, 2022

I saw this error in my logs:
2022-11-09T14:31:00Z E! [inputs.docker] Error in plugin: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0": dial unix /var/run/docker.sock: connect: permission denied

after some investigation I saw this
docker/compose#1532 (comment)

Is this something to do with file permissions pls ?

@SimonMcN
Copy link
Author

SimonMcN commented Nov 9, 2022

@SimonMcN
Copy link
Author

SimonMcN commented Nov 9, 2022

I ended up with a compose as below which seems to work.. The user is the pertinent part.

    container_name: telegraf
    build: ./.templates/telegraf/.
    image: telegraf:latest
    restart: unless-stopped
    user: telegraf:998
    environment:
    - TZ=Etc/UTC
    - HOST_ETC=/hostfs/etc
    - HOST_PROC=/hostfs/proc
    - HOST_SYS=/hostfs/sys
    - HOST_VAR=/hostfs/var
    - HOST_RUN=/hostfs/run
    - HOST_MOUNT_PREFIX=/hostfs
    ports:
    - "8092:8092/udp"
    - "8094:8094/tcp"
    - "8125:8125/udp"
    volumes:
    - ./volumes/telegraf:/etc/telegraf
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /:/hostfs:ro
    depends_on:
    - influxdb

@Paraphraser
Copy link

Well, here's my service definition:

  telegraf:
    container_name: telegraf
    build: ./.templates/telegraf/.
    hostname: iotstack
    restart: unless-stopped
    environment:
      - TZ=Australia/Sydney
    ports:
      - "8092:8092/udp"
      - "8094:8094/tcp"
      - "8125:8125/udp"
    volumes:
      - ./volumes/telegraf:/etc/telegraf
      - /var/run/docker.sock:/var/run/docker.sock:ro
    depends_on:
      - influxdb
      - mosquitto

I just did a clean-slate install:

  • terminate the container
  • remove the image
  • erase the persistent store
  • up the container (force a rebuild and complete re-initialisation)

The result in the log:

$ docker logs telegraf
2022-11-09T23:23:39Z I! Using config file: /etc/telegraf/telegraf.conf
2022-11-09T23:23:39Z I! Starting Telegraf 1.24.3
2022-11-09T23:23:39Z I! Available plugins: 221 inputs, 9 aggregators, 26 processors, 20 parsers, 57 outputs
2022-11-09T23:23:39Z I! Loaded inputs: cpu disk diskio docker file kernel mem processes swap system
2022-11-09T23:23:39Z I! Loaded aggregators: 
2022-11-09T23:23:39Z I! Loaded processors: 
2022-11-09T23:23:39Z I! Loaded outputs: influxdb
2022-11-09T23:23:39Z I! Tags enabled: host=iotstack
2022-11-09T23:23:39Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"iotstack", Flush Interval:10s

Metrics are also turning up in InfluxDB so that's working too.

I'm not saying this is the answer but problems involving docker.sock usually turn out to be incomplete installation of Docker in the Raspbian environment and, in particular, not having done:

$ sudo usermod -G docker -a $USER
$ sudo usermod -G bluetooth -a $USER
$ sudo reboot

You can also get away with just a logout and login rather than the reboot

To the best of my recollection, the current user not being a member of group docker is something that shows up at docker-compose time rather than the container seeming to come up OK but moaning internally. You didn't mention anything like that happening so that's quite puzzling.

The mechanism by which the current user gains access to docker.sock by being a member of the docker group is (to my eye) fairly straightforward:

$ ls -al /var/run/docker.sock
srw-rw---- 1 root docker 0 Nov  3 12:09 /var/run/docker.sock

members of docker get rw access

How your user statement solves the problem is something I can't explain. On my system, there is no telegraf in /etc/passwd and group 998 is i2c which pi (my $USER) is a member of. As I read the permissions above, you either need to be root or a member of docker:

$ grep docker /etc/group
docker:x:995:pi

So, on my system at least, pi is the only member of docker.

Beats me!

Anyway, perhaps try the usermod commands, comment-out the user clause, and see what happens.

@Paraphraser
Copy link

Just as an experiment, I removed the current user from the docker group, logged out/in to let it take effect, and tried to bring up telegraf:

$ docker-compose up -d telegraf
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.project%3Diotstack%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied

That's the same error but it lacks the "2022-11-09T14:31:00Z E! [inputs.docker] Error in plugin: Got " preamble which shows yours is coming from docker logs telegraf.

Of course, not being a member of docker pretty much prevents anything from working unless I use sudo.

When I try:

$ sudo docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

my container hasn't even come up. I suppose I could try to follow-through using sudo on everything but I don't want to do that because it just creates myriad other problems which I'll then have to unpick.

So, have you also been using sudo to run docker and docker-compose commands? That should never be needed. Maybe read this for some context.

If this combination (not being in docker group and using sudo to run docker commands) does actually turn out to explain your problem, please let me know. I'll add some extra words to that doco page to emphasise that needing to use sudo in that situation indicates a deeper problem.

I still can't explain how user: telegraf:998 solves this…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant