Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Station kills itself after a container restart #13

Open
gilleswaeber opened this issue Nov 12, 2023 · 4 comments
Open

Basic Station kills itself after a container restart #13

gilleswaeber opened this issue Nov 12, 2023 · 4 comments
Assignees

Comments

@gilleswaeber
Copy link

Hi, I had an issue where the Basic Station process would die immediately after a container restart with the message

Killing process 29

This seem to be caused by killOldPid (https://github.com/lorabasics/basicstation/blob/master/src-linux/sys_linux.c#L366) which read the PID from a file (/var/temp/station.pid) and kills it if it's still running to avoid having multiple processes running at the same time.

Since the start.sh always does the same, the process will always get the same PID and there is no check for that in the code.

It might be better to fix it in the lorabasics repo somehow, but for now adding the following in the start script fixed it.

rm -r /var/tmp/ 2> /dev/null
@xoseperez
Copy link
Owner

Hi! Thank you for reporting the problem.
I have not yet tested it, but have a doubt: even thou the PID of the process inside the container might be always the same... how does the PID file persist across reboots?

@gilleswaeber
Copy link
Author

gilleswaeber commented Nov 13, 2023

Thanks for your reply.
For the PID, Linux assigns them in ascending order starting with 1 for the first process, so while the exact PID may vary between different configurations (the start.sh script is the only thing starting new processes), if the same number of processes are started before the station, the station will have the same PID. On another device, station has consistently the PID 30.

For the persisting data, this seems to be a weird thing of docker compose when restarting (afaict not in the doc, but it's mentioned e.g. here https://stackoverflow.com/questions/69369205/how-to-return-one-container-to-a-clean-state-in-docker-compose). Data seems to persist when doing a docker compose restart or when it restarts because of a failure, but not when doing down and up or after a full system reboot.

The easiest way I found to reproduce the issue is to observe the logs with docker logs basicstation --since 2m -f and then type
docker exec basicstation pkill station in another tab (with restart: unless-stopped or always in docker-compose.yml).

@xoseperez
Copy link
Owner

I have tested the setup as you suggest and it worked as expected, no error after the reboot. Can it be specific to a certain docker version?

Also, I don't see how the issue you linked applies here, there is no persistent storage in the image except that defined on the docker compose file (if defined).

Anyway adding the line you suggest will not harm so I'm ok with it. But I fear the issue might not be due to something different...

@gilleswaeber
Copy link
Author

That's weird. I checked the debian Dockerfiles for balena and it doesn't seem like they define a volume explicitely anywhere.
It could be specific to the host OS then, or the docker version, or something else... but it would seem reasonable to assume that this is a docker issue, I'll see if I can make a simpler repro and submit a report there.

My setup, for the record: Debian 11 (buster), Docker 24.0 (tested with both a normal setup and rootless), armhf architecture (tested with a RPi 3B and a BeagleBone Green)

@xoseperez xoseperez self-assigned this Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants