Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error creating machine #114

Open
julianboehne opened this issue Sep 19, 2023 · 13 comments
Open

Error creating machine #114

julianboehne opened this issue Sep 19, 2023 · 13 comments

Comments

@julianboehne
Copy link

Hello,
I used this driver on different autoscaling images like this one. I'm running Docker on my Windows system locally and an error occurs when i try the command:

docker-machine create \
  --driver hetzner \
  --hetzner-api-token=******** \
  --hetzner-server-location=fsn1 \
  --hetzner-image=ubuntu-22.04 \
  --hetzner-server-type=cx11 \
  GitLab-Docker-Machine

The server is starting on the Hetzner Cloud but i get this ssh-error:

Error creating machine: Error running provisioning: Error running "DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y  curl": ssh command error:
command : DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y  curl
err     : exit status 255

I also tried to connect to the server inside the docker container and this works well with ssh.
What can I do?

@JonasProgrammer
Copy link
Owner

Hi, the exit status comes from the commands run by docker-machine to provision an already existing server, i.e. the driver has nothing to do with this.

Can you try runnning DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y curl directly on the working SQL connection? Perhaps you'll get more error output then.

@julianboehne
Copy link
Author

Alright, I tested the command directly with ssh and it works well. I searched this command in the docker-machine repo and I found it there. But why does this command works well using ssh and failed with the docker-machine.
When I try to debug in docker with docker-machine -D, I get following new error:

About to run SSH command:

                if ! grep -xq '.*\sGitLab-Docker-Machine' /etc/hosts; then
                        if grep -xq '127.0.1.1\s.*' /etc/hosts; then
                                sudo sed -i 's/^127.0.1.1\s.*/127.0.1.1 GitLab-Docker-Machine/g' /etc/hosts;
                        else 
                                echo '127.0.1.1 GitLab-Docker-Machine' | sudo tee -a /etc/hosts; 
                        fi
                fi
SSH cmd err, output: <nil>:

@JonasProgrammer
Copy link
Owner

Very strange indeed.

Can you perhaps try docker-machine ssh after the server was created? You don't actually need to wait for the provisioning process to fail, just don't kill it right after creation (waiting for the next output after 'Waiting for the server to come up' should be fine).
Despite not being provisioned, the machine's access credentials etc. should be available at this stage, so docker-machine ssh should work (fingers crossed).

If the command were to run successful even then, I'm out of ideas right now. I have seen a fair share of driver-related stuff, but heisenbugs in regards to seemingly simple shell commands are a first...

@julianboehne
Copy link
Author

No more ideas,

I stopped the creating process before the Detecting the provisioner... step. I tried the docker-machine ssh command and everything works fine. Thanks for the great help, but at the moment I don't have any new ideas to fix this issue.

@mrjackv
Copy link

mrjackv commented Sep 22, 2023

@julianboehne I've encountered the same issue yesterday as well and figured out the problem
It's a combination of:

  • Many servers (~100) trying to brute force the ssh password
  • The default ssh MaxStartups being set relatively low (10:30:100)

That means that every time docker-machine tries to issue a command via ssh there's a good chance it'll be dropped
I've "solved" the problem by using the following cloud-init:

#cloud-config
package_update: true
packages:
  - fail2ban
bootcmd:
  # Temporarely disable ssh, otherwise docker-machine will try to install
  # its stuff before we're done running the cloud-init
  - systemctl disable --now ssh.service
write_files:
- path: /etc/fail2ban/jail.local
  content: |
    [sshd]
    enabled = true
    mode = aggressive
- path: /etc/ssh/sshd_config.d/custom.conf
  content: |
    MaxStartups 300:30:1000
    PasswordAuthentication no
runcmd:
  - systemctl enable --now fail2ban.service
  - systemctl enable --now ssh.service

@julianboehne
Copy link
Author

@mrjackv where did I find the cloud-init or where do I need to create it?

@mrjackv
Copy link

mrjackv commented Sep 22, 2023

You need to save the contents to a file and then use the command line option --hetzner-user-data-file=<path to file> when running docker-machine create

@julianboehne
Copy link
Author

I tried it like this:

docker-machine create \
  --driver hetzner \
  --hetzner-user-data-file=usr/cloud/cloud-init \
  --hetzner-api-token=*********** \
  --hetzner-server-location=fsn1 \
  --hetzner-image=ubuntu-22.04 \
  --hetzner-server-type=cx11 \
  GitLab-Docker-Machine

But it didn't work. Did I understand something wrong?

@JonasProgrammer
Copy link
Owner

Thanks for the observation @mrjackv. Despite maintaining the driver I run mostly on Hetzner metal, so I don't always have an insight as to what is currently happening in the cloud world. Therefore having some input of the 'front users' is invaluable.

@julianboehne Did you get any kind of error message or did it just exhibit the same behavior as initially described in this issue?

@julianboehne
Copy link
Author

After trying to create the docker-machine, I could observe following error by using docker-machine ls:

NAME                    ACTIVE   DRIVER    STATE     URL                       SWARM   DOCKER    ERRORS
GitLab-Docker-Machine   -        hetzner   Running   tcp://49.13****:2376           Unknown   Unable to query docker version: Cannot connect to the docker engine endpoint

Additionally some informations:

  • using Docker Desktop 4.23.0 on Windows10 (also tryied lower Docker versions)
  • I used different approches to fix this problem like this or this one, but they also got the same error

@JonasProgrammer
Copy link
Owner

Can you try and SSH into the machine after creation to see, whether the docker daemon is actually running (systemctl status docker.service) or something alike?

Other than that, do you have a firewall configured (either on the machine itself or via Hetzner)? Could you also post the output of systemctl cat docker.service and systemctl cat docker.socket?

@julianboehne
Copy link
Author

Docker is not running on the Hetzner Server I created. To my knowledge, I have not configured any firewalls.

root@GitLab-Docker-Machine:~# systemctl status docker.service
Unit docker.service could not be found.
root@GitLab-Docker-Machine:~# systemctl cat docker.service
No files found for docker.service.
root@GitLab-Docker-Machine:~# systemctl cat docker.socket
No files found for docker.socket.

@gschafra
Copy link

Debug output when doing docker-machine -D regenerate-certs GitLab-Docker-Machine -f:

package: action=install name=curl
(GitLab-Docker-Machine) Calling .GetSSHHostname
(GitLab-Docker-Machine) Calling .GetSSHPort
(GitLab-Docker-Machine) Calling .GetSSHKeyPath
(GitLab-Docker-Machine) Calling .GetSSHKeyPath
(GitLab-Docker-Machine) Calling .GetSSHUsername
Using SSH client type: external
Using SSH private key: /root/.docker/machine/machines/GitLab-Docker-Machine/id_rsa (-rw-------)
&{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] -o IdentitiesOnly=yes -i /root/.docker/machine/machines/GitLab-Docker-Machine/id_rsa -p 22] /usr/bin/ssh <nil>}
About to run SSH command:
DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y  curl
SSH cmd err, output: exit status 255: 
Error running "DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y  curl": ssh command error:
command : DEBIAN_FRONTEND=noninteractive sudo -E apt-get install -y  curl
err     : exit status 255
output  :

Doing the corresponding ssh command directly in the terminal works like a charm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants