-
Notifications
You must be signed in to change notification settings - Fork 392
since approx 2018-06-05, in-docker-container ansible-container build fails with "ansible.errors.AnsibleError: the role '<rolename>' was not found in <rolespath>" on different roles depending on environment #942
Comments
I have taken a look on your issue , using POC repo from other issue as a base: https://github.com/Nexlo/ansible-test extending it to be double role:
on a clear virtual env (py2 , base os ubuntu 16.04 LTS), without mentioned Dockerfile - works like a charm for me. This makes me think that issue might be not in ansible-container, but your environment (i.e. combination of dockerized ansible-container + conductor + container) Perhaps you can create POC repository for issue, using above https://github.com/Nexlo/ansible-test as basis ? as an option - try to build with --no-container-cache
If it get's better , please comment here |
Voronenko, I do appreciate you looking at this. (At this time it seems the support that ansible-container users might get post-2fa778a is ourselves!) My original writeup is reeeeely long and the working/not-working scenarios are buried in too much other text:
I agree that the main factor is the dockerized ansible-container setup. The thing that strikes me as very strange is that the dockerized configuration was working fine for the month or two that I was using it successfully before approx June 5. And on the server that was working previously, the roles that now cannot be found are the roles I added after June 5; all the previously found and previously working roles still work. Would you mind trying running an ansible-container build in a container? Here's a minimal ubuntu:xenial Dockerfile that should run ansible-container successfully (mount in /var/run/docker.sock and your ansible code):
pip freeze output in both in-virtualenv working and global-env nonworking ubuntu situations is:
-- |
From one hand I confirm the issue (i.e. in some circumstances role not found, if mapped to other path than on original host), from other hand whole approach is erroneous:
i.e. summary at that point - I would not do in that way.... and instead go with local python with ansible-container in virtual env
docker process starts to find mentioned roles and even tries to build. I would not do building docker from docker with mapped sock. Using TCP port ? who knows - seems more reliable, at least it will send context there. Hope that helps |
your suggestions and analysis give me some good ideas on investigating a workaround or alternate approaches I'll report back if anything ends up successful (the idea of curl-ing the docker binary directly into the image comes from how the conductor images are created - "docker history --no-trunc ansible/container-conductor-centos-7:0.9.2") thank you |
Your comment about conductor is right. So this is rather api compability. |
ISSUE TYPE
container.yml
This is a reasonably small example I created to demonstrate the problem. (Yes it fails.)
Individual roles have a tasks/main.yml of the form
substitute BASE for ONE, TWO, THREE, FOUR to match role
OS / ENVIRONMENT
The environment for a virtualenv ansible-container install direct on ubuntu xenial:
Believed-identical environment configured as a Dockerfile-built docker container "FROM ubuntu:xenial":
(I have tried a "FROM centos:7" version as well - no difference.)
My environments are set up pinned to 0.9.2 with various workarounds applied as I encountered the need for them (ubuntu paths below):
pip docker==2.7.0 is workaround that I can't find a reference for now (?!?!)
sed filters workaround addresses ansible-container bug described in moby/moby#34121
sed return is workaround for #762
SUMMARY
Heads up: The observed behavior is strikingly similar to #673 but does not involve any cloud-enabled roles; all roles requested confirmed to exist on the filesystem in the single path specified in --roles-path option.
I have many services, each with many different roles listed. Previous to 2018-06-05 everything was working fine on a particular docker host. On 2018-06-05 I added an extra role to my services. at the end of the list (e.g. "BuildBox/Configuration4") which resulted in different failures depending on the environment.
In a direct-on-iron ansible-container virtualenv environment created after the problem date, an "ansible-container build" call completes fine.
Depending on the docker host I run an ansible-container docker image on, I get an error like:
The <AC_ROLES_PATH> is the path provided in the ansible-container --roles-path option.
The missing <NOTFOUNDROLE> role is, at times:
In all cases I can confirm all roles are present on the local / in-container filesystem before the ansible-container call.
The fact that on the working-before-2018-06-05 docker host, I can delete the recently-added last role and build successfully suggests that some caching is happening and maybe some intermediary tool changed (c.f. #673) but I am unable to determine what and where.
Failures not affected by presence/absense of --debug and/or --use-local-python
STEPS TO REPRODUCE
Create an on-iron virtualenv and set up environment as shown above
Create a Dockerfile with ansible-container environment as shown above
Set up the container.yml and various roles as described above
Run:
EXPECTED RESULTS
working build, direct on-iron
ACTUAL RESULTS
debug output above, for ansible-container run in docker container on host, varies depending on host
The text was updated successfully, but these errors were encountered: