Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target remote Incus host #16

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

mttjohnson
Copy link
Contributor

@mttjohnson mttjohnson commented Dec 12, 2024

I want to allow this project to target an Incus host, such as a remote host, that is not expected to be on the localhost of the system I'm running the commands from.

Allow Terraform/OpenTofu to target remote host

  • creation of variables to use and defaults that can easily be overridden
  • removal of hard coded values in modules to allow for variable overrides for hosts that need different values
  • separation of incus provider into it's own file where connection details are specified
  • .opentofu-version file of version tested/used that can be automatically loaded with tenv
  • updated README with additional guidance for getting started

Allow Ansible to run commands on remote host

  • ceph packages were not found for noble (ubuntu/24.04) at the https://download.ceph.com/debian-reef/dists/ repo so I changed the dev-incus-deploy cluster in main.tf to use the newest version supported by the ceph distribution jammy (ubuntu/22.04).
  • modified ceph.yaml playbook to run "Ceph - Generate cluster keys and maps" plays after installing ceph packages on hosts. The play also now only runs the play against the server01 host. This avoids delegating and running ceph commands on the control node (localhost/127.0.0.1) and uses the specified installed version of ceph tools to execute commands on one of the remote hosts instead. To maintain the same functionality additional tasks were added to first check for existing files on the control node (control node stat), copy it to the remote (copy to remote) if one exists to avoid generating a new file, and (fetch back to control node) if the file doesn't exist on the control node already like in the case of generating it for the first time on the remote host.
  • modified ceph.monitors.tpl to use all groups instead of the hosts in the play because the play was now only run on server01 instead of running on all hots and delegating to execute commands on the control node. This primarily impacts the "Add nodes to mon map" task.
  • additional guidance including Pipfile and ansible_requirements.yml for reference to a known working version of ansible and collections that can be used to run the playbooks in addition to some reference commands in README.md for installing and executing the same version of Ansible and dependencies specified in the additional Pipfile.
  • added some examples to README.md for launching instances on a new cluster

- creation of variables to use and defaults that can easily be overridden
- removal of hard coded values in modules to allow for variable overrides for hosts that need different values
- separation of incus provider into it's own file where connection details are specified
- .opentofu-version file of version tested/used that can be automatically loaded with tenv
- updated README with additional guidance for getting started

Signed-off-by: Matt Johnson <[email protected]>
@mttjohnson
Copy link
Contributor Author

The changes here relating to using Terraform/OpenTofu to target remote host are complete and worked fine when I tested this locally on my systems, though I intend to continue adding to this the pull request to include the changes needed for the Ansible portion as well (which is why I set the status to draft).

When I initially tried to run an Ansible deploy referencing the hosts.yaml.example to install a ceph and ovn cluster on (server01, server02, server03, server04, server05) I found that ceph did not want to install on Ubuntu 24.04 due to no available package for the reef or squid releases on the noble dist (https://download.ceph.com/debian-reef/dists/). I tried switching the VMs back to an older distro images:ubuntu/22.04 and got it installed, but ran into some other issue further down the playbook execution that I still need some time to investigate and resolve.

- ceph packages were not found for `noble` (ubuntu/24.04) at the https://download.ceph.com/debian-reef/dists/ repo so I changed the `dev-incus-deploy` cluster in `main.tf` to use the newest version supported by the ceph distribution `jammy` (ubuntu/22.04).

- modified `ceph.yaml` playbook to run "Ceph - Generate cluster keys and maps" plays after installing ceph packages on hosts. The play also now only runs the play against the server01 host. This avoids delegating and running ceph commands on the control node (localhost/127.0.0.1) and uses the specified installed version of ceph tools to execute commands on one of the remote hosts instead. To maintain the same functionality additional tasks were added to first check for existing files on the control node (control node stat), copy it to the remote (copy to remote) if one exists to avoid generating a new file, and (fetch back to control node) if the file doesn't exist on the control node already like in the case of generating it for the first time on the remote host.

- modified `ceph.monitors.tpl` to use all groups instead of the hosts in the play because the play was now only run on server01 instead of running on all hots and delegating to execute commands on the control node. This primarily impacts the "Add nodes to mon map" task.

- additional guidance including `Pipfile` and `ansible_requirements.yml` for reference to a known working version of ansible and collections that can be used to run the playbooks in addition to some reference commands in `README.md` for installing and executing the same version of Ansible and dependencies specified in the additional Pipfile.

- Added some examples to `README.md` for launching instances on a new cluster

Signed-off-by: Matt Johnson <[email protected]>
@mttjohnson
Copy link
Contributor Author

After getting this all working, I destroyed it and replayed all the steps from start to finish to test it.

Destroy existing test cluster:

rm -r ansible/data/ceph/*
rm -r ansible/data/lvmcluster/*
rm -r ansible/data/ovn/*
cd terraform
tofu destroy

Create new test cluster:

# Create VMs for test cluster
cd terraform
tofu apply -target=module.baremetal

# Display new test cluster VMs
incus list --project dev-incus-deploy

# Configure test cluster (ceph/ovn/incus)
cd ansible
ansible-playbook deploy.yaml --diff

Create some test instances on the new test cluster:

# Open a shell on one of the Incus cluster nodes
incus exec --project dev-incus-deploy server01 bash

# List all instances
incus list

# Launch a system container
incus launch images:ubuntu/22.04 ubuntu-container

# Launch a virtual machine
incus launch images:ubuntu/22.04 ubuntu-vm --vm

# Launch an application container
incus remote add oci-docker https://docker.io --protocol=oci
incus launch oci-docker:hello-world --ephemeral --console
incus launch oci-docker:nginx nginx-app-container

@mttjohnson mttjohnson marked this pull request as ready for review December 27, 2024 23:26
@mttjohnson
Copy link
Contributor Author

@mkbrechtel This is the work I referenced that I needed to allow Terraform and Ansible to target Incus running from a remote system.

@mttjohnson
Copy link
Contributor Author

Running the ceph commands on the remote host like this with the correct version of the ceph tools should help resolve some of the challenges reported on issue #11

- After additional testing found that this additional python package dependency may be needed for TASK [Install btrfs tools] which uses json_query

Signed-off-by: Matt Johnson <[email protected]>
Signed-off-by: Matt Johnson <[email protected]>

# resolved conflicts:
#	ansible/books/ceph.yaml
@stgraber
Copy link
Member

stgraber commented Jan 1, 2025

With the change to add automated testing yesterday, I've moved the default back to 22.04 and Ceph squid due to the noble issue you mentioned.

Sounds like for noble we need to stick to the distro version of Ceph for now...

@mttjohnson
Copy link
Contributor Author

After merging the most recent changes from the main branch, I retested things locally using the squid ceph release and it worked fine for me.

This workflow is awaiting approval from a maintainer in #16

The automated tests are waiting for approval from a maintainer. Is there anything else I need to change or do for this pull request?

@@ -0,0 +1 @@
1.8.7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this file should be included in the repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the .opentofu-version file for tenv and removed the comments about it that I had added to the README.md.

Comment on lines 11 to 16
# Provider variables
incus_host_name = "incus" # default: incus
incus_host_scheme = "https" # default: https
incus_host_address = "127.0.0.1" # default: 127.0.0.1
incus_host_port = 443 # default: 443
incus_host_default = true # default: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't look ideal. The default should be to use the local Incus server over the unix socket, not go over the network as going over the network would require a pre-trusted certificate.

This is what's causing the tests to fail now as Terraform is now trying to communicate with Incus through 127.0.0.1:443 which doesn't work.

Also, when accessing over HTTPS, a better example port would be 8443 as that's the port that Incus typically listens on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out, I hadn't considered running it locally and communicating over a unix socket as that hasn't been an option for me.

I read through the provider docs more closely again, updated the variable defaults, and I expect this to work better for the tests. I wasn't sure if I should leave the schema value in the terraform.tfvars.example file with https or if I should also have it match the default unix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect a majority of folks (mostly developers) will run against a local Incus daemon, so defaulting to the unix socket is preferable.

@stgraber
Copy link
Member

stgraber commented Jan 9, 2025

Good, the test did its job and caught a regression ;)

@mttjohnson
Copy link
Contributor Author

Are there any concerns with including the ansible/Pipfile and comments in the README.md about pipenv?

NOTE: If you need the same version of ansible this was tested with:

pyenv install 3.13.1
pipenv --python "3.13.1" install
pipenv shell
ansible-galaxy install -r ansible_requirements.yml

@mttjohnson mttjohnson requested a review from stgraber January 10, 2025 14:44
@stgraber
Copy link
Member

Pipfile and README changes seem fine to me

- or address needs to specify the path to the unix socket

Signed-off-by: Matt Johnson <[email protected]>
@mttjohnson
Copy link
Contributor Author

The tests failed with the error "The incus daemon doesn't appear to be started (socket path: 127.0.0.1)" because when schema was set to unix it took the address value and expected it was the path to the UNIX socket even though I didn't realize that was a thing based on the provider documentation. Once I get this figured out I'm definitely making a pull request to update the provider docs so someone else doesn't trip over this as much as I am right now.

I tried reading through some of the provider code but I don't think I follow enough go code to understand if I can leave address as an empty string because of https://github.com/lxc/terraform-provider-incus/blob/575ff114dcbef39f6a5d42cc0f59df3cf50b5e06/internal/provider-config/config.go#L213 if I want it to be 'unix://that I need to specify theaddress as '// because of https://github.com/lxc/terraform-provider-incus/blob/575ff114dcbef39f6a5d42cc0f59df3cf50b5e06/internal/provider-config/config.go#L386 or if I could just specify the address as unix:// and somehow that works itself out.

I ended up deciding to first try the empty string hoping that works since I don't have anything set up to try this out locally.

Signed-off-by: Matt Johnson <[email protected]>
@@ -191,6 +112,171 @@
state: present
when: '"rgw" in task_roles'

- name: Ceph - Generate cluster keys and maps
hosts: server01
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do that, when deploying on actual servers, we can't assume the name of the servers.

scheme = var.incus_host_scheme
address = var.incus_host_address
port = var.incus_host_port
default = var.incus_host_default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an actual config key for the remote definition in the provider as far as I can tell.
It's a config key in the Incus config.yml but that's not something that's useful in this context.

I'll drop that one when pushing through my branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like it's used as a way to tell the Terraform provider which remote to use by default (when not mentioned through incus_remote), so it's then likely preferable to just always set it to true in this context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At which point we can also fix its name to something we control to reduce the number of variables, say incus-deploy in this case.

Comment on lines +2 to +4
# Incus client library looks in ~/.config/incus for client.crt and client.key for authentication
generate_client_certificates = true
accept_remote_certificate = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to drop these two.

accept_remote_certificate being true allows for man in the middle attacks, so we shouldn't make that the default.

generate_client_certificates isn't likely to be useful as if you need to generate a client certificate then you can't possibly already be trusted by the remote server, so the connection will fail.

Comment on lines +16 to +39

variable "nic_method" {
type = string
default = "network"
validation {
condition = contains(["network", "nictype"], var.nic_method)
error_message = "Valid value is one of the following: network, nictype."
}
}

variable "nictype" {
type = string
default = ""
}

variable "nictype_parent" {
type = string
default = ""
}

variable "network_name" {
type = string
default = ""
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for this?

The example used is one where the default use of network is better suited.
If this is to support a pre-existing bridge or macvlan, I think it'd be better for the user to use incus network create SOME-NETWORK --type=physical parent=EXISTING-INTERFACE and use the resulting network rather than use nictype to directly get to the NIC.

@stgraber
Copy link
Member

I believe I've effectively extracted everything I can from this PR at this stage.

As mentioned above, I've skipped some bits of the Terraform Incus remote definition as I don't think we need to expose all of that to the user, just scheme, address and port should do.

I also removed the nic_type bits as I would expect that everything can be done through managed networks these days but maybe I'm missing something.

The main bit missing is the Ceph reword, but that's stuck due to it currently relying on a fixed name for the initial server.

@stgraber
Copy link
Member

I also now turned the remote stuff to just one incus_remote variable since given we recommend the use of one-time tokens to set up new remotes, it's usually better done through incus remote add and then the resulting remote be used with Terraform.

But I can be a bit more flexible on that one, I just couldn't get it to work properly anymore with the more comprehensive config so went the easy way and only supplying the remote name :)

@stgraber
Copy link
Member

stgraber commented Jan 14, 2025

@mttjohnson I've now merged as much as I can from this PR.

Here's what's left when comparing your branch to the current main branch:

diff --git a/README.md b/README.md
index 892cc67..20dade8 100644
--- a/README.md
+++ b/README.md
@@ -3,12 +3,12 @@
 This is a collection of Ansible playbooks, Terraform configurations and scripts to deploy and operate Incus clusters.
 
 ## How to get the test setup run:
-### Install Incus and OpenTofu
-Install Incus stable or LTS on your system from the [zabbly/incus](https://github.com/zabbly/incus) release and initialize it on your local machine.
+### Install incus and OpenTofu
+Install incus stable or LTS on your system from the [zabbly/incus](https://github.com/zabbly/incus) release and initialize it on your local machine.
 
 Install [OpenTofu](https://opentofu.org/docs/intro/install/).
 
-Install the required ceph packages for Ansible on the controller, on Debian that's the `ceph-base` and `ceph-common` packages:
+Install the required ceph packages for ansible on the controller, on Debian that's the `ceph-base` and `ceph-common` packages:
 ```
 apt install --no-install-recommends ceph-base ceph-common
 ```
@@ -24,11 +24,9 @@ Init the terraform project:
 tofu init
 ```
 
-Create 5 VMs and associated networks and storage volumes for testing an Incus cluster:
-If your Incus host needs different values from the default, you may need
-to copy `terraform.tfvars.example` to `terraform.tfvars` and update the
-variables.
+If your Incus host needs different values from the default, you may need to copy `terraform.tfvars.example` to `terraform.tfvars` and update the variables.
 
+Create 5 VMs for testing with incus, ovn, and ceph cluster:
 ```
 tofu apply -target=module.baremetal
 ```
@@ -39,7 +37,7 @@ Go to the ansible directory:
 cd ../ansible/
 ```
 
-NOTE: If you need the same version of Ansible this was tested with:
+NOTE: If you need the same version of ansible this was tested with:
 ```
 pyenv install 3.13.1
 pipenv --python "3.13.1" install
@@ -51,19 +49,18 @@ Copy the example inventory file:
 ```
 cp hosts.yaml.example hosts.yaml
 ```
-NOTE: If you are connecting to a remote Incus host you will need to change the `ansible_incus_remote` variable to match the name of the Incus remote (see: `incus remote list` for a list of remote names to use).
+NOTE: If you are connecting to a remote incus host you will need to change the `ansible_incus_remote` variable to match the name of the Incus remote (see: `incus remote list` for a list of remote names to use).
 
 Run the Playbooks:
 ```
 ansible-playbook deploy.yaml
 ```
 
-NOTE: When re-deploying the same cluster (e.g. following a `terraform destroy`),
-you need to make sure to also clear any local state from the
+NOTE: When re-deploying the same cluster (e.g. following a `terraform
+destroy`), you need to make sure to also clear any local state from the
 `data` directory, failure to do so will cause Ceph/OVN to attempt
 connection to the previously deployed systems which will cause the
 deployment to get stuck.
-
 ```
 rm ansible/data/ceph/*
 rm ansible/data/lvmcluster/*
diff --git a/ansible/books/ceph.yaml b/ansible/books/ceph.yaml
index 487a7ff..54ff7c3 100644
--- a/ansible/books/ceph.yaml
+++ b/ansible/books/ceph.yaml
@@ -113,8 +113,7 @@
       when: '"rgw" in task_roles'
 
 - name: Ceph - Generate cluster keys and maps
-  hosts: all
-  order: shuffle
+  hosts: server01
   gather_facts: yes
   gather_subset:
     - "default_ipv4"
@@ -139,57 +138,144 @@
       squid: 19
   any_errors_fatal: true
   tasks:
-    - name: Generate mon keyring
+    - name: create /root/generate dir
+      ansible.builtin.file:
+        path: /root/generate
+        owner: root
+        group: root
+        mode: 0700
+        state: directory
+        
+    - name: Generate mon keyring (control node stat)
       delegate_to: 127.0.0.1
-      shell:
-        cmd: ceph-authtool --create-keyring {{ task_mon_keyring }} --gen-key -n mon. --cap mon 'allow *'
-        creates: '{{ task_mon_keyring }}'
+      ansible.builtin.stat:
+        path: '{{ task_mon_keyring }}'
+      register: task_mon_keyring_stat
+    - name: Generate mon keyring (copy to remote)
+      ansible.builtin.copy:
+        src: '{{ task_mon_keyring }}'
+        dest: /root/generate/task_mon_keyring
+      when: task_mon_keyring_stat.stat.exists
+    - name: Generate mon keyring
+      ansible.builtin.shell:
+        cmd: ceph-authtool --create-keyring /root/generate/task_mon_keyring --gen-key -n mon. --cap mon 'allow *'
+        creates: /root/generate/task_mon_keyring
       throttle: 1
       when: 'task_fsid'
-
-    - name: Generate client.admin keyring
+    - name: Generate mon keyring (fetch back to control node)
+      ansible.builtin.fetch:
+        src: /root/generate/task_mon_keyring
+        dest: '{{ task_mon_keyring }}'
+        flat: true
+      when: not task_mon_keyring_stat.stat.exists
+
+    - name: Generate client.admin keyring (control node stat)
       delegate_to: 127.0.0.1
-      shell:
-        cmd: ceph-authtool --create-keyring {{ task_client_admin_keyring }} --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
-        creates: '{{ task_client_admin_keyring }}'
+      ansible.builtin.stat:
+        path: '{{ task_client_admin_keyring }}'
+      register: task_client_admin_keyring_stat
+    - name: Generate client.admin keyring (copy to remote)
+      ansible.builtin.copy:
+        src: '{{ task_client_admin_keyring }}'
+        dest: /root/generate/task_client_admin_keyring
+      when: task_client_admin_keyring_stat.stat.exists
+    - name: Generate client.admin keyring
+      ansible.builtin.shell:
+        cmd: ceph-authtool --create-keyring /root/generate/task_client_admin_keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
+        creates: /root/generate/task_client_admin_keyring
       throttle: 1
       notify: Add key to client.admin keyring
       when: 'task_fsid'
-
-    - name: Generate bootstrap-osd keyring
+    - name: Generate client.admin keyring (fetch back to control node)
+      ansible.builtin.fetch:
+        src: /root/generate/task_client_admin_keyring
+        dest: '{{ task_client_admin_keyring }}'
+        flat: true
+      when: not task_client_admin_keyring_stat.stat.exists
+
+    - name: Generate bootstrap-osd keyring (control node stat)
       delegate_to: 127.0.0.1
-      shell:
-        cmd: ceph-authtool --create-keyring {{ task_bootstrap_osd_keyring }} --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
-        creates: '{{ task_bootstrap_osd_keyring }}'
+      ansible.builtin.stat:
+        path: '{{ task_bootstrap_osd_keyring }}'
+      register: task_bootstrap_osd_keyring_stat
+    - name: Generate bootstrap-osd keyring (copy to remote)
+      ansible.builtin.copy:
+        src: '{{ task_bootstrap_osd_keyring }}'
+        dest: /root/generate/task_bootstrap_osd_keyring
+      when: task_bootstrap_osd_keyring_stat.stat.exists
+    - name: Generate bootstrap-osd keyring
+      ansible.builtin.shell:
+        cmd: ceph-authtool --create-keyring /root/generate/task_bootstrap_osd_keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
+        creates: /root/generate/task_bootstrap_osd_keyring
       throttle: 1
       notify: Add key to bootstrap-osd keyring
       when: 'task_fsid'
-
-    - name: Generate mon map
+    - name: Generate bootstrap-osd keyring (fetch back to control node)
+      ansible.builtin.fetch:
+        src: /root/generate/task_bootstrap_osd_keyring
+        dest: '{{ task_bootstrap_osd_keyring }}'
+        flat: true
+      when: not task_bootstrap_osd_keyring_stat.stat.exists
+
+    - name: Generate mon map (control node stat)
       delegate_to: 127.0.0.1
-      shell:
-        cmd: monmaptool --create{% if task_release_majors[task_release] | default(None) %} --set-min-mon-release={{ task_release_majors[task_release] }}{% endif %} --fsid {{ task_fsid }} {{ task_mon_map }}
-        creates: '{{ task_mon_map }}'
+      ansible.builtin.stat:
+        path: '{{ task_mon_map }}'
+      register: task_mon_map_stat
+    - name: Generate mon map (copy to remote)
+      ansible.builtin.copy:
+        src: '{{ task_mon_map }}'
+        dest: /root/generate/task_mon_map
+      when: task_mon_map_stat.stat.exists
+    - name: Generate mon map
+      ansible.builtin.shell:
+        cmd: monmaptool --create{% if task_release_majors[task_release] | default(None) %} --set-min-mon-release={{ task_release_majors[task_release] }}{% endif %} --fsid {{ task_fsid }} /root/generate/task_mon_map
+        creates: /root/generate/task_mon_map
       throttle: 1
       notify: Add nodes to mon map
       when: 'task_fsid'
+    - name: Generate mon map (fetch back to control node)
+      ansible.builtin.fetch:
+        src: /root/generate/task_mon_map
+        dest: '{{ task_mon_map }}'
+        flat: true
+      when: not task_mon_map_stat.stat.exists
 
   handlers:
-    - name: Add key to client.admin keyring
-      delegate_to: 127.0.0.1
-      shell:
-        cmd: ceph-authtool {{ task_mon_keyring }} --import-keyring {{ task_client_admin_keyring }}
-
-    - name: Add key to bootstrap-osd keyring
-      delegate_to: 127.0.0.1
-      shell:
-        cmd: ceph-authtool {{ task_mon_keyring }} --import-keyring {{ task_bootstrap_osd_keyring }}
-
-    - name: Add nodes to mon map
-      delegate_to: 127.0.0.1
-      shell:
-        cmd: monmaptool --add {{ item.name }} {{ item.ip }} {{ task_mon_map }}
+    - name: Add key to client.admin keyring (remote)
+      # delegate_to: 127.0.0.1
+      ansible.builtin.shell:
+        cmd: ceph-authtool /root/generate/task_mon_keyring --import-keyring /root/generate/task_client_admin_keyring
+      listen: Add key to client.admin keyring
+    - name: Add key to client.admin keyring (fetch)
+      ansible.builtin.fetch:
+        src: /root/generate/task_mon_keyring
+        dest: '{{ task_mon_keyring }}'
+        flat: true
+      listen: Add key to client.admin keyring
+
+    - name: Add key to bootstrap-osd keyring (remote)
+      ansible.builtin.shell:
+        cmd: ceph-authtool /root/generate/task_mon_keyring --import-keyring /root/generate/task_bootstrap_osd_keyring
+      listen: Add key to bootstrap-osd keyring
+    - name: Add key to bootstrap-osd keyring (fetch)
+      ansible.builtin.fetch:
+        src: /root/generate/task_mon_keyring
+        dest: '{{ task_mon_keyring }}'
+        flat: true
+      listen: Add key to bootstrap-osd keyring
+
+    - name: Add nodes to mon map (remote)
+      ansible.builtin.shell:
+        cmd: monmaptool --add {{ item.name }} {{ item.ip }} /root/generate/task_mon_map
       loop: "{{ lookup('template', '../files/ceph/ceph.monitors.tpl') | from_yaml | default([]) }}"
+      listen: Add nodes to mon map
+    - name: Add nodes to mon map (fetch)
+      ansible.builtin.fetch:
+        src: /root/generate/task_mon_map
+        dest: '{{ task_mon_map }}'
+        flat: true
+      listen: Add nodes to mon map
 
 - name: Ceph - Set up config and keyrings
   hosts: all
diff --git a/terraform/baremetal-incus/main.tf b/terraform/baremetal-incus/main.tf
index 44ee171..e998bfd 100644
--- a/terraform/baremetal-incus/main.tf
+++ b/terraform/baremetal-incus/main.tf
@@ -17,9 +17,9 @@ resource "incus_network" "this" {
   description = "Network used to test incus-deploy (OVN uplink)"
 
   config = {
-    "ipv4.address" = var.ovn_uplink_ipv4_address
+    "ipv4.address" = var.ovn_net_uplink_ipv4_address
     "ipv4.nat"     = "true"
-    "ipv6.address" = var.ovn_uplink_ipv6_address
+    "ipv6.address" = var.ovn_net_uplink_ipv6_address
     "ipv6.nat"     = "true"
   }
 }
@@ -48,9 +48,12 @@ resource "incus_profile" "this" {
     type = "nic"
     name = "eth0"
 
-    properties = {
-      "network" = var.network
+    properties = var.nic_method == "network" ? {
+      "network" = var.network_name
       "name"    = "eth0"
+      } : {
+      "nictype" = var.nictype
+      "parent"  = var.nictype_parent
     }
   }
 
diff --git a/terraform/baremetal-incus/variables.tf b/terraform/baremetal-incus/variables.tf
index c01e26b..e5f1baf 100644
--- a/terraform/baremetal-incus/variables.tf
+++ b/terraform/baremetal-incus/variables.tf
@@ -18,16 +18,32 @@ variable "storage_pool" {
   type = string
 }
 
-variable "network" {
-  type = string
+variable "nic_method" {
+  type    = string
+  default = "network"
+}
+
+variable "nictype" {
+  type    = string
+  default = ""
+}
+
+variable "nictype_parent" {
+  type    = string
+  default = ""
+}
+
+variable "network_name" {
+  type    = string
+  default = ""
 }
 
-variable "ovn_uplink_ipv4_address" {
+variable "ovn_net_uplink_ipv4_address" {
   type    = string
   default = ""
 }
 
-variable "ovn_uplink_ipv6_address" {
+variable "ovn_net_uplink_ipv6_address" {
   type    = string
   default = ""
 }
diff --git a/terraform/main.tf b/terraform/main.tf
index 59758d8..778f15e 100644
--- a/terraform/main.tf
+++ b/terraform/main.tf
@@ -7,10 +7,16 @@ module "baremetal" {
   memory         = "4GiB"
 
   storage_pool = var.incus_storage_pool
-  network      = var.incus_network
 
-  ovn_uplink_ipv4_address = var.ovn_uplink_ipv4_address
-  ovn_uplink_ipv6_address = var.ovn_uplink_ipv6_address
+  nic_method = var.incus_nic_method
+  # network settings
+  network_name = var.incus_network_name
+  # nictype settings
+  nictype        = var.incus_nictype
+  nictype_parent = var.incus_nictype_parent
+
+  ovn_net_uplink_ipv4_address = var.baremetal_net_ipv4_address
+  ovn_net_uplink_ipv6_address = var.baremetal_net_ipv6_address
 }
 
 module "services" {
@@ -19,7 +25,12 @@ module "services" {
   project_name   = "dev-incus-deploy-services"
   instance_names = ["ceph-mds01", "ceph-mds02", "ceph-mds03", "ceph-mgr01", "ceph-mgr02", "ceph-mgr03", "ceph-rgw01", "ceph-rgw02", "ceph-rgw03"]
   image          = "images:ubuntu/24.04"
+  storage_pool   = var.incus_storage_pool
 
-  storage_pool = var.incus_storage_pool
-  network      = var.incus_network
+  nic_method = var.incus_nic_method
+  # network settings
+  network_name = var.incus_network_name
+  # nictype settings
+  nictype        = var.incus_nictype
+  nictype_parent = var.incus_nictype_parent
 }
diff --git a/terraform/provider_incus.tf b/terraform/provider_incus.tf
index 6867e14..bdd3596 100644
--- a/terraform/provider_incus.tf
+++ b/terraform/provider_incus.tf
@@ -1,6 +1,13 @@
 provider "incus" {
+  # Incus client library looks in ~/.config/incus for client.crt and client.key for authentication
+  generate_client_certificates = true
+  accept_remote_certificate    = true
+
   remote {
-    name    = var.incus_remote
-    default = true
+    name    = var.incus_host_name
+    scheme  = var.incus_host_scheme
+    address = var.incus_host_address
+    port    = var.incus_host_port
+    default = var.incus_host_default
   }
 }
diff --git a/terraform/services/main.tf b/terraform/services/main.tf
index b44719a..86ed302 100644
--- a/terraform/services/main.tf
+++ b/terraform/services/main.tf
@@ -36,9 +36,12 @@ resource "incus_profile" "this" {
     type = "nic"
     name = "eth0"
 
-    properties = {
-      "network" = var.network
+    properties = var.nic_method == "network" ? {
+      "network" = var.network_name
       "name"    = "eth0"
+      } : {
+      "nictype" = var.nictype
+      "parent"  = var.nictype_parent
     }
   }
 }
diff --git a/terraform/services/variables.tf b/terraform/services/variables.tf
index 72a92fc..0200e96 100644
--- a/terraform/services/variables.tf
+++ b/terraform/services/variables.tf
@@ -14,7 +14,26 @@ variable "storage_pool" {
   type = string
 }
 
-variable "network" {
+variable "nic_method" {
+  type    = string
+  default = "network"
+  validation {
+    condition     = contains(["network", "nictype"], var.nic_method)
+    error_message = "Valid value is one of the following: network, nictype."
+  }
+}
+
+variable "nictype" {
+  type    = string
+  default = ""
+}
+
+variable "nictype_parent" {
+  type    = string
+  default = ""
+}
+
+variable "network_name" {
   type    = string
   default = ""
 }
diff --git a/terraform/terraform.tfvars.example b/terraform/terraform.tfvars.example
index a290838..4c4a7c6 100755
--- a/terraform/terraform.tfvars.example
+++ b/terraform/terraform.tfvars.example
@@ -1,18 +1,27 @@
 # Example of terraform.tfvars
-#
-# You can copy the contents of this file to terraform.tfvars and modify
+# 
+# You can copy the contents of this file to terraform.tfvars and modify 
 # values for your own setup to operate with values different from the defaults
-#
-# A terraform.tfvars will override any other defined variables such as the ones
+# 
+# A terraform.tfvars will override any other defined variables such as the ones 
 # in *.auto.tfvars or defaults in variables.tf
 # https://opentofu.org/docs/language/values/variables/#variable-definitions-tfvars-files
 # https://opentofu.org/docs/language/values/variables/#variable-definition-precedence
 
-# Incus variables
-incus_remote         = "local"    # Name of the Incus remote to deploy on (see `incus remote list`)
-incus_storage_pool   = "default"  # Name of the storage pool to use for the VMs and volumes
-incus_network        = "incusbr0" # Name of the network to use for the VMs
+# Provider variables
+incus_host_name    = "local" # default: local
+incus_host_scheme  = "unix"  # default: unix
+incus_host_address = ""      # default: 
+incus_host_port    = 8443    # default: 8443
+incus_host_default = true    # default: true
 
-# OVN uplink configuration
-ovn_uplink_ipv4_address = "172.31.254.1/24"
-ovn_uplink_ipv6_address = "fd00:1e4d:637d:1234::1/64"
+# Incus Host variables
+incus_storage_pool   = "default"  # default: default
+incus_nic_method     = "network"  # default: network (can be: network / nictype)
+incus_network_name   = "incusbr0" # default: incusbr0 (only used with incus_nic_method = network)
+incus_nictype        = "bridged"  # default: bridged (only used with incus_nic_method = nictype)
+incus_nictype_parent = "incusbr0" # default: incusbr0 (only used with incus_nic_method = nictype)
+
+# baremetal-incus configs
+baremetal_net_ipv4_address = "172.31.254.1/24"
+baremetal_net_ipv6_address = "fd00:1e4d:637d:1234::1/64"
diff --git a/terraform/variables.tf b/terraform/variables.tf
index ee56bb8..b66631e 100755
--- a/terraform/variables.tf
+++ b/terraform/variables.tf
@@ -1,24 +1,61 @@
-variable "incus_remote" {
+variable "incus_host_name" {
   type    = string
   default = "local"
 }
+variable "incus_host_scheme" {
+  type    = string
+  default = "unix"
+}
+variable "incus_host_address" {
+  type    = string
+  default = ""
+}
+variable "incus_host_port" {
+  type    = number
+  default = 8443
+}
+variable "incus_host_default" {
+  type    = bool
+  default = true
+}
+
+
 
 variable "incus_storage_pool" {
   type    = string
   default = "default"
 }
-
-variable "incus_network" {
+variable "incus_nic_method" {
   type    = string
-  default = "incusbr0"
+  default = "network"
+  validation {
+    condition     = contains(["network", "nictype"], var.incus_nic_method)
+    error_message = "Valid value is one of the following: network, nictype."
+  }
+}
+variable "incus_network_name" {
+  description = "optional: only used with incus_nic_method set to network"
+  type        = string
+  default     = "incusbr0"
+}
+variable "incus_nictype" {
+  description = "optional: only used with incus_nic_method set to nictype"
+  type        = string
+  default     = "bridged"
+}
+variable "incus_nictype_parent" {
+  description = "optional: only used with incus_nic_method set to nictype"
+  type        = string
+  default     = "incusbr0"
 }
 
-variable "ovn_uplink_ipv4_address" {
+
+
+variable "baremetal_net_ipv4_address" {
   type    = string
   default = "172.31.254.1/24"
 }
-
-variable "ovn_uplink_ipv6_address" {
+variable "baremetal_net_ipv6_address" {
   type    = string
   default = "fd00:1e4d:637d:1234::1/64"
 }

Some notes:

  • The README changes can probably be discarded (mostly reverts of additional tweaks I've made).
  • The Terraform changes are either because of the nictype option or because of the more complex remote options. If using managed networks and existing remotes works properly, then those changes can also be discarded.
  • The Ceph playbook changes are the main bit we still need merging but that needs to be adapted to remove the hardcoded server01 and then be re-tested before merging.

@mttjohnson
Copy link
Contributor Author

Thank you for all the feedback and changes to include the Terraform pieces here!

I hope to have some time within the next week to read through and address any specific comments more thoroughly, and start working on cleaning up the Ansible stuff to remove the hardcoded server01 use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants