Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dcnm_vrf: infinite loop in wait_for_vrf_del_ready() #351

Closed
allenrobel opened this issue Dec 3, 2024 · 0 comments · Fixed by #354
Closed

dcnm_vrf: infinite loop in wait_for_vrf_del_ready() #351

allenrobel opened this issue Dec 3, 2024 · 0 comments · Fixed by #354
Assignees
Labels
bug Something isn't working pr_submitted PR Submitted

Comments

@allenrobel
Copy link
Collaborator

allenrobel commented Dec 3, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Ansible Version and collection version

ansible [core 2.17.5]
  config file = /Users/arobel/.ansible.cfg
  configured module search path = ['/Users/arobel/repos/ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Users/arobel/repos/ndfc-python/.venv/lib/python3.12/site-packages/ansible
  ansible collection location = /Users/arobel/repos/ansible/collections
  executable location = /Users/arobel/repos/ndfc-python/.venv/bin/ansible
  python version = 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)] (/Users/arobel/repos/ndfc-python/.venv/bin/python)
  jinja version = 3.1.4
  libyaml = True

DCNM version

  • V 3.6.0

Affected module(s)

  • dcnm_vrf

Ansible Playbook

- name: Minimal reproduce for VRF deleted infinite loop
  hosts: ndfc
  vars:
    FABRIC_NAME: FABRIC_1
    FABRIC_BGP_AS: 65001
    NETWORK_NAME: NETWORK_1
    VRF_NAME: VRF_1
    LEAF_IP4: 10.1.1.2
    ATTACH_PORTS: ["Ethernet1/9"]
    SWITCH_USERNAME: admin
    SWITCH_PASSWORD: MySwitchPassword

  tasks:
    - name: Create fabric
      cisco.dcnm.dcnm_fabric:
        state: merged
        config:
        - FABRIC_NAME: "{{ FABRIC_NAME }}"
          FABRIC_TYPE: VXLAN_EVPN
          BGP_AS: "{{ FABRIC_BGP_AS }}"

    - name: Add leaf
      cisco.dcnm.dcnm_inventory:
        fabric: "{{ FABRIC_NAME }}"
        state: merged
        config:
        - seed_ip: "{{ LEAF_IP4 }}"
          auth_proto: MD5
          user_name: "{{ SWITCH_USERNAME }}"
          password: "{{ SWITCH_PASSWORD }}"
          max_hops: 0
          role: leaf
          preserve_config: False
      register: result

    - name: Wait for switch to fully reload
      pause:
        seconds: 180
      when: result.changed

    - name: deploy
      cisco.dcnm.dcnm_rest:
        method: POST
        path: "/appcenter/cisco/ndfc/api/v1/lan-fabric/rest/control/fabrics/{{ FABRIC_NAME }}/config-deploy?forceShowRun=false"
      when: result.changed

    - name: Merge VRF
      cisco.dcnm.dcnm_vrf:
        fabric: "{{ FABRIC_NAME }}"
        state: merged
        config:
        - vrf_name: "{{ VRF_NAME }}"
          vrf_id: 50101 
          adv_default_routes: false
          static_default_route: false
          vrf_template: Default_VRF_Universal
          vrf_extension_template: Default_VRF_Extension_Universal
          vlan_id: 201
          vrf_int_mtu: 9000
          attach:
          - ip_address: "{{ LEAF_IP4 }}"
            deploy: true
      register: result

    - name: deploy
      cisco.dcnm.dcnm_rest:
        method: POST
        path: "/appcenter/cisco/ndfc/api/v1/lan-fabric/rest/control/fabrics/{{ FABRIC_NAME }}/config-deploy?forceShowRun=false"
      when: result.changed

    - name: Merge Network
      cisco.dcnm.dcnm_network:
        fabric: '{{ FABRIC_NAME }}'
        state: merged
        config:
        - net_name: "{{ NETWORK_NAME }}"
          vrf_name: "{{ VRF_NAME }}"
          net_id: 30101
          net_template: Default_Network_Universal
          net_extension_template: Default_Network_Extension_Universal
          l3gw_on_border: true
          vlan_id: 101
          gw_ip_subnet: 172.16.14.1/24
          attach:
            - ip_address: "{{ LEAF_IP4 }}" 
              deploy: true
              ports: "{{ ATTACH_PORTS }}"
          deploy: true
          multicast_group_address: 239.1.1.1
      when: result.changed

    - name: deploy
      cisco.dcnm.dcnm_rest:
        method: POST
        path: "/appcenter/cisco/ndfc/api/v1/lan-fabric/rest/control/fabrics/{{ FABRIC_NAME }}/config-deploy?forceShowRun=false"

    - name: Wait for network merge to deploy
      pause:
        seconds: 60
      when: result.changed

    - name: Delete VRF
      cisco.dcnm.dcnm_vrf:
        fabric: '{{ FABRIC_NAME }}'
        state: deleted
        config:
        - vrf_name: "{{ VRF_NAME }}"

Debug Output

Since lanAttachState below is DEPLOYED the following while loop conditional is hit, which sets state to False, so the loop is never exited.

            for vrf in self.diff_delete:
                state = False
                # ...
                while not state:
                    resp = dcnm_send(self.module, method, path)
                    state = True
                    if resp.get("DATA") is not None:
                        attach_list = resp["DATA"][0]["lanAttachList"]
                            # ...
                            if atch["lanAttachState"] != "NA":
                                self.diff_delete.update({vrf: "DEPLOYED"})
                                state = False
                                time.sleep(self.WAIT_TIME_FOR_DELETE_LOOP)
                                break
2024-12-02 14:46:59,570 - DEBUG - [dcnm.DcnmVrf.wait_for_vrf_del_ready.2784] DcnmVrf.wait_for_vrf_del_ready: attach_list: [
    {
        "vrfName": "VRF_1",
        "switchName": "cvd-2311-leaf",
        "lanAttachState": "DEPLOYED",
        "isLanAttached": true,
        "switchSerialNo": "FDO211218HB",
        "peerSerialNo": null,
        "switchRole": "leaf",
        "fabricName": "FABRIC_1",
        "ipAddress": "172.22.150.106",
        "instanceValues": "{\"loopbackIpV6Address\":\"\",\"loopbackId\":\"\",\"deviceSupportL3VniNoVlan\":\"false\",\"switchRouteTargetImportEvpn\":\"\",\"loopbackIpAddress\":\"\",\"switchRouteTargetExportEvpn\":\"\"}",
        "vlanId": 201,
        "vrfId": 50101,
        "entityName": "VRF_1"
    }
]

Expected Behavior

We should exit with an error if lanAttachedState == DEPLOYED and isLanAttached == True

Actual Behavior

We loop forever.

Steps to Reproduce

Modify and run the attached playbook.

References

allenrobel added a commit that referenced this issue Dec 3, 2024
The fix entails a modification to wait_for_vrf_del_ready()

In both the legitimate case (user trying to delete a VRF after having removed all network attachments) `lanAttachState` very briefly transitions to DEPLOY before transitioning to its final state of NA.  However, in this case, `isLanAttached` (in the same data structure) is False.  Whereas in the illegitimate case (user hasn't removed network attachments) `isLanAttached` is True.  Hence, we can leverage `isLanAttached` to differentiate between legitimate and illegitimate cases.

Adding another conditional that checks if `lanAttachState` == DEPLOY AND `isLanAttached` == True.  If this is the case, then the user is trying to delete a VRF that still contains network attachments and we now fail immediately with an appropriate error message.

Other changes:

1. Add standard python logging

2. Use `ControllerVersion()` to retrieve the NDFC version and remove import for `dcnm_version_supported`

3. Use `FabricDetails()` to retrieve fabric type.

4. Modify `update_attach_params()` to improve readability by first populating the neighbor dictionary before appending it.  This way, we avoid a lot of unsightly accesses to element 0 of the list.  For example:

```python
                    if a_l["peer_vrf"]:
                        vrflite_con["VRF_LITE_CONN"][0]["PEER_VRF_NAME"] = a_l["peer_vrf"]
                    else:
                        vrflite_con["VRF_LITE_CONN"][0]["PEER_VRF_NAME"] = ""
```

Becomes:

```python
                    if a_l["peer_vrf"]:
                        nbr_dict["PEER_VRF_NAME"] = a_l["peer_vrf"]
                    else:
                        nbr_dict["PEER_VRF_NAME"] = ""
```

5. diff_for_attach_deploy() - Reduce indentation by reversing logic of conditional.

The following:

```python
                                    if wlite["IF_NAME"] == hlite["IF_NAME"]:
                                        # Lots of indented code ...
```

Becomes:

```python
                                    if wlite["IF_NAME"] != hlite["IF_NAME"]:
                                        continue
                                    # unindent the above code
```

6. get_have()

- Reduce indentation levels by reversing logic (similar to #5 above)

7. Add method want_and_have_vrf_template_configs_differ(), see next item.

8. diff_for_create()

- Leverage want_and_have_vrf_template_configs_differ() to simplify.

9. Add method to_bool(), see next item

10. diff_for_attach_deploy()

- Simplify/shorten by leveraging to_bool()

11. In multiple places, ensure that a key exists before accessing it or deleting it.

12. Run though black

13. Several minor formatting changes for improved readability.
@allenrobel allenrobel self-assigned this Dec 4, 2024
@allenrobel allenrobel linked a pull request Dec 5, 2024 that will close this issue
@allenrobel allenrobel added pr_submitted PR Submitted bug Something isn't working labels Jan 8, 2025
mikewiebe added a commit that referenced this issue Jan 16, 2025
* Tentative fix for Issue #351

The fix entails a modification to wait_for_vrf_del_ready()

In both the legitimate case (user trying to delete a VRF after having removed all network attachments) `lanAttachState` very briefly transitions to DEPLOY before transitioning to its final state of NA.  However, in this case, `isLanAttached` (in the same data structure) is False.  Whereas in the illegitimate case (user hasn't removed network attachments) `isLanAttached` is True.  Hence, we can leverage `isLanAttached` to differentiate between legitimate and illegitimate cases.

Adding another conditional that checks if `lanAttachState` == DEPLOY AND `isLanAttached` == True.  If this is the case, then the user is trying to delete a VRF that still contains network attachments and we now fail immediately with an appropriate error message.

Other changes:

1. Add standard python logging

2. Use `ControllerVersion()` to retrieve the NDFC version and remove import for `dcnm_version_supported`

3. Use `FabricDetails()` to retrieve fabric type.

4. Modify `update_attach_params()` to improve readability by first populating the neighbor dictionary before appending it.  This way, we avoid a lot of unsightly accesses to element 0 of the list.  For example:

```python
                    if a_l["peer_vrf"]:
                        vrflite_con["VRF_LITE_CONN"][0]["PEER_VRF_NAME"] = a_l["peer_vrf"]
                    else:
                        vrflite_con["VRF_LITE_CONN"][0]["PEER_VRF_NAME"] = ""
```

Becomes:

```python
                    if a_l["peer_vrf"]:
                        nbr_dict["PEER_VRF_NAME"] = a_l["peer_vrf"]
                    else:
                        nbr_dict["PEER_VRF_NAME"] = ""
```

5. diff_for_attach_deploy() - Reduce indentation by reversing logic of conditional.

The following:

```python
                                    if wlite["IF_NAME"] == hlite["IF_NAME"]:
                                        # Lots of indented code ...
```

Becomes:

```python
                                    if wlite["IF_NAME"] != hlite["IF_NAME"]:
                                        continue
                                    # unindent the above code
```

6. get_have()

- Reduce indentation levels by reversing logic (similar to #5 above)

7. Add method want_and_have_vrf_template_configs_differ(), see next item.

8. diff_for_create()

- Leverage want_and_have_vrf_template_configs_differ() to simplify.

9. Add method to_bool(), see next item

10. diff_for_attach_deploy()

- Simplify/shorten by leveraging to_bool()

11. In multiple places, ensure that a key exists before accessing it or deleting it.

12. Run though black

13. Several minor formatting changes for improved readability.

* dcnm_vrf: to_bool() fix to return correct value, or call fail_json()

The initial implementation would return True for e.g. "false" since bool(non-null-string) is always True.

1. Modify to explicitly compare against known boolean-like strings i.e. "false", "False", "true", and "True".

2. Add the caller to the error message for better debugging ability in the future.

* dcnm_image_policy: fix for issue #347 (#348)

* Fix for issue 347

Manually tested this to verify.

Still need to update integration and unit tests.

* dcnm_image_policy: Update integration test

Update integration test for overridden state.

1. playbooks/roles/dcnm_image_policy/dcnm_tests.yaml

- Add vars
    - install_package_1
    - uninstall_package_1

2. test/integration/targets/dcnm_image_policy/tests/dcnm_image_policy_overridden.yaml

- Add packages.install and packages.uninstall configuration
- Verify that merged state adds these packages to the controller config
- Verify that overridden state removes packages.install and packages.uninstall
- Verify that overridden state metadata.action is "replace" instead of "update"

* dcnm_fabric: hardening (#349)

Two bits of vulnerable code found when porting to ndfc-python.

1. plugins/modules/dcnm_fabric.py

Accessing dictionary key directly can lead to a KeyError exception.

2. plugins/module_utils/fabric/replaced.py

If user omits the DEPLOY parameter from their playbook (ndfc-python) the DEPLOY key would be None, and not get popped from the payload.  This would cause NDFC to complain about an invalid key in the payload.  We need to unconditionally pop DEPLOY here, if it's present.  Hence, we've removed the value check (if DEPLOY is not None).

* dcnm_vrf: remove bool() casts, more...

1. Removed all instances where values were cast to bool.  These potentially could result in bad results e.g. bool("false") returns True.

2. Renamed and fixed want_and_have_vrf_template_configs_differ().

Renamed to dict_values_differ()

Fix was to add a skip_keys parameter so that we can skip vrfVlanId in one of the elif()s

3. Added some debugging statements.

* dcnm_vrf: More refactoring and simplifying

* Rename var for readability

* Rename var for readability

* dcnm_vrf: Avoid code duplication

1. find_dict_in_list_by_key_value() new method to generalize and consolidate duplicate code.

2. Remove a few cases of single-use vars.

3. Run though black

* Remove TODO comment

I opened an issue to track what this comment describes, so can remove the comment from the module.

#352

* dcnm_vrf: leverage get_vrf_lite_objects() everywhere

1. Replace several bits that can be replaced with a call to get_vrf_lite_objects().

2. Fix a few pylint f-string complaints.  There are many more of these, which we'll address in the next commit.  One of these required a change to an associated unit test.

* Appease pylint f-string complaints, more...

1. Appease pylint f-string complaints

2. optimize a couple conditionals

3. Change an "== True" to the preferred "is True"

4. Add a few TODO comments

* test_log_v2.py: Temporarily disable unit tests

Unit tests pass locally if Ithe tests in the following file are disabled:

~/test/unit/module_utils/common/test_log_v2.py.

Temporarily disabling these to see if the same is seen when running the unit tests on Github.

If the same is seen, will debug why this is happening.

* Appease pylint

Fix bare-except and dangerous-default-value errors.

* Fix pep8 too-many-blank-lines

test_dcnm_vrf.py: Removed two (out of four) contiguous blank lines.

* Remove python 3.9 incompatible type hint

python 3.9 doesn't like:

def find_dict_in_list_by_key_value( ... ) -> dict | None:

Removed the type hint:

def find_dict_in_list_by_key_value( ... ):

* Re-enable test_log_v2.py unit tests and "fix" UT errors

If we fail_json(), or even if we sys.exit() in main() logging setup, the unit tests fail.

The failure is a KeyError in logging.config.dictConfig when disabling logging in log_v2.py:

    def disable_logging(self):
        logger = logging.getLogger()
        for handler in logger.handlers.copy():
            try:
                logger.removeHandler(handler)
            except ValueError:  # if handler already removed
                pass
        logger.addHandler(logging.NullHandler())
        logger.propagate = False

Above, the KeyError happens here

logger.removeHandler(handler)

The value of handler when this happens is "standard"

I'm not sure why this happens ONLY when the log_v2.py unit tests are run prior to the dcnm_vrf.py unit tests (running these tests separately works).

For now, a "fix" is to pass in the except portion of the try/except block in dcnm_vrf.py main().

def main():
    try:
        log = Log()
        log.commit()
    except (TypeError, ValueError) as error:
        pass

Will investigate further, but the above works, and logging is enabled with no issue in normal use.

Am renaming __DISABLE_test_log_v2.py back to test_log_v2.py

* Appease linters

Remove unused import (sys, added to test fixes for the unit test failures).

Remove extra lines.

* Update another conditional

Modify another OR-conditional to use the preferred:

if X "in" (X, Y, Z):

* dcnm_vrf: dict_values_differ() use generic names

Use generic names for the two dicts.

* dcnm_vrf: Address mwiebe review comments

1. compare_properties: refactor comparison in diff_for_attach_deploy() using this new method.

2. diff_for_attach_deploy(): Leverate to_bool() to add dictionary access protection and remove try/except block.

* Address mwiebe coments part 2

1. Remove commented imports.

2. main(): Remove unused var (error)

* dcnm_vrf: Protect dictionary access

Fix KeyError hit during IT.

* dcnm_vrf: Refactor push_to_remote, validate_input

1. push_to_remote()

Refactor into

- push_diff_create_update()
- push_diff_detach()
- push_diff_undeploy()
- push_diff_delete()
- push_diff_create()
- push_diff_attach()
- push_diff_deploy()

2. validate_input()

There were only very small differences between the parameters in attach_spec, lite_spec, and vrf_spec for the different Ansible states.  Reduced code duplication by factoring handling for these specs into and moving the Ansible-state conditional into these refactored methods.

- attach_spec()
- lite_spec()
- vrf_spec()

* appease linters

* Appease linters

* More refactoring

1. update_attach_params()

Simplified by:

1. Initialize nbr_dict values to ""
2. Populate the nbr_dict values from the current item from attach["vrf_lite"]
3. Test if any values in nbr_dict are != "" after step 2
4. If no values have been updated, continue
5. De-indent the remaining code
6. (also renamed vlanId to vlan_id)

This change also required that we add "IF_NAME" to self.vrf_lite_properties.  I verified that this change will not impact the other use for this structure in diff_for_attach_deploy().

2. diff_for_create()

After the refactor of this method in the last commit, it became obvious that code in the if and else clauses were heavily duplicated.

Refactored to remove the if/else clause entirely since the only difference between these was whether we skip key "vrfVlanId" when vlan_id_want == 0.

This reduces to a simple if statement to populate skip_keys if vlan_id_want == 0.

* Fix typo

Mike's eagle-eyes caught this during review.

* dcnm_vrf: fix IT files, minor cleanup

1. Worked with Mike to fix dcnm.yaml and main.yaml in tests/integration/targets/dcnm_vrf/tasks.

2. Updated query.yaml so I could debug things.  Query IT is now working.

3. dcnm_vrf.py - Added a lot of debug statements and cleaned up a few minor things.

* Fix UT failures

1. dcnm_vrf.py: Fix mispelled var
2. test_dcnm_vrf.py: Update assert to match error message.

* Appease pylint

* Cleanup IT deleted, query, merged

NOTE: All three of these tests are working with the current set of changes up to this point.

1. Add REQUIRED VARS section

2. Update task titles for consistency

3. Add a wait_for task to the SETUP section in case any VRFs exist with vrf-lite extensions prior to running the tests.  This wait_for will be skipped if the preceeding "Delete all VRFs" task result.changed value is false.

4. Standardize var names

- switch_1 - border switch
- switch_2 - border switch
- switch_3 - non-border switch
- interface_1

* Update playbooks/roles

1. Add example dynamic inventory

playbooks/roles/dynamic_inventory.py

2. Update dcnm_vrf/dcnm_tests.yaml with notes for dynamic_inventory.py

3. Add dcnm_vrf/dcnm_hosts.py

* Move dynamic_inventory.py into playbooks/files

Try to avoid sanity error (unexpected shebang) by moving dynamic_inventory.py out of playbooks/roles.

* Appease ansible sanity

According to the following link, '#!/usr/bin/env python' should be OK.

https://docs.ansible.com/ansible/latest/dev_guide/testing/sanity/shebang.html

Let's try...

* Appease linters

Fix pep8 E265: block comment should start with '# '

* dcnm_vrf: Updates to integration tests

1. Standardize integration test var names

fabric_1
switch_1
switch_2
switch_3
interface_1
interface_2
interface_3

2. All tests

- SETUP.  Add task to print all vars
- Standardize task titles to include ansible state

3 overridden.yaml

- Add a workaround for issue seen with ND 3.2.1e

In step TEST.6, NDFC issues an NX-OS CLI that immediately switches from
from configure profile mode, to configure terminal; vrf context <vrf>.
This command results in FAILURE (switch accounting log).  Adding a
wait_for will not help since this all happens within step TEST.6.
A config-deploy resolves the OUT-OF-SYNC VRF status.

- Add REQUIRED VARS section

4. query.yaml

- Update REQUIRED VARS section

5. merged.yaml

- Add missing wait_for after VRF deletion
- Update REQUIRED VARS section
- Renumber tests to group sub-tests

6. deleted.yaml

- Update REQUIRED VARS section

7. dynamic_inventory.py

- Add conditional blocks to set vars based on role

* Appease linters

* Fix unprotected dictionary access

Found in IT (replaced.yaml)

* Update integration tests

1. All tests

- Added wait_for to the CLEANUP section for all tests, in case we run them immediately after each other.
- rename "result" to include the test number i.e. there is no longer a global result that is overwritten in each test.  Rather, each test has its own result.  This is a bit more maintenance perhaps, but it reduces the probability that one test asserts on an earlier test's result.

2. replaced.yaml

- Added REQUIRED VARS section
- Use standardized var names
- SETUP: print vars
- Add standardized task names to all tasks
- Renumbered tests to group them.
- Print all results just prior to their associated asserts

3. query.yaml

- Update some task titles and fix some test numbers

4. merged.yaml

- Print all results just prior to their associated asserts
- Fix a few test numbering issues

5. deleted.yaml

- Use standardize task titles

* dcnm_vrf/dcnm_tests.yaml - Include all vars

1. Include all vars used in the dcnm_vrf integration tests.

2. Update the path to dynamic_inventory.py

* Update Usage section and assign additional fabric_* vars

Address mwiebe comments by including more detailed usage and examples.

Add fabric_2, fabric_3 in case any tests require more than one fabric.

* IT: interface var naming change

1. The current interface var names did not incorporate a way to encode switch ownership.  Modified the var naming to allow for specifying multiple interfaces per switch in such a way that the switch ownership of an interface is evident.

This is documented in:

playbooks/files/dynamic_inventory.py

2. Modified all dcnm_vrf test cases to align with this convention.

- Updated test case header comments with the new usage
- Updated all test case interface vars
- Ran the following tests
  - deleted.yaml
  - overridden.yaml
  - replaced.yaml
  - query.yaml
  - sanity.yaml

3. dynamic_interface.py

In addition to the changes above:

- Fixed the documentation for environment variable ND_ROLE (previously it was misnamed NDFC_ROLE in the documentation, but was correct -- ND_ROLE -- in the actual usage).

- Fix Markdown heading levels

* IT: Update scale.yaml

1. Use standardized task titles
2. Print results prior to each assert

* dcnm_vrf: IT dynamic_inventory.py small modifications

1.  dcnm_vrf: use switch_1, switch_2, switch_3 directly
2. Add scale role to the 'if nd_role' conditional

* dcnm_vrf: fix for #356, and for an undeploy case, simplify, more...

1. Fix case where previous commit in this PR broke undeploy.

2. Fix for issue #356

2. Update unit tests to align with changes in this commit

3. Some simplifications, including

- Add a method send_to_controller() to aggregate POST, PUT, DELETE verb handling.  This method calls dcnm_send() and then calls the response handler, etc.  This removes duplicated code throughout the module.

- Refactor vrf_lite handlng out of update_attach_params() and into new method update_attach_params_extension_values()

- Never noticed this, but it appears we don't have to use inspect() with the new logging system, except in cases where fail_json() is called.  Removed inspect() from all methods that do not call fail_json()

- New method is_border_switch() to remove this code from push_diff_attach() and for future consolidation into a shared library.

- Move dcnm_vrf_paths dictionary out of the class.  These endpoints will later be moved to common/api/ep/.

- in __init__(), add self.sn_ip, built from self.ip_sn.  There were several case where the module wanted a serial_number given an ip_address.  Added two methods that leverage self.sn_ip and self.ip_sn:

- self.serial_number_to_ip()
- self.ip_to_serial_number()

Replaced all instances where duplicated code was performing these functions.

* dcnm_vrf: Fix for issue #357

1. Potential fix for issue #357

If any interface in the playbook task's vrf_lite configuration does not match an interface on the switch that had extensionValues, call fail_json().

- Refactor vrf_lite processing out of push_diff_attach() and into:

- update_vrf_attach_vrf_lite_extensions()

- In update_vrf_attach_vrf_lite_extensions() verify that all interfaces in the playbook's vrf_lite section match an interface on the switch that has extensionValues.  If this check fails, call fail_json()

2. Rename get_extension_values_from_lite_object() to get_extension_values_from_lite_objects() and be explicit that the method takes a list of objects and returns a list of objects, or an empty list.

3. Add some debug statements

4. Rename vrf to vrf_name in push_to_remote()

* dcnm_vrf IT: Update task titles, print results

1. Update task titles to group tests.

2. Print results before each assert stanza.

3. Increase pause after VRF deletion from 40 to 60 seconds.

* IT: Update comment for workaround.

Update the comment for test 3b to indicate that the workaround is needed only when Overlay Mode is set to "config-profile" (which is the default for new fabrics).  The issue does not happen when Overlay Mode is set to "cli".

* Uncomment dcnm_get_ip_addr_info, more...

1. Uncommenting a call to dcnm_get_ip_addr_info() after realizing it also converts serial numbers to ip addresses.

2. Added a method to break up long lists into a list of lists comprizing smaller lists.  This is called in release_resources_by_id() to limit the size of the list of IDs we send to the controller to 512.  The actual size NDFC can process is somewhere between 512 and 630, but don't know exactly what the limit is, so leaving at 512.

I checked later and, since we are processing the release of IDs per-vrf, we are not sending anywhere near a 512 item list, but get_list_of_lists() will be a noop if the length is under (in this case) 512, so no harm adding this.  And, depending on the number of switches in a fabric, this could actually be larger than 512 in some environments.

* Safe get in dict_values_differ() method

* Change conf_changed to module scope

Due to refactoring, conf_changed was set in diff_merge_create() and then cleared before being accessed in diff_merge_attach().  These two methods used to be part of a larger method before the refactoring, so the value of conf_changed was accessible by diff_merged_attach().

This commit does the following to rectify this.

1.Change the scope of conf_changed to class scope by renaming to self.conf_changed and initializing self.conf_changed in __init__().

2. In diff_merge_attach(), remove the line where conf_changed was initialized.

3. Rename an unrelated var (named conf_changed, but is a boolean) to configuration_changed to avoid any future confusion.

4. In diff_merge_attach() (re)initialize self.conf_changed to {}.

All Integration tests have been run with these changes and pass.

* UT: Update unit tests to accommodate issue #357

Some test cases were previously (incorrectly) passing, but starting failing after the commit for issue #357   This commit updates these test cases to (correctly pass and adds corresponding test cases which (correctly) fail.

1. Updated test cases that previously passed incorrectly to now pass correctly.  These test cases previously passed despite using an interface that did not contain extensionValues.  Modified these test cases to use an interface WITH extensionValues.

2. Added test cases, corresponding to the above test cases, which fail due to using an interface without extensionValues.  These test cases are modified to expect fail_json() to be called.

3. Modified ALL testcases to call self.test_data.get() to retrieve their playbook.  Previously, global vars were used for their playbook.  This has a couple advantages.  a. when a testcase (or set of testcases) are run, only the playbook fixtures needed to be retrieved are retrieved.  Previously, ALL playbook fixtures where retrieved even if only one test case was run.  b. The dict() definition is now simpler and more consistent between testcases, since the config key in the dict() will always be playbook i.e. dict(config=playbook), where previously the config key contained different vars for every testcase.

4. Fixed a reference to a non-existent fixture in delete_std_lite.

This test case was trying to access self.mock_vrf_attach_get_ext_object_dcnm_att4_only, which does not exist.  Modified it to use self.mock_vrf_attach_get_ext_object_dcnm_att2_only.

5. Ran black, isort linters.

* dcnm_vrf: diff_for_create() consistent return statements

1. The first return statement was inconsistent with the second return statement.  Fixed by adding the boolean configuration_changed to the first return statement.

2. All the other changes are due to running the black and isort linters.

* dcnm_vrf: fix for improper update of lanAttachList

In push_diff_attach(), only the last update to lan_attach_list was being appended to diff_attach_list because the update to dif_attach_list was happening outside the 'for diff_attach` loop.

The fix was to indent the append for new_diff_attach_list to be under the 'for diff_attach' loop.

---------

Co-authored-by: mikewiebe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pr_submitted PR Submitted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant