Skip to content

fix: delete PIM rp-items before multicast group replace#232

Open
weneghawi wants to merge 1 commit intomainfrom
fix/pim-multicast-groups
Open

fix: delete PIM rp-items before multicast group replace#232
weneghawi wants to merge 1 commit intomainfrom
fix/pim-multicast-groups

Conversation

@weneghawi
Copy link
Contributor

Fix

In EnsurePIM, before calling Update (gNMI Replace), check if the current StaticRPItems config differs from the desired config. If it does, delete the existing items first in a separate transaction — then the Replace only adds new entries.

File: internal/provider/cisco/nxos/provider.go

Live Device Testing

Setup

kubernikusctl auth init --username I521907 --user-domain-name ora \
  --project-id 2cb8a03edaa444a19b56a04dec88d9db \
  --auth-url https://identity-3.eu-de-1.cloud.sap/v3 \
  --url https://kubernikus.eu-de-1.cloud.sap --name clabernetes

kubectl port-forward -n lab--i521907v1-c058dde2aecf49f1821054ef877eabcf \
  pod/switch-54749479d8-9r6l9 9339:60009

Reproduce the bug (old behavior)

Enable PIM feature:

gnmic -a localhost:9339 --skip-verify -u admin -p admin set \
  --replace-path "System/fm-items/pim-items" \
  --replace-value '{"adminSt":"enabled"}'

Push initial RP with group 225.0.0.0/8:

gnmic -a localhost:9339 --skip-verify -u admin -p admin set \
  --replace-path "System/pim-items/inst-items/dom-items/Dom-list[name=default]/staticrp-items/rp-items" \
  --replace-value '{"StaticRP-list":[{"addr":"10.0.0.1/32","rpgrplist-items":{"RPGrpList-list":[{"grpListName":"225.0.0.0/8","bidir":false,"override":false}]}}]}'

Change group to 226.0.0.0/8 — fails with the bug:

gnmic -a localhost:9339 --skip-verify -u admin -p admin set \
  --replace-path "System/pim-items/inst-items/dom-items/Dom-list[name=default]/staticrp-items/rp-items" \
  --replace-value '{"StaticRP-list":[{"addr":"10.0.0.1/32","rpgrplist-items":{"RPGrpList-list":[{"grpListName":"226.0.0.0/8","bidir":false,"override":false}]}}]}'
# ERROR: child (Rn) cannot be added to deleted objectRn=rpgrplist-[226.0.0.0/8], Commit Failed

Verify the fix (delete first, then replace)

gnmic -a localhost:9339 --skip-verify -u admin -p admin set \
  --delete "System/pim-items/inst-items/dom-items/Dom-list[name=default]/staticrp-items/rp-items"

gnmic -a localhost:9339 --skip-verify -u admin -p admin set \
  --replace-path "System/pim-items/inst-items/dom-items/Dom-list[name=default]/staticrp-items/rp-items" \
  --replace-value '{"StaticRP-list":[{"addr":"10.0.0.1/32","rpgrplist-items":{"RPGrpList-list":[{"grpListName":"226.0.0.0/8","bidir":false,"override":false}]}}]}'
# OK

Results

Step Result
Push initial RP with 225.0.0.0/8
Replace directly with 226.0.0.0/8 (old behavior) rpgrplist error
Delete rp-items + replace with 226.0.0.0/8 (fix)
Verify 226.0.0.0/8 on device
Replay same config (idempotency) ✅ No unnecessary delete

@weneghawi weneghawi requested a review from a team as a code owner March 13, 2026 19:08
@github-actions
Copy link

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos 10.65% (-0.02%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/ironcore-dev/network-operator/internal/provider/cisco/nxos/provider.go 0.06% (-0.00%) 1547 (+4) 1 1546 (+4) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

@hardikdr hardikdr added the area/metal-automation Automation processes within the Metal project. label Mar 14, 2026
@hardikdr hardikdr added this to Roadmap Mar 14, 2026
Copy link
Contributor

@felix-kaestner felix-kaestner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting the whole System/pim-items/inst-items/dom-items/Dom-list[name=default]/staticrp-items/rp-items list container in YANG is way too disruptive!

Assume the case where a user adds an additional rendezvous point, your reflect.DeepEqual would fail and so you would delete the whole list container in one request and recreate it in another request, where each request is a single transaction, so this change is not atomic.

I suggest the following implementation:

  1. Fetch all rendezvous points from the API
  2. Loop through all the rendezvous points and check whether they are still in the desired list of rendezvous points
    a. If !reflect.DeepEqual(got, want), add as Update to the request.
    b. If not desired (any longer), add as Delete to the request.

With this implementation, we would only create an additional rendezvous point in the scenario described above.

This makes use of the fact that the *StaticRP list items, also implement the gnmiext.Configurable interface and as such can be configured with the same client call.

Here is a sample implementation for this, please verify if it works as intended.

diff --git a/internal/provider/cisco/nxos/provider.go b/internal/provider/cisco/nxos/provider.go
index 34243ce7..f6f386aa 100644
--- a/internal/provider/cisco/nxos/provider.go
+++ b/internal/provider/cisco/nxos/provider.go
@@ -1662,18 +1662,17 @@ func (p *Provider) EnsurePIM(ctx context.Context, req *provider.EnsurePIMRequest
 	del := make([]gnmiext.Configurable, 0, 3)
 
 	if len(rpItems.StaticRPList) > 0 {
-		// NX-OS fails with "child (Rn) cannot be added to deleted object" when a
-		// gNMI Replace tries to update nested rpgrplist-items within a StaticRP
-		// in the same transaction (e.g. when multicastGroups change). Work around
-		// this by deleting the existing StaticRPItems first when the configuration
-		// has changed, so the Replace only needs to add entries.
-		currentRPItems := new(StaticRPItems)
-		if err := p.client.GetConfig(ctx, currentRPItems); err == nil && !reflect.DeepEqual(rpItems, currentRPItems) {
-			if err := p.client.Delete(ctx, new(StaticRPItems)); err != nil {
-				return err
+		current := new(StaticRPItems)
+		if err := p.client.GetConfig(ctx, current); err != nil && !errors.Is(err, gnmiext.ErrNil) {
+			return err
+		}
+		for _, rp := range rpItems.StaticRPList {
+			if got, ok := current.StaticRPList.Get(rp.Key()); !ok || !reflect.DeepEqual(got, rp) {
+				conf = append(conf, rp)
+			} else {
+				del = append(del, rp)
 			}
 		}
-		conf = append(conf, rpItems)
 	} else {
 		del = append(del, rpItems)
 	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/metal-automation Automation processes within the Metal project.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants