Skip to content

Conversation

@olliewalsh
Copy link
Contributor

Adds support for deploying DCN with local storage (which is essentially
multi-stack plus spine & leaf networking) to the adoption_osp_deploy role.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign michburk for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@olliewalsh olliewalsh requested a review from fultonj December 17, 2025 15:27
@evallesp evallesp self-requested a review December 17, 2025 16:14
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/164182cfd2c244d8a1cf9d4c183185cc

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 50m 59s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 31m 20s
cifmw-crc-podified-edpm-baremetal FAILURE in 47m 51s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 32m 50s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 42s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 44s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 16s
✔️ build-push-container-cifmw-client SUCCESS in 26m 16s
✔️ cifmw-molecule-adoption_osp_deploy SUCCESS in 3m 43s

# DCN network routes extracted from TripleO scenario
{% set _dcn1_stack = cifmw_adoption_osp_deploy_scenario.stacks | selectattr('stackname', 'equalto', 'dcn1') | first | default({}) %}
{% if _dcn1_stack.network_routes is defined %}
edpm_dcn1_routes:
Copy link
Contributor

@evallesp evallesp Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[related to first commit]
(blocking) suggestion: I see here a good candidate for a refactor:

{% for site in ['dcn1', 'dcn2'] %}
{% set _stack = cifmw_adoption_osp_deploy_scenario.stacks | selectattr('stackname', 'equalto', site) | first | default({}) %}
{% if _stack.network_routes is defined %}
edpm_{{ site }}_routes:
  {% for subnet_name, routes in _stack.network_routes.items() %}
  {{ subnet_name }}:
    {% for route in routes %}
    - destination: {{ route.ip_netmask }}
      nexthop: {{ route.next_hop }}
    {% endfor %}
  {% endfor %}
{% endif %}
{% endfor %}

Copy link
Contributor

@fultonj fultonj Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a reasonable suggestion. I'm testing it in my own here:

https://github.com/fultonj/ci-framework/tree/dcn_adoption_pr_evallesp_refactor

@evallesp
Copy link
Contributor

Where's dcn_nostorage file?

Copy link
Contributor

@evallesp evallesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general (second commit). I've found something that might trigger an error when we starting to have more than one stack.
If not, my comment might be skipped and eventually we can fix this in a following PR.

addresses:
- ip_netmask: {{ net[ip_version|default('ip_v4')] }}/{{ net[prefix_length_version|default('prefix_length_v4')] }}
{% if _stack.network_routes is defined and network_name in _stack.network_routes %}
routes:
Copy link
Contributor

@evallesp evallesp Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) concern: As this is not touched by this PR, I'm marking this comment as non-blocking.

I was checking the output of the routes, and in this PR it seems correct.
But above, at L17, the "routes:" are inside the "{%- for route in _stack.routes %}" which might end up by having two differente routes section like:

routes:
    - ip_netmask: first
  routes:
    - ip_netmask: second

I see here that "routes:" is properly set outside the for loop.

Copy link
Contributor

@fultonj fultonj Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a reasonable suggestion. I'm testing it in my own here:

https://github.com/fultonj/ci-framework/tree/dcn_adoption_pr_evallesp_refactor

- osp_trunk
# Let's remove the default computes, since we want to adopt the
# OSP ones
compute:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) question: is this something required to be there to avoid a runtime error? Should we create a task for solving this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required to avoid a conflict. Setting amount of computes to 0 like this:

      # Let's remove the default computes, since we want to adopt the
      # OSP ones
      compute:
        amount: 0

is necessary because:

  1. The base scenario would otherwise create new compute VMs
  2. For adoption, we're reusing the existing OSP compute nodes (osp-compute) instead of creating new ones
  3. Without amount: 0, we'd have both the default computes AND the osp-compute nodes, which would cause conflicts

This isn't a bug or workaround that needs a task - it's the correct pattern for adoption scenarios. The libvirt_manager_patch_layout is intentionally overriding the default compute configuration to disable it while defining the OSP-specific compute nodes separately.

The comment in the code explains the intent clearly. No additional task needed.

@github-actions
Copy link

github-actions bot commented Jan 3, 2026

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 3, 2026
@fultonj fultonj marked this pull request as ready for review January 8, 2026 12:49
@fultonj fultonj removed the Stale label Jan 8, 2026
@fultonj
Copy link
Contributor

fultonj commented Jan 8, 2026

@olliewalsh would you please update the PR commits so that they look like this?

[fultonj@stybba ci-framework{dcn_adoption_pr}]$ git log --oneline | head -3
54c0bdda [adoption_osp_deploy] Extract DCN network routes from TripleO scenario into adoption vars
5b51be13 [adoption_osp_deploy] Render network-specific routes in os-net-config for DCN deployments
f72bad3d [adoption_osp_deploy] Add DCN adoption support
[fultonj@stybba ci-framework{dcn_adoption_pr}]$ 

Unfortunately it's a requirement which is causing a CI failure.

Run ./scripts/check-role-prefix.sh commit-message-file
  ./scripts/check-role-prefix.sh commit-message-file
  shell: /usr/bin/bash -e {0}
Checking latest commit message:
Extract DCN network routes from TripleO scenario into adoption vars

https://github.com/openstack-k8s-operators/ci-framework/actions/runs/20307890310/job/58330346588?pr=3570

Here's a quick way to do it.

git rebase -i 052ccead

In the editor that opens, change pick to reword (or just r) for these 3 commits:

reword f72bad3d Add DCN adoption support
reword 5b51be13 Render network-specific routes in os-net-config for DCN deployments
reword 54c0bdda Extract DCN network routes from TripleO scenario into adoption vars

git will then prompt you for each commit and you can paste [adoption_osp_deploy]

@fultonj
Copy link
Contributor

fultonj commented Jan 8, 2026

Where's dcn_nostorage file?

This PR has scenarios/adoption/dcn_nostorage.yml and it is used with the new dcn_nostorage architecture patch here:

openstack-k8s-operators/architecture#670

@fultonj
Copy link
Contributor

fultonj commented Jan 9, 2026

@olliewalsh I tested this patch using the refactoring that @evallesp suggested and it still worked.

Feel free to git apply evallesp_refactor.patch this patch to get his suggestions.

evallesp_refactor.patch

olliewalsh and others added 4 commits January 9, 2026 21:20
Adds support for deploying DCN with local storage (which is essentially
multi-stack plus spine & leaf networking) to the adoption_osp_deploy role.

Signed-off-by: Oliver Walsh <owalsh@redhat.com>
… for DCN deployments

Modify os_net_config_overcloud.yml.j2 template to render per-network
routes from stack configuration instead of hardcoded empty routes.

Problem:
- Template had hardcoded "routes: []" for all VLAN networks (line 40)
- DCN compute nodes need routes to reach central site services
- Routes defined in data-plane-adoption's network_data.yaml.j2 were
  being ignored because ci-framework pre-generates os-net-config files
  before TripleO Heat deployment runs

Solution:
- Check if _stack.network_routes is defined and contains routes for
  the current network
- If routes exist, render them with ip_netmask and next_hop
- Otherwise, fall back to empty routes array

This enables DCN scenarios to configure cross-site routes via the
network_routes field in stack definitions (dcn_nostorage.yaml).

Example usage in stack config:
  network_routes:
    internalapidcn1:
      - ip_netmask: 172.17.0.0/24
        next_hop: 172.17.10.1

Related: Requires corresponding data-plane-adoption change to add
network_routes to dcn1/dcn2 stack definitions.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
…o into adoption vars

Add edpm_dcn1_routes and edpm_dcn2_routes variables to adoption_vars.yaml
template. These variables extract network routes defined in the TripleO
scenario file (dcn_nostorage.yaml) and make them available for EDPM
ansible configuration.

Routes are extracted from stack.network_routes for dcn1 and dcn2 stacks,
with each route containing destination (ip_netmask) and nexthop fields.

This enables proper inter-site connectivity in DCN deployments where
compute nodes in edge sites need routes to reach control plane services
in the central site.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: John Fulton <fulton@redhat.com>
…s bug

Refactor adoption_vars.yaml.j2: Replace duplicate dcn1/dcn2 route
extraction blocks with a single loop over site names. This follows
DRY principles and scales automatically if more DCN sites are added.

Fix os_net_config_overcloud.yml.j2: Move 'routes:' outside the `for`
loop to prevent generating duplicate YAML keys when multiple routes
exist. This mirrors the correct pattern already used for network_routes
at lines 40-48.

Signed-off-by: John Fulton <fulton@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants