2023.1 Antelope Series Release Notes

stackhpc/14.0.0.83

New Features

  • Add optional support for relabelling network devices in Prometheus. Use network names as defined in kayobe, instead of network device names. Reuse of device names within an environment is not supported.

stackhpc/14.0.0.81

New Features

  • Bumped pulp repo versions for Q2 2024 Bumped Kolla image tags for Q2 2024 Bumped prometheus server from 2.38.0 to 2.51.1 Bumped prometheus alertmanager from 0.24.0 to 0.26.0 Bumped prometheus blackbox exporter from 0.23.0 to 0.25.0 Bumped prometheus cadvisor exporter from 0.48.0 to 0.49.1 Bumped prometheus haproxy exporter from 0.13.0 to 0.15.0 Bumped prometheus memcached exporter from 0.10.0 to 0.14.3 Bumped prometheus msteams from 1.5.1 to 1.5.2 Bumped prometheus mtail from 3.0.0-rc50 to 3.0.0-rc53 Bumped prometheus mysqld exporter from 0.15.0 to 0.15.1 Bumped prometheus node exporter from 1.4.0 to 1.7.0 Bumped prometheus openstack exporter from 1.6.0 to 1.7.0 Bumped prometheus ovn exporter from 1.0.6 to 1.0.7 Bumped opensearch from 2.11.1-1 to 2.13.0-1 (Rocky Linux 9) Bumped opensearch from 2.12.0 to 2.13.0 (Ubuntu Jammy) Bumped grafana from 10.1.5-1 to 10.4.2-1 (Rocky Linux 9) Bumped grafana from 10.4.0 to 10.4.2 (Ubuntu Jammy)

Security Issues

  • Fixed CVE-2023-31047, CVE-2023-23969, CVE-2023-24580, CVE-2023-36053, CVE-2023-46695, CVE-2023-30861, CVE-2022-4899. CVE-2024-1135, GHSA-2m57-hf25-phgg, CVE-2023-0286, CVE-2023-50782, CVE-2024-26130 for openstack services.

    Fixed CVE-2022-41723, CVE-2023-39325 (except prometheus-alertmanager, prometheus-msteams-exporter, prometheus-haproxy-exporter, prometheus-openstack-exporter. No patch available.), CVE-2021-43565, CVE-2022-27191, CVE-2022-27664, CVE-2021-38561, CVE-2022-21698, CVE-2021-4238, CVE-2022-40083, CVE-2022-41721, CVE-2021-33194, CVE-2023-2253, CVE-2023-27561, CVE-2023-28840, CVE-2024-21626, CVE-2022-32149, CVE-2023-45142, GHSA-m425-mq94-257g for prometheus server and exporters except prometheus-libvirt-exporter and prometheus-haproxy-exporter. (Source repository of each are archived and no longer maintained)

    Fixed CVE-2023-39325, CVE-2023-45142, CVE-2023-47108, CVE-2023-49568, CVE-2023-49569, GHSA-9763-4f94-gfch, GHSA-m425-mq94-257g for grafana.

    It is advised to redeploy service with current version of images from StackHPC Release Train.

stackhpc/14.0.0.80

New Features

  • Supports adding CA certificates to the Tempest container trust store.

stackhpc/14.0.0.79

Bug Fixes

  • The OpenSearch backend for CloudKitty has been fixed, so the Horizon Rating panels work again.

stackhpc/14.0.0.78

New Features

  • Adds a new Prometheus alert HostNetworkBondDegraded which will be raised when at least one bond member is down.

  • Adds a new Prometheus alert HostNetworkBondSingleLink which will be raised when a bond is configured with only one member. This can happen when NetworkManager detects that a bond member is down at boot time. This alert can be disabled by setting alertmanager_warn_network_bond_single_link to false.

stackhpc/14.0.0.77

Bug Fixes

  • Adds a custom fix-houston.yml playbook to address dmesg errors, specifically: “tc mirred to Houston: device bond0-ovs is down”. This error typically appears when OVS HW offloading is enabled, often in conjunction with VF-LAG and ASAP^2. Detailed usage instructions are provided within the playbook’s comments. Additional context is available at the following links: LP#1899364 Kernel Patch

stackhpc/14.0.0.76

Bug Fixes

  • Fixes appending to ca.crt in make-cert-client.sh causing multiple identical ca certs being added into /etc/kubernetes/certs/ca.crt.

stackhpc/14.0.0.73

Security Issues

  • Update Horizon on Ubuntu to include apache2 package 2.4.52-1ubuntu4.8 which fixes CVE-2023-31122.

stackhpc/14.0.0.72

New Features

  • Updates Magnum CAPI Helm driver version to v0.13.0

stackhpc/14.0.0.71

New Features

  • Updates Magnum CAPI Helm driver version to v0.11.0

  • Automatic deployment for OpenStack Capacity via a Kayobe service deploy hook using kolla admin credentials.

Upgrade Notes

  • Updates the Ansible configuration to fail on any unparsed inventory source. If you are using a separate Ansible configuration for Kolla Ansible, you may wish to add this setting in etc/kayobe/kolla/ansible.cfg.

  • OpenStack Capacity no longer uses application credentials. Please delete any previously generated application credentials.

stackhpc/14.0.0.69

Upgrade Notes

  • Ensure that your deployment has only one nova-compute-ironic service running per conductor group. See the operations / nova-compute-ironic doc for further details.

Bug Fixes

  • Adds basic support and a document explaining how to migrate to a single nova-compute-ironic instance, and how to re-deploy the instance to another machine in the event of failure. See the operations / nova-compute-ironic doc for further details.

stackhpc/14.0.0.68

Security Issues

  • The Heat container images are rebuilt with yaql 3.0.0 to include patch for vulnerability OSSN/OSSN-0093. It is recommended that you redeploy Heat services in your system with the current version of Heat images from StackHPC Release Train.

stackhpc/14.0.0.67

New Features

  • Updates Magnum CAPI Helm driver version to v0.12.0

stackhpc/14.0.0.66

Security Issues

  • Kolla container images created using the stackhpc-container-image-build.yml workflow are now automatically scanned for vulnerablilities.

stackhpc/14.0.0.64

Bug Fixes

  • The grafana image now includes the gnocchixyz-gnocchi-datasource and the grafana-opensearch-datasource plugins, which are the default upstream plugins.

stackhpc/14.0.0.62

Upgrade Notes

  • Updates Magnum CAPI Helm driver version to v0.11.0

stackhpc/14.0.0.59

Bug Fixes

  • Fix an issue with the OSD summary pie chart not showing any data.

stackhpc/14.0.0.55

Bug Fixes

  • Updates Magnum CAPI Helm driver version to v0.10.0

  • Fixes Grafana panel of top Ceph pools by capacity used. This panel was only showing the most used pool instead of as many pools as configured with the $topk variable.

stackhpc/14.0.0.54

New Features

  • The smartmon-tools playbook now ensures that the cron service is running as in some cases it may not be running by default.

Upgrade Notes

  • Update Ubuntu Jammy Zed Kolla container tags.

stackhpc/14.0.0.51

New Features

  • Adds alerts for software raid failures.

stackhpc/14.0.0.49

New Features

  • Adds a custom playbook (pulp-auth-proxy.yml) for deploying an authenticating proxy for Pulp. This can be used when building container images to avoid leaking credentials for package repositories into the built images or their metadata.

stackhpc/14.0.0.42

New Features

  • Rocky images have been rebuilt and are now based on Rocky 9.3.

stackhpc/14.0.0.40

New Features

  • Adds NVMe and S.M.A.R.T utilities to the overcloud host image built by DIB.

stackhpc/14.0.0.39

Bug Fixes

  • Removes bogus ContainerVolumeUsage alert. This rule wasn’t correctly measuring container volume IO and could cause spurious alerts.

  • Add a new reset-bls-entries.yml custom playbook which will rename existing Boot Loader Specification (BLS) entries using the current machine ID for each host. This should fix an issue with Grub not selecting the most recent kernel during boot.

stackhpc/14.0.0.36

New Features

  • Added support for Rocky Linux 9.3 repositories and Kolla containers. Made 9.3 the default version for Rocky Linux.

  • Updated Rocky Linux 9.2 pulp repo versions. Added Rocky Linux 9.3 pulp repo versions. Rebuilt Kolla containers with Rocky Linux 9.3.

Upgrade Notes

  • Bifrost Ironic debug logging is now disabled by default. Change ironic_debug to true to revert.

  • Updates Consul to 1.16.4 and Vault to 1.14.8.

Bug Fixes

  • Bumps OpenSearch heap size to 8 GB, to be identical to Elasticsearch.

stackhpc/14.0.0.31

New Features

  • StackHPC Kayobe Configuration container images for CI/CD with Kayobe Automation are now published to GitHub Container Registry (GHCR) at ghcr.io/stackhpc/stackhpc-kayobe-config. The image is tagged with the name of the release branch, e.g. stackhpc/yoga.

stackhpc/14.0.0.30

Bug Fixes

  • Previously switchdev capabilities should be configured manually by a user with admin privileges using port’s binding profile. This blocked regular users from managing ports with Open vSwitch hardware offloading as providing write access to a port’s binding profile to non-admin users introduces security risks. For example, a binding profile may contain a pci_slot definition, which denotes the host PCI address of the device attached to the VM. A malicious user can use this parameter to passthrough any host device to a guest, so it is impossible to provide write access to a binding profile to regular users in many scenarios.

    This patch fixes this situation by translating VF capabilities reported by Libvirt to Neutron port binding profiles. Other VF capabilities are translated as well for possible future use. LP#2008238. LP#2020813.

  • Neutron ovn db sync operation will no longer removes OVN metadata ports in networks with Octavia OVN Load balancers health monitors. A maintenance task process has been added to update the existing OVN LB HM ports to the new behaviour defined. Specifically, the “device_owner” field will be updated from network:distributed to ovn-lb-hm:distributed. Additionally, the “device_id” will be populated during update action. LP#2038091.

stackhpc/14.0.0.26

Bug Fixes

stackhpc/14.0.0.25

New Features

  • Adds support for deploying GitHub runners and creating GitHub workflows for use within Kayobe Automation. Two playbooks and their requirements have been added to ansible/ in addition to the relevant groups defined with some useful default variables where appropriate. Finally, documentation has been added to cover how to deploy these runners and workflows.

  • Added the stop-openstack-services.yml playbook, which can be used to stop OpenStack services across the overcloud.

Bug Fixes

  • Pin the OCI image tag used for the Ubuntu Focal base-image of Kolla image builds. This prevents packages in the image with the latest tag getting in front of StackHPC release-train package repositories. Ubuntu tag should be bumped when new packages are available in StackHPC release-train.

stackhpc/14.0.0.22

New Features

  • Updates OpenSearch to 2.11.1.

stackhpc/14.0.0.21

Bug Fixes

  • Pin the OCI image tag used for the base-image of Rocky 9 Kolla image builds. This prevents packages in the image with the latest tag getting in front of StackHPC release-train package repositories.

stackhpc/14.0.0.20

New Features

  • Added the rekey-hosts.yml playbook to automatically rotate the SSH keys on all hosts.

  • Adds support for Ubuntu Jammy and Rocky 9 to the CIS benchmark hardening playbook: cis.yml. This playbook will need to be manually applied.

  • Adds a panel in the Hardware Overview dashboard to show DWPD (Drive writes per day) for NVMEs. This is calculated by dividing the total bytes written in the past 24 hours by the drive capacity. This is currently only supported on NVMEs.

  • Adds alerts that will fire after 1 DWPD is sustained for 7 days, and a critical alert if 1 DWPD is sustained for 30 days.

Bug Fixes

  • Fixes display of the OpenSearch cluster health in Grafana when in yellow state.

  • Fix Grafana HAProxy dashboard when non-default Prometheus instance labels are used.

stackhpc/14.0.0.17

New Features

  • Neutron containers are now built from our StackHPC fork.

Upgrade Notes

  • Updates default Ceph images to v17.2.7 for Quincy.

  • Updates Consul to 1.16.3 and Vault to 1.14.6.

Bug Fixes

  • Fixes the bulk API of CloudKitty so that it now supports the migration from Elasticsearch to OpenSearch.

  • Fixes an issue with the growroot playbook where disks such as ‘sdp’ would become ‘sd’ due to the removal of the trailing ‘p’ when dealing with nvme devices.

  • Fixes Neutron so that load balancer FIPs are not broken on Neutron restart. See Neutron bug report.

  • Fixes issue where Netmiko devices were sending no commands to the switch since plug_bond_to_network is overridden in networking_generic_switch/devices/netmiko_devices/init.py and PLUG_BOND_TO_NETWORK to set to None. See NGS bug report.

  • Restores valid value for the flavor_id label on openstack_nova_server_status Prometheus metrics.

stackhpc/14.0.0.16

New Features

  • Adds kolla config merging options to the Kolla custom config generation section of etc/kayobe/kolla.yml.

Upgrade Notes

  • Kolla config merging is enabled by default in the Antelope release of Kayobe. This was quite an extensive change and whilst backwards compatbility was one of the goals, there may be some situations where refactoring of your Kolla config will be necessary. Extra care should be taken if you are using the multiple environments feature. It is recommended that you carefully check the diff in the resultant Kolla configuration by following these steps to check for missing config or duplicated config options. The kolla_openstack_custom_config_environment_merging_enabled option can be set to False to revert back to the old behaviour.

stackhpc/14.0.0.15

New Features

  • The Cephadm pre and post commands now support default commands with the variables cephadm_commands_pre_default and cephadm_commands_post_default. As such, any extra commands should be added to the variables cephadm_commands_pre_extra and cephadm_commands_post_extra.

  • Rocky Linux 9 image has been rebuilt with missing base packages (e.g. microcode_ctl) by installing ‘Minimal Install’ DNF group. Also cloud-init from CentOS 9 Stream has been installed with NetworkManager support.

Bug Fixes

  • Fixes an issue when live migrating instances to hosts with cgroups v2 enabled (Ubuntu Jammy and Rocky 9). See Nova bug report.

  • Fixes a race condition when launching multiple Ironic instances in parallel (as is commonly triggered when using Terraform/OpenTofu). See Nova bug report.

  • When using custom SCA policies for Wazuh, the agents are now correctly configured to allow commands to be executed from the manager.

  • Fixes an issue with Ansible Pulp modules depending on the pulp_glue Python library since the pulp.squeezer 0.0.14 release.

  • Fixes an issue with Kolla container image builds for Ubuntu where the release train package repositories could be behind the container image, leading to image build failures.

stackhpc/14.0.0.14

Bug Fixes

  • Rebuild and bump the Bifrost container for Xena to include fix for Error while running update_to_latest_versions: ‘’BIOSSetting’’ object has no attribute during Ironic database migrations on upgrade

  • Disabled custom APT configuration for non-overcloud hosts (Ubuntu Only). This resolves the issue of the seed hypervisor attempting to pull packages from the repository on the seed before it has been deployed.

stackhpc/14.0.0.12

New Features

  • This patch adds OpenStack Capacity metrics and exporters to StackHPC Kayobe Config. This includes a deployment playbook, Prometheus scrape jobs and HAProxy configurations to support this change.

  • Adds ethtool and pciutils to the overcloud host disk image.

  • Raises an alert when the count of RabbitMQ ready messages increases above a threshold.

  • Adapt threshold of RabbitMQ connection alert based on the size of the deployment to avoid spurious alerts.

  • Wazuh can now de deployed with additional custom SCA policies. Just add the policy file(s) to the directory {{ kayobe_env_config_path }}/wazuh/custom_sca_policies.

Upgrade Notes

  • Rebuilt all kolla and package repo tags to bring in kernel fixes and apply CentOS image build customisations that were previously being ignored.

  • To deploy the OpenStack Capacity Grafana dashboard, you must define OpenStack application credential variables: secrets_os_capacity_credential_id and secrets_os_capacity_credential_secret as laid out in the ‘Monitoring’ documentation.

    You must also enable the stackhpc_enable_os_capacity flag for OpenStack Capacity HAProxy and Prometheus configuration to be templated.

    You may also change the default authentication URL from the kolla_internal_fqdn and change the default OpenStack region from RegionOne with the variables: stackhpc_os_capacity_auth_url and stackhpc_os_capacity_openstack_region_name.

    To disable certificate verification for the OpenStack Capacity exporter, you can set stackhpc_os_capacity_openstack_verify to false.

stackhpc/14.0.0.9

Upgrade Notes

  • Enabled ML2/OVN by default. Checks preventing accidental migration from ML2/OVS were added in Kolla Ansible. If you are using a Neutron plugin other than ML2/OVN, set kolla_enable_ovn to false.

    OVN distributed FIP is disabled, to enable it set neutron_ovn_distributed_fip to true in etc/kayobe/kolla/globals.yml.

  • The reboot.yml custom Ansible playbook now defaults to reboot only one host at a time. Existing behaviour can be retained by setting ANSIBLE_SERIAL=0.

Security Issues

  • The Rocky 8 minor version has been bumped to 8.8 and new snapshots have been created to include fixes for Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

  • The snapshots for Rocky 9.2 have been refreshed to include fixes for Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

stackhpc/14.0.0.8

Upgrade Notes

  • The path used to store Wazuh certificates has changed. local_certs_path is now set to the environment directory e.g $KAYOBE_CONFIG_PATH/environments/<environment>/wazuh or $KAYOBE_CONFIG_PATH/wazuh/ if not using environments. The contents of $KAYOBE_CONFIG_PATH/ansible/wazuh/certificates should be moved to the new location and the empty directory should be removed.

  • The local_custom_certs_path variable has been removed. Custom wazuh certificates should be moved to $KAYOBE_CONFIG_PATH/environments/<environment>/wazuh/wazuh-certificates/ if using environments, or $KAYOBE_CONFIG_PATH/wazuh/wazuh-certificates if not.

stackhpc/14.0.0.6

New Features

  • Provide ELRepo 9, which in turn provides packages to support be2net and mpt3sas hardware. Configuration of ELRepo 9 is disabled by default and may be enabled by setting dnf_install_elrepo_9: true.

  • Nvmemon now reports physical size of the disk.

Upgrade Notes

  • CentOS Stream 8 snapshots have been bumped and new container images are available. Make sure to sync these into your local pulp. The yum repositories must be reconfigured to exclude a buggy version of iptables. To do this use: kayobe overcloud service reconfigure -kt none -t dnf.

  • CentOS Extras has been replaced with CentOS Extras Common. You may need to use the --allowerasing option with DNF if you have packages installed from the old repo. This is a one time only thing and on the next package update you can drop this argument.

  • Configure Nova to use more modern ‘q35’ libvirt machine type rather than ‘pc’ which is considered legacy.

  • Instance labels in prometheus now use inventory hostnames rather than IPs.

Security Issues

  • Bumps CentOS Stream 8 snapshots to include fixes for Zenbleed (CVE-2023-20593) and Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

  • Bumps Ubuntu repository snapshots and container images to bring in latest security patches. This includes the microcode to patch Downfall (CVE-2022-40982). Zenbleed (CVE-2023-20593) was patched in the previous snapshot bump. To apply the microcode updates, it is recommended to reboot each host after upgrading all of the packages.

Bug Fixes

  • Fixes an issue with local image builds where kolla_tag had not been set. The error had the signature:

  • Upstream package repository mirrors are now restored in Kolla container images. This makes it possible to install or update packages for debugging purposes.

stackhpc/14.0.0.1

New Features

  • Add blazar project Kolla container images. Blazar is a resource reservation service for OpenStack. Blazar enables users to reserve a specific type/amount of resources for a specific time period and it leases these resources to users based on their reservations.

  • Adds caso container images. cASO is an is an accounting reporter that supports Cloud Accounting Usage Records. For more information, see the upstream docs. Note that this container does not exist in upstream Kolla and is maintained downstream by StackHPC.

  • Adds code to the globals.yml file to add endpoints for the ceph_mgr_exporter. If ceph is configured correctly, managers will be under the mgrs inventory group. If this group is empty, then the variable will just be empty (the KA default). This also requires setting kolla_enable_prometheus_ceph_mgr_exporter to true.

  • The playbook hotfix-containers.yml has been added. This allows arbitrary files to be copied into, and/or arbitrary commands to be executed within, overcloud containers.

  • Support for Ubuntu 22.04 Jammy Jellyfish repositories have been added to the Yoga Release.

  • Set monitoring services be enabled by default in the ci-multinode environment.

  • OpenSearch container images have been added.

  • Add the package repository configuration required for Rocky Linux 9 support.

    Add CI for Rocky 9 hosts.

  • Added support for Rocky Linux 9.2 repositories and made 9.2 the default version.

  • Adds support for using a VMs as compute and controller nodes in the ci-multinode environment by dynamically setting the MTU of the networks in networks.yml and removing the static definition of the network interfaces for the compute and controller groups.

  • Add Wazuh deployment playbook.

  • Adds utility playbooks to build and rotate amphora images. For more details check out the Octavia section of the Operator Guide included in the documentation.

  • Brings in new neutron container images to add batching support to Networking Generic Switch. This is opt in via the ngs_batch_requests configuration option and only affects Ironic deployments that use Networking Generic Switch. See the following PR for more details.

  • Updates neutron containers to contain a version of networking-generic-switch with support for trunk ports when using DellOS 10 or Cisco switches. See this PR for more details.

  • Updates neutron containers to contain a version of networking-generic-switch with support for DellOS 10. See this PR for more details.

  • Improvements to the ci-aio automated deployment script to allow the script to successfully run on LVM-based images.

  • Added a script to the AIO environment that can be used to quickly deploy an AIO for testing.

  • Adds time information to tasks using the ansible.posix.profile_tasks callback.

  • Adds some basic tuning of Ansible, including use of 20 forks, enabling SSH pipelining, YAML-formatted output, and disabling fact variable injection.

  • magnum container now has capi driver

  • Adds support for using Ceph HAProxy and Keepalived images stored in Pulp. This is enabled automatically if stackhpc_sync_ceph_images is set to true.

  • Mariabackup is now enabled by default.

  • The flag om_enable_rabbitmq_high_availability is now set to true. Adds tags for new RabbitMQ containers to update to RabbitMQ version 3.9.22.

  • Adds an etcd Kolla container image. This can be used for OpenStack service coordination as a tooz backend, or for batched processing of switch configuration in Networking Generic Switch (this requires a downstream NGS patch).

  • Adds drive temperatures to the table on the hardware overview dashboard and a timeseries to show the temperature over time.

  • Adds picker to hardware overview dashboard to select a specific host to show drive information for.

  • Adds support for synchronising HashiCorp Consul and Vault images to a local Pulp registry.

  • Adds a new variable stackhpc_pulp_sync_for_local_container_build which, when set to true, configures the local Pulp server to sync all package repositories required for building kolla containers on a local kolla build host.

  • Enable TLS for the Seed Pulp service. Set pulp_enable_tls: true and provide paths to a TLS certificate and key using pulp_cert_path and pulp_key_path respectively.

  • Adds a standard LVM configuration that is compatible with the new overcloud host image.

  • adds helm client into magnum container

  • Adds support for Manila in the ci-multinode environment using the CephFS native backend. This is disabled by default, but can be enabled by setting the following variables in the kayobe configuration: kolla_enable_manila: true kolla_enable_manila_backend_cephfs_native: true

  • Updated the documentation for the ci-multinode to include instructions on how to set up and test Magnum.

  • Added support for Wazuh in the ci-multinode environment.

  • Updates Prometheus Node exporter to version 1.5.0.

  • Adds NTP alerts to prometheus alertmanager.

  • Adds alerts for Octavia load balancers and amphorae. Alerts are triggered when load balancers enter the ERROR or DEGRADED states, or when amphorae enter the ERROR state.

  • Adds a new Grafana dashboard for Octavia. This dashboard is used to monitor the load balancers as well as the amphorae.

  • Adds a standard overcloud Diskimage Builder (DIB) host image configuration.

  • Prebuilt overcloud host images can now be pulled from Ark using the stackhpc_download_overcloud_host_images variable. The image is selected based on os_distribution and os_release.

  • Re-enable Pulp Ubuntu repositories.

  • Package repositories and container images for CentOS Stream based deployments have been updated. Key packages to note are:

    • Kernel

      • version: 4.18.0

      • release: 448.el8

    • Libvirt

      • version: 8.0.0

      • release: 6.module_el8.7.0+1140+ff0772f9

    • OVS

      • version: 2.17.0

      • release: 71.el8s

    • OVN

      • version: 22.09.0

      • release: 11.el8s

  • Container images for Ubuntu based deployments have been updated. Key packages to note are:

    • Libvirt

      • version: 8.0.0

      • release: 1ubuntu7.4~cloud0

    • OVS

      • version: 2.17.3

      • release: 0ubuntu0.22.04.1~cloud0

    • OVN (unchanged since last container build)

      • version: 22.03.0

      • release: 0ubuntu1~cloud0

  • Sync Rocky Linux 8.7 RPM repositories to local Pulp servers.

  • Enables SMART monitoring. Manual action is required, please see the monitoring documentation for the procedure.

  • Split cephadm_commands into cephadm_commands_pre and cephadm_commands_post commands. This allows the user to run commands that must be run before the rest of the post-deployment configuration, as well as commands that rely on resources created by the post-deployment config.

  • Updates Grafana to 9.4.7 version.

  • Upgrades Pulp from 3.21 to 3.22.

  • Disables Pulp analytics.

  • Sets Pulp worker based on available CPU cores. This may improve performance when pulling container images to many hosts simultaneously.

  • Upgrades Pulp from 3.22 to 3.23.

  • Upgrades Pulp from 3.23 to 3.24.

  • Adds support for package repository snapshots via Pulp. A local Pulp server is deployed on the seed, which syncs package repositories and container images from the StackHPC Ark Pulp server. Control plane servers pull packages and container images from the local Pulp server.

  • The EPEL package repository is disabled by default. It may be enabled by setting dnf_enable_epel to true.

  • Uses StackHPC source code repositories for kolla, kolla-ansible, and bifrost.

  • Supports Kolla CentOS Stream 8 source container images.

  • Adds custom playbooks for compute host maintenance:

    • nova-compute-drain.yml

    • nova-compute-disable.yml

    • nova-compute-enable.yml

    • reboot.yml

  • Adds a custom playbook to reset the RabbitMQ cluster and restart OpenStack services that use it, rabbitmq-reset.yml.

  • Adds a custom playbook to configure swap, swap.yml.

  • Adds the Kayobe Automation Git repository as a submodule, and provides some basic configuration for it in an .automation.conf directory.

  • Adds support for deploying a Squid caching proxy as a custom container on the seed.

  • Enables Elasticsearch, Grafana, Kibana, Prometheus by default. Provides standard dashboards for Grafana and alerting rules for Prometheus.

Upgrade Notes

  • Bumped focal package versions due to unmet depenencies

  • Bumps octavia container versions

  • Bumped rocky 9 package versions due to missing snapshot

  • container tags for magnum capi changes

  • Updates Ceph Pacific container image to v16.2.11.

  • Automatically install Quincy if the node is running Ubuntu 22.04, else install Pacific.

  • Increase stackhpc.cephadm collection to version 1.12.2.

  • Enables Docker live restore by default. This may be disabled by setting docker_daemon_live_restore to false in docker.yml.

  • The flag om_enable_rabbitmq_high_availability is now set to true. As this enables durable queues, RabbitMQ will need to be reset, and the services which use it restarted. Tags are added to update the RabbitMQ containers to version 3.9.22.

  • The overcloud host image build workflow now uploads the built image to SMS as well as ARK, allowing it to be tested both manually and through AIO CI jobs.

  • Updated OVN package version from 22.06 to 22.09.

  • openvswitch version has been updated to ~2.17.5 on all distributions (CentOS/Rocky9 are two patches ahead of 2.17.5). Images include fixes for CVE-2023-1668.

    Ubuntu repository versions for focal and ubuntu cloud archive have been updated to 20230515.

  • Kolla tag overrides have been refactored to allow kolla-ansible to resolve them individually by host. This means that mixed clouds can be deployed which allows for migration between distributions.

  • Dont pull apt packages from pulp for Ubuntu Jammy until Jammy packages are published.

  • Dont pull ceph packages from ceph official repos for Ubuntu Jammy until Jammy packages are published.

  • Updates the smartmon-tools.yml playbook to ensure that cron is installed before attempting to configure crontab.

  • Updated Ubuntu package repository versions.

Bug Fixes

  • Added NetworkManager-config-server package to Rocky Linux 9 deployment image. Which prevents NetworkManager from automatically running DHCP on unconfigured ethernet devices and allows connections with static IP addresses to be brought up even on ethernet devices with no carrier.

  • Fixed a syntax error in Prometheus SMART monitoring rules.

  • Caps the number of Pulp API and content workers to 32 each to avoid errors on hosts with many CPUs.

  • Fixes the hardware overview dashboard to use the correct metric for displaying drive temps. Now uses an or to display whichever metric is compatible with the drives in the system. The two metrics are temperature_case_raw_value and temperature_celsius_raw_value.

  • Fixes the issue with using SAML2 federation in Keystone against NetIQ IdP.

  • Fixes internet connectivity for VMs deployed in the ci-multinode environment.

  • Fixes creation of over 1TB memory VMs on AMD with IOMMU enabled on Rocky Linux 9.

  • Fixes the smartmon script to be case insensitive when checking for the inital SMART info. This is to ensure that the script works correctly on systems where the output of smartctl -i is not capitalised as previously expected by the script. This leads to badly formatted .prom files which lead to node_exporter failing to scrape the file.

  • Fixes the InstanceDown alerting rule wait time to be consistent with the alert message. The alert message says “for 5 minutes” but the rule was set to wait for 1 minute.

  • Updates nova image to bring in a fix for parsing mdev uuids when using libvirt>=7.7. See bug for more details.

  • Add unit to LowMemory alert description.

  • Fixes Octavia health monitors not being created on cluster spawn.

  • Fixes CoreDNS for Magnum clusters crashing on startup.

  • Allows cinder-csi nodeplugin to start on the same Magnum cluster host as cinder-csi controllerplugin.

  • Corrects ClusterRole rules for Magnum cluster-autoscaler, and sets cluster-autoscaler pods to use hostNetwork.

  • Disables metadata proxy over IPv6 inside Neutron DHCP agent to work around bug 1953165.

  • Fix for nova resize API not parsing the new flavor on resize - bug 1805969.

  • Fix creation of VM instances with UEFI enabled and Secure Boot disabled.

  • Fixes documentation builds on Read the Docs.

  • Fixes synchronisation and DNF configuration of the Rocky Linux 9 CRB repository.

  • HAProxy alerting rules have been updated to use the server name that is down, rather than the name of the instance that reported the down server.

Other Notes

  • deployment guide docs added for new capi driver

  • Reduced verbosity in etc/kayobe/pulp.yml

  • Changes the Grafana OpenStack dashboard to show HTTP status 300 as green instead of orange.

  • Adds a ci-aio environment for CI testing.

  • Adds a ci-builder environment for building Kolla container images in CI.