Zed Series Release Notes

stackhpc/13.1.0.104

Security Issues

stackhpc/13.1.0.93

Security Issues

  • Security fixes for bug 2119646: Unauthenticated access to EC2/S3 token endpoints can grant Keystone authorization.

stackhpc/13.1.0.91

Bug Fixes

  • Updated Neutron container image tags to fix CVE-2024-53916. See #2037002 for more details.

stackhpc/13.1.0.85

Bug Fixes

  • Fixes a regression when using growroot.yml and software raid where the playbook would fail to identify the correct disk.

stackhpc/13.1.0.81

New Features

  • Adds a new diagnostics.yml playbook that collects diagnostic information from hosts. The diagnostics are aggregated to a directory ($PWD/diagnostics/ by default) on localhost. The diagnostics include:

    • Docker container logs

    • Kolla configuration files

    • Log files

    The collected diagnostic information contains sensitive information such as passwords in configuration files.

stackhpc/13.1.0.80

Critical Issues

  • Fixes CVE-2024-40767 with updated container images for Nova services.

stackhpc/13.1.0.74

New Features

  • The Docker CE package for Ubuntu has been bumped from 5:24.0.6-1 to 5:25.0.0-1 This is a side effect of separating out the repos for Docker CE for Ubuntu Jammy/Focal.

Critical Issues

  • Disables password expiration and inactivity policies. This caused the kayobe and kolla service accounts to be locked out of the system. You should re-apply the CIS benchmark hardening playbook as soon as possible to avoid being locked out of your system.

Bug Fixes

  • Separated out repos for Docker CE for Ubuntu Jammy/Focal. This fixes a Pulp sync issue where two “identical” repository versions existed with different checksums.

  • Updates the stackhpc.hashicorp Ansible collection to 2.5.0. This brings in an idempotency fix for generating certificates.

stackhpc/13.1.0.72

Security Issues

  • Adds a custom Apt repository to address CVE-2024-6387 in OpenSSH.

stackhpc/13.1.0.70

Security Issues

  • Updates the Rocky Linux 9 SIG Security Common repository to address CVE-2024-6409 in OpenSSH.

  • Enables the Rocky Linux 9 SIG Security Common repository, which provides updated OpenSSH packages addressing CVE-2024-6387 (regreSSHion). Other packages available in this repository are currently ignored.

stackhpc/13.1.0.68

Critical Issues

  • Fixes CVE-2024-32498 with updated container images for Cinder, Glance and Nova services.

stackhpc/13.1.0.64

New Features

  • Added two alerts (Warning and critical) that are triggered when the ratio of (free_swap_sppace / total_swap_space) is below thresholds. Each threshold can be modified by alterting value of alertmanager_node_free_swap_warning_threshold_ratio and alertmanager_node_free_swap_critical_threshold_ratio.

    Currently this solution has limitation of having one-size fits all policy. This can cause unwanted alerts for the hosts which utilise swap heavily Therefore it is recommended to tune the thresholds or apply silence rules for the needs.

  • Bumped Horizon kolla image Bumped Grafana from 10.1.5-1 to 10.4.2-1 (CentOS & Rocky Linux) Bumped Grafana from 10.4.1 to 10.4.2 (Ubuntu) Bumped Prometheus-msteams from 1.5.0 to 1.5.2

  • Adds support for providing a CA certificate for OpenStack Capacity exporter.

  • Allows to synchronise a custom list of containers to Pulp using the stackhpc_pulp_repository_container_repos_extra and stackhpc_pulp_distribution_container_extra variables.

Security Issues

  • Fixed CVE-2023-31047 for Horizon. Fixed CVE-2023-49569 for Grafana. Fixed CVE-2022-40083 and CVE-2021-4238 for Prometheus-msteams.

stackhpc/13.1.0.63

Known Issues

  • Generate backend TLS files for network hosts. This fixes backend TLS configuration for deployments where some API services are running on network hosts.

Bug Fixes

  • Prevents raising a Ceph PgsUnclean alert because of backfilling which can frequently happen because of normal rebalancing activities, such as use of the Ceph balancer or OSD addition.

stackhpc/13.1.0.62

New Features

  • Bumped Horizon kolla image Bumped Grafana from 10.1.5-1 to 10.4.2-1 (Rocky Linux) Bumped Grafana from 10.4.1 to 10.4.2 (Ubuntu) Bumped Prometheus-msteams from 1.5.1 to 1.5.2

Security Issues

  • Fixed CVE-2023-31047 for Horizon. Fixed CVE-2023-49569 for Grafana. Fixed CVE-2022-40083 and CVE-2021-4238 for Prometheus-msteams.

stackhpc/13.1.0.61

Bug Fixes

  • The OpenSearch backend for CloudKitty has been fixed, so the Horizon Rating panels work again.

stackhpc/13.1.0.60

New Features

  • Adds a new Prometheus alert HostNetworkBondDegraded which will be raised when at least one bond member is down.

  • Adds a new Prometheus alert HostNetworkBondSingleLink which will be raised when a bond is configured with only one member. This can happen when NetworkManager detects that a bond member is down at boot time. This alert can be disabled by setting alertmanager_warn_network_bond_single_link to false.

stackhpc/13.1.0.58

New Features

  • Updates Magnum CAPI Helm driver version to v0.11.0

Upgrade Notes

  • Updates the Ansible configuration to fail on any unparsed inventory source. If you are using a separate Ansible configuration for Kolla Ansible, you may wish to add this setting in etc/kayobe/kolla/ansible.cfg.

stackhpc/13.1.0.56

Security Issues

  • The Heat container images are rebuilt with yaql 3.0.0 to include patch for vulnerability OSSN/OSSN-0093. It is recommended that you redeploy Heat services in your system with the current version of Heat images from StackHPC Release Train.

stackhpc/13.1.0.55

New Features

  • Automatic deployment for OpenStack Capacity via a Kayobe service deploy hook using kolla admin credentials.

Upgrade Notes

  • OpenStack Capacity no longer uses application credentials. Please delete any previously generated application credentials.

stackhpc/13.1.0.54

Security Issues

  • Kolla container images created using the stackhpc-container-image-build.yml workflow are now automatically scanned for vulnerablilities.

stackhpc/13.1.0.50

Bug Fixes

  • Updates Magnum CAPI Helm driver version to v0.10.0

  • Fixes Grafana panel of top Ceph pools by capacity used. This panel was only showing the most used pool instead of as many pools as configured with the $topk variable.

stackhpc/13.1.0.49

New Features

  • The smartmon-tools playbook now ensures that the cron service is running as in some cases it may not be running by default.

stackhpc/13.1.0.47

Upgrade Notes

  • Update Ubuntu Jammy Zed Kolla container tags.

stackhpc/13.1.0.45

New Features

  • Adds a custom playbook (pulp-auth-proxy.yml) for deploying an authenticating proxy for Pulp. This can be used when building container images to avoid leaking credentials for package repositories into the built images or their metadata.

stackhpc/13.1.0.40

New Features

  • Rocky images have been rebuilt and are now based on Rocky 9.3.

stackhpc/13.1.0.37

Bug Fixes

  • Removes bogus ContainerVolumeUsage alert. This rule wasn’t correctly measuring container volume IO and could cause spurious alerts.

  • Add a new reset-bls-entries.yml custom playbook which will rename existing Boot Loader Specification (BLS) entries using the current machine ID for each host. This should fix an issue with Grub not selecting the most recent kernel during boot.

stackhpc/13.1.0.36

New Features

  • Added support for Rocky Linux 9.3 repositories and Kolla containers. Made 9.3 the default version for Rocky Linux.

  • Updated Rocky Linux 9.2 pulp repo versions. Added Rocky Linux 9.3 pulp repo versions. Rebuilt Kolla containers with Rocky Linux 9.3.

Upgrade Notes

  • Bifrost Ironic debug logging is now disabled by default. Change ironic_debug to true to revert.

  • Updates Consul to 1.16.4 and Vault to 1.14.8.

Bug Fixes

  • Bumps OpenSearch heap size to 8 GB, to be identical to Elasticsearch.

stackhpc/13.1.0.35

New Features

  • StackHPC Kayobe Configuration container images for CI/CD with Kayobe Automation are now published to GitHub Container Registry (GHCR) at ghcr.io/stackhpc/stackhpc-kayobe-config. The image is tagged with the name of the release branch, e.g. stackhpc/yoga.

stackhpc/13.1.0.33

New Features

  • Adds support for deploying GitHub runners and creating GitHub workflows for use within Kayobe Automation. Two playbooks and their requirements have been added to ansible/ in addition to the relevant groups defined with some useful default variables where appropriate. Finally, documentation has been added to cover how to deploy these runners and workflows.

  • Added the stop-openstack-services.yml playbook, which can be used to stop OpenStack services across the overcloud.

Bug Fixes

  • Pin the OCI image tag used for the Ubuntu Focal base-image of Kolla image builds. This prevents packages in the image with the latest tag getting in front of StackHPC release-train package repositories. Ubuntu tag should be bumped when new packages are available in StackHPC release-train.

stackhpc/13.1.0.32

New Features

  • Updates OpenSearch to 2.11.1.

stackhpc/13.1.0.30

Bug Fixes

  • Pin the OCI image tag used for the base-image of Rocky 9 Kolla image builds. This prevents packages in the image with the latest tag getting in front of StackHPC release-train package repositories.

stackhpc/13.1.0.28

New Features

  • Added the rekey-hosts.yml playbook to automatically rotate the SSH keys on all hosts.

  • Adds support for Ubuntu Jammy and Rocky 9 to the CIS benchmark hardening playbook: cis.yml. This playbook will need to be manually applied.

  • Adds a panel in the Hardware Overview dashboard to show DWPD (Drive writes per day) for NVMEs. This is calculated by dividing the total bytes written in the past 24 hours by the drive capacity. This is currently only supported on NVMEs.

  • Adds alerts that will fire after 1 DWPD is sustained for 7 days, and a critical alert if 1 DWPD is sustained for 30 days.

Bug Fixes

  • Fixes display of the OpenSearch cluster health in Grafana when in yellow state.

  • Fix Grafana HAProxy dashboard when non-default Prometheus instance labels are used.

stackhpc/13.1.0.27

Upgrade Notes

  • Updates default Ceph images to v17.2.7 for Quincy.

Bug Fixes

  • Fixes Neutron so that load balancer FIPs are not broken on Neutron restart. See Neutron bug report.

  • Fixes issue where Netmiko devices were sending no commands to the switch since plug_bond_to_network is overridden in networking_generic_switch/devices/netmiko_devices/init.py and PLUG_BOND_TO_NETWORK to set to None. See NGS bug report.

stackhpc/13.1.0.26

Upgrade Notes

  • Updates Consul to 1.16.3 and Vault to 1.14.6.

Bug Fixes

  • Fixes the bulk API of CloudKitty so that it now supports the migration from Elasticsearch to OpenSearch.

  • Fixes an issue with the growroot playbook where disks such as ‘sdp’ would become ‘sd’ due to the removal of the trailing ‘p’ when dealing with nvme devices.

stackhpc/13.1.0.23

Bug Fixes

  • Restores valid value for the flavor_id label on openstack_nova_server_status Prometheus metrics.

stackhpc/13.1.0.22

New Features

  • Neutron containers are now built from our StackHPC fork.

stackhpc/13.1.0.20

New Features

  • The Cephadm pre and post commands now support default commands with the variables cephadm_commands_pre_default and cephadm_commands_post_default. As such, any extra commands should be added to the variables cephadm_commands_pre_extra and cephadm_commands_post_extra.

Bug Fixes

  • When using custom SCA policies for Wazuh, the agents are now correctly configured to allow commands to be executed from the manager.

  • Fixes an issue with Ansible Pulp modules depending on the pulp_glue Python library since the pulp.squeezer 0.0.14 release.

stackhpc/13.1.0.19

New Features

  • Rocky Linux 9 image has been rebuilt with missing base packages (e.g. microcode_ctl) by installing ‘Minimal Install’ DNF group. Also cloud-init from CentOS 9 Stream has been installed with NetworkManager support.

Bug Fixes

  • Fixes an issue when live migrating instances to hosts with cgroups v2 enabled (Ubuntu Jammy and Rocky 9). See Nova bug report.

  • Fixes a race condition when launching multiple Ironic instances in parallel (as is commonly triggered when using Terraform/OpenTofu). See Nova bug report.

  • Fixes an issue with Kolla container image builds for Ubuntu where the release train package repositories could be behind the container image, leading to image build failures.

stackhpc/13.1.0.17

Bug Fixes

  • Rebuild and bump the Bifrost container for Xena to include fix for Error while running update_to_latest_versions: ‘’BIOSSetting’’ object has no attribute during Ironic database migrations on upgrade

  • Disabled custom APT configuration for non-overcloud hosts (Ubuntu Only). This resolves the issue of the seed hypervisor attempting to pull packages from the repository on the seed before it has been deployed.

stackhpc/13.1.0.14

New Features

  • This patch adds OpenStack Capacity metrics and exporters to StackHPC Kayobe Config. This includes a deployment playbook, Prometheus scrape jobs and HAProxy configurations to support this change.

  • Adds ethtool and pciutils to the overcloud host disk image.

  • Raises an alert when the count of RabbitMQ ready messages increases above a threshold.

  • Adapt threshold of RabbitMQ connection alert based on the size of the deployment to avoid spurious alerts.

  • Wazuh can now de deployed with additional custom SCA policies. Just add the policy file(s) to the directory {{ kayobe_env_config_path }}/wazuh/custom_sca_policies.

Upgrade Notes

  • Rebuilt all kolla and package repo tags to bring in kernel fixes and apply CentOS image build customisations that were previously being ignored.

  • To deploy the OpenStack Capacity Grafana dashboard, you must define OpenStack application credential variables: secrets_os_capacity_credential_id and secrets_os_capacity_credential_secret as laid out in the ‘Monitoring’ documentation.

    You must also enable the stackhpc_enable_os_capacity flag for OpenStack Capacity HAProxy and Prometheus configuration to be templated.

    You may also change the default authentication URL from the kolla_internal_fqdn and change the default OpenStack region from RegionOne with the variables: stackhpc_os_capacity_auth_url and stackhpc_os_capacity_openstack_region_name.

    To disable certificate verification for the OpenStack Capacity exporter, you can set stackhpc_os_capacity_openstack_verify to false.

stackhpc/13.1.0.11

Upgrade Notes

  • The reboot.yml custom Ansible playbook now defaults to reboot only one host at a time. Existing behaviour can be retained by setting ANSIBLE_SERIAL=0.

Security Issues

  • The Rocky 8 minor version has been bumped to 8.8 and new snapshots have been created to include fixes for Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

  • The snapshots for Rocky 9.2 have been refreshed to include fixes for Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

stackhpc/13.1.0.9

Upgrade Notes

  • Enabled ML2/OVN by default. Checks preventing accidental migration from ML2/OVS were added in Kolla Ansible. If you are using a Neutron plugin other than ML2/OVN, set kolla_enable_ovn to false.

    OVN distributed FIP is disabled, to enable it set neutron_ovn_distributed_fip to true in etc/kayobe/kolla/globals.yml.

stackhpc/13.1.0.8

New Features

  • Provide ELRepo 9, which in turn provides packages to support be2net and mpt3sas hardware. Configuration of ELRepo 9 is disabled by default and may be enabled by setting dnf_install_elrepo_9: true.

Upgrade Notes

  • Configure Nova to use more modern ‘q35’ libvirt machine type rather than ‘pc’ which is considered legacy.

stackhpc/13.1.0.5

New Features

  • Nvmemon now reports physical size of the disk.

Upgrade Notes

  • CentOS Stream 8 snapshots have been bumped and new container images are available. Make sure to sync these into your local pulp. The yum repositories must be reconfigured to exclude a buggy version of iptables. To do this use: kayobe overcloud service reconfigure -kt none -t dnf.

  • CentOS Extras has been replaced with CentOS Extras Common. You may need to use the --allowerasing option with DNF if you have packages installed from the old repo. This is a one time only thing and on the next package update you can drop this argument.

Security Issues

  • Bumps CentOS Stream 8 snapshots to include fixes for Zenbleed (CVE-2023-20593) and Downfall (CVE-2022-40982). It is recommended that you update your OS packages and reboot into the kernel as soon as possible.

  • Bumps Ubuntu repository snapshots and container images to bring in latest security patches. This includes the microcode to patch Downfall (CVE-2022-40982). Zenbleed (CVE-2023-20593) was patched in the previous snapshot bump. To apply the microcode updates, it is recommended to reboot each host after upgrading all of the packages.

Bug Fixes

  • Fixes an issue with local image builds where kolla_tag had not been set. The error had the signature:

  • Upstream package repository mirrors are now restored in Kolla container images. This makes it possible to install or update packages for debugging purposes.

stackhpc/13.1.0.4

Upgrade Notes

  • Instance labels in prometheus now use inventory hostnames rather than IPs.

stackhpc/13.1.0.2

New Features

  • The playbook hotfix-containers.yml has been added. This allows arbitrary files to be copied into, and/or arbitrary commands to be executed within, overcloud containers.

  • Support for Ubuntu 22.04 Jammy Jellyfish repositories have been added to the Yoga Release.

  • Improvements to the ci-aio automated deployment script to allow the script to successfully run on LVM-based images.

  • Adds support for using Ceph HAProxy and Keepalived images stored in Pulp. This is enabled automatically if stackhpc_sync_ceph_images is set to true.

  • Adds support for synchronising HashiCorp Consul and Vault images to a local Pulp registry.

  • Prebuilt overcloud host images can now be pulled from Ark using the stackhpc_download_overcloud_host_images variable. The image is selected based on os_distribution and os_release.

Upgrade Notes

  • Bumped focal package versions due to unmet depenencies

Bug Fixes

  • Added NetworkManager-config-server package to Rocky Linux 9 deployment image. Which prevents NetworkManager from automatically running DHCP on unconfigured ethernet devices and allows connections with static IP addresses to be brought up even on ethernet devices with no carrier.

  • Caps the number of Pulp API and content workers to 32 each to avoid errors on hosts with many CPUs.

  • Fixes Octavia health monitors not being created on cluster spawn.

  • Fixes documentation builds on Read the Docs.

Other Notes

  • Reduced verbosity in etc/kayobe/pulp.yml

stackhpc/13.1.0.1

New Features

  • Add blazar project Kolla container images. Blazar is a resource reservation service for OpenStack. Blazar enables users to reserve a specific type/amount of resources for a specific time period and it leases these resources to users based on their reservations.

  • Adds caso container images. cASO is an is an accounting reporter that supports Cloud Accounting Usage Records. For more information, see the upstream docs. Note that this container does not exist in upstream Kolla and is maintained downstream by StackHPC.

  • Adds code to the globals.yml file to add endpoints for the ceph_mgr_exporter. If ceph is configured correctly, managers will be under the mgrs inventory group. If this group is empty, then the variable will just be empty (the KA default). This also requires setting kolla_enable_prometheus_ceph_mgr_exporter to true.

  • Set monitoring services be enabled by default in the ci-multinode environment.

  • OpenSearch container images have been added.

  • Add the package repository configuration required for Rocky Linux 9 support.

    Add CI for Rocky 9 hosts.

  • Added support for Rocky Linux 9.2 repositories and made 9.2 the default version.

  • Adds support for using a VMs as compute and controller nodes in the ci-multinode environment by dynamically setting the MTU of the networks in networks.yml and removing the static definition of the network interfaces for the compute and controller groups.

  • Add Wazuh deployment playbook.

  • Adds utility playbooks to build and rotate amphora images. For more details check out the Octavia section of the Operator Guide included in the documentation.

  • Brings in new neutron container images to add batching support to Networking Generic Switch. This is opt in via the ngs_batch_requests configuration option and only affects Ironic deployments that use Networking Generic Switch. See the following PR for more details.

  • Updates neutron containers to contain a version of networking-generic-switch with support for trunk ports when using DellOS 10 or Cisco switches. See this PR for more details.

  • Updates neutron containers to contain a version of networking-generic-switch with support for DellOS 10. See this PR for more details.

  • Added a script to the AIO environment that can be used to quickly deploy an AIO for testing.

  • Adds time information to tasks using the ansible.posix.profile_tasks callback.

  • Adds some basic tuning of Ansible, including use of 20 forks, enabling SSH pipelining, YAML-formatted output, and disabling fact variable injection.

  • magnum container now has capi driver

  • Mariabackup is now enabled by default.

  • The flag om_enable_rabbitmq_high_availability is now set to true. Adds tags for new RabbitMQ containers to update to RabbitMQ version 3.9.22.

  • Adds an etcd Kolla container image. This can be used for OpenStack service coordination as a tooz backend, or for batched processing of switch configuration in Networking Generic Switch (this requires a downstream NGS patch).

  • Adds drive temperatures to the table on the hardware overview dashboard and a timeseries to show the temperature over time.

  • Adds picker to hardware overview dashboard to select a specific host to show drive information for.

  • Adds a new variable stackhpc_pulp_sync_for_local_container_build which, when set to true, configures the local Pulp server to sync all package repositories required for building kolla containers on a local kolla build host.

  • Enable TLS for the Seed Pulp service. Set pulp_enable_tls: true and provide paths to a TLS certificate and key using pulp_cert_path and pulp_key_path respectively.

  • Adds a standard LVM configuration that is compatible with the new overcloud host image.

  • adds helm client into magnum container

  • Adds support for Manila in the ci-multinode environment using the CephFS native backend. This is disabled by default, but can be enabled by setting the following variables in the kayobe configuration: kolla_enable_manila: true kolla_enable_manila_backend_cephfs_native: true

  • Updated the documentation for the ci-multinode to include instructions on how to set up and test Magnum.

  • Added support for Wazuh in the ci-multinode environment.

  • Updates Prometheus Node exporter to version 1.5.0.

  • Adds NTP alerts to prometheus alertmanager.

  • Adds alerts for Octavia load balancers and amphorae. Alerts are triggered when load balancers enter the ERROR or DEGRADED states, or when amphorae enter the ERROR state.

  • Adds a new Grafana dashboard for Octavia. This dashboard is used to monitor the load balancers as well as the amphorae.

  • Adds a standard overcloud Diskimage Builder (DIB) host image configuration.

  • Re-enable Pulp Ubuntu repositories.

  • Package repositories and container images for CentOS Stream based deployments have been updated. Key packages to note are:

    • Kernel

      • version: 4.18.0

      • release: 448.el8

    • Libvirt

      • version: 8.0.0

      • release: 6.module_el8.7.0+1140+ff0772f9

    • OVS

      • version: 2.17.0

      • release: 71.el8s

    • OVN

      • version: 22.09.0

      • release: 11.el8s

  • Container images for Ubuntu based deployments have been updated. Key packages to note are:

    • Libvirt

      • version: 8.0.0

      • release: 1ubuntu7.4~cloud0

    • OVS

      • version: 2.17.3

      • release: 0ubuntu0.22.04.1~cloud0

    • OVN (unchanged since last container build)

      • version: 22.03.0

      • release: 0ubuntu1~cloud0

  • Sync Rocky Linux 8.7 RPM repositories to local Pulp servers.

  • Enables SMART monitoring. Manual action is required, please see the monitoring documentation for the procedure.

  • Split cephadm_commands into cephadm_commands_pre and cephadm_commands_post commands. This allows the user to run commands that must be run before the rest of the post-deployment configuration, as well as commands that rely on resources created by the post-deployment config.

  • Updates Grafana to 9.4.7 version.

  • Upgrades Pulp from 3.21 to 3.22.

  • Disables Pulp analytics.

  • Sets Pulp worker based on available CPU cores. This may improve performance when pulling container images to many hosts simultaneously.

  • Upgrades Pulp from 3.22 to 3.23.

  • Upgrades Pulp from 3.23 to 3.24.

  • Adds support for package repository snapshots via Pulp. A local Pulp server is deployed on the seed, which syncs package repositories and container images from the StackHPC Ark Pulp server. Control plane servers pull packages and container images from the local Pulp server.

  • The EPEL package repository is disabled by default. It may be enabled by setting dnf_enable_epel to true.

  • Uses StackHPC source code repositories for kolla, kolla-ansible, and bifrost.

  • Supports Kolla CentOS Stream 8 source container images.

  • Adds custom playbooks for compute host maintenance:

    • nova-compute-drain.yml

    • nova-compute-disable.yml

    • nova-compute-enable.yml

    • reboot.yml

  • Adds a custom playbook to reset the RabbitMQ cluster and restart OpenStack services that use it, rabbitmq-reset.yml.

  • Adds a custom playbook to configure swap, swap.yml.

  • Adds the Kayobe Automation Git repository as a submodule, and provides some basic configuration for it in an .automation.conf directory.

  • Adds support for deploying a Squid caching proxy as a custom container on the seed.

  • Enables Elasticsearch, Grafana, Kibana, Prometheus by default. Provides standard dashboards for Grafana and alerting rules for Prometheus.

Upgrade Notes

  • Bumps octavia container versions

  • Bumped rocky 9 package versions due to missing snapshot

  • container tags for magnum capi changes

  • Updates Ceph Pacific container image to v16.2.11.

  • Automatically install Quincy if the node is running Ubuntu 22.04, else install Pacific.

  • Increase stackhpc.cephadm collection to version 1.12.2.

  • Enables Docker live restore by default. This may be disabled by setting docker_daemon_live_restore to false in docker.yml.

  • The flag om_enable_rabbitmq_high_availability is now set to true. As this enables durable queues, RabbitMQ will need to be reset, and the services which use it restarted. Tags are added to update the RabbitMQ containers to version 3.9.22.

  • The overcloud host image build workflow now uploads the built image to SMS as well as ARK, allowing it to be tested both manually and through AIO CI jobs.

  • Updated OVN package version from 22.06 to 22.09.

  • openvswitch version has been updated to ~2.17.5 on all distributions (CentOS/Rocky9 are two patches ahead of 2.17.5). Images include fixes for CVE-2023-1668.

    Ubuntu repository versions for focal and ubuntu cloud archive have been updated to 20230515.

  • Kolla tag overrides have been refactored to allow kolla-ansible to resolve them individually by host. This means that mixed clouds can be deployed which allows for migration between distributions.

  • Dont pull apt packages from pulp for Ubuntu Jammy until Jammy packages are published.

  • Dont pull ceph packages from ceph official repos for Ubuntu Jammy until Jammy packages are published.

  • Updates the smartmon-tools.yml playbook to ensure that cron is installed before attempting to configure crontab.

  • Updated Ubuntu package repository versions.

Bug Fixes

  • Fixed a syntax error in Prometheus SMART monitoring rules.

  • Fixes the hardware overview dashboard to use the correct metric for displaying drive temps. Now uses an or to display whichever metric is compatible with the drives in the system. The two metrics are temperature_case_raw_value and temperature_celsius_raw_value.

  • Fixes the issue with using SAML2 federation in Keystone against NetIQ IdP.

  • Fixes internet connectivity for VMs deployed in the ci-multinode environment.

  • Fixes creation of over 1TB memory VMs on AMD with IOMMU enabled on Rocky Linux 9.

  • Fixes the smartmon script to be case insensitive when checking for the inital SMART info. This is to ensure that the script works correctly on systems where the output of smartctl -i is not capitalised as previously expected by the script. This leads to badly formatted .prom files which lead to node_exporter failing to scrape the file.

  • Fixes the InstanceDown alerting rule wait time to be consistent with the alert message. The alert message says “for 5 minutes” but the rule was set to wait for 1 minute.

  • Updates nova image to bring in a fix for parsing mdev uuids when using libvirt>=7.7. See bug for more details.

  • Add unit to LowMemory alert description.

  • Fixes CoreDNS for Magnum clusters crashing on startup.

  • Allows cinder-csi nodeplugin to start on the same Magnum cluster host as cinder-csi controllerplugin.

  • Corrects ClusterRole rules for Magnum cluster-autoscaler, and sets cluster-autoscaler pods to use hostNetwork.

  • Disables metadata proxy over IPv6 inside Neutron DHCP agent to work around bug 1953165.

  • Fix for nova resize API not parsing the new flavor on resize - bug 1805969.

  • Fix creation of VM instances with UEFI enabled and Secure Boot disabled.

  • Fixes synchronisation and DNF configuration of the Rocky Linux 9 CRB repository.

  • HAProxy alerting rules have been updated to use the server name that is down, rather than the name of the instance that reported the down server.

Other Notes

  • deployment guide docs added for new capi driver

  • Changes the Grafana OpenStack dashboard to show HTTP status 300 as green instead of orange.

  • Adds a ci-aio environment for CI testing.

  • Adds a ci-builder environment for building Kolla container images in CI.