ROCm 7.9.0 Preview Release
ROCm Core SDK 7.9.0 introduces a technology preview release aimed at helping
developers explore the new ROCm build and release infrastructure system called
TheRock. See ROCm Core SDK and TheRock Build System for more information.
This release focuses on foundational improvements and streamlining the development experience.
Important
ROCm 7.9.0 introduces a versioning discontinuity following the previous 7.0 releases.
Versions 7.0 through 7.8 are reserved for production stream ROCm releases,
while versions 7.9 and later represent the technology preview release stream.
Both streams share a largely similar code base but differ in their build systems.
These differences include the CMake configuration, operating system package dependencies,
and integration of AMD GPU driver components.
Maintaining parallel release streams allows users ample time to evaluate and
adopt the new build system and dependency changes. The technology preview
stream is planned to continue through mid‑2026, after which it will replace the
current production stream.
Release highlights
This technology preview of the ROCm Core SDK with TheRock introduces several
key foundational changes:
- ManyLinux_2_28 compliance: Enables single builds to support multiple Linux distributions, improving portability and simplifying deployment.
- Architecture-specific Python packages: Redesigned to target individual GPU architectures, reducing disk usage and improving modularity.
- Slimmed-down SDK: Focuses on core GPU compute capabilities with a minimal set of runtime components, libraries, and tools.
In addition to these technical updates, this release also begins the transition
to a more open and predictable development process:
- Open release process: Transition to a fully open model with public release candidates, nightly builds, and transparent pull request workflows.
- Predictable release cadence: Major and minor versions will follow a fixed 6-week release cycle.
7.9.0 compatibility notice
In terms of package compatibility, ROCm 7.9.0 diverges from the existing ROCm
7.0 stream and upcoming stable releases in that stream:
- No upgrade path from existing production releases -- including ROCm 7.0 and earlier -- as well as from upcoming stable releases. See the explanatory note.
- Not intended for production workloads -- users running production environments should continue using the ROCm 7.0 stream.
See the explanatory note. - Not fully featured -- this release is a stepping stone toward fully open software development.
7.9.0 support
- Hardware support: Builds are limited to AMD Instinct MI350 Series GPUs, MI300 Series GPUs and APUs, Ryzen AI Max+ PRO 300 Series APUs, and Ryzen AI Max 300 Series APUs. See Supported hardware and operating systems.
- Packaging format: RPM and Debian packages are not available in this initial release. Instead, Python wheels and tarballs are provided. See the ROCm 7.9.0 installation instructions.
- Software components: Some components of the ROCm Core SDK are not yet
available in this release. Additional components are planned to be introduced in
future preview releases as part of the ROCm Core SDK. Components not included in
the future Core SDK will either:- Be released as standalone project-specific packages, or
- Be grouped into ROCm Expansion SDKs.
Looking ahead
Subsequent technology preview releases will follow a 6-week cadence, gradually
filling gaps and introducing new ROCm expansions. AMD will continue to maintain
traditional ROCm releases in parallel with the 7.9+ preview stream.
Supported hardware and operating systems
ROCm 7.9.0 supports the following AMD Instinct GPUs and Ryzen AI
APUs. Each supported device is listed with its corresponding GPU architecture,
LLVM target, and supported operating systems.
Note
If you're running ROCm on Linux, ensure your system is using a supported kernel version.
Future preview releases will expand operating system support coverage.
Instinct MI355X Instinct MI350X |
CDNA4 |
gfx950 |
Ubuntu 24.04.3
Ubuntu 22.04.5
RHEL 10.0
RHEL 9.6 |
Instinct MI325X Instinct MI300X Instinct MI300A |
CDNA3 |
gfx942 |
|
Ryzen AI Max+ PRO 395 Ryzen AI Max+ PRO 390 Ryzen AI Max+ PRO 385 Ryzen AI Max+ PRO 380 |
RDNA3.5 |
gfx1151 |
Ubuntu 24.04.3 Windows 11 24H2 |
Ryzen AI Max 395 Ryzen AI Max 390 Ryzen AI Max 385 |
RDNA3.5 |
gfx1151 |
Ubuntu 24.04.3 Windows 11 24H2 |
Note
This release supports a limited number of GPU and APUs.
Hardware support will be expanded with future releases -- following the six-week release cadence.
Supported kernel driver and firmware bundles
ROCm depends on a coordinated stack of compatible firmware, driver, and user
space components. Maintaining version alignment between these layers ensures correct GPU
operation and performance, especially for AMD data center products.
While AMD publishes drivers and ROCm user space components, your server or
infrastructure provider publishes the GPU and baseboard firmware by bundling
AMD firmware releases through Platform Level Data Model (PLDM) bundles --
which include the Integrated Firmware Image (IFWI).
Note
Supported Ryzen AI APUs require the inbox kernel driver included with Ubuntu 24.04.3.
GPU virtualization is not supported in ROCm 7.9.0.
Instinct MI355X |
AMD GPU Driver (amdgpu) |
Not applicable |
01.25.15.04 01.25.13.09 |
Instinct MI350X |
|||
Instinct MI325X |
01.25.04.02 01.25.03.03 |
||
Instinct MI300X |
01.25.03.12 |
||
Instinct MI300A |
BKC 26 BKC 25 |
||
Ryzen AI Max+ PRO 395 |
Inbox kernel driver |
25.9.2 |
Not applicable |
Ryzen AI Max+ PRO 390 |
|||
Ryzen AI Max+ PRO 385 |
|||
Ryzen AI Max+ PRO 380 |
|||
Ryzen AI Max 395 |
|||
Ryzen AI Max 390 |
|||
Ryzen AI Max 385 |
Deep learning frameworks
ROCm 7.9.0 supports PyTorch 2.7.1 on Linux and PyTorch 2.9.0 on Windows.
ROCm Core SDK components
The following table lists core components included in the ROCm 7.9.0 release.
Expect future releases in this stream to expand the list of components.
ROCm 7.0.2 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable new features and improvements in ROCm 7.0.2. For changes to individual components, see
Detailed component changes.
Supported hardware, operating system, and virtualization changes
ROCm 7.0.2 adds support for the RDNA4 architecture-based AMD Radeon RX 9060. For more information about supported AMD hardware, see Supported GPUs (Linux).
ROCm 7.0.2 adds support for the following operating systems and kernel versions:
- Debian 13 (kernel: 6.12)
- Oracle Linux 10 (kernel: 6.12.0 [UEK])
- RHEL 10.0 (kernel: 6.12.0-55)
For more information about supported operating systems, see Supported operating systems and install instructions.
Virtualization support
Virtualization support remains unchanged in this release. For more information, see Virtualization Support.
User space, driver, and firmware dependent changes
The software for AMD Datacenter GPU products requires maintaining a hardware
and software stack with interdependencies between the GPU and baseboard
firmware, AMD GPU drivers, and the ROCm user space software.
<style> tbody#virtualization-support-instinct tr:last-child { border-bottom: 2px solid var(--pst-color-primary); } </style>
ROCm 7.0.2 | MI355X |
01.25.15.02 (or later) 01.25.13.09 |
30.10.2 30.10.1 30.10 |
8.4.1.K |
MI350X |
01.25.15.02 (or later) 01.25.13.09 |
30.10.2 30.10.1 30.10 |
||
MI325X |
01.25.04.02 (or later) 01.25.03.03 |
30.10.2 30.10.1 30.10 6.4.z where z (0-3) 6.3.y where y (1-3) |
||
MI300X | 01.25.05.00 (or later)[1] 01.25.03.12 |
30.10.2 30.10.1 30.10 6.4.z where z (0–3) 6.3.y where y (0–3) 6.2.x where x (1–4) |
8.4.1.K | |
MI300A | BKC 26 (or later) BKC 25 |
Not Applicable | ||
MI250X | IFWI 47 (or later) | |||
MI250 | MU5 w/ IFWI 75 (or later) | |||
MI210 | MU5 w/ IFWI 75 (or later) | 8.4.0.K | ||
MI100 | VBIOS D3430401-037 | Not Applicable |
AMD Instinct MI300X GPU resiliency improvement
Multimedia Engine Reset is now supported in AMD GPU Driver (amdgpu) 30.10.2 for AMD Instinct MI300X GPUs. This finer-grain GPU resiliency feature allows recovery from faults related to VCN or JPEG without requiring a full GPU reset, thereby improving system stability and fault tolerance. Note that VCN queue reset functionality requires PLDM bundle 01.25.05.00 (or later) firmware.
New OS support in ROCm dependent on AMD GPU Driver
ROCm support for RHEL 10.0 and Oracle 10 requires AMD GPU Driver 30.10.2 or later.
RAG AI support enabled for ROCm
In September 2025, Retrieval-Augmented Generation (RAG) was added to the ROCm platform. Use RAG to build and deploy end-to-end AI pipelines on AMD GPUs. It enhances the accuracy and reliability of a large language model (LLM) by exposing it to up-to-date, relevant information. When queried, RAG retrieves relevant data from its knowledge base and uses it in conjunction with the query to generate accurate and informed responses. This approach minimizes hallucinations (the creation of false information) while also enabling the model to access current information not present in its original training data. For more information, see the ROCm-RAG documentation.
gsplat support enabled for ROCm
Gaussian splatting (gsplat) is an open-source library for GPU-accelerated differentiable rasterization of 3D Gaussians with Python bindings. This ROCm-enabled release of gsplat is built on top of PyTorch for ROCm, enabling innovators in computer graphics, machine learning, and 3D vision to leverage GPU acceleration with AMD Instinct GPUs. With gsplat, you can build, research, and innovate with Gaussian splatting. To install gsplat on ROCm, see installation instructions.
Introducing ROCm Life Science (ROCm-LS) toolkit
The ROCm Life Science (ROCm-LS) toolkit is an open-source software collection for high-performance life science and healthcare applications built on the core ROCm platform. It helps you accelerate life science processing and analyze workloads on AMD GPUs. ROCm-LS is in an early access state. Running production workloads is not recommended. For more information, see the AMD ROCm-LS documentation.
ROCm-LS provides the following tools to build a complete workflow for life science acceleration on AMD GPUs:
-
The hipCIM library provides powerful support for GPU-accelerated I/O operations, coupled with an array of computer vision and image processing primitives designed for N-dimensional image data in fields such as biomedical imaging. For more information, see the hipCIM documentation.
-
MONAI for AMD ROCm, a ROCm-enabled version of MONAI, is built on top of PyTorch for AMD ROCm, helping healthcare and life science innovators to leverage GPU acceleration with AMD Instinct GPUs for high-performance inference and training of medical AI applications. For more information, see the MONAI for AMD ROCm documentation.
Deep learning and AI framework updates
ROCm provides a comprehensive ecosystem for deep learning development. For more information, see Deep learning frameworks for ROCm and the Compatibility
matrix for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm.
Updated framework support
ROCm 7.0.0 introduces several newly supported versions of Deep learning and AI frameworks:
PyTorch
ROCm 7.0.2 enables support for PyTorch 2.8.
New frameworks
AMD ROCm has officially added support for the following Deep learning and AI frameworks:
- FlashInfer is a library and kernel generator for Large Language Models (LLMs) that provides a high-perf...
ROCm 7.0.1 Release
ROCm 7.0.1 is a quality release that resolves the issue listed in the Release highlights.
Release highlights
The following issue has been resolved in the AMD GPU Driver (amdgpu) 30.10.1 to be used with ROCm 7.0.1.
Failure to declare out-of-bound CPERs for bad memory page
The issue of failing to declare Out-Of-Band Common Platform Error Records (CPERs) when exceeding bad memory page threshold has been resolved. The fix applies to all AMD Instinct MI300 Series and MI350 Series GPUs.
User space, driver, and firmware dependent changes
The software for AMD Datacenter GPU products requires maintaining a hardware
and software stack with interdependencies between the GPU and baseboard
firmware, AMD GPU drivers, and the ROCm user space software.
ROCm 7.0.1 | MI355X |
01.25.13.09 (or later) 01.25.11.02 |
30.10.1 30.10 |
8.4.0.K |
MI350X |
01.25.13.09 (or later) 01.25.11.02 |
30.10.1 30.10 |
||
MI325X |
01.25.04.02 (or later) 01.25.03.03 |
30.10.1 30.10 6.4.z where z (0-3) 6.3.y where y (1-3) |
||
MI300X | 01.25.03.12 (or later) 01.25.02.04 |
30.10.1 30.10 6.4.z where z (0–3) 6.3.y where y (0–3) 6.2.x where x (1–4) |
8.4.0.K | |
MI300A | 26 (or later) | Not Applicable | ||
MI250X | IFWI 47 (or later) | |||
MI250 | MU5 w/ IFWI 75 (or later) | |||
MI210 | MU5 w/ IFWI 75 (or later) | 8.4.0.K | ||
MI100 | VBIOS D3430401-037 | Not Applicable |
Note
ROCm 7.0.1 doesn't include any other significant changes or feature additions. For comprehensive changes, new features, and enhancements in ROCm 7.0, refer to the ROCm 7.0.0 release notes.
ROCm 7.0.0 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Note
If you’re using AMD Radeon™ PRO or Radeon GPUs in a workstation setting with a display connected, see the Use ROCm on Radeon GPUs documentation to verify compatibility and system requirements.
Release highlights
The following are notable new features and improvements in ROCm 7.0.0. For changes to individual components, see
Detailed component changes.
Operating system, hardware, and virtualization support changes
ROCm 7.0.0 adds support for AMD Instinct MI355X and MI350X. For details, see the full list of Supported GPUs (Linux).
ROCm 7.0.0 adds support for the following operating systems and kernel versions:
- Ubuntu 24.04.3 (kernel: 6.8 [GA], 6.14 [HWE])
- Rocky Linux 9 (kernel: 5.14.0-570)
ROCm 7.0.0 marks the end of support (EoS) for Ubuntu 24.04.2 (kernel: 6.8 [GA], 6.11 [HWE]) and SLES 15 SP6.
For more information about supported operating systems, see Supported operating systems and install instructions.
See the Compatibility matrix for more information about operating system and hardware compatibility.
Virtualization support
ROCm 7.0.0 introduces support for KVM Passthrough for AMD Instinct MI350X and MI355X GPUs.
All KVM-based SR-IOV supported configurations require the GIM SR-IOV driver version 8.4.0.K. Refer to GIM Release note for more details. In addition, support for VMware ESXi 8 has been introduced for AMD Instinct MI300X GPUs. For more information, see Virtualization Support.
Deep learning and AI framework updates
ROCm provides a comprehensive ecosystem for deep learning development. For more information, see Deep learning frameworks for ROCm and the Compatibility
matrix for the complete list of Deep learning and AI framework versions tested for compatibility with ROCm.
Updated framework support
ROCm 7.0.0 introduces several newly supported versions of Deep learning and AI frameworks:
PyTorch
ROCm 7.0.0 enables the following PyTorch features:
- Support for PyTorch 2.7.
- Integrated Fused Rope kernels in APEX.
- Compilation of Python C++ extensions using amdclang++.
- Support for channels-last NHWC format for convolutions via MIOpen.
JAX
ROCm 7.0.0 enables support for JAX 0.6.0.
Megatron-LM
Megatron-LM for ROCm now supports:
-
Fused Gradient Accumulation via APEX.
-
Fused Rope Kernel in APEX.
-
Fused_bias_swiglu kernel.
TensorFlow
ROCm 7.0.0 enables support for TensorFlow 2.19.1.
ONNX Runtime
ROCm 7.0.0 enables support for ONNX Runtime 1.22.0.
vLLM
- Support for Open Compute Project (OCP) FP8 data type.
- FP4 precision for Llama 3.1 405B.
Triton
ROCm 7.0.0 enables support for Triton 3.3.0.
New frameworks
AMD ROCm has officially added support for the following Deep learning and AI frameworks:
-
Ray is a unified framework for scaling AI and Python applications from your laptop to a full cluster, without changing your code. Ray consists of a core distributed runtime and a set of AI libraries for simplifying machine learning computations. It is currently supported on ROCm 6.4.1. For more information, see Ray compatibility.
-
llama.cpp is an open-source framework for Large Language Model (LLM) inference that runs on both central processing units (CPUs) and graphics processing units (GPUs). It is written in plain C/C++, providing a simple, dependency-free setup. It is currently supported on ROCm 6.4.0. For more information, see llama.cpp compatibility.
AMD GPU Driver/ROCm packaging separation
The AMD GPU Driver (amdgpu) is now distributed separately from the ROCm software stack and is stored under in its own location /amdgpu/ in the package repository at repo.radeon.com. The first release is designated as AMD GPU Driver (amdgpu) version 30.10. See the User and kernel-space support matrix for more information.
AMD SMI continues to stay with the ROCm software stack under the ROCm organization repository.
Consolidation of ROCm library repositories
The following ROCm library repositories are migrating from multiple repositories under {fab}github ROCm to a single repository under {fab}github rocm-libraries in the ROCm organization GitHub: hipBLAS, hipBLASLt
, hipCUB, hipFFT, hipRAND, hipSPARSE, hipSPARSELt, MIOpen, rocBLAS, rocFFT, rocPRIM, rocRAND, rocSPARSE, rocThrust, and Tensile.
Use the new ROCm Libraries repository to access source code, clone projects, and contribute to the code base and documentation.The change helps to streamline development, CI, and integration. For more information about working with the ROCm Libraries repository, see Contributing to the ROCm Libraries in GitHub.
Other ROCm libraries are also in the process of migration along with ROCm tools to {fab}github rocm-systems. For latest status information, see the README file. The official completion of migration will be communicated in a future ROCm release.
HIP API compatibility improvements
To improve code portability between AMD ROCm and other programming models, HIP API has been updated in ROCm 7.0.0 to simplify cross-platform programming. These changes are incompatible with prior ROCm releases and might require recompiling existing HIP applications for use with ROCm 7.0.0. For more information, see the HIP API 7.0.0 changes and the HIP changelog below.
HIP runtime updates
The HIP runtime now includes support for:
- Open Compute Project (OCP) MX floating-point FP4, FP6, and FP8 data types and APIs.
- Improved logging by adding more precise pointer information and launch arguments for better tracking and debugging in dispatch met...
ROCm 6.4.3 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
ROCm 6.4.3 is a quality release that resolves the following issues. For changes to individual components, see Detailed component changes.
AMDGPU driver updates
- Resolved an issue causing performance degradation in communication operations, caused by increased latency in certain RCCL applications. The fix prevents unnecessary queue eviction during the fork process.
- Fixed an issue in the AMDGPU driver’s scheduler constraints that could cause queue preemption to fail during workload execution.
ROCm SMI update
- Fixed the failure to load GPU data like System Clock (SCLK) by adjusting the logic for retrieving GPU board voltage.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
-
Tutorials for AI developers have been expanded with the following five new tutorials:
- Inference tutorials
- GPU development and optimization tutorial: MLA decoding kernel of AITER library
For more information about the changes, see Changelog for the AI Developer Hub.
-
ROCm provides a comprehensive ecosystem for deep learning development. For more details, see Deep learning frameworks for ROCm. AMD ROCm adds support for the following deep learning frameworks:
- Taichi is an open-source, imperative, and parallel programming language designed for high-performance numerical computation. Embedded in Python, it leverages just-in-time (JIT) compilation frameworks such as LLVM to accelerate compute-intensive Python code by compiling it to native GPU or CPU instructions. It is currently supported on ROCm 6.3.2. For more information, see Taichi compatibility.
- Megablocks is a light-weight library for mixture-of-experts (MoE) training. The core of the system is efficient "dropless-MoE" and standard MoE layers. Megablocks is integrated with Megatron-LM, where data and pipeline parallel training of MoEs is supported. It is currently supported on ROCm 6.3.0. For more information, see Megablocks compatibility.
-
The Data types and precision support topic now includes new hardware and library support information.
Operating system and hardware support changes
Operating system and hardware support remain unchanged in this release.
See the Compatibility matrix for more information about operating system and hardware compatibility.
ROCm components
The following table lists the versions of ROCm components for ROCm 6.4.3.
Click {fab}github to go to the component's source code on GitHub.
ROCm 6.4.2 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable new features and improvements in ROCm 6.4.2. For changes to individual components, see
Detailed component changes.
ROCm Compute Profiler enhancements
ROCm Compute Profiler includes the following changes:
-
The --roofline-data-type option now supports FP8, FP16, BF16, FP32, FP64, I8, I32, and I64 data types. This is dependent on the GPU architecture. For more information, see Roofline options.
-
ROCm Compute Profiler now uses AMD SMI instead of ROCm SMI. The AMD System Management Interface Library (AMD SMI) is a successor to ROCm SMI. It is a unified system management interface tool that provides a user-space interface for applications to monitor and control GPU applications and gives users the ability to query information about drivers and GPUs on the system. For more information, see https://github.com/ROCm/amdsmi and the AMD SMI documentation.
-
ROCm Compute Profiler has added 8-bit floating point (FP8) metrics support for AMD Instinct MI300 series accelerators. For more information, see System Speed-of-Light.
rocSOLVER enhancements
rocSOLVER has improved the performance of eigensolvers and singular value decomposition (SVD). For more information, see rocSOLVER documentation.
ROCm Offline Installer Creator updates
The ROCm Offline Installer Creator 6.4.2 includes the following features and improvements:
- Added support for Oracle Linux 8.10 and 9.6, and SLES 15 SP7.
- Additional package options for the Offline Installer Creator, including amd-smi, rocdecode, rocjpeg, and rdc.
- ROCm meta packages are now used for selecting ROCm components and use cases.
- Improved separation of kernel/driver and ROCm prerequisite packages to reduce the size of ROCm-only or driver-only offline installers.
In addition, the option to build an offline installer based on ROCm version 5.7.3 has been removed. To build an offline installer for ROCm 5.7.3, use the Offline Installer Creator from version 6.4.1 or earlier. See ROCm Offline Installer Creator for more information.
ROCm Runfile Installer updates
The ROCm Runfile Installer 6.4.2 adds support for Oracle Linux 8.10 and 9.6 (using the RHEL 8 or 9 .run files), Debian 12 (using the Ubuntu 22.04 .run file), and SLES 15 SP7. It also fixes permission settings issues during ROCm and AMDGPU driver installation. For more information, see ROCm Runfile Installer.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
-
Tutorials for AI developers have been expanded with the following four new tutorials:
- Inference tutorial: AI agent with MCPs using vLLM and PydanticAI
- GPU development and optimization tutorials:
For more information about the changes, see Changelog for the AI Developer Hub.
-
ROCm provides a comprehensive ecosystem for deep learning development. For more details, see Deep learning frameworks for ROCm. As of July 2025, AMD ROCm provides support for the following additional deep learning frameworks:
- Deep Graph Library is an easy-to-use, high-performance, and scalable Python package for deep learning on graphs. DGL is framework agnostic, meaning if a deep graph model is a component in an end-to-end application, the rest of the logic is implemented using PyTorch. It is currently supported on ROCm 6.4.0. For more information, see DGL compatibility.
- Stanford Megatron-LM is a large-scale language model training framework. It’s designed to train massive transformer-based language models efficiently by model and data parallelism. It is currently supported on ROCm 6.3.0. For more information, see Stanford Megatron-LM compatibility.
- Volcano Engine Reinforcement Learning for LLMs (verl) is a reinforcement learning framework designed for large language models (LLMs). verl offers a scalable, open-source fine-tuning solution optimized for AMD Instinct GPUs with full ROCm support. It is currently supported on ROCm 6.2.0. For more information, see verl compatibility.
-
Documentation for the new ROCprof Compute Viewer was added in May 2025. This tool is used to visualize and analyze GPU thread trace data collected using rocprofv3. Note that ROCprof Compute Viewer is in an early access state. Running production workloads is not recommended.
-
The AMDGPU installer documentation has been removed to encourage the use of the package manager for ROCm installation. While the package manager is the recommended method, you can still install ROCm using the AMDGPU installer by following the legacy process. Ensure to update the command with the intended ROCm version before running it. For more information, see Installation via native package manager.
Operating system and hardware support changes
ROCm 6.4.2 adds support for SLES 15 SP7. For more information, see SLES installation.
ROCm 6.4.2 marks the end of support (EoS) for RHEL 9.5.
ROCm 6.4.2 adds support for RDNA3 architecture-based Radeon RX 7700 XT GPU. This GPU is supported on Ubuntu 24.04.2 and RHEL 9.6.
For details, see the full list of Supported GPUs
(Linux).
See the Compatibility
matrix
for more information about operating system and hardware compatibility.
ROCm components
The following table lists the versions of ROCm components for ROCm 6.4.2, including any version
changes from 6.4.1 to 6.4.2. Click the component's updated version to go to a list of its changes.
Click {fab}github to go to the component's source code on GitHub.
ROCm 6.4.1 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable new features and improvements in ROCm 6.4.1. For changes to individual components, see
Detailed component changes.
Addition of DPX partition mode under NPS2 memory mode
AMD Instinct MI300X now supports DPX partition mode under NPS2 memory mode. For more partitioning information, see the Deep dive into the MI300 compute and memory partition modes blog and AMD Instinct MI300X system optimization.
Introducing the ROCm Data Science toolkit
The ROCm Data Science toolkit (or ROCm-DS) is an open-source software collection for high-performance data science applications built on the core ROCm platform. You can leverage ROCm-DS to accelerate both new and existing data science workloads, allowing you to execute intensive applications with larger datasets at lightning speed. ROCm-DS is in an early access state. Running production workloads is not recommended. For more information, see AMD ROCm-DS Documentation.
ROCm Offline Installer Creator updates
The ROCm Offline Installer Creator 6.4.1 now allows you to use the SPACEBAR or ENTER keys for menu item selection in the GUI. It also adds support for Debian 12 and fixes an issue for “full” mode RHEL offline installer creation, where GDM packages were uninstalled during offline installation. See ROCm Offline Installer Creator for more information.
ROCm Runfile Installer updates
The ROCm Runfile Installer 6.4.1 adds the following improvements:
- Relaxed version checks for installation on different distributions. Provided the dependencies are not installed by the Runfile Installer, you can target installation for a different path from the host system running the installer. For example, the installer can run on a system using Ubuntu 22.04 and install to a partition/system that is using Ubuntu 24.04.
- Performance improvements for detecting a previous ROCm install.
- Removal of the extra opt directory created for the target during the ROCm installation. For example, installing to target=/home/amd now installs ROCm to /home/amd/rocm-6.4.1 and not /home/amd/opt/rocm-6.4.1. For installs using target=/, the installation will continue to use /opt/.
- The Runfile Installer can be used to uninstall any Runfile-based installation of the driver.
- In the CLI interface, the postrocm argument can now be run separately from the rocm argument. In cases where postrocm was missed from the initial ROCm install, postrocm can now be run on the same target folder. For example, if you installed ROCm 6.4.1 using install.run target=/myrocm rocm, you can run the post-installation separately using the command install.run target=/myrocm/rocm-6.4.1 postrocm.
For more information, see ROCm Runfile Installer.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
- Tutorials for AI developers have been expanded with five new tutorials. These tutorials are Jupyter notebook-based, easy-to-follow documents. They are ideal for AI developers who want to learn about specific topics, including inference, fine-tuning, and training. For more information about the changes, see Changelog for the AI Developer Hub.
- The Training a model with LLM Foundry performance testing guide has been added. This guide describes how to use the preconfigured ROCm/pytorch-training training environment and https://github.com/ROCm/MAD to test the training performance of the LLM Foundry framework on AMD Instinct MI325X and MI300X accelerators using the MPT-30B model.
- The Training a model with PyTorch performance testing guide has been updated to feature the latest ROCm/pytorch-training Docker image (a preconfigured training environment with ROCm and PyTorch). Support for Llama 3.3 70B has been added.
- The Training a model with JAX MaxText performance testing guide has been updated to feature the latest ROCm/jax-training Docker image (a preconfigured training environment with ROCm, JAX, and MaxText). Support for Llama 3.3 70B has been added.
- The vLLM inference performance testing guide has been updated to feature the latest ROCm/vLLM Docker image (a preconfigured environment for inference with ROCm and vLLM). Support for the QwQ-32B model has been added.
- The PyTorch inference performance testing guide has been added, featuring the ROCm/PyTorch Docker image (a preconfigured inference environment with ROCm and PyTorch) with initial support for the CLIP and Chai-1 models.
Operating system and hardware support changes
ROCm 6.4.1 introduces support for the RDNA4 architecture-based Radeon AI PRO
R9700,
Radeon RX 9070 XT, and
Radeon RX 9060 XT GPUs
for compute workloads. Currently, these GPUs are only supported on Ubuntu 24.04.2, Ubuntu 22.04.5, RHEL 9.5, and RHEL 9.4.
For details, see the full list of Supported GPUs
(Linux).
Operating system support remains unchanged in this release.
See the Compatibility
matrix
for more information about operating system and hardware compatibility.
ROCm components
The following table lists the versions of ROCm components for ROCm 6.4.1, including any version
changes from 6.4.0 to 6.4.1. Click the component's updated version to go to a list of its changes.
Click {fab}github to go to the component's source code on GitHub.
ROCm 6.4.0 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable new features and improvements in ROCm 6.4.0. For changes to individual components, see
Detailed component changes.
New kernel support added in Megatron-LM framework for ROCm
The Megatron-LM framework for ROCm is a specialized fork of the robust Megatron-LM, designed to enable efficient training of large-scale language models on AMD GPUs. The Megatron-LM fork adds support to the following fused kernels:
- Fused Attention (Fused QKV)
- Fused Layer Norm
- Fused ROPE
See Training a model with Megatron-LM for ROCm for more information.
CPX mode with NPS4 memory mode supported
On AMD Instinct™ MI300X systems, you can now use Core Partitioned X-celerator (CPX) mode in combination with the Non-Uniform Memory Access (NUMA) Per Socket (NPS4) memory mode. This partition mode configuration can be applied to a Single Root IO Virtualization (SR-IOV) host or a bare metal environment. This feature enables better performance with small language models (13B parameters or less) that can fit within one CPX GPU.
To learn how to switch to CPX and NPS4 modes, see Change GPU partition
modes
in the Instinct documentation.
To learn how CPX and NPS4 partition modes can benefit RCCL performance on MI300X systems, see RCCL usage tips.
Kernel-mode GPU Driver (KMD) and user space software compatibility improved
ROCm 6.4.0 has been tested to allow you to choose a combination of AMD Kernel-mode GPU Driver (KMD) and ROCm user space software from ROCm releases up to a year apart (assuming hardware support is available in both). This compatibility has been tested for backward direction in ROCm 6.4.0, and it will be tested in forward directions for every new driver release occurring for a year from ROCm 6.4.0 release (for example, older user space with newer KMD and vice versa).
Separation of user space and driver space components documentation
As of ROCm 6.4.0, the driver space components documentation has moved from AMD ROCm documentation to its own documentation site, AMD Instinct Data Center GPU Driver. The goal is to make the software for AMD Instinct GPUs more modular. This helps in having a clear understanding of the options for installation combinations of Instinct driver and multiple supported ROCm user space versions.
Information about the variant of the amdgpu driver built for Instinct GPUs is available on AMD Instinct Data Center GPU Driver. See ROCm/ROCK-Kernel-Driver GitHub repository for source code, which is planned to be renamed to instinct-driver in a future ROCm release. For ROCm 6.4.0, the versioning scheme for the Instinct driver is parallel to the ROCm versioning; that is, 6.4.0. In future ROCm releases, the Instinct driver version is planned to be separate from the ROCm versioning.
Separating the major software components improves the upgrade experience by:
- Allowing you to upgrade your Instinct driver independently of ROCm user space, or vice versa.
- Having bug fixes released independently in either the Instinct driver or ROCm user space.
PyTorch 2.6 and 2.5 support added
ROCm 6.4.0 adds support for PyTorch 2.6 and 2.5. See the Compatibility
matrix
for the complete list of PyTorch versions tested for compatibility with ROCm. See Installing deep learning frameworks for ROCm for more information about supported deep learning frameworks.
VP9 support added to rocDecode and rocPyDecode
VP9 support is added to rocDecode and rocPyDecode, allowing enhanced codec support with VP9 encoding.
Bitstream reader support added to rocDecode
The new bitstream reader feature has been added to rocDecode. It contains built-in stream file parsers, including an elementary stream file parser and an IVF container file parser. It enables decoding without the requirement for FFmpeg demuxer. The reader can parse AVC, HEVC, and AV1 elementary stream files, and AV1 IVF container files. See Using the rocDecode bitstream reader APIs for more information.
DLPack support added to rocAL
rocAL now supports DLPack, allowing rocAL GPU tensor to be exchanged with PyTorch. This allows faster data processing by leveraging DLPack tensors. It also improves the GPU based workload performance. For more details, see DLpack github reference documentation.
ROCm Compute Profiler updates
-
ROCm Compute Profiler now supports:
- ROCprofiler-SDK (rocprofv3)
- Experimental multi-nodes profiling support.
- Roofline plot for 64-bit floating point (FP64) and 32-bit floating point (FP32) data types.
ROCm Systems Profiler updates
ROCm Systems Profiler now supports:
- Network performance profiling for standard Network Interface Cards (NICs).
- OpenMP offload of kernel activity.
- Device tracing of OpenMP (C/C++).
- AMD Video Core Next (VCN) engine activity and video decode API tracing.
rocWMMA updates
rocWMMA library has been enhanced with:
- Infrastructure to support interleaved wave-tiles for better General Matrix Multiplication (GEMM) performance.
- Binary sizes can now be reduced on supported compilers by using the --offload-compress compiler flag.
- An emulation test suite has been added for reduced scope smoke tests.
hipTensor updates
hipTensor library has been enhanced with:
- New benchmarking and validation test suites were added for contractions, reductions, and permutations, which are driven with YAML configurations.
- Binary sizes can now be reduced on supported compilers by using the --offload-compress compiler flag.
- Emulation test was suite added for reduced scope smoke tests.
- Default strides are now calculated in column-major order.
- Permutation kernel selection optimized for improved performance.
ROCm Data Center Tool (RDC) updates
- Additional new modules and metrics have been added to enhance the end-user experience by improving monitoring, management, and optimization of GPU resources, RDC components, communication, data transfer, and the overall system functionality, ensuring reduced downtime.
- Modules: RVS integration, Group policy management, Add version command, Multilevel Diagnostics Runs, Topology mapping, Conditions and Thresholds, Memory speed, Runtime health check.
- Metrics: Switches and Link Status, Memory bandwidth, Memory Usage, Utilization, MM Eng Enc/Dec throughput.
- Plugins for ROCprofiler-SDK (rocprofv3) has been upgraded.
ROCm Offline Installer Creator updates
The ROCm Offline Installer Creator 6.4.0 adds support for Oracle Linux 9 and uninstall support for RHEL, SLES, and Oracle Linux. See ROCm Offline Installer Creator for more information.
ROCm Runfile Installer updates
The ROCm Runfile Installer 6.4.0 adds improvements for dependency installation in an online-only environment and support for the following:
- Ubuntu 24.04, RHEL 8.10, 9.4, and 9.5, and SLES 15 SP6
- AMDGPU driver installation
- ROCm and AMDGPU driver uninstall
For more information, see ROCm Runfile Installer.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
-
Tutorials for AI developers have been expanded with four new tutorials. These tutorials are Jupyter notebook-based, easy-to-follow documents. They are ideal for AI developers who want to learn about specific topics, including inference, fine-tuning, and training.
-
The Training a model with PyTorch for ROCm performance testing
guide has been updated to feature the latest [ROCm/pytorch-training](https://hub.docker.com/layers/rocm/pytorch-training/v25.4/images/sha256-fa98a9aa69968e654466c06f05aaa12730db79b48b113c1ab4f7a5...
ROCm 6.3.3 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable new features and improvements in ROCm 6.3.3. For changes to individual components, see
Detailed component changes.
ROCm Offline Installer Creator updates
The ROCm Offline Installer Creator 6.3.3 adds a new Post-Install Options menu, which includes a new udev option for adding GPU resources access for all users. It also moves the user-specific GPU access option (for the video,render group) from the Driver Options menu to the Post-Install Options menu. See the ROCm Offline Installer Creator documentation for more information.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
-
Tutorials for AI developers have been added. These tutorials are Jupyter notebook-based, easy-to-follow documents. They are ideal for AI developers who want to learn about specific topics, including inference, fine-tuning, and training.
-
The LLM inference performance validation guide for AMD Instinct MI300X
now includes additional models for performance benchmarking. The accompanying ROCm vLLM Docker has been upgraded to ROCm 6.3.1. -
The HIP documentation has been updated with new resources for developers. To learn more about concurrency, parallelism, and stream management on devices and multiple GPUs, see Asynchronous concurrent execution
-
The following HIP documentation topics have been updated:
Operating system and hardware support changes
Operating system and hardware support remain unchanged in this release.
See the Compatibility
matrix
for more information about operating system and hardware compatibility.
ROCm components
The following table lists the versions of ROCm components for ROCm 6.3.3, including any version
changes from 6.3.2 to 6.3.3. Click the component's updated version to go to a list of its changes.
Click {fab}github to go to the component's source code on GitHub.
ROCm 6.3.2 Release
The release notes provide a summary of notable changes since the previous ROCm release.
Release highlights
The following are notable improvements in ROCm 6.3.2. For changes to individual components, see
Detailed component changes.
ROCm documentation updates
ROCm documentation continues to be updated to provide clearer and more comprehensive guidance for a wider variety of user needs and use cases.
-
Documentation about ROCm compatibility with deep learning frameworks has been added. These topics outline ROCm-enabled features for each deep learning framework, key ROCm libraries that can influence the capabilities, validated Docker image tags, and features supported across the available ROCm and framework versions. For more information, see:
-
The HIP C++ language extensions and Kernel language C++ support topics have been reorganized to make them easier to find and review. The topics have also been enhanced with new content.
Operating system and hardware support changes
ROCm 6.3.2 adds support for Azure Linux 3.0 (kernel: 6.6). Azure Linux is supported only on AMD Instinct accelerators. For more information, see Azure Linux installation.
See the Compatibility
matrix
for more information about operating system and hardware compatibility.
ROCm components
The following table lists the versions of ROCm components for ROCm 6.3.2, including any version
changes from 6.3.1 to 6.3.2. Click the component's updated version to go to a list of its changes.
Click {fab}github to go to the component's source code on GitHub.