The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x a3f3040657417aeadb9622c629d4a0c2693a0f93 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025063022-frail-ceremony-f06e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a3f3040657417aeadb9622c629d4a0c2693a0f93 Mon Sep 17 00:00:00 2001 From: Avadhut Naik avadhut.naik@amd.com Date: Thu, 29 May 2025 20:50:04 +0000 Subject: [PATCH] EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index b681c0663203..07f1e9dc1ca7 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -1209,7 +1209,9 @@ static int umc_get_cs_mode(int dimm, u8 ctrl, struct amd64_pvt *pvt) if (csrow_enabled(2 * dimm + 1, ctrl, pvt)) cs_mode |= CS_ODD_PRIMARY;
- /* Asymmetric dual-rank DIMM support. */ + if (csrow_sec_enabled(2 * dimm, ctrl, pvt)) + cs_mode |= CS_EVEN_SECONDARY; + if (csrow_sec_enabled(2 * dimm + 1, ctrl, pvt)) cs_mode |= CS_ODD_SECONDARY;
@@ -1230,12 +1232,13 @@ static int umc_get_cs_mode(int dimm, u8 ctrl, struct amd64_pvt *pvt) return cs_mode; }
-static int __addr_mask_to_cs_size(u32 addr_mask_orig, unsigned int cs_mode, - int csrow_nr, int dimm) +static int calculate_cs_size(u32 mask, unsigned int cs_mode) { - u32 msb, weight, num_zero_bits; - u32 addr_mask_deinterleaved; - int size = 0; + int msb, weight, num_zero_bits; + u32 deinterleaved_mask; + + if (!mask) + return 0;
/* * The number of zero bits in the mask is equal to the number of bits @@ -1248,19 +1251,30 @@ static int __addr_mask_to_cs_size(u32 addr_mask_orig, unsigned int cs_mode, * without swapping with the most significant bit. This can be handled * by keeping the MSB where it is and ignoring the single zero bit. */ - msb = fls(addr_mask_orig) - 1; - weight = hweight_long(addr_mask_orig); + msb = fls(mask) - 1; + weight = hweight_long(mask); num_zero_bits = msb - weight - !!(cs_mode & CS_3R_INTERLEAVE);
/* Take the number of zero bits off from the top of the mask. */ - addr_mask_deinterleaved = GENMASK_ULL(msb - num_zero_bits, 1); + deinterleaved_mask = GENMASK(msb - num_zero_bits, 1); + edac_dbg(1, " Deinterleaved AddrMask: 0x%x\n", deinterleaved_mask); + + return (deinterleaved_mask >> 2) + 1; +} + +static int __addr_mask_to_cs_size(u32 addr_mask, u32 addr_mask_sec, + unsigned int cs_mode, int csrow_nr, int dimm) +{ + int size;
edac_dbg(1, "CS%d DIMM%d AddrMasks:\n", csrow_nr, dimm); - edac_dbg(1, " Original AddrMask: 0x%x\n", addr_mask_orig); - edac_dbg(1, " Deinterleaved AddrMask: 0x%x\n", addr_mask_deinterleaved); + edac_dbg(1, " Primary AddrMask: 0x%x\n", addr_mask);
/* Register [31:1] = Address [39:9]. Size is in kBs here. */ - size = (addr_mask_deinterleaved >> 2) + 1; + size = calculate_cs_size(addr_mask, cs_mode); + + edac_dbg(1, " Secondary AddrMask: 0x%x\n", addr_mask_sec); + size += calculate_cs_size(addr_mask_sec, cs_mode);
/* Return size in MBs. */ return size >> 10; @@ -1269,8 +1283,8 @@ static int __addr_mask_to_cs_size(u32 addr_mask_orig, unsigned int cs_mode, static int umc_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc, unsigned int cs_mode, int csrow_nr) { + u32 addr_mask = 0, addr_mask_sec = 0; int cs_mask_nr = csrow_nr; - u32 addr_mask_orig; int dimm, size = 0;
/* No Chip Selects are enabled. */ @@ -1308,13 +1322,13 @@ static int umc_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc, if (!pvt->flags.zn_regs_v2) cs_mask_nr >>= 1;
- /* Asymmetric dual-rank DIMM support. */ - if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY)) - addr_mask_orig = pvt->csels[umc].csmasks_sec[cs_mask_nr]; - else - addr_mask_orig = pvt->csels[umc].csmasks[cs_mask_nr]; + if (cs_mode & (CS_EVEN_PRIMARY | CS_ODD_PRIMARY)) + addr_mask = pvt->csels[umc].csmasks[cs_mask_nr];
- return __addr_mask_to_cs_size(addr_mask_orig, cs_mode, csrow_nr, dimm); + if (cs_mode & (CS_EVEN_SECONDARY | CS_ODD_SECONDARY)) + addr_mask_sec = pvt->csels[umc].csmasks_sec[cs_mask_nr]; + + return __addr_mask_to_cs_size(addr_mask, addr_mask_sec, cs_mode, csrow_nr, dimm); }
static void umc_debug_display_dimm_sizes(struct amd64_pvt *pvt, u8 ctrl) @@ -3512,9 +3526,10 @@ static void gpu_get_err_info(struct mce *m, struct err_info *err) static int gpu_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc, unsigned int cs_mode, int csrow_nr) { - u32 addr_mask_orig = pvt->csels[umc].csmasks[csrow_nr]; + u32 addr_mask = pvt->csels[umc].csmasks[csrow_nr]; + u32 addr_mask_sec = pvt->csels[umc].csmasks_sec[csrow_nr];
- return __addr_mask_to_cs_size(addr_mask_orig, cs_mode, csrow_nr, csrow_nr >> 1); + return __addr_mask_to_cs_size(addr_mask, addr_mask_sec, cs_mode, csrow_nr, csrow_nr >> 1); }
static void gpu_debug_display_dimm_sizes(struct amd64_pvt *pvt, u8 ctrl)
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com --- drivers/edac/amd64_edac.c | 72 ++++++++++++++++++++++++--------------- 1 file changed, 44 insertions(+), 28 deletions(-)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index 2f854feeeb23..dd48e8a2d1cb 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -1371,7 +1371,9 @@ static int f17_get_cs_mode(int dimm, u8 ctrl, struct amd64_pvt *pvt) if (csrow_enabled(2 * dimm + 1, ctrl, pvt)) cs_mode |= CS_ODD_PRIMARY;
- /* Asymmetric dual-rank DIMM support. */ + if (csrow_sec_enabled(2 * dimm, ctrl, pvt)) + cs_mode |= CS_EVEN_SECONDARY; + if (csrow_sec_enabled(2 * dimm + 1, ctrl, pvt)) cs_mode |= CS_ODD_SECONDARY;
@@ -2191,11 +2193,41 @@ static int f16_dbam_to_chip_select(struct amd64_pvt *pvt, u8 dct, return ddr3_cs_size(cs_mode, false); }
+static int calculate_cs_size(u32 mask, unsigned int cs_mode) +{ + int msb, weight, num_zero_bits; + u32 deinterleaved_mask; + + if (!mask) + return 0; + + /* + * The number of zero bits in the mask is equal to the number of bits + * in a full mask minus the number of bits in the current mask. + * + * The MSB is the number of bits in the full mask because BIT[0] is + * always 0. + * + * In the special 3 Rank interleaving case, a single bit is flipped + * without swapping with the most significant bit. This can be handled + * by keeping the MSB where it is and ignoring the single zero bit. + */ + + msb = fls(mask) - 1; + weight = hweight_long(mask); + num_zero_bits = msb - weight - !!(cs_mode & CS_3R_INTERLEAVE); + + /* Take the number of zero bits off from the top of the mask. */ + deinterleaved_mask = GENMASK_ULL(msb - num_zero_bits, 1); + edac_dbg(1, " Deinterleaved AddrMask: 0x%x\n", deinterleaved_mask); + + return (deinterleaved_mask >> 2) + 1; +} + static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc, unsigned int cs_mode, int csrow_nr) { - u32 addr_mask_orig, addr_mask_deinterleaved; - u32 msb, weight, num_zero_bits; + u32 addr_mask = 0, addr_mask_sec = 0; int cs_mask_nr = csrow_nr; int dimm, size = 0;
@@ -2234,36 +2266,20 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc, if (!fam_type->flags.zn_regs_v2) cs_mask_nr >>= 1;
- /* Asymmetric dual-rank DIMM support. */ - if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY)) - addr_mask_orig = pvt->csels[umc].csmasks_sec[cs_mask_nr]; - else - addr_mask_orig = pvt->csels[umc].csmasks[cs_mask_nr]; + if (cs_mode & (CS_EVEN_PRIMARY | CS_ODD_PRIMARY)) + addr_mask = pvt->csels[umc].csmasks[cs_mask_nr];
- /* - * The number of zero bits in the mask is equal to the number of bits - * in a full mask minus the number of bits in the current mask. - * - * The MSB is the number of bits in the full mask because BIT[0] is - * always 0. - * - * In the special 3 Rank interleaving case, a single bit is flipped - * without swapping with the most significant bit. This can be handled - * by keeping the MSB where it is and ignoring the single zero bit. - */ - msb = fls(addr_mask_orig) - 1; - weight = hweight_long(addr_mask_orig); - num_zero_bits = msb - weight - !!(cs_mode & CS_3R_INTERLEAVE); - - /* Take the number of zero bits off from the top of the mask. */ - addr_mask_deinterleaved = GENMASK_ULL(msb - num_zero_bits, 1); + if (cs_mode & (CS_EVEN_SECONDARY | CS_ODD_SECONDARY)) + addr_mask_sec = pvt->csels[umc].csmasks_sec[cs_mask_nr];
edac_dbg(1, "CS%d DIMM%d AddrMasks:\n", csrow_nr, dimm); - edac_dbg(1, " Original AddrMask: 0x%x\n", addr_mask_orig); - edac_dbg(1, " Deinterleaved AddrMask: 0x%x\n", addr_mask_deinterleaved); + edac_dbg(1, " Primary AddrMask: 0x%x\n", addr_mask);
/* Register [31:1] = Address [39:9]. Size is in kBs here. */ - size = (addr_mask_deinterleaved >> 2) + 1; + size = calculate_cs_size(addr_mask, cs_mode); + + edac_dbg(1, " Secondary AddrMask: 0x%x\n", addr_mask_sec); + size += calculate_cs_size(addr_mask_sec, cs_mode);
/* Return size in MBs. */ return size >> 10;
[ Sasha's backport helper bot ]
Hi,
Summary of potential issues: ⚠️ Found matching upstream commit but patch is missing proper reference to it
Found matching upstream commit: a3f3040657417aeadb9622c629d4a0c2693a0f93
Status in newer kernel trees: 6.15.y | Present (different SHA1: 8971673d7c04) 6.12.y | Present (different SHA1: 302f2ef77d98) 6.6.y | Present (different SHA1: 653a158b2ec7)
Note: The patch differs from the upstream commit: --- 1: a3f3040657417 < -: ------------- EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMs -: ------------- > 1: 673c64c326b4b EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMs ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.1.y | Success | Success |
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Hi,
On 7/2/2025 09:31, Greg KH wrote:
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Yes, the cherry-pick was not clean, but the core logic of changes between the original commit and the cherry-picked commit remains the same.
The amd64_edac module has been reworked quite a lot in the last year or two. Support has also been introduced for new SOC families and models. This rework and support, predominantly undertaken through the below commits, is missing in 6.1 kernel.
9c42edd571aa EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh ed623d55eef4 EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt a2e59ab8e933 EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
In this particular context, the original patch makes changes to umc_addr_mask_to_cs_size() and __addr_mask_to_cs_size() functions. These functions, however, are missing in 6.1. They were introduced in the module through commits a2e59ab8e933 and 9c42edd571aa. Instead, their functionality, in 6.1, has been squashed into a single function f17_addr_mask_to_cs_size(). Hence, the cherry-picked patch makes changes to f17_addr_mask_to_cs_size().
Additionally, gpu_addr_mask_to_cs_size() is missing in 6.1. It was introduced through 9c42edd571aa commit. Hence, the cherry-picked patch skips changes made by the original patch to this function.
Also, tested the cherry-picked patch on Zen4 system which had a 96GB (non-power-of-2) DIMM connected to it. Below is the snippet from dmesg:
Ubuntu24 default kernel:
[root avadnaik]# uname -r 6.8.0-62-generic [root avadnaik]# dmesg | awk '/UMC7 chip selects:/ {print; getline; print; getline; print}' [ 27.584535] EDAC MC: UMC7 chip selects: [ 27.584537] EDAC amd64: MC: 0: 32768MB 1: 16384MB [ 27.584539] EDAC amd64: MC: 2: 0MB 3: 0MB [root avadnaik]#
6.1 kernel with cherry-picked commit incorporated
[root avadnaik]# uname -r 6.1.142-edac-6.1-stable-24153-g431fa5011469 [root avadnaik]# dmesg | awk '/UMC7 chip selects:/ {print; getline; print; getline; print}' [ 24.600370] EDAC MC: UMC7 chip selects: [ 24.600371] EDAC amd64: MC: 0: 49152MB 1: 49152MB [ 24.600373] EDAC amd64: MC: 2: 0MB 3: 0MB [root avadnaik]#
Without the cherry-picked patch, the module outputs incorrect DIMM size information.
Please let me know if any further clarification is required from my end.
On Wed, Jul 02, 2025 at 12:19:41PM -0500, Naik, Avadhut wrote:
Hi,
On 7/2/2025 09:31, Greg KH wrote:
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Yes, the cherry-pick was not clean, but the core logic of changes between the original commit and the cherry-picked commit remains the same.
The amd64_edac module has been reworked quite a lot in the last year or two. Support has also been introduced for new SOC families and models. This rework and support, predominantly undertaken through the below commits, is missing in 6.1 kernel.
9c42edd571aa EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh ed623d55eef4 EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt a2e59ab8e933 EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
Why not take these as prerequisite changes? Taking changes that are radically different from what is upstream is almost always wrong, it makes future backports impossible, and usually is buggy.
And if you do make radical changes, like you did here, you must document it in the patch notes itself, like others do. Don't attempt to pass it off as a "cherry-pick" when it was not.
thanks,
greg k-h
On 7/3/2025 00:28, Greg KH wrote:
On Wed, Jul 02, 2025 at 12:19:41PM -0500, Naik, Avadhut wrote:
Hi,
On 7/2/2025 09:31, Greg KH wrote:
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Yes, the cherry-pick was not clean, but the core logic of changes between the original commit and the cherry-picked commit remains the same.
The amd64_edac module has been reworked quite a lot in the last year or two. Support has also been introduced for new SOC families and models. This rework and support, predominantly undertaken through the below commits, is missing in 6.1 kernel.
9c42edd571aa EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh ed623d55eef4 EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt a2e59ab8e933 EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
Why not take these as prerequisite changes? Taking changes that are radically different from what is upstream is almost always wrong, it makes future backports impossible, and usually is buggy.
Just to ensure that I have understood correctly, are you suggesting that we backport the above three commits to 6.1 too?
And if you do make radical changes, like you did here, you must document it in the patch notes itself, like others do. Don't attempt to pass it off as a "cherry-pick" when it was not.
Apologies! Wasn't aware of this! Just noticed that conflicts encountered during backporting have been documented in commit messages itself. Will do the same going forward!
thanks,
greg k-h
On Mon, Jul 07, 2025 at 02:00:24AM -0500, Naik, Avadhut wrote:
On 7/3/2025 00:28, Greg KH wrote:
On Wed, Jul 02, 2025 at 12:19:41PM -0500, Naik, Avadhut wrote:
Hi,
On 7/2/2025 09:31, Greg KH wrote:
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Yes, the cherry-pick was not clean, but the core logic of changes between the original commit and the cherry-picked commit remains the same.
The amd64_edac module has been reworked quite a lot in the last year or two. Support has also been introduced for new SOC families and models. This rework and support, predominantly undertaken through the below commits, is missing in 6.1 kernel.
9c42edd571aa EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh ed623d55eef4 EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt a2e59ab8e933 EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
Why not take these as prerequisite changes? Taking changes that are radically different from what is upstream is almost always wrong, it makes future backports impossible, and usually is buggy.
Just to ensure that I have understood correctly, are you suggesting that we backport the above three commits to 6.1 too?
Yes, why not?
On 7/7/2025 02:44, Greg KH wrote:
On Mon, Jul 07, 2025 at 02:00:24AM -0500, Naik, Avadhut wrote:
On 7/3/2025 00:28, Greg KH wrote:
On Wed, Jul 02, 2025 at 12:19:41PM -0500, Naik, Avadhut wrote:
Hi,
On 7/2/2025 09:31, Greg KH wrote:
On Tue, Jul 01, 2025 at 05:10:32PM +0000, Avadhut Naik wrote:
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers.
Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used.
For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS.
Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity.
Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena zilvinas@natrix.lt Signed-off-by: Avadhut Naik avadhut.naik@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Tested-by: Žilvinas Žaltiena zilvinas@natrix.lt Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com (cherry picked from commit a3f3040657417aeadb9622c629d4a0c2693a0f93) Signed-off-by: Avadhut Naik avadhut.naik@amd.com
This was not a clean cherry-pick at all. Please document what you did differently from the original commit please.
thanks,
greg k-h
Yes, the cherry-pick was not clean, but the core logic of changes between the original commit and the cherry-picked commit remains the same.
The amd64_edac module has been reworked quite a lot in the last year or two. Support has also been introduced for new SOC families and models. This rework and support, predominantly undertaken through the below commits, is missing in 6.1 kernel.
9c42edd571aa EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh ed623d55eef4 EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt a2e59ab8e933 EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
Why not take these as prerequisite changes? Taking changes that are radically different from what is upstream is almost always wrong, it makes future backports impossible, and usually is buggy.
Just to ensure that I have understood correctly, are you suggesting that we backport the above three commits to 6.1 too?
Yes, why not?
I just mentioned the above commits because I think they modify the code in question for this backport. But these commits have been merged in as part of larger patchsets (links below):
9c42edd571aa: https://lore.kernel.org/all/20230515113537.1052146-5-muralimk@amd.com/ ed623d55eef4: https://lore.kernel.org/all/20230127170419.1824692-11-yazen.ghannam@amd.com/ a2e59ab8e933: https://lore.kernel.org/all/20230127170419.1824692-9-yazen.ghannam@amd.com/
Backporting these commits might require us to backport these entire sets to 6.1. Wasn't completely sure if this is the road we want to take. Hence, asked the question in my earlier mail.
On Mon, Jul 07, 2025 at 12:57:39PM -0500, Naik, Avadhut wrote:
I just mentioned the above commits because I think they modify the code in question for this backport. But these commits have been merged in as part of larger patchsets (links below):
9c42edd571aa: https://lore.kernel.org/all/20230515113537.1052146-5-muralimk@amd.com/ ed623d55eef4: https://lore.kernel.org/all/20230127170419.1824692-11-yazen.ghannam@amd.com/ a2e59ab8e933: https://lore.kernel.org/all/20230127170419.1824692-9-yazen.ghannam@amd.com/
Backporting these commits might require us to backport these entire sets to 6.1. Wasn't completely sure if this is the road we want to take. Hence, asked the question in my earlier mail.
Seems like this is getting too complicated and I don't see the gain from it.
I'd say you leave 6.1 be for now unless someone comes with a persuasive reason to backport this patch to it.
IMNSVHO.
Thx.
linux-stable-mirror@lists.linaro.org