From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit 08a1af326a80b88324acd73877db81ae927b1219 ]
Currently, during locating the CIVD section, the ixgbe driver loops over the OROM area and at each iteration reads only OROM-datastruct-size amount of data. This results in many small reads and is inefficient.
Optimize this by reading the entire OROM bank into memory once before entering the loop. This significantly reduces the probing time.
Without this patch probing time may exceed over 25s, whereas with this patch applied average time of probe is not greater than 5s.
without the patch: [14:12:22] ixgbe: Copyright (c) 1999-2016 Intel Corporation. [14:12:25] ixgbe 0000:21:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 [14:12:25] ixgbe 0000:21:00.0: 63.012 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x4 link) [14:12:26] ixgbe 0000:21:00.0: MAC: 7, PHY: 27, PBA No: N55484-001 [14:12:26] ixgbe 0000:21:00.0: 20:3a:43:09:3a:12 [14:12:26] ixgbe 0000:21:00.0: Intel(R) 10 Gigabit Network Connection [14:12:50] ixgbe 0000:21:00.0 ens2f0np0: renamed from eth0
with the patch: [14:18:18] ixgbe: Copyright (c) 1999-2016 Intel Corporation. [14:18:19] ixgbe 0000:21:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 [14:18:19] ixgbe 0000:21:00.0: 63.012 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x4 link) [14:18:19] ixgbe 0000:21:00.0: MAC: 7, PHY: 27, PBA No: N55484-001 [14:18:19] ixgbe 0000:21:00.0: 20:3a:43:09:3a:12 [14:18:19] ixgbe 0000:21:00.0: Intel(R) 10 Gigabit Network Connection [14:18:22] ixgbe 0000:21:00.0 ens2f0np0: renamed from eth0
Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Tested-by: Rinitha S sx.rinitha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
Reasoning and impact - User-visible bug: The old implementation read only a small struct per 512-byte step across the whole OROM, causing thousands of NVM transactions during probe. The commit reduces probe time dramatically (25s → ~5s), which is a real user-facing issue (long boot delays, timeouts). This is a performance bug fix, not a feature. - Scope: The change is contained to the E610 flash/OROM probing path and limited to a single function in one file. No ABI, IO paths, or critical runtime datapaths are modified.
What changed in code - Batch read OROM once: - Allocates a buffer of the OROM bank size (`orom_size = hw->flash.banks.orom_size`) and reads it in a single flat-NVM pass, then scans in memory instead of doing many small reads: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3010, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3015, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3020. - The read goes through `ixgbe_read_flash_module()` which holds the NVM resource once and uses `ixgbe_read_flat_nvm()` that already chunks reads to 4KB sectors, so this is supported and efficient: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:2788, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3533. - Search logic preserved, just done in-memory: - Scans 512-byte aligned offsets looking for “$CIV”, verifies a simple modulo-256 checksum over the CIVD struct, then copies the struct out: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3032, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3039, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3042. - The struct layout and size are defined here and verified with a `BUILD_BUG_ON` against 512 bytes: drivers/net/ethernet/intel/ixgbe/ixgbe_type_e610.h:929, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3036. - Error semantics clarified and unchanged in behavior for callers: - Now explicitly returns -ENOMEM (allocation), -EIO (flash read), -EDOM (checksum), -ENODATA (not found), 0 on success; matching the documented behavior and typical expectations of `ixgbe_get_orom_ver_info()` which simply returns on error: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:2992, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3134. - The OROM size and offsets are sourced from Shadow RAM in 4KB units, already discovered via `ixgbe_determine_active_flash_banks()`: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:2687, drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:2602.
Risk assessment - Memory allocation: `kzalloc(orom_size, GFP_KERNEL)` allocates the OROM bank (typically small/hundreds of KB). It’s probe-time, immediately freed, and far less likely to fail under fragmentation. Even if -ENOMEM happens, failure behavior mirrors other probe-time allocations and cleanly propagates (and the previous code would then spend tens of seconds doing many I/Os). - Locking/IO semantics: `ixgbe_read_flat_nvm()` already chunks to 4KB and is designed for larger flat reads. Holding the NVM resource once is safer and faster than many acquire/release cycles. - Callers: The function feeds OROM version parsing (`ixgbe_get_orom_ver_info`) used during `ixgbe_get_flash_data` at probe; reducing time here improves user-visible driver bring-up time without changing logic: drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c:3345, drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:11666.
Why it fits stable - Fixes a significant user-impacting performance issue (probe delay ~25s). - Minimal, localized code change without architectural impact. - Maintains existing behavior and error handling expectations for callers. - Limited to E610 hardware path; low regression surface.
Conclusion - This is a well-scoped, low-risk performance bug fix that materially improves user experience during probe. It should be backported to stable.
drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c | 59 +++++++++++++------ 1 file changed, 40 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c index bfeef5b0b99d8..e5f0399657097 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_e610.c @@ -3008,50 +3008,71 @@ static int ixgbe_get_nvm_srev(struct ixgbe_hw *hw, * Searches through the Option ROM flash contents to locate the CIVD data for * the image. * - * Return: the exit code of the operation. + * Return: -ENOMEM when cannot allocate memory, -EDOM for checksum violation, + * -ENODATA when cannot find proper data, -EIO for faulty read or + * 0 on success. + * + * On success @civd stores collected data. */ static int ixgbe_get_orom_civd_data(struct ixgbe_hw *hw, enum ixgbe_bank_select bank, struct ixgbe_orom_civd_info *civd) { - struct ixgbe_orom_civd_info tmp; + u32 orom_size = hw->flash.banks.orom_size; + u8 *orom_data; u32 offset; int err;
+ orom_data = kzalloc(orom_size, GFP_KERNEL); + if (!orom_data) + return -ENOMEM; + + err = ixgbe_read_flash_module(hw, bank, + IXGBE_E610_SR_1ST_OROM_BANK_PTR, 0, + orom_data, orom_size); + if (err) { + err = -EIO; + goto cleanup; + } + /* The CIVD section is located in the Option ROM aligned to 512 bytes. * The first 4 bytes must contain the ASCII characters "$CIV". * A simple modulo 256 sum of all of the bytes of the structure must * equal 0. */ - for (offset = 0; (offset + SZ_512) <= hw->flash.banks.orom_size; - offset += SZ_512) { + for (offset = 0; offset + SZ_512 <= orom_size; offset += SZ_512) { + struct ixgbe_orom_civd_info *tmp; u8 sum = 0; u32 i;
- err = ixgbe_read_flash_module(hw, bank, - IXGBE_E610_SR_1ST_OROM_BANK_PTR, - offset, - (u8 *)&tmp, sizeof(tmp)); - if (err) - return err; + BUILD_BUG_ON(sizeof(*tmp) > SZ_512); + + tmp = (struct ixgbe_orom_civd_info *)&orom_data[offset];
/* Skip forward until we find a matching signature */ - if (memcmp(IXGBE_OROM_CIV_SIGNATURE, tmp.signature, - sizeof(tmp.signature))) + if (memcmp(IXGBE_OROM_CIV_SIGNATURE, tmp->signature, + sizeof(tmp->signature))) continue;
/* Verify that the simple checksum is zero */ - for (i = 0; i < sizeof(tmp); i++) - sum += ((u8 *)&tmp)[i]; + for (i = 0; i < sizeof(*tmp); i++) + sum += ((u8 *)tmp)[i]; + + if (sum) { + err = -EDOM; + goto cleanup; + }
- if (sum) - return -EDOM; + *civd = *tmp; + err = 0;
- *civd = tmp; - return 0; + goto cleanup; }
- return -ENODATA; + err = -ENODATA; +cleanup: + kfree(orom_data); + return err; }
/**