Currently, ghes_edac_register() is called via ghes_init() from acpi_init() at the subsys_initcall() level. However, edac_init() is also called from the subsys_initcall(), leaving the ordering ambiguous.
If ghes_edac_register() is called first, then 'mc0' ends up at: /sys/devices/mc0/, instead of the expected: /sys/devices/system/edac/mc/mc0.
So while everything seems ok, other than the unexpected sysfs location, it seems like 'edac_init()' should be called before any drivers start registering. So have 'edac_init()' called earlier via arch_initcall().
However, this moves edac_pci_clear_parity_errors() up as well. Seems like this wants to be called after pci bus scan, so keep edac_pci_clear_parity_errors() at subsys_init(). That said, it seems like pci bus scan happens at subsys_init() level, so really the parity clearing should be moved later. But that can be left as a separate patch.
Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()") Signed-off-by: Jason Baron jbaron@akamai.com Cc: Borislav Petkov bp@alien8.de Cc: Mauro Carvalho Chehab mchehab@kernel.org Cc: Tony Luck tony.luck@intel.com Cc: James Morse james.morse@arm.com Cc: Robert Richter rric@kernel.org Cc: "Rafael J. Wysocki" rafael.j.wysocki@intel.com Cc: Shuai Xue xueshuai@linux.alibaba.com Cc: stable@vger.kernel.org --- drivers/edac/edac_module.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c index 32a931d0cb71..407d4a5fce7a 100644 --- a/drivers/edac/edac_module.c +++ b/drivers/edac/edac_module.c @@ -109,15 +109,6 @@ static int __init edac_init(void) if (err) return err;
- /* - * Harvest and clear any boot/initialization PCI parity errors - * - * FIXME: This only clears errors logged by devices present at time of - * module initialization. We should also do an initial clear - * of each newly hotplugged device. - */ - edac_pci_clear_parity_errors(); - err = edac_mc_sysfs_init(); if (err) goto err_sysfs; @@ -157,12 +148,34 @@ static void __exit edac_exit(void) edac_subsys_exit(); }
+static void __init edac_init_clear_parity_errors(void) +{ + /* + * Harvest and clear any boot/initialization PCI parity errors + * + * FIXME: This only clears errors logged by devices present at time of + * module initialization. We should also do an initial clear + * of each newly hotplugged device. + */ + edac_pci_clear_parity_errors(); + + return 0; +} + /* * Inform the kernel of our entry and exit points + * + * ghes_edac_register() is call via acpi_init() -> ghes_init() + * at the subsys_initcall level so edac_init() must come first */ -subsys_initcall(edac_init); +arch_initcall(edac_init); module_exit(edac_exit);
+/* + * Clear parity errors after PCI subsys is initialized + */ +subsys_initcall(edac_init_clear_parity_errors); + MODULE_LICENSE("GPL"); MODULE_AUTHOR("Doug Thompson www.softwarebitmaker.com, et al"); MODULE_DESCRIPTION("Core library routines for EDAC reporting");
Hi Jason,
I love your patch! Yet something to improve:
[auto build test ERROR on ras/edac-for-next] [also build test ERROR on linus/master v6.1-rc5 next-20221115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Jason-Baron/EDAC-edac_module-... base: https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next patch link: https://lore.kernel.org/r/20221116003729.194802-1-jbaron%40akamai.com patch subject: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register() config: powerpc-allyesconfig compiler: powerpc-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/a970ee7e983345d07bd1f3e455688e... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046 git checkout a970ee7e983345d07bd1f3e455688ef753f32a45 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash drivers/
If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
drivers/edac/edac_module.c: In function 'edac_init_clear_parity_errors': drivers/edac/edac_module.c:162:16: error: 'return' with a value, in function returning void [-Werror=return-type] 162 | return 0; | ^ drivers/edac/edac_module.c:151:20: note: declared here 151 | static void __init edac_init_clear_parity_errors(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/printk.h:6, from include/asm-generic/bug.h:22, from arch/powerpc/include/asm/bug.h:158, from include/linux/bug.h:5, from arch/powerpc/include/asm/cmpxchg.h:8, from arch/powerpc/include/asm/atomic.h:11, from include/linux/atomic.h:7, from include/linux/edac.h:15, from drivers/edac/edac_module.c:13: drivers/edac/edac_module.c: At top level:
drivers/edac/edac_module.c:177:17: error: initialization of 'initcall_t' {aka 'int (*)(void)'} from incompatible pointer type 'void (*)(void)' [-Werror=incompatible-pointer-types]
177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/init.h:250:55: note: in definition of macro '____define_initcall' 250 | __attribute__((__section__(__sec))) = fn; | ^~ include/linux/init.h:260:9: note: in expansion of macro '__unique_initcall' 260 | __unique_initcall(fn, id, __sec, __initcall_id(fn)) | ^~~~~~~~~~~~~~~~~ include/linux/init.h:262:35: note: in expansion of macro '___define_initcall' 262 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) | ^~~~~~~~~~~~~~~~~~ include/linux/init.h:286:41: note: in expansion of macro '__define_initcall' 286 | #define subsys_initcall(fn) __define_initcall(fn, 4) | ^~~~~~~~~~~~~~~~~ drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall' 177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~ cc1: some warnings being treated as errors
vim +177 drivers/edac/edac_module.c
173 174 /* 175 * Clear parity errors after PCI subsys is initialized 176 */
177 subsys_initcall(edac_init_clear_parity_errors);
178
On Tue, Nov 15, 2022 at 07:37:29PM -0500, Jason Baron wrote:
Currently, ghes_edac_register() is called via ghes_init() from acpi_init()
https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-ghes
On 11/16/22 06:14, Borislav Petkov wrote:
On Tue, Nov 15, 2022 at 07:37:29PM -0500, Jason Baron wrote:
Currently, ghes_edac_register() is called via ghes_init() from acpi_init()
https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/...
Hi Boris,
Thanks, yes this looks like it will address the regression. Is this planned for 6.1?
Or 5.15 stable, which is where we hit this regression?
Thanks,
-Jason
Hi,
On Wed, Nov 16, 2022 at 09:32:41AM -0500, Jason Baron wrote:
Thanks, yes this looks like it will address the regression. Is this planned for 6.1?
6.2.
Or 5.15 stable, which is where we hit this regression?
No, I don't think it is stable material.
Thx.
On 11/16/22 13:37, Borislav Petkov wrote:
Hi,
On Wed, Nov 16, 2022 at 09:32:41AM -0500, Jason Baron wrote:
Thanks, yes this looks like it will address the regression. Is this planned for 6.1?
6.2.
Or 5.15 stable, which is where we hit this regression?
No, I don't think it is stable material.
Thx.
Ok, thanks. Is there any plan to address this in 5.15 stable/6.1 ?
Either with a revert or fixup as I proposed or something else?
Thanks,
-Jason
Hi Jason,
I love your patch! Yet something to improve:
[auto build test ERROR on ras/edac-for-next] [also build test ERROR on linus/master v6.1-rc5 next-20221115] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Jason-Baron/EDAC-edac_module-... base: https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next patch link: https://lore.kernel.org/r/20221116003729.194802-1-jbaron%40akamai.com patch subject: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register() config: powerpc-allmodconfig compiler: powerpc-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/a970ee7e983345d07bd1f3e455688e... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046 git checkout a970ee7e983345d07bd1f3e455688ef753f32a45 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash drivers/
If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot lkp@intel.com
All error/warnings (new ones prefixed by >>):
drivers/edac/edac_module.c: In function 'edac_init_clear_parity_errors': drivers/edac/edac_module.c:162:16: error: 'return' with a value, in function returning void [-Werror=return-type] 162 | return 0; | ^ drivers/edac/edac_module.c:151:20: note: declared here 151 | static void __init edac_init_clear_parity_errors(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/device/driver.h:21, from include/linux/device.h:32, from include/linux/edac.h:16, from drivers/edac/edac_module.c:13: drivers/edac/edac_module.c: At top level: include/linux/module.h:130:49: error: redefinition of '__inittest' 130 | static inline initcall_t __maybe_unused __inittest(void) \ | ^~~~~~~~~~ include/linux/module.h:116:41: note: in expansion of macro 'module_init' 116 | #define subsys_initcall(fn) module_init(fn) | ^~~~~~~~~~~ drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall' 177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~ include/linux/module.h:130:49: note: previous definition of '__inittest' with type 'int (*(void))(void)' 130 | static inline initcall_t __maybe_unused __inittest(void) \ | ^~~~~~~~~~ include/linux/module.h:115:41: note: in expansion of macro 'module_init' 115 | #define arch_initcall(fn) module_init(fn) | ^~~~~~~~~~~ drivers/edac/edac_module.c:171:1: note: in expansion of macro 'arch_initcall' 171 | arch_initcall(edac_init); | ^~~~~~~~~~~~~ drivers/edac/edac_module.c: In function '__inittest':
drivers/edac/edac_module.c:177:17: error: returning 'void (*)(void)' from a function with incompatible return type 'initcall_t' {aka 'int (*)(void)'} [-Werror=incompatible-pointer-types]
177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/module.h:131:18: note: in definition of macro 'module_init' 131 | { return initfn; } \ | ^~~~~~ drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall' 177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~ drivers/edac/edac_module.c: At top level: include/linux/module.h:132:13: error: redefinition of 'init_module' 132 | int init_module(void) __copy(initfn) \ | ^~~~~~~~~~~ include/linux/module.h:116:41: note: in expansion of macro 'module_init' 116 | #define subsys_initcall(fn) module_init(fn) | ^~~~~~~~~~~ drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall' 177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~ include/linux/module.h:132:13: note: previous definition of 'init_module' with type 'int(void)' 132 | int init_module(void) __copy(initfn) \ | ^~~~~~~~~~~ include/linux/module.h:115:41: note: in expansion of macro 'module_init' 115 | #define arch_initcall(fn) module_init(fn) | ^~~~~~~~~~~ drivers/edac/edac_module.c:171:1: note: in expansion of macro 'arch_initcall' 171 | arch_initcall(edac_init); | ^~~~~~~~~~~~~
include/linux/module.h:132:13: warning: 'init_module' alias between functions of incompatible types 'int(void)' and 'void(void)' [-Wattribute-alias=]
132 | int init_module(void) __copy(initfn) \ | ^~~~~~~~~~~ include/linux/module.h:116:41: note: in expansion of macro 'module_init' 116 | #define subsys_initcall(fn) module_init(fn) | ^~~~~~~~~~~ drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall' 177 | subsys_initcall(edac_init_clear_parity_errors); | ^~~~~~~~~~~~~~~ drivers/edac/edac_module.c:151:20: note: aliased declaration here 151 | static void __init edac_init_clear_parity_errors(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors
vim +177 drivers/edac/edac_module.c
173 174 /* 175 * Clear parity errors after PCI subsys is initialized 176 */
177 subsys_initcall(edac_init_clear_parity_errors);
178
linux-stable-mirror@lists.linaro.org