The computer (amd64) fails to boot. The init was stuck at the synchronization of the time through the network. This began between 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. Git bisect revealed the following. In this case the nonfree firmwre is not present on the system. Blacklisting the iwflwifi module works as a workaround for now.
6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 Author: Johannes Berg johannes.berg@intel.com Date: Fri Dec 10 11:12:42 2021 +0200
iwlwifi: fix leaks/bad data after failed firmware load
[ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ]
If firmware load fails after having loaded some parts of the firmware, e.g. the IML image, then this would leak. For the host command list we'd end up running into a WARN on the next attempt to load another firmware image.
Fix this by calling iwl_dealloc_ucode() on failures, and make that also clear the data so we start fresh on the next round.
Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa63... Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
On Thu, Feb 03, 2022 at 04:19:59PM -0800, Jason Self wrote:
The computer (amd64) fails to boot. The init was stuck at the synchronization of the time through the network. This began between 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. Git bisect revealed the following. In this case the nonfree firmwre is not present on the system. Blacklisting the iwflwifi module works as a workaround for now.
6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 Author: Johannes Berg johannes.berg@intel.com Date: Fri Dec 10 11:12:42 2021 +0200
iwlwifi: fix leaks/bad data after failed firmware load
[ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] If firmware load fails after having loaded some parts of the firmware, e.g. the IML image, then this would leak. For the host command list we'd end up running into a WARN on the next attempt to load another firmware image. Fix this by calling iwl_dealloc_ucode() on failures, and make that also clear the data so we start fresh on the next round. Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa63... Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
Please cc: the authors of this commit, and the upstream wireless developers so they can help you out here as I think the same issue shows up in 5.17-rc2, right?
thanks,
greg k-h
[TLDR: I'm adding this regression to regzbot, the Linux kernel regression tracking bot; most text you find below is compiled from a few templates paragraphs some of you might have seen already.]
Hi, this is your Linux kernel regression tracker speaking.
Adding the regression mailing list to the list of recipients, as it should be in the loop for all regressions, as explained here: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
On 04.02.22 01:19, Jason Self wrote:
The computer (amd64) fails to boot. The init was stuck at the synchronization of the time through the network. This began between 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. Git bisect revealed the following. In this case the nonfree firmwre is not present on the system. Blacklisting the iwflwifi module works as a workaround for now.
6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 Author: Johannes Berg johannes.berg@intel.com Date: Fri Dec 10 11:12:42 2021 +0200
To be sure this issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot:
#regzbot ^introduced 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 #regzbot title net: iwlwifi: system fails to boot since 5.16.3 #regzbot ignore-activity
Reminder: when fixing the issue, please add a 'Link:' tag with the URL to the report (the parent of this mail) using the kernel.org redirector, as explained in 'Documentation/process/submitting-patches.rst'. Regzbot then will automatically mark the regression as resolved once the fix lands in the appropriate tree. For more details about regzbot see footer.
Sending this to everyone that got the initial report, to make all aware of the tracking. I also hope that messages like this motivate people to directly get at least the regression mailing list and ideally even regzbot involved when dealing with regressions, as messages like this wouldn't be needed then.
Don't worry, I'll send further messages wrt to this regression just to the lists (with a tag in the subject so people can filter them away), as long as they are intended just for regzbot. With a bit of luck no such messages will be needed anyway.
Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat)
P.S.: As a Linux kernel regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them. Unfortunately therefore I sometimes will get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me about it in a public reply, that's in everyone's interest.
BTW, I have no personal interest in this issue, which is tracked using regzbot, my Linux kernel regression tracking bot (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting this mail to get things rolling again and hence don't need to be CC on all further activities wrt to this regression.
iwlwifi: fix leaks/bad data after failed firmware load
[ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] If firmware load fails after having loaded some parts of the firmware, e.g. the IML image, then this would leak. For the host command list we'd end up running into a WARN on the next attempt to load another firmware image. Fix this by calling iwl_dealloc_ucode() on failures, and make that also clear the data so we start fresh on the next round. Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa63... Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- Additional information about regzbot:
If you want to know more about regzbot, check out its web-interface, the getting start guide, and/or the references documentation:
https://linux-regtracking.leemhuis.info/regzbot/ https://gitlab.com/knurd42/regzbot/-/blob/main/docs/getting_started.md https://gitlab.com/knurd42/regzbot/-/blob/main/docs/reference.md
The last two documents will explain how you can interact with regzbot yourself if your want to.
Hint for reporters: when reporting a regression it's in your interest to tell #regzbot about it in the report, as that will ensure the regression gets on the radar of regzbot and the regression tracker. That's in your interest, as they will make sure the report won't fall through the cracks unnoticed.
Hint for developers: you normally don't need to care about regzbot once it's involved. Fix the issue as you normally would, just remember to include a 'Link:' tag to the report in the commit message, as explained in Documentation/process/submitting-patches.rst That aspect was recently was made more explicit in commit 1f57bd42b77c: https://git.kernel.org/linus/1f57bd42b77c
On 2022-02-04 01:19, Jason Self wrote:
The computer (amd64) fails to boot. The init was stuck at the synchronization of the time through the network. This began between 5.16.2 (good) and 5.16.3 (bad.) This continues on 5.16.4 and 5.16.5. Git bisect revealed the following. In this case the nonfree firmwre is not present on the system. Blacklisting the iwflwifi module works as a workaround for now.
I have several reports of Intel NUC 10th/11th gen not booting/crashing during boot after updating to 5.10.96 (from 5.10.91). At least one stack trace shows iwl_dealloc_ucode in the call path. The below commit is part of 5.10.96 So this regression seems to not only affect 5.16 series.
Link: https://github.com/home-assistant/operating-system/issues/1739#issuecomment-...
-- Stefan
6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 is the first bad commit commit 6b5ad4bd0d78fef6bbe0ecdf96e09237c9c52cc1 Author: Johannes Berg johannes.berg@intel.com Date: Fri Dec 10 11:12:42 2021 +0200
iwlwifi: fix leaks/bad data after failed firmware load
[ Upstream commit ab07506b0454bea606095951e19e72c282bfbb42 ] If firmware load fails after having loaded some parts of the firmware, e.g. the IML image, then this would leak. For the host command list we'd end up running into a WARN on the next attempt to load another firmware image. Fix this by calling iwl_dealloc_ucode() on failures, and make that also clear the data so we start fresh on the next round. Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Link: https://lore.kernel.org/r/iwlwifi.20211210110539.1f742f0eb58a.I1315f22f6aa63... Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 8 ++++++++ 1 file changed, 8 insertions(+)
On Tue, 08 Feb 2022 09:50:59 +0100 Stefan Agner stefan@agner.ch wrote:
On 2022-02-04 01:19, Jason Self wrote: [...]
I have several reports of Intel NUC 10th/11th gen not booting/crashing during boot after updating to 5.10.96 (from 5.10.91). At least one stack trace shows iwl_dealloc_ucode in the call path. The below commit is part of 5.10.96 So this regression seems to not only affect 5.16 series.
Link: https://github.com/home-assistant/operating-system/issues/1739#issuecomment-...
Yes, it does appear to affect multiple versions; at least 5.17-rc2, 5.16, 5.15, and as you say 5.10.
I can confirm that this patch addresses it on 5.16: https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/T/#t
It appears desirable to apply the patch to all of the stable versions that need it, after it's gone into Linus's tree to also address the matter with the upcoming 5.17 series.
On 08.02.22 19:05, Jason Self wrote:
On Tue, 08 Feb 2022 09:50:59 +0100 Stefan Agner stefan@agner.ch wrote:
On 2022-02-04 01:19, Jason Self wrote: [...]
I have several reports of Intel NUC 10th/11th gen not booting/crashing during boot after updating to 5.10.96 (from 5.10.91). At least one stack trace shows iwl_dealloc_ucode in the call path. The below commit is part of 5.10.96 So this regression seems to not only affect 5.16 series.
Link: https://github.com/home-assistant/operating-system/issues/1739#issuecomment-...
Yes, it does appear to affect multiple versions; at least 5.17-rc2, 5.16, 5.15, and as you say 5.10.
I can confirm that this patch addresses it on 5.16: https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/T/#t
Thx for pointing to the thread!
#regzbot monitor: https://lore.kernel.org/stable/YgJSEEmRDKKG+3lT@mail-itl/
It appears desirable to apply the patch to all of the stable versions that need it, after it's gone into Linus's tree to also address the matter with the upcoming 5.17 series.
FWIW, the patch is marked for backporting already, it just needs to get merged to mainline first.
Ciao, Thorsten
linux-stable-mirror@lists.linaro.org