The patch titled
Subject: ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
ocfs2-fix-null-pointer-dereference-in-ocfs2_journal_dirty.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Subject: ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
Date: Thu, 30 May 2024 19:06:29 +0800
bdev->bd_super has been removed and commit 8887b94d9322 change the usage
from bdev->bd_super to b_assoc_map->host->i_sb. This introduces the
following NULL pointer dereference in ocfs2_journal_dirty() since
b_assoc_map is still not initialized. This can be easily reproduced by
running xfstests generic/186, which simulate no more credits.
[ 134.351592] BUG: kernel NULL pointer dereference, address: 0000000000000000
...
[ 134.355341] RIP: 0010:ocfs2_journal_dirty+0x14f/0x160 [ocfs2]
...
[ 134.365071] Call Trace:
[ 134.365312] <TASK>
[ 134.365524] ? __die_body+0x1e/0x60
[ 134.365868] ? page_fault_oops+0x13d/0x4f0
[ 134.366265] ? __pfx_bit_wait_io+0x10/0x10
[ 134.366659] ? schedule+0x27/0xb0
[ 134.366981] ? exc_page_fault+0x6a/0x140
[ 134.367356] ? asm_exc_page_fault+0x26/0x30
[ 134.367762] ? ocfs2_journal_dirty+0x14f/0x160 [ocfs2]
[ 134.368305] ? ocfs2_journal_dirty+0x13d/0x160 [ocfs2]
[ 134.368837] ocfs2_create_new_meta_bhs.isra.51+0x139/0x2e0 [ocfs2]
[ 134.369454] ocfs2_grow_tree+0x688/0x8a0 [ocfs2]
[ 134.369927] ocfs2_split_and_insert.isra.67+0x35c/0x4a0 [ocfs2]
[ 134.370521] ocfs2_split_extent+0x314/0x4d0 [ocfs2]
[ 134.371019] ocfs2_change_extent_flag+0x174/0x410 [ocfs2]
[ 134.371566] ocfs2_add_refcount_flag+0x3fa/0x630 [ocfs2]
[ 134.372117] ocfs2_reflink_remap_extent+0x21b/0x4c0 [ocfs2]
[ 134.372994] ? inode_update_timestamps+0x4a/0x120
[ 134.373692] ? __pfx_ocfs2_journal_access_di+0x10/0x10 [ocfs2]
[ 134.374545] ? __pfx_ocfs2_journal_access_di+0x10/0x10 [ocfs2]
[ 134.375393] ocfs2_reflink_remap_blocks+0xe4/0x4e0 [ocfs2]
[ 134.376197] ocfs2_remap_file_range+0x1de/0x390 [ocfs2]
[ 134.376971] ? security_file_permission+0x29/0x50
[ 134.377644] vfs_clone_file_range+0xfe/0x320
[ 134.378268] ioctl_file_clone+0x45/0xa0
[ 134.378853] do_vfs_ioctl+0x457/0x990
[ 134.379422] __x64_sys_ioctl+0x6e/0xd0
[ 134.379987] do_syscall_64+0x5d/0x170
[ 134.380550] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 134.381231] RIP: 0033:0x7fa4926397cb
[ 134.381786] Code: 73 01 c3 48 8b 0d bd 56 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8d 56 38 00 f7 d8 64 89 01 48
[ 134.383930] RSP: 002b:00007ffc2b39f7b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 134.384854] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fa4926397cb
[ 134.385734] RDX: 00007ffc2b39f7f0 RSI: 000000004020940d RDI: 0000000000000003
[ 134.386606] RBP: 0000000000000000 R08: 00111a82a4f015bb R09: 00007fa494221000
[ 134.387476] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 134.388342] R13: 0000000000f10000 R14: 0000558e844e2ac8 R15: 0000000000f10000
[ 134.389207] </TASK>
Fix it by only aborting transaction and journal in ocfs2_journal_dirty()
now, and leave ocfs2_abort() later when detecting an aborted handle,
e.g. start next transaction. Also log the handle details in this case.
Link: https://lkml.kernel.org/r/20240530110630.3933832-1-joseph.qi@linux.alibaba.…
Fixes: 8887b94d9322 ("ocfs2: stop using bdev->bd_super for journal error logging")
Signed-off-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/journal.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/ocfs2/journal.c~ocfs2-fix-null-pointer-dereference-in-ocfs2_journal_dirty
+++ a/fs/ocfs2/journal.c
@@ -778,13 +778,15 @@ void ocfs2_journal_dirty(handle_t *handl
if (!is_handle_aborted(handle)) {
journal_t *journal = handle->h_transaction->t_journal;
- mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
- "Aborting transaction and journal.\n");
+ mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed: "
+ "handle type %u started at line %u, credits %u/%u "
+ "errcode %d. Aborting transaction and journal.\n",
+ handle->h_type, handle->h_line_no,
+ handle->h_requested_credits,
+ jbd2_handle_buffer_credits(handle), status);
handle->h_err = status;
jbd2_journal_abort_handle(handle);
jbd2_journal_abort(journal, status);
- ocfs2_abort(bh->b_assoc_map->host->i_sb,
- "Journal already aborted.\n");
}
}
}
_
Patches currently in -mm which might be from joseph.qi(a)linux.alibaba.com are
ocfs2-fix-null-pointer-dereference-in-ocfs2_journal_dirty.patch
ocfs2-fix-null-pointer-dereference-in-ocfs2_abort_trigger.patch
Hello
There is a big performance bug (only affecting games performance) on Linux (not reproducible on Windows) with AM5 boards (at least for my MSI MAG b650 Tomahawk) if these BIOS settings are used:
CSM -> Enabled
Wi-Fi -> Disabled (or set to Bluetooth only)
This does not happen even on Win 7... (I've only tested DX12 games) and does not happen if UEFI mode is chosen. I've tested with many different BIOS versions all showing the same result.
I have tried troubleshooting with MSI with some benchmarks posted (https://forum-en.msi.com/index.php?threads/b650-tomahawk-bios-bug-disabling…) and after a week we realized this only happens on Linux (tested on Arch/Fedora/Ubuntu with 6.8 and 6.9 kernels for the first two).
Please investigate.
Thank you
The patch titled
Subject: nilfs2: fix potential kernel bug due to lack of writeback flag waiting
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-potential-kernel-bug-due-to-lack-of-writeback-flag-waiting.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix potential kernel bug due to lack of writeback flag waiting
Date: Thu, 30 May 2024 23:15:56 +0900
Destructive writes to a block device on which nilfs2 is mounted can cause
a kernel bug in the folio/page writeback start routine or writeback end
routine (__folio_start_writeback in the log below):
kernel BUG at mm/page-writeback.c:3070!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
...
RIP: 0010:__folio_start_writeback+0xbaa/0x10e0
Code: 25 ff 0f 00 00 0f 84 18 01 00 00 e8 40 ca c6 ff e9 17 f6 ff ff
e8 36 ca c6 ff 4c 89 f7 48 c7 c6 80 c0 12 84 e8 e7 b3 0f 00 90 <0f>
0b e8 1f ca c6 ff 4c 89 f7 48 c7 c6 a0 c6 12 84 e8 d0 b3 0f 00
...
Call Trace:
<TASK>
nilfs_segctor_do_construct+0x4654/0x69d0 [nilfs2]
nilfs_segctor_construct+0x181/0x6b0 [nilfs2]
nilfs_segctor_thread+0x548/0x11c0 [nilfs2]
kthread+0x2f0/0x390
ret_from_fork+0x4b/0x80
ret_from_fork_asm+0x1a/0x30
</TASK>
This is because when the log writer starts a writeback for segment summary
blocks or a super root block that use the backing device's page cache, it
does not wait for the ongoing folio/page writeback, resulting in an
inconsistent writeback state.
Fix this issue by waiting for ongoing writebacks when putting
folios/pages on the backing device into writeback state.
Link: https://lkml.kernel.org/r/20240530141556.4411-1-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/nilfs2/segment.c~nilfs2-fix-potential-kernel-bug-due-to-lack-of-writeback-flag-waiting
+++ a/fs/nilfs2/segment.c
@@ -1652,6 +1652,7 @@ static void nilfs_segctor_prepare_write(
if (bh->b_folio != bd_folio) {
if (bd_folio) {
folio_lock(bd_folio);
+ folio_wait_writeback(bd_folio);
folio_clear_dirty_for_io(bd_folio);
folio_start_writeback(bd_folio);
folio_unlock(bd_folio);
@@ -1665,6 +1666,7 @@ static void nilfs_segctor_prepare_write(
if (bh == segbuf->sb_super_root) {
if (bh->b_folio != bd_folio) {
folio_lock(bd_folio);
+ folio_wait_writeback(bd_folio);
folio_clear_dirty_for_io(bd_folio);
folio_start_writeback(bd_folio);
folio_unlock(bd_folio);
@@ -1681,6 +1683,7 @@ static void nilfs_segctor_prepare_write(
}
if (bd_folio) {
folio_lock(bd_folio);
+ folio_wait_writeback(bd_folio);
folio_clear_dirty_for_io(bd_folio);
folio_start_writeback(bd_folio);
folio_unlock(bd_folio);
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-potential-kernel-bug-due-to-lack-of-writeback-flag-waiting.patch
On 5/24/2024 12:17 PM, David Howells wrote:
> Hi Christian,
>
> Can you pick this up, please?
>
> Thanks,
> David
> ---
> From: Marc Dionne<marc.dionne(a)auristor.com>
>
> afs: Don't cross .backup mountpoint from backup volume
>
> Don't cross a mountpoint that explicitly specifies a backup volume
> (target is <vol>.backup) when starting from a backup volume.
>
> It it not uncommon to mount a volume's backup directly in the volume
> itself. This can cause tools that are not paying attention to get
> into a loop mounting the volume onto itself as they attempt to
> traverse the tree, leading to a variety of problems.
>
> This doesn't prevent the general case of loops in a sequence of
> mountpoints, but addresses a common special case in the same way
> as other afs clients.
>
> Reported-by: Jan Henrik Sylvester<jan.henrik.sylvester(a)uni-hamburg.de>
> Link:http://lists.infradead.org/pipermail/linux-afs/2024-May/008454.html
> Reported-by: Markus Suvanto<markus.suvanto(a)gmail.com>
> Link:http://lists.infradead.org/pipermail/linux-afs/2024-February/008074.ht…
> Signed-off-by: Marc Dionne<marc.dionne(a)auristor.com>
> Signed-off-by: David Howells<dhowells(a)redhat.com>
> Reviewed-by: Jeffrey Altman<jaltman(a)auristor.com>
> cc:linux-afs@lists.infradead.org
> ---
> fs/afs/mntpt.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
> index 97f50e9fd9eb..297487ee8323 100644
> --- a/fs/afs/mntpt.c
> +++ b/fs/afs/mntpt.c
> @@ -140,6 +140,11 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt)
> put_page(page);
> if (ret < 0)
> return ret;
> +
> + /* Don't cross a backup volume mountpoint from a backup volume */
> + if (src_as->volume && src_as->volume->type == AFSVL_BACKVOL &&
> + ctx->type == AFSVL_BACKVOL)
> + return -ENODEV;
> }
>
> return 0;
Please add
cc: stable(a)vger.kernel.org
when it is applied to vfs-fixes.
Thank you.
Jeffrey Altman
On 5/24/2024 12:17 PM, David Howells wrote:
> Hi Christian,
>
> Can you pick this up, please?
>
> Thanks,
> David
> ---
> From: Marc Dionne<marc.dionne(a)auristor.com>
>
> afs: Don't cross .backup mountpoint from backup volume
>
> Don't cross a mountpoint that explicitly specifies a backup volume
> (target is <vol>.backup) when starting from a backup volume.
>
> It it not uncommon to mount a volume's backup directly in the volume
> itself. This can cause tools that are not paying attention to get
> into a loop mounting the volume onto itself as they attempt to
> traverse the tree, leading to a variety of problems.
>
> This doesn't prevent the general case of loops in a sequence of
> mountpoints, but addresses a common special case in the same way
> as other afs clients.
>
> Reported-by: Jan Henrik Sylvester<jan.henrik.sylvester(a)uni-hamburg.de>
> Link:http://lists.infradead.org/pipermail/linux-afs/2024-May/008454.html
> Reported-by: Markus Suvanto<markus.suvanto(a)gmail.com>
> Link:http://lists.infradead.org/pipermail/linux-afs/2024-February/008074.ht…
> Signed-off-by: Marc Dionne<marc.dionne(a)auristor.com>
> Signed-off-by: David Howells<dhowells(a)redhat.com>
> Reviewed-by: Jeffrey Altman<jaltman(a)auristor.com>
> cc:linux-afs@lists.infradead.org
> ---
> fs/afs/mntpt.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
> index 97f50e9fd9eb..297487ee8323 100644
> --- a/fs/afs/mntpt.c
> +++ b/fs/afs/mntpt.c
> @@ -140,6 +140,11 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt)
> put_page(page);
> if (ret < 0)
> return ret;
> +
> + /* Don't cross a backup volume mountpoint from a backup volume */
> + if (src_as->volume && src_as->volume->type == AFSVL_BACKVOL &&
> + ctx->type == AFSVL_BACKVOL)
> + return -ENODEV;
> }
>
> return 0;
Please add
cc: stable(a)vger.kernel.org
when it is applied to vfs-fixes.
Thank you.
Jeffrey Altman
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 2a38e4ca302280fdcce370ba2bee79bac16c4587
Gitweb: https://git.kernel.org/tip/2a38e4ca302280fdcce370ba2bee79bac16c4587
Author: Dave Hansen <dave.hansen(a)linux.intel.com>
AuthorDate: Fri, 17 May 2024 13:05:34 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Thu, 30 May 2024 08:29:45 -07:00
x86/cpu: Provide default cache line size if not enumerated
tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH]
will end up reporting cache_line_size()==0 and bad things happen.
Fill in a default on those to avoid the problem.
Long Story:
The kernel dies a horrible death if c->x86_cache_alignment (aka.
cache_line_size() is 0. Normally, this value is populated from
c->x86_clflush_size.
Right now the code is set up to get c->x86_clflush_size from two
places. First, modern CPUs get it from CPUID. Old CPUs that don't
have leaf 0x80000008 (or CPUID at all) just get some sane defaults
from the kernel in get_cpu_address_sizes().
The vast majority of CPUs that have leaf 0x80000008 also get
->x86_clflush_size from CPUID. But there are oddballs.
Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set
CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size:
cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
if (cap0 & (1<<19))
c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
So they: land in get_cpu_address_sizes() and see that CPUID has level
0x80000008 and jump into the side of the if() that does not fill in
c->x86_clflush_size. That assigns a 0 to c->x86_cache_alignment, and
hilarity ensues in code like:
buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()),
GFP_KERNEL);
To fix this, always provide a sane value for ->x86_clflush_size.
Big thanks to Andy Shevchenko for finding and reporting this and also
providing a first pass at a fix. But his fix was only partial and only
worked on the Quark CPUs. It would not, for instance, have worked on
the QEMU config.
1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel…
2. You can also get this behavior if you use "-cpu 486,+clzero"
in QEMU.
[ dhansen: remove 'vp_bits_from_cpuid' reference in changelog
because bpetkov brutally murdered it recently. ]
Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Reported-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Jörn Heusipp <osmanx(a)heusipp.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20240516173928.3960193-1-andriy.shevchenko@linu…
Link: https://lore.kernel.org/lkml/5e31cad3-ad4d-493e-ab07-724cfbfaba44@heusipp.d…
Link: https://lore.kernel.org/all/20240517200534.8EC5F33E%40davehans-spike.ostc.i…
---
arch/x86/kernel/cpu/common.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2b170da..e31293c 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1075,6 +1075,10 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
c->x86_virt_bits = (eax >> 8) & 0xff;
c->x86_phys_bits = eax & 0xff;
+
+ /* Provide a sane default if not enumerated: */
+ if (!c->x86_clflush_size)
+ c->x86_clflush_size = 32;
}
c->x86_cache_bits = c->x86_phys_bits;
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: b9210e56d71d9deb1ad692e405f6b2394f7baa4d
Gitweb: https://git.kernel.org/tip/b9210e56d71d9deb1ad692e405f6b2394f7baa4d
Author: Dave Hansen <dave.hansen(a)linux.intel.com>
AuthorDate: Fri, 17 May 2024 13:05:34 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Thu, 30 May 2024 07:14:27 -07:00
x86/cpu: Provide default cache line size if not enumerated
tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH]
will end up reporting cache_line_size()==0 and bad things happen.
Fill in a default on those to avoid the problem.
Long Story:
The kernel dies a horrible death if c->x86_cache_alignment (aka.
cache_line_size() is 0. Normally, this value is populated from
c->x86_clflush_size.
Right now the code is set up to get c->x86_clflush_size from two
places. First, modern CPUs get it from CPUID. Old CPUs that don't
have leaf 0x80000008 (or CPUID at all) just get some sane defaults
from the kernel in get_cpu_address_sizes().
The vast majority of CPUs that have leaf 0x80000008 also get
->x86_clflush_size from CPUID. But there are oddballs.
Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set
CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size:
cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
if (cap0 & (1<<19))
c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
So they: land in get_cpu_address_sizes() and see that CPUID has level
0x80000008 and jump into the side of the if() that does not fill in
c->x86_clflush_size. That assigns a 0 to c->x86_cache_alignment, and
hilarity ensues in code like:
buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()),
GFP_KERNEL);
To fix this, always provide a sane value for ->x86_clflush_size.
Big thanks to Andy Shevchenko for finding and reporting this and also
providing a first pass at a fix. But his fix was only partial and only
worked on the Quark CPUs. It would not, for instance, have worked on
the QEMU config.
1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel…
2. You can also get this behavior if you use "-cpu 486,+clzero"
in QEMU.
[ dhansen: remove 'vp_bits_from_cpuid' reference in changelog
because bpetkov brutally murdered it recently. ]
Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Reported-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Jörn Heusipp <osmanx(a)heusipp.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20240516173928.3960193-1-andriy.shevchenko@linu…
Link: https://lore.kernel.org/lkml/5e31cad3-ad4d-493e-ab07-724cfbfaba44@heusipp.d…
Link: https://lore.kernel.org/all/20240517200534.8EC5F33E%40davehans-spike.ostc.i…
---
arch/x86/kernel/cpu/common.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2b170da..373b16b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1070,6 +1070,10 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
cpu_has(c, X86_FEATURE_PSE36))
c->x86_phys_bits = 36;
}
+
+ /* Provide a sane default if not enumerated: */
+ if (!c->x86_clflush_size)
+ c->x86_clflush_size = 32;
} else {
cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
From: Dave Hansen <dave.hansen(a)linux.intel.com>
tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH]
will end up reporting cache_line_size()==0 and bad things happen.
Fill in a default on those to avoid the problem.
Long Story:
The kernel dies a horrible death if c->x86_cache_alignment (aka.
cache_line_size() is 0. Normally, this value is populated from
c->x86_clflush_size.
Right now the code is set up to get c->x86_clflush_size from two
places. First, modern CPUs get it from CPUID. Old CPUs that don't
have leaf 0x80000008 (or CPUID at all) just get some sane defaults
from the kernel in get_cpu_address_sizes().
The vast majority of CPUs that have leaf 0x80000008 also get
->x86_clflush_size from CPUID. But there are oddballs.
Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set
CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size:
cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
if (cap0 & (1<<19))
c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
So they: land in get_cpu_address_sizes(), set vp_bits_from_cpuid=0 and
never fill in c->x86_clflush_size, assign c->x86_cache_alignment, and
hilarity ensues in code like:
buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()),
GFP_KERNEL);
To fix this, always provide a sane value for ->x86_clflush_size.
Big thanks to Andy Shevchenko for finding and reporting this and also
providing a first pass at a fix. But his fix was only partial and only
worked on the Quark CPUs. It would not, for instance, have worked on
the QEMU config.
1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel…
2. You can also get this behavior if you use "-cpu 486,+clzero"
in QEMU.
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reported-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Link: https://lore.kernel.org/all/20240516173928.3960193-1-andriy.shevchenko@linu…
Cc: stable(a)vger.kernel.org
---
b/arch/x86/kernel/cpu/common.c | 4 ++++
1 file changed, 4 insertions(+)
diff -puN arch/x86/kernel/cpu/common.c~default-x86_clflush_size arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~default-x86_clflush_size 2024-05-17 12:51:25.886169008 -0700
+++ b/arch/x86/kernel/cpu/common.c 2024-05-17 13:03:09.761999885 -0700
@@ -1064,6 +1064,10 @@ void get_cpu_address_sizes(struct cpuinf
c->x86_virt_bits = (eax >> 8) & 0xff;
c->x86_phys_bits = eax & 0xff;
+
+ /* Provide a sane default if not enumerated: */
+ if (!c->x86_clflush_size)
+ c->x86_clflush_size = 32;
} else {
if (IS_ENABLED(CONFIG_X86_64)) {
c->x86_clflush_size = 64;
_
From: Bill Wendling <morbo(a)google.com>
Work for __counted_by on generic pointers in structures (not just
flexible array members) has started landing in Clang 19 (current tip of
tree). During the development of this feature, a restriction was added
to __counted_by to prevent the flexible array member's element type from
including a flexible array member itself such as:
struct foo {
int count;
char buf[];
};
struct bar {
int count;
struct foo data[] __counted_by(count);
};
because the size of data cannot be calculated with the standard array
size formula:
sizeof(struct foo) * count
This restriction was downgraded to a warning but due to CONFIG_WERROR,
it can still break the build. The application of __counted_by on the
states member of 'struct _StateArray' triggers this restriction,
resulting in:
drivers/gpu/drm/radeon/pptable.h:442:5: error: 'counted_by' should not be applied to an array with element of unknown size because 'ATOM_PPLIB_STATE_V2' (aka 'struct _ATOM_PPLIB_STATE_V2') is a struct type with a flexible array member. This will be an error in a future compiler version [-Werror,-Wbounds-safety-counted-by-elt-type-unknown-size]
442 | ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Remove this use of __counted_by to fix the warning/error. However,
rather than remove it altogether, leave it commented, as it may be
possible to support this in future compiler releases.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2028
Fixes: efade6fe50e7 ("drm/radeon: silence UBSAN warning (v3)")
Signed-off-by: Bill Wendling <morbo(a)google.com>
Co-developed-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/gpu/drm/radeon/pptable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/radeon/pptable.h b/drivers/gpu/drm/radeon/pptable.h
index b7f22597ee95..969a8fb0ee9e 100644
--- a/drivers/gpu/drm/radeon/pptable.h
+++ b/drivers/gpu/drm/radeon/pptable.h
@@ -439,7 +439,7 @@ typedef struct _StateArray{
//how many states we have
UCHAR ucNumEntries;
- ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
+ ATOM_PPLIB_STATE_V2 states[] /* __counted_by(ucNumEntries) */;
}StateArray;
---
base-commit: e64e8f7c178e5228e0b2dbb504b9dc75953a319f
change-id: 20240529-drop-counted-by-radeon-states-state-array-01936ded4c18
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hi all,
This series fixed some issues on bootloader - kernel
interface.
The first two fixed booting with devicetree, the last two
enhanced kernel's tolerance on different bootloader implementation.
Please review.
Thanks
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
---
Changes in v3:
- Polish all individual patches with comments received offline.
- Link to v2: https://lore.kernel.org/r/20240522-loongarch-booting-fixes-v2-0-727edb96e54…
Changes in v2:
- Enhance PATCH 3-4 based on off list discussions with Huacai & co.
- Link to v1: https://lore.kernel.org/r/20240521-loongarch-booting-fixes-v1-0-659c201c037…
---
Jiaxun Yang (4):
LoongArch: Fix built-in DTB detection
LoongArch: smp: Add all CPUs enabled by fdt to NUMA node 0
LoongArch: Fix entry point in image header
LoongArch: Override higher address bits in JUMP_VIRT_ADDR
arch/loongarch/include/asm/stackframe.h | 2 +-
arch/loongarch/kernel/head.S | 2 +-
arch/loongarch/kernel/setup.c | 6 ++++--
arch/loongarch/kernel/smp.c | 5 ++++-
arch/loongarch/kernel/vmlinux.lds.S | 2 ++
drivers/firmware/efi/libstub/loongarch.c | 2 +-
6 files changed, 13 insertions(+), 6 deletions(-)
---
base-commit: 124cfbcd6d185d4f50be02d5f5afe61578916773
change-id: 20240521-loongarch-booting-fixes-366e13e7ca55
Best regards,
--
Jiaxun Yang <jiaxun.yang(a)flygoat.com>