The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5596d9e8b553dacb0ac34bcf873cbbfb16c3ba3e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024071559-reptilian-chaffing-a991@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
5596d9e8b553 ("mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()")
bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
51718e25c53f ("mm: convert arch_clear_hugepage_flags to take a folio")
831bc31a5e82 ("mm: hugetlb: improve the handling of hugetlb allocation failure for freed or in-use hugetlb")
ebc20dcac4ce ("mm: hugetlb_vmemmap: convert page to folio")
c5ad3233ead5 ("hugetlb_vmemmap: use folio argument for hugetlb_vmemmap_* functions")
c24f188b2289 ("hugetlb: batch TLB flushes when restoring vmemmap")
f13b83fdd996 ("hugetlb: batch TLB flushes when freeing vmemmap")
f4b7e3efaddb ("hugetlb: batch PMD split for bulk vmemmap dedup")
91f386bf0772 ("hugetlb: batch freeing of vmemmap pages")
cfb8c75099db ("hugetlb: perform vmemmap restoration on a list of pages")
79359d6d24df ("hugetlb: perform vmemmap optimization on a list of pages")
d67e32f26713 ("hugetlb: restructure pool allocations")
d2cf88c27f51 ("hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles")
30a89adf872d ("hugetlb: check for hugetlb folio before vmemmap_restore")
d5b43e9683ec ("hugetlb: convert remove_pool_huge_page() to remove_pool_hugetlb_folio()")
04bbfd844b99 ("hugetlb: remove a few calls to page_folio()")
fde1c4ecf916 ("mm: hugetlb: skip initialization of gigantic tail struct pages if freed by HVO")
3ee0aa9f0675 ("mm: move some shrinker-related function declarations to mm/internal.h")
d8f5f7e445f0 ("hugetlb: set hugetlb page flag before optimizing vmemmap")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5596d9e8b553dacb0ac34bcf873cbbfb16c3ba3e Mon Sep 17 00:00:00 2001
From: Miaohe Lin <linmiaohe(a)huawei.com>
Date: Mon, 8 Jul 2024 10:51:27 +0800
Subject: [PATCH] mm/hugetlb: fix potential race in
__update_and_free_hugetlb_folio()
There is a potential race between __update_and_free_hugetlb_folio() and
try_memory_failure_hugetlb():
CPU1 CPU2
__update_and_free_hugetlb_folio try_memory_failure_hugetlb
folio_test_hugetlb
-- It's still hugetlb folio.
folio_clear_hugetlb_hwpoison
spin_lock_irq(&hugetlb_lock);
__get_huge_page_for_hwpoison
folio_set_hugetlb_hwpoison
spin_unlock_irq(&hugetlb_lock);
spin_lock_irq(&hugetlb_lock);
__folio_clear_hugetlb(folio);
-- Hugetlb flag is cleared but too late.
spin_unlock_irq(&hugetlb_lock);
When the above race occurs, raw error page info will be leaked. Even
worse, raw error pages won't have hwpoisoned flag set and hit
pcplists/buddy. Fix this issue by deferring
folio_clear_hugetlb_hwpoison() until __folio_clear_hugetlb() is done. So
all raw error pages will have hwpoisoned flag set.
Link: https://lkml.kernel.org/r/20240708025127.107713-1-linmiaohe@huawei.com
Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Muchun Song <muchun.song(a)linux.dev>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2afb70171b76..fe44324d6383 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1725,13 +1725,6 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
return;
}
- /*
- * Move PageHWPoison flag from head page to the raw error pages,
- * which makes any healthy subpages reusable.
- */
- if (unlikely(folio_test_hwpoison(folio)))
- folio_clear_hugetlb_hwpoison(folio);
-
/*
* If vmemmap pages were allocated above, then we need to clear the
* hugetlb flag under the hugetlb lock.
@@ -1742,6 +1735,13 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
spin_unlock_irq(&hugetlb_lock);
}
+ /*
+ * Move PageHWPoison flag from head page to the raw error pages,
+ * which makes any healthy subpages reusable.
+ */
+ if (unlikely(folio_test_hwpoison(folio)))
+ folio_clear_hugetlb_hwpoison(folio);
+
folio_ref_unfreeze(folio, 1);
/*
On Wed, Aug 07, 2024 at 06:00:11AM +0300, ahmed Ehab wrote:
> On Sat, Aug 3, 2024 at 3:51 AM Boqun Feng <boqun.feng(a)gmail.com> wrote:
>
> > On Mon, Jul 15, 2024 at 04:26:38PM +0300, botta633 wrote:
> > > From: Ahmed Ehab <bottaawesome633(a)gmail.com>
> > >
> > > Checking if the lockdep_map->name will change when setting the subclass.
> > > It shouldn't change so that the lock class and subclass will have the
> > same
> > > name
> > >
> > > Reported-by: <syzbot+7f4a6f7f7051474e40ad(a)syzkaller.appspotmail.com>
> > > Fixes: de8f5e4f2dc1f ("lockdep: Introduce wait-type checks")
> > > Cc: <stable(a)vger.kernel.org>
> >
> > You seems to miss my comment at v2:
> >
> > https://lore.kernel.org/lkml/ZpRKcHNZfsMuACRG@boqun-archlinux/
> >
> > , i.e. you don't need the Reported-by, Fixes and Cc tag for the patch
> > that adds a test case.
> >
> > > Signed-off-by: Ahmed Ehab <bottaawesome633(a)gmail.com>
> > > ---
> > > v3->v4:
> > > - Fixed subject line truncation.
> > >
> > > lib/locking-selftest.c | 21 +++++++++++++++++++++
> > > 1 file changed, 21 insertions(+)
> > >
> > > diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
> > > index 6f6a5fc85b42..aeed613799ca 100644
> > > --- a/lib/locking-selftest.c
> > > +++ b/lib/locking-selftest.c
> > > @@ -2710,6 +2710,25 @@ static void local_lock_3B(void)
> > >
> > > }
> > >
> > > + /**
> >
> > ^ there is a tailing space here, next time you can detect this by using
> > checkpatch. Also "/**" style is especially for function signature
> > comment, you could just use a "/*" here.
> >
> > > + * after setting the subclass the lockdep_map.name changes
> > > + * if we initialize a new string literal for the subclass
> > > + * we will have a new name pointer
> > > + */
> > > +static void class_subclass_X1_name_test(void)
> > > +{
> > > + printk("
> > --------------------------------------------------------------------------\n");
> > > + printk(" | class and subclass name test|\n");
> > > + printk(" ---------------------\n");
> > > + const char *name_before_setting_subclass = rwsem_X1.dep_map.name;
> > > + const char *name_after_setting_subclass;
> > > +
> > > + WARN_ON(!rwsem_X1.dep_map.name);
> > > + lockdep_set_subclass(&rwsem_X1, 1);
> > > + name_after_setting_subclass = rwsem_X1.dep_map.name;
> > > + WARN_ON(name_before_setting_subclass !=
> > name_after_setting_subclass);
> > > +}
> > > +
> > > static void local_lock_tests(void)
> > > {
> > > printk("
> > --------------------------------------------------------------------------\n");
> > > @@ -2916,6 +2935,8 @@ void locking_selftest(void)
> > >
> > > local_lock_tests();
> > >
> > > + class_subclass_X1_name_test();
> > > +
> >
> > I got this in the serial log:
> >
> > [ 0.619454]
> > --------------------------------------------------------------------------
> > [ 0.621463] | local_lock tests |
> > [ 0.622326] ---------------------
> > [ 0.623211] local_lock inversion 2: ok |
> > [ 0.624904] local_lock inversion 3A: ok |
> > [ 0.626740] local_lock inversion 3B: ok |
> > [ 0.628492]
> > --------------------------------------------------------------------------
> > [ 0.630513] | class and subclass name test|
> > [ 0.631614] ---------------------
> > [ 0.632502] hardirq_unsafe_softirq_safe: ok |
> >
> > two problems here:
> >
> > 1) The "class and subclass name test" line interrupts the output of
> > testsuite "local_lock tests".
> >
> > 2) Instead of a WARN_ON(), could you look into using dotest() to
> > print "ok" if the test passes, which is consistent with other
> >
> tests.
> >
>
> I wrote it this way:
> static void lock_class_subclass_X1(void)
> {
> const char *name_before_setting_subclass = rwsem_X1.dep_map.name;
> const char *name_after_setting_subclass;
>
> lockdep_set_subclass(&rwsem_X1, 1);
> name_after_setting_subclass = rwsem_X1.dep_map.name;
> debug_locks = name_before_setting_subclass == name_after_setting_subclass;
I think you could use:
DEBUG_LOCK_WARN_ON(name_before_setting_subclass != name_after_setting_subclass);
here.
Regards,
Boqun
> }
> ...
> static void class_subclass_X1_name_test(void)
> {
> printk("
> --------------------------------------------------------------------------\n");
> printk(" | class and subclass name test|\n");
> printk(" ---------------------\n");
>
> print_testname("lock class and subclass same name");
> dotest(lock_class_subclass_X1, SUCCESS, LOCKTYPE_RWSEM);
> pr_cont("\n");
> }
> However, assigning a value to debug_locks seems very uncommon. I tried to
> check other test cases; however, they seem to rely on the method they are
> testing. Do you have a suggestion for my scenario if I want to compare the
> names before and after setting the subclass?
> Or you suggest that I follow a different approach other than comparing the
> names such as checking debug_locks in lockdep_init_map_type and returning
> when we have multiple instantiations for lock->name?
>
> >
> > Could you please fix all above problems and send another version of this
> > patch (no need to resend the first one)? Thanks!
> >
> > Regards,
> > Boqun
> >
> > > print_testname("hardirq_unsafe_softirq_safe");
> > > dotest(hardirq_deadlock_softirq_not_deadlock, FAILURE,
> > LOCKTYPE_SPECIAL);
> > > pr_cont("\n");
> > > --
> > > 2.45.2
> > >
> >
>
> Regards,
> Ahmed
From: Michal Kubiak <michal.kubiak(a)intel.com>
The initialization of vport interrupt consists of two functions:
1) idpf_vport_intr_init() where a generic configuration is done
2) idpf_vport_intr_req_irq() where the irq for each q_vector is
requested.
The first function used to create a base name for each interrupt using
"kasprintf()" call. Unfortunately, although that call allocated memory
for a text buffer, that memory was never released.
Fix this by removing creating the interrupt base name in 1).
Instead, always create a full interrupt name in the function 2), because
there is no need to create a base name separately, considering that the
function 2) is never called out of idpf_vport_intr_init() context.
Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport")
Cc: stable(a)vger.kernel.org # 6.7
Signed-off-by: Michal Kubiak <michal.kubiak(a)intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga(a)intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin(a)intel.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Tested-by: Krishneil Singh <krishneil.k.singh(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index af2879f03b8d..a2f9f252694a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -3780,13 +3780,15 @@ void idpf_vport_intr_update_itr_ena_irq(struct idpf_q_vector *q_vector)
/**
* idpf_vport_intr_req_irq - get MSI-X vectors from the OS for the vport
* @vport: main vport structure
- * @basename: name for the vector
*/
-static int idpf_vport_intr_req_irq(struct idpf_vport *vport, char *basename)
+static int idpf_vport_intr_req_irq(struct idpf_vport *vport)
{
struct idpf_adapter *adapter = vport->adapter;
+ const char *drv_name, *if_name, *vec_name;
int vector, err, irq_num, vidx;
- const char *vec_name;
+
+ drv_name = dev_driver_string(&adapter->pdev->dev);
+ if_name = netdev_name(vport->netdev);
for (vector = 0; vector < vport->num_q_vectors; vector++) {
struct idpf_q_vector *q_vector = &vport->q_vectors[vector];
@@ -3804,8 +3806,8 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport, char *basename)
else
continue;
- name = kasprintf(GFP_KERNEL, "%s-%s-%d", basename, vec_name,
- vidx);
+ name = kasprintf(GFP_KERNEL, "%s-%s-%s-%d", drv_name, if_name,
+ vec_name, vidx);
err = request_irq(irq_num, idpf_vport_intr_clean_queues, 0,
name, q_vector);
@@ -4326,7 +4328,6 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
*/
int idpf_vport_intr_init(struct idpf_vport *vport)
{
- char *int_name;
int err;
err = idpf_vport_intr_init_vec_idx(vport);
@@ -4340,11 +4341,7 @@ int idpf_vport_intr_init(struct idpf_vport *vport)
if (err)
goto unroll_vectors_alloc;
- int_name = kasprintf(GFP_KERNEL, "%s-%s",
- dev_driver_string(&vport->adapter->pdev->dev),
- vport->netdev->name);
-
- err = idpf_vport_intr_req_irq(vport, int_name);
+ err = idpf_vport_intr_req_irq(vport);
if (err)
goto unroll_vectors_alloc;
--
2.42.0
To prevent potential error return values, it is necessary to check the
return value of btf__type_by_id. We can add a kind checking to fix the
issue.
Cc: stable(a)vger.kernel.org
Fixes: 430025e5dca5 ("libbpf: Add subskeleton scaffolding")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
tools/lib/bpf/libbpf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index a3be6f8fac09..d1eb45d16054 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -13850,6 +13850,9 @@ int bpf_object__open_subskeleton(struct bpf_object_subskeleton *s)
var = btf_var_secinfos(map_type);
for (i = 0; i < len; i++, var++) {
var_type = btf__type_by_id(btf, var->type);
+ if (!var_type)
+ return libbpf_err(-ENOENT);
+
var_name = btf__name_by_offset(btf, var_type->name_off);
if (strcmp(var_name, var_skel->name) == 0) {
*var_skel->addr = map->mmaped + var->offset;
--
2.25.1
The patch titled
Subject: mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-use-raw_spinlock_t-in-struct-memory_failure_cpu.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Waiman Long <longman(a)redhat.com>
Subject: mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu
Date: Tue, 6 Aug 2024 12:41:07 -0400
The memory_failure_cpu structure is a per-cpu structure. Access to its
content requires the use of get_cpu_var() to lock in the current CPU and
disable preemption. The use of a regular spinlock_t for locking purpose
is fine for a non-RT kernel.
Since the integration of RT spinlock support into the v5.15 kernel, a
spinlock_t in a RT kernel becomes a sleeping lock and taking a sleeping
lock in a preemption disabled context is illegal resulting in the
following kind of warning.
[12135.732244] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[12135.732248] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 270076, name: kworker/0:0
[12135.732252] preempt_count: 1, expected: 0
[12135.732255] RCU nest depth: 2, expected: 2
:
[12135.732420] Hardware name: Dell Inc. PowerEdge R640/0HG0J8, BIOS 2.10.2 02/24/2021
[12135.732423] Workqueue: kacpi_notify acpi_os_execute_deferred
[12135.732433] Call Trace:
[12135.732436] <TASK>
[12135.732450] dump_stack_lvl+0x57/0x81
[12135.732461] __might_resched.cold+0xf4/0x12f
[12135.732479] rt_spin_lock+0x4c/0x100
[12135.732491] memory_failure_queue+0x40/0xe0
[12135.732503] ghes_do_memory_failure+0x53/0x390
[12135.732516] ghes_do_proc.constprop.0+0x229/0x3e0
[12135.732575] ghes_proc+0xf9/0x1a0
[12135.732591] ghes_notify_hed+0x6a/0x150
[12135.732602] notifier_call_chain+0x43/0xb0
[12135.732626] blocking_notifier_call_chain+0x43/0x60
[12135.732637] acpi_ev_notify_dispatch+0x47/0x70
[12135.732648] acpi_os_execute_deferred+0x13/0x20
[12135.732654] process_one_work+0x41f/0x500
[12135.732695] worker_thread+0x192/0x360
[12135.732715] kthread+0x111/0x140
[12135.732733] ret_from_fork+0x29/0x50
[12135.732779] </TASK>
Fix it by using a raw_spinlock_t for locking instead. Also move the
pr_err() out of the lock critical section to avoid indeterminate latency
of this call.
Link: https://lkml.kernel.org/r/20240806164107.1044956-1-longman@redhat.com
Fixes: ea8f5fb8a71f ("HWPoison: add memory_failure_queue()")
Signed-off-by: Waiman Long <longman(a)redhat.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Juri Lelli <juri.lelli(a)redhat.com>
Cc: Len Brown <len.brown(a)intel.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-use-raw_spinlock_t-in-struct-memory_failure_cpu
+++ a/mm/memory-failure.c
@@ -2417,7 +2417,7 @@ struct memory_failure_entry {
struct memory_failure_cpu {
DECLARE_KFIFO(fifo, struct memory_failure_entry,
MEMORY_FAILURE_FIFO_SIZE);
- spinlock_t lock;
+ raw_spinlock_t lock;
struct work_struct work;
};
@@ -2443,19 +2443,21 @@ void memory_failure_queue(unsigned long
{
struct memory_failure_cpu *mf_cpu;
unsigned long proc_flags;
+ bool buffer_overflow;
struct memory_failure_entry entry = {
.pfn = pfn,
.flags = flags,
};
mf_cpu = &get_cpu_var(memory_failure_cpu);
- spin_lock_irqsave(&mf_cpu->lock, proc_flags);
- if (kfifo_put(&mf_cpu->fifo, entry))
+ raw_spin_lock_irqsave(&mf_cpu->lock, proc_flags);
+ buffer_overflow = !kfifo_put(&mf_cpu->fifo, entry);
+ if (!buffer_overflow)
schedule_work_on(smp_processor_id(), &mf_cpu->work);
- else
+ raw_spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
+ if (buffer_overflow)
pr_err("buffer overflow when queuing memory failure at %#lx\n",
pfn);
- spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
put_cpu_var(memory_failure_cpu);
}
EXPORT_SYMBOL_GPL(memory_failure_queue);
@@ -2469,9 +2471,9 @@ static void memory_failure_work_func(str
mf_cpu = container_of(work, struct memory_failure_cpu, work);
for (;;) {
- spin_lock_irqsave(&mf_cpu->lock, proc_flags);
+ raw_spin_lock_irqsave(&mf_cpu->lock, proc_flags);
gotten = kfifo_get(&mf_cpu->fifo, &entry);
- spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
+ raw_spin_unlock_irqrestore(&mf_cpu->lock, proc_flags);
if (!gotten)
break;
if (entry.flags & MF_SOFT_OFFLINE)
@@ -2501,7 +2503,7 @@ static int __init memory_failure_init(vo
for_each_possible_cpu(cpu) {
mf_cpu = &per_cpu(memory_failure_cpu, cpu);
- spin_lock_init(&mf_cpu->lock);
+ raw_spin_lock_init(&mf_cpu->lock);
INIT_KFIFO(mf_cpu->fifo);
INIT_WORK(&mf_cpu->work, memory_failure_work_func);
}
_
Patches currently in -mm which might be from longman(a)redhat.com are
padata-fix-possible-divide-by-0-panic-in-padata_mt_helper.patch
mm-memory-failure-use-raw_spinlock_t-in-struct-memory_failure_cpu.patch
watchdog-handle-the-enodev-failure-case-of-lockup_detector_delay_init-separately.patch
The patch titled
Subject: padata: Fix possible divide-by-0 panic in padata_mt_helper()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
padata-fix-possible-divide-by-0-panic-in-padata_mt_helper.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Waiman Long <longman(a)redhat.com>
Subject: padata: Fix possible divide-by-0 panic in padata_mt_helper()
Date: Tue, 6 Aug 2024 13:46:47 -0400
We are hit with a not easily reproducible divide-by-0 panic in padata.c at
bootup time.
[ 10.017908] Oops: divide error: 0000 1 PREEMPT SMP NOPTI
[ 10.017908] CPU: 26 PID: 2627 Comm: kworker/u1666:1 Not tainted 6.10.0-15.el10.x86_64 #1
[ 10.017908] Hardware name: Lenovo ThinkSystem SR950 [7X12CTO1WW]/[7X12CTO1WW], BIOS [PSE140J-2.30] 07/20/2021
[ 10.017908] Workqueue: events_unbound padata_mt_helper
[ 10.017908] RIP: 0010:padata_mt_helper+0x39/0xb0
:
[ 10.017963] Call Trace:
[ 10.017968] <TASK>
[ 10.018004] ? padata_mt_helper+0x39/0xb0
[ 10.018084] process_one_work+0x174/0x330
[ 10.018093] worker_thread+0x266/0x3a0
[ 10.018111] kthread+0xcf/0x100
[ 10.018124] ret_from_fork+0x31/0x50
[ 10.018138] ret_from_fork_asm+0x1a/0x30
[ 10.018147] </TASK>
Looking at the padata_mt_helper() function, the only way a divide-by-0
panic can happen is when ps->chunk_size is 0. The way that chunk_size is
initialized in padata_do_multithreaded(), chunk_size can be 0 when the
min_chunk in the passed-in padata_mt_job structure is 0.
Fix this divide-by-0 panic by making sure that chunk_size will be at least
1 no matter what the input parameters are.
Link: https://lkml.kernel.org/r/20240806174647.1050398-1-longman@redhat.com
Fixes: 004ed42638f4 ("padata: add basic support for multithreaded jobs")
Signed-off-by: Waiman Long <longman(a)redhat.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Cc: Waiman Long <longman(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/padata.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/kernel/padata.c~padata-fix-possible-divide-by-0-panic-in-padata_mt_helper
+++ a/kernel/padata.c
@@ -517,6 +517,13 @@ void __init padata_do_multithreaded(stru
ps.chunk_size = max(ps.chunk_size, job->min_chunk);
ps.chunk_size = roundup(ps.chunk_size, job->align);
+ /*
+ * chunk_size can be 0 if the caller sets min_chunk to 0. So force it
+ * to at least 1 to prevent divide-by-0 panic in padata_mt_helper().`
+ */
+ if (!ps.chunk_size)
+ ps.chunk_size = 1U;
+
list_for_each_entry(pw, &works, pw_list)
if (job->numa_aware) {
int old_node = atomic_read(&last_used_nid);
_
Patches currently in -mm which might be from longman(a)redhat.com are
padata-fix-possible-divide-by-0-panic-in-padata_mt_helper.patch
watchdog-handle-the-enodev-failure-case-of-lockup_detector_delay_init-separately.patch
The following changes since commit 6d834691da474ed1c648753d3d3a3ef8379fa1c1:
virtio_pci_modern: remove admin queue serialization lock (2024-07-17 05:43:21 -0400)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 0823dc64586ba5ea13a7d200a5d33e4c5fa45950:
vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler (2024-07-26 03:26:02 -0400)
----------------------------------------------------------------
virtio: bugfix
Fixes a single, long-standing issue with kick pass-through vdpa.
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Jason Wang (1):
vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler
drivers/vhost/vdpa.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)