The patch titled
Subject: hugetlb: check for undefined shift on 32 bit architectures
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
hugetlb-check-for-undefined-shift-on-32-bit-architectures.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlb: check for undefined shift on 32 bit architectures
Date: Wed, 15 Feb 2023 17:35:42 -0800
Users can specify the hugetlb page size in the mmap, shmget and
memfd_create system calls. This is done by using 6 bits within the flags
argument to encode the base-2 logarithm of the desired page size. The
routine hstate_sizelog() uses the log2 value to find the corresponding
hugetlb hstate structure. Converting the log2 value (page_size_log) to
potential hugetlb page size is the simple statement:
1UL << page_size_log
Because only 6 bits are used for page_size_log, the left shift can not be
greater than 63. This is fine on 64 bit architectures where a long is 64
bits. However, if a value greater than 31 is passed on a 32 bit
architecture (where long is 32 bits) the shift will result in undefined
behavior. This was generally not an issue as the result of the undefined
shift had to exactly match hugetlb page size to proceed.
Recent improvements in runtime checking have resulted in this undefined
behavior throwing errors such as reported below.
Fix by comparing page_size_log to BITS_PER_LONG before doing shift.
Link: https://lkml.kernel.org/r/20230216013542.138708-1-mike.kravetz@oracle.com
Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4M…
Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Reviewed-by: Jesper Juhl <jesperjuhl76(a)gmail.com>
Acked-by: Muchun Song <songmuchun(a)bytedance.com>
Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Cc: Anders Roxell <anders.roxell(a)linaro.org>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Sasha Levin <sashal(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/include/linux/hugetlb.h~hugetlb-check-for-undefined-shift-on-32-bit-architectures
+++ a/include/linux/hugetlb.h
@@ -743,7 +743,10 @@ static inline struct hstate *hstate_size
if (!page_size_log)
return &default_hstate;
- return size_to_hstate(1UL << page_size_log);
+ if (page_size_log < BITS_PER_LONG)
+ return size_to_hstate(1UL << page_size_log);
+
+ return NULL;
}
static inline struct hstate *hstate_vma(struct vm_area_struct *vma)
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlb-check-for-undefined-shift-on-32-bit-architectures.patch
The patch titled
Subject: mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-migrate-fix-wrongly-apply-write-bit-after-mkdirty-on-sparc64.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
Date: Thu, 16 Feb 2023 10:30:59 -0500
Nick Bowler reported another sparc64 breakage after the young/dirty
persistent work for page migration (per "Link:" below). That's after a
similar report [2].
It turns out page migration was overlooked, and it wasn't failing before
because page migration was not enabled in the initial report test
environment.
David proposed another way [2] to fix this from sparc64 side, but that
patch didn't land somehow. Neither did I check whether there's any other
arch that has similar issues.
Let's fix it for now as simple as moving the write bit handling to be
after dirty, like what we did before.
Note: this is based on mm-unstable, because the breakage was since 6.1 and
we're at a very late stage of 6.2 (-rc8), so I assume for this specific
case we should target this at 6.3.
[1] https://lore.kernel.org/all/20221021160603.GA23307@u164.east.ru/
[2] https://lore.kernel.org/all/20221212130213.136267-1-david@redhat.com/
Link: https://lkml.kernel.org/r/20230216153059.256739-1-peterx@redhat.com
Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations")
Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u…
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reported-by: Nick Bowler <nbowler(a)draconx.ca>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Nick Bowler <nbowler(a)draconx.ca>
Cc: <regressions(a)lists.linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/huge_memory.c~mm-migrate-fix-wrongly-apply-write-bit-after-mkdirty-on-sparc64
+++ a/mm/huge_memory.c
@@ -3272,8 +3272,6 @@ void remove_migration_pmd(struct page_vm
pmde = mk_huge_pmd(new, READ_ONCE(vma->vm_page_prot));
if (pmd_swp_soft_dirty(*pvmw->pmd))
pmde = pmd_mksoft_dirty(pmde);
- if (is_writable_migration_entry(entry))
- pmde = maybe_pmd_mkwrite(pmde, vma);
if (pmd_swp_uffd_wp(*pvmw->pmd))
pmde = pmd_wrprotect(pmd_mkuffd_wp(pmde));
if (!is_migration_entry_young(entry))
@@ -3281,6 +3279,10 @@ void remove_migration_pmd(struct page_vm
/* NOTE: this may contain setting soft-dirty on some archs */
if (PageDirty(new) && is_migration_entry_dirty(entry))
pmde = pmd_mkdirty(pmde);
+ if (is_writable_migration_entry(entry))
+ pmde = maybe_pmd_mkwrite(pmde, vma);
+ else
+ pmde = pmd_wrprotect(pmde);
if (PageAnon(new)) {
rmap_t rmap_flags = RMAP_COMPOUND;
--- a/mm/migrate.c~mm-migrate-fix-wrongly-apply-write-bit-after-mkdirty-on-sparc64
+++ a/mm/migrate.c
@@ -224,6 +224,8 @@ static bool remove_migration_pte(struct
pte = maybe_mkwrite(pte, vma);
else if (pte_swp_uffd_wp(*pvmw.pte))
pte = pte_mkuffd_wp(pte);
+ else
+ pte = pte_wrprotect(pte);
if (folio_test_anon(folio) && !is_readable_migration_entry(entry))
rmap_flags |= RMAP_EXCLUSIVE;
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-migrate-fix-wrongly-apply-write-bit-after-mkdirty-on-sparc64.patch
mm-uffd-fix-comment-in-handling-pte-markers.patch
This patch introduce a new internal flag per lkb value to handle
internal flags which are handled not on wire. The current lkb internal
flags stored as lkb->lkb_flags are split in upper and lower bits, the
lower bits are used to share internal flags over wire for other cluster
wide lkb copies on other nodes.
In commit 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks")
we introduced a new internal flag for pending callbacks for the dlm
callback queue. This flag is protected by the lkb->lkb_cb_lock lock.
This patch overlooked that on dlm receive path and the mentioned upper
and lower bits, that dlm will read the flags, mask it and write it
back. As example receive_flags() in fs/dlm/lock.c. This flag
manipulation is not done atomically and is not protected by
lkb->lkb_cb_lock. This has unknown side effects of the current callback
handling.
In future we should move to set/clear/test bit functionality and avoid
read, mask and writing back flag values. In later patches we will move
the upper parts to the new introduced internal lkb flags which are not
shared between other cluster nodes to the new non shared internal flag
field to avoid similar issues.
Cc: stable(a)vger.kernel.org
Fixes: 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks")
Reported-by: Bob Peterson <rpeterso(a)redhat.com>
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
---
fs/dlm/ast.c | 9 ++++-----
fs/dlm/dlm_internal.h | 7 ++++++-
fs/dlm/user.c | 2 +-
3 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 26fef9945cc9..7daffdd99f99 100644
--- a/fs/dlm/ast.c
+++ b/fs/dlm/ast.c
@@ -45,7 +45,7 @@ void dlm_purge_lkb_callbacks(struct dlm_lkb *lkb)
kref_put(&cb->ref, dlm_release_callback);
}
- lkb->lkb_flags &= ~DLM_IFL_CB_PENDING;
+ clear_bit(DLM_IFLNS_CB_PENDING, &lkb->lkb_insflags);
/* invalidate */
dlm_callback_set_last_ptr(&lkb->lkb_last_cast, NULL);
@@ -103,10 +103,9 @@ int dlm_enqueue_lkb_callback(struct dlm_lkb *lkb, uint32_t flags, int mode,
cb->sb_status = status;
cb->sb_flags = (sbflags & 0x000000FF);
kref_init(&cb->ref);
- if (!(lkb->lkb_flags & DLM_IFL_CB_PENDING)) {
- lkb->lkb_flags |= DLM_IFL_CB_PENDING;
+ if (!test_and_set_bit(DLM_IFLNS_CB_PENDING, &lkb->lkb_insflags))
rv = DLM_ENQUEUE_CALLBACK_NEED_SCHED;
- }
+
list_add_tail(&cb->list, &lkb->lkb_callbacks);
if (flags & DLM_CB_CAST)
@@ -209,7 +208,7 @@ void dlm_callback_work(struct work_struct *work)
spin_lock(&lkb->lkb_cb_lock);
rv = dlm_dequeue_lkb_callback(lkb, &cb);
if (rv == DLM_DEQUEUE_CALLBACK_EMPTY) {
- lkb->lkb_flags &= ~DLM_IFL_CB_PENDING;
+ clear_bit(DLM_IFLNS_CB_PENDING, &lkb->lkb_insflags);
spin_unlock(&lkb->lkb_cb_lock);
break;
}
diff --git a/fs/dlm/dlm_internal.h b/fs/dlm/dlm_internal.h
index ab1a55337a6e..b967b4d7d55d 100644
--- a/fs/dlm/dlm_internal.h
+++ b/fs/dlm/dlm_internal.h
@@ -211,7 +211,11 @@ struct dlm_args {
#endif
#define DLM_IFL_DEADLOCK_CANCEL 0x01000000
#define DLM_IFL_STUB_MS 0x02000000 /* magic number for m_flags */
-#define DLM_IFL_CB_PENDING 0x04000000
+
+/* lkb_insflags */
+
+#define DLM_IFLNS_CB_PENDING 0
+
/* least significant 2 bytes are message changed, they are full transmitted
* but at receive side only the 2 bytes LSB will be set.
*
@@ -246,6 +250,7 @@ struct dlm_lkb {
uint32_t lkb_exflags; /* external flags from caller */
uint32_t lkb_sbflags; /* lksb flags */
uint32_t lkb_flags; /* internal flags */
+ unsigned long lkb_insflags; /* internal non shared flags */
uint32_t lkb_lvbseq; /* lvb sequence number */
int8_t lkb_status; /* granted, waiting, convert */
diff --git a/fs/dlm/user.c b/fs/dlm/user.c
index 35129505ddda..98488a1b702d 100644
--- a/fs/dlm/user.c
+++ b/fs/dlm/user.c
@@ -884,7 +884,7 @@ static ssize_t device_read(struct file *file, char __user *buf, size_t count,
goto try_another;
case DLM_DEQUEUE_CALLBACK_LAST:
list_del_init(&lkb->lkb_cb_list);
- lkb->lkb_flags &= ~DLM_IFL_CB_PENDING;
+ clear_bit(DLM_IFLNS_CB_PENDING, &lkb->lkb_insflags);
break;
case DLM_DEQUEUE_CALLBACK_SUCCESS:
break;
--
2.31.1
radix_tree_preload() warns on attempting to call it with an allocation
mask that doesn't allow blocking. While that warning could arguably
be removed, we need to handle radix insertion failures anyway as they
are more likely if we cannot block to get memory.
Remove legacy BUG_ON()'s and turn them into proper errors instead, one
for the allocation failure and one for finding a page that doesn't
match the correct index.
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
drivers/block/brd.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 1ddada0cdaca..6019ef23344f 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -84,6 +84,7 @@ static int brd_insert_page(struct brd_device *brd, sector_t sector, gfp_t gfp)
{
pgoff_t idx;
struct page *page;
+ int ret = 0;
page = brd_lookup_page(brd, sector);
if (page)
@@ -93,7 +94,7 @@ static int brd_insert_page(struct brd_device *brd, sector_t sector, gfp_t gfp)
if (!page)
return -ENOMEM;
- if (radix_tree_preload(gfp)) {
+ if (gfpflags_allow_blocking(gfp) && radix_tree_preload(gfp)) {
__free_page(page);
return -ENOMEM;
}
@@ -104,8 +105,10 @@ static int brd_insert_page(struct brd_device *brd, sector_t sector, gfp_t gfp)
if (radix_tree_insert(&brd->brd_pages, idx, page)) {
__free_page(page);
page = radix_tree_lookup(&brd->brd_pages, idx);
- BUG_ON(!page);
- BUG_ON(page->index != idx);
+ if (!page)
+ ret = -ENOMEM;
+ else if (page->index != idx)
+ ret = -EIO;
} else {
brd->brd_nr_pages++;
}
--
2.39.1
From: Mario Limonciello <mario.limonciello(a)amd.com>
BugLink: https://bugs.launchpad.net/bugs/2007581
Commit 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members
before initialization") attempted to fix a race condition that lead to a
NULL pointer, but in the process caused a regression for _AEI/_EVT
declared GPIOs.
This manifests in messages showing deferred probing while trying to
allocate IRQs like so:
amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517
amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x002C to IRQ, err -517
amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x003D to IRQ, err -517
[ .. more of the same .. ]
The code for walking _AEI doesn't handle deferred probing and so this
leads to non-functional GPIO interrupts.
Fix this issue by moving the call to `acpi_gpiochip_request_interrupts`
to occur after gc->irc.initialized is set.
Fixes: 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization")
Link: https://lore.kernel.org/linux-gpio/BL1PR12MB51577A77F000A008AA694675E2EF9@B…
Link: https://bugzilla.suse.com/show_bug.cgi?id=1198697
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215850
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1979
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1976
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
Reviewed-by: Shreeya Patel <shreeya.patel(a)collabora.com>
Tested-By: Samuel Čavoj <samuel(a)cavoj.net>
Tested-By: lukeluk498(a)gmail.com Link:
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Acked-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-and-tested-by: Takashi Iwai <tiwai(a)suse.de>
Cc: Shreeya Patel <shreeya.patel(a)collabora.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
(backported from commit 06fb4ecfeac7e00d6704fa5ed19299f2fefb3cc9)
Signed-off-by: Asmaa Mnebhi <asmaa(a)nvidia.com>
---
drivers/gpio/gpiolib.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index e4d47e00c392..049cdfc975b3 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2329,8 +2329,6 @@ static int gpiochip_add_irqchip(struct gpio_chip *gpiochip,
gpiochip_set_irq_hooks(gpiochip);
- acpi_gpiochip_request_interrupts(gpiochip);
-
/*
* Using barrier() here to prevent compiler from reordering
* gc->irq.initialized before initialization of above
@@ -2340,6 +2338,8 @@ static int gpiochip_add_irqchip(struct gpio_chip *gpiochip,
gpiochip->irq.initialized = true;
+ acpi_gpiochip_request_interrupts(gpiochip);
+
return 0;
}
--
2.30.1
From: Shreeya Patel <shreeya.patel(a)collabora.com>
BugLink: https://bugs.launchpad.net/bugs/2007581
GPIO chip irq members are exposed before they could be completely
initialized and this leads to race conditions.
One such issue was observed for the gc->irq.domain variable which
was accessed through the I2C interface in gpiochip_to_irq() before
it could be initialized by gpiochip_add_irqchip(). This resulted in
Kernel NULL pointer dereference.
Following are the logs for reference :-
kernel: Call Trace:
kernel: gpiod_to_irq+0x53/0x70
kernel: acpi_dev_gpio_irq_get_by+0x113/0x1f0
kernel: i2c_acpi_get_irq+0xc0/0xd0
kernel: i2c_device_probe+0x28a/0x2a0
kernel: really_probe+0xf2/0x460
kernel: RIP: 0010:gpiochip_to_irq+0x47/0xc0
To avoid such scenarios, restrict usage of GPIO chip irq members before
they are completely initialized.
Signed-off-by: Shreeya Patel <shreeya.patel(a)collabora.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl(a)bgdev.pl>
(backported from commit 5467801f1fcbdc46bc7298a84dbf3ca1ff2a7320)
Signed-off-by: Asmaa Mnebhi <asmaa(a)nvidia.com>
---
drivers/gpio/gpiolib.c | 19 +++++++++++++++++++
include/linux/gpio/driver.h | 9 +++++++++
2 files changed, 28 insertions(+)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index abdf448b11a3..e4d47e00c392 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2146,6 +2146,16 @@ static int gpiochip_to_irq(struct gpio_chip *chip, unsigned offset)
{
struct irq_domain *domain = chip->irq.domain;
+#ifdef CONFIG_GPIOLIB_IRQCHIP
+ /*
+ * Avoid race condition with other code, which tries to lookup
+ * an IRQ before the irqchip has been properly registered,
+ * i.e. while gpiochip is still being brought up.
+ */
+ if (!chip->irq.initialized)
+ return -EPROBE_DEFER;
+#endif
+
if (!gpiochip_irqchip_irq_valid(chip, offset))
return -ENXIO;
@@ -2321,6 +2331,15 @@ static int gpiochip_add_irqchip(struct gpio_chip *gpiochip,
acpi_gpiochip_request_interrupts(gpiochip);
+ /*
+ * Using barrier() here to prevent compiler from reordering
+ * gc->irq.initialized before initialization of above
+ * GPIO chip irq members.
+ */
+ barrier();
+
+ gpiochip->irq.initialized = true;
+
return 0;
}
diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h
index 5dd9c982e2cb..15418caf76fc 100644
--- a/include/linux/gpio/driver.h
+++ b/include/linux/gpio/driver.h
@@ -201,6 +201,15 @@ struct gpio_irq_chip {
*/
bool threaded;
+ /**
+ * @initialized:
+ *
+ * Flag to track GPIO chip irq member's initialization.
+ * This flag will make sure GPIO chip irq members are not used
+ * before they are initialized.
+ */
+ bool initialized;
+
/**
* @init_hw: optional routine to initialize hardware before
* an IRQ chip will be added. This is quite useful when
--
2.30.1
Users can specify the hugetlb page size in the mmap, shmget and
memfd_create system calls. This is done by using 6 bits within the
flags argument to encode the base-2 logarithm of the desired page size.
The routine hstate_sizelog() uses the log2 value to find the
corresponding hugetlb hstate structure. Converting the log2 value
(page_size_log) to potential hugetlb page size is the simple statement:
1UL << page_size_log
Because only 6 bits are used for page_size_log, the left shift can not
be greater than 63. This is fine on 64 bit architectures where a long
is 64 bits. However, if a value greater than 31 is passed on a 32 bit
architecture (where long is 32 bits) the shift will result in undefined
behavior. This was generally not an issue as the result of the
undefined shift had to exactly match hugetlb page size to proceed.
Recent improvements in runtime checking have resulted in this undefined
behavior throwing errors such as reported below.
Fix by comparing page_size_log to BITS_PER_LONG before doing shift.
Reported-by: Naresh Kamboju <naresh.kamboju(a)linaro.org>
Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4M…
Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
include/linux/hugetlb.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index df6dd624ccfe..8b45720f9475 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -781,7 +781,10 @@ static inline struct hstate *hstate_sizelog(int page_size_log)
if (!page_size_log)
return &default_hstate;
- return size_to_hstate(1UL << page_size_log);
+ if (page_size_log < BITS_PER_LONG)
+ return size_to_hstate(1UL << page_size_log);
+
+ return NULL;
}
static inline struct hstate *hstate_vma(struct vm_area_struct *vma)
--
2.39.1
By default, non-mq drivers do not support nowait. This causes io_uring
to use a slower path as the driver cannot be trust not to block. brd
can safely set the nowait flag, as worst case all it does is a NOIO
allocation.
For io_uring, this makes a substantial difference. Before:
submitter=0, tid=453, file=/dev/ram0, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=440.03K, BW=1718MiB/s, IOS/call=32/31
IOPS=428.96K, BW=1675MiB/s, IOS/call=32/32
IOPS=442.59K, BW=1728MiB/s, IOS/call=32/31
IOPS=419.65K, BW=1639MiB/s, IOS/call=32/32
IOPS=426.82K, BW=1667MiB/s, IOS/call=32/31
and after:
submitter=0, tid=354, file=/dev/ram0, node=-1
polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=3.37M, BW=13.15GiB/s, IOS/call=32/31
IOPS=3.45M, BW=13.46GiB/s, IOS/call=32/31
IOPS=3.43M, BW=13.42GiB/s, IOS/call=32/32
IOPS=3.43M, BW=13.39GiB/s, IOS/call=32/31
IOPS=3.43M, BW=13.38GiB/s, IOS/call=32/31
or about an 8x in difference. Now that brd is prepared to deal with
REQ_NOWAIT reads/writes, mark it as supporting that.
Cc: stable(a)vger.kernel.org # 5.10+
Link: https://lore.kernel.org/linux-block/20230203103005.31290-1-p.raghav@samsung…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
drivers/block/brd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 6019ef23344f..522530a6ebca 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -418,6 +418,7 @@ static int brd_alloc(int i)
/* Tell the block layer that this is not a rotational device */
blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue);
+ blk_queue_flag_set(QUEUE_FLAG_NOWAIT, disk->queue);
err = add_disk(disk);
if (err)
goto out_cleanup_disk;
--
2.39.1
by Linux kernel regression tracking (Thorsten Leemhuis)
[adding Chih-Kang Chang (author), Kalle (committer) and LKML to the list
of recipients]
[anyone who replies to this: feel free to remove stable(a)vger.kernel.org
from the recipients, this is a mainline regression]
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]
On 09.02.23 20:59, Paul Gover wrote:
> Suspend/Resume was working OK on kernel 6.0.13, broken since 6.1.1
> (I've not tried kernels between those, except in the bisect below.)
> All subsequent 6,1 kernels exhibit the same behaviour.
>
> Suspend works OK, but on Resume, there's a flicker, and then it reboots.
> Sometimes the screen gets restored to its contents at the time of suspend. but
> less than a second later, it starts rebooting.
> To reproduce, simply boot, suspend, and resume.
>
> Git bisect blames RTW88
> commit 6bf3a083407b5d404d70efc3a5ac75b472e5efa9
TWIMC, that's "wifi: rtw88: add flag check before enter or leave IPS"
> I'll attach bisect log, dmesg and configs to the bug I've opened
> https://bugzilla.kernel.org/show_bug.cgi?id=217016
>
> dmesg from the following boot show a hardware error.
> It's not there when the system resumes or reboots with 6.0.13,
> and if I don't suspend & resume, there are no reported errors.
>
> The problem occurs under both Wayland and X11, and from the command line via
> echo mem>/sys/power.state
>
>
> Vanilla kernels, untainted, compiled with GCC; my system is Gentoo FWIW, but I
> do my own kernels direct from a git clone of stable.
>
> Couldn't find anything similar with Google or the mailing lists.
>
> **Hardware:**
>
> HP Laptop 15-bw0xx
> AMD A9-9420 RADEON R5, 5 COMPUTE CORES
> Stoney [Radeon R2/R3/R4/R5 Graphics]
> 4 GB memory
> RTL8723DE PCIe adapter
>
> **Kernel**
>
> Kernel command line:
> psmouse.synaptics_intertouch=1 pcie_aspm=force rdrand=force rootfstype=f2fs
> root=LABEL=gentoo
>
> CONFIG_RTW88=m
> CONFIG_RTW88_CORE=m
> CONFIG_RTW88_PCI=m
> CONFIG_RTW88_8723D=m
> # CONFIG_RTW88_8822BE is not set
> # CONFIG_RTW88_8822CE is not set
> CONFIG_RTW88_8723DE=m
> # CONFIG_RTW88_8821CE is not set
> # CONFIG_RTW88_DEBUG is not set
> # CONFIG_RTW88_DEBUGFS is not set
> # CONFIG_RTW89 is not set
Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:
#regzbot ^introduced 6bf3a083407b
#regzbot title wifi: rtw88: resume broken (reboot)
#regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.
Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
Fixed optkprobe issues, mainly related to the x86 architecture.
Yang Jihong (2):
x86/kprobes: Fix __recover_optprobed_insn check optimizing logic
x86/kprobes: Fix arch_check_optimized_kprobe check within
optimized_kprobe range
arch/x86/kernel/kprobes/opt.c | 6 +++---
include/linux/kprobes.h | 2 ++
kernel/kprobes.c | 4 ++--
3 files changed, 7 insertions(+), 5 deletions(-)
--
Changes since v1:
- Remove patch1 since there is already a fix patch.
- Add "cc stable" and modify comment for patch2.
- Use "kprobe_disarmed" instead of "kprobe_disabled" for patch3.
- Add fix commmit and "cc stable" for patch3.
2.30.GIT