There was previously a theoretical window where swapoff() could run and
teardown a swap_info_struct while a call to free_swap_and_cache() was
running in another thread. This could cause, amongst other bad
possibilities, swap_page_trans_huge_swapped() (called by
free_swap_and_cache()) to access the freed memory for swap_map.
This is a theoretical problem and I haven't been able to provoke it from
a test case. But there has been agreement based on code review that this
is possible (see link below).
Fix it by using get_swap_device()/put_swap_device(), which will stall
swapoff(). There was an extra check in _swap_info_get() to confirm that
the swap entry was valid. This wasn't present in get_swap_device() so
I've added it. I couldn't find any existing get_swap_device() call sites
where this extra check would cause any false alarms.
Details of how to provoke one possible issue (thanks to David Hilenbrand
for deriving this):
--8<-----
__swap_entry_free() might be the last user and result in
"count == SWAP_HAS_CACHE".
swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0.
So the question is: could someone reclaim the folio and turn
si->inuse_pages==0, before we completed swap_page_trans_huge_swapped().
Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are
still references by swap entries.
Process 1 still references subpage 0 via swap entry.
Process 2 still references subpage 1 via swap entry.
Process 1 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
[then, preempted in the hypervisor etc.]
Process 2 quits. Calls free_swap_and_cache().
-> count == SWAP_HAS_CACHE
Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls
__try_to_reclaim_swap().
__try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()->
put_swap_folio()->free_swap_slot()->swapcache_free_entries()->
swap_entry_free()->swap_range_free()->
...
WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
What stops swapoff to succeed after process 2 reclaimed the swap cache
but before process1 finished its call to swap_page_trans_huge_swapped()?
--8<-----
Fixes: 7c00bafee87c ("mm/swap: free swap slots in batch")
Closes: https://lore.kernel.org/linux-mm/65a66eb9-41f8-4790-8db2-0c70ea15979f@redha…
Cc: stable(a)vger.kernel.org
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
Applies on top of v6.8-rc6 and mm-unstable (b38c34939fe4).
Thanks,
Ryan
mm/swapfile.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2b3a2d85e350..f580e6abc674 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1281,7 +1281,9 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
smp_rmb();
offset = swp_offset(entry);
if (offset >= si->max)
- goto put_out;
+ goto bad_offset;
+ if (data_race(!si->swap_map[swp_offset(entry)]))
+ goto bad_free;
return si;
bad_nofile:
@@ -1289,9 +1291,14 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
out:
return NULL;
put_out:
- pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val);
percpu_ref_put(&si->users);
return NULL;
+bad_offset:
+ pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val);
+ goto put_out;
+bad_free:
+ pr_err("%s: %s%08lx\n", __func__, Unused_offset, entry.val);
+ goto put_out;
}
static unsigned char __swap_entry_free(struct swap_info_struct *p,
@@ -1609,13 +1616,14 @@ int free_swap_and_cache(swp_entry_t entry)
if (non_swap_entry(entry))
return 1;
- p = _swap_info_get(entry);
+ p = get_swap_device(entry);
if (p) {
count = __swap_entry_free(p, entry);
if (count == SWAP_HAS_CACHE &&
!swap_page_trans_huge_swapped(p, entry))
__try_to_reclaim_swap(p, swp_offset(entry),
TTRS_UNMAPPED | TTRS_FULL);
+ put_swap_device(p);
}
return p != NULL;
}
--
2.25.1
Hi,
I have to admit that v3 was a lazy attempt. This one should be on
the right path.
this series does basically two things:
1. Disables automatic load balancing as adviced by the hardware
workaround.
2. Assigns all the CCS slices to one single user engine. The user
will then be able to query only one CCS engine
I'm using here the "Requires: " tag, but I'm not sure the commit
id will be valid, on the other hand, I don't know what commit id
I should use.
Thanks Tvrtko, Matt, John and Joonas for your reviews!
Andi
Changelog
=========
v3 -> v4
- Reword correctly the comment in the workaround
- Fix a buffer overflow (Thanks Joonas)
- Handle properly the fused engines when setting the CCS mode.
v2 -> v3
- Simplified the algorithm for creating the list of the exported
uabi engines. (Patch 1) (Thanks, Tvrtko)
- Consider the fused engines when creating the uabi engine list
(Patch 2) (Thanks, Matt)
- Patch 4 now uses a the refactoring from patch 1, in a cleaner
outcome.
v1 -> v2
- In Patch 1 use the correct workaround number (thanks Matt).
- In Patch 2 do not add the extra CCS engines to the exposed UABI
engine list and adapt the engine counting accordingly (thanks
Tvrtko).
- Reword the commit of Patch 2 (thanks John).
Andi Shyti (3):
drm/i915/gt: Disable HW load balancing for CCS
drm/i915/gt: Refactor uabi engine class/instance list creation
drm/i915/gt: Enable only one CCS for compute workload
drivers/gpu/drm/i915/gt/intel_engine_user.c | 40 ++++++++++++++-------
drivers/gpu/drm/i915/gt/intel_gt.c | 23 ++++++++++++
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 6 ++++
drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++
4 files changed, 62 insertions(+), 12 deletions(-)
--
2.43.0
Hi,
On Bugzilla, danilrybakov249(a)gmail.com reported stable-specific, ACPI error
regression that led into high CPU temperature [1]. He wrote:
> Overview:
>
> After updating from lts v6.6.14-2 to lts v6.6.17-1 noticed high CPU temperature and lag. After running htop noticed that journald was using 30-60% of CPU. Afterwards, tried switching to stable, or lts v6.6.18-1, but encountered the same issue.
>
> Running journalctl -f gives these lines over and over again:
>
> Feb 19 21:09:12 danirybe kernel: ACPI Error: Could not disable RealTimeClock events (20230628/evxfevnt-243)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 08, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0A, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0B, disabling event (20230628/evgpe-839)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PM_Timer (0), disabling (20230628/evevent-255)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PowerButton (2), disabling (20230628/evevent-255)
> Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - SleepButton (3), disabling (20230628/evevent-255)
>
> My system info:
>
> Laptop model: ASUS VivoBook D540NV-GQ065T
> OS: Arch Linux x86_64
> Kernel: 6.6.14-2-lts
> WM: sway
> CPU: Intel Pentium N420 (4) @ 2.500GHz
> GPU1: Intel Apollo Lake [HD Graphics 505]
> GPU2: NVIDIA GeForce 920MX
>
> I've pinned down the commit after which the problem occurs:
>
> 847e1eb30e269a094da046c08273abe3f3361cf2 is the first bad commit
> commit 847e1eb30e269a094da046c08273abe3f3361cf2
> Author: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
> Date: Mon Jan 8 15:20:58 2024 +0900
>
> platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
>
> commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream.
>
> <snipped>...
See Bugzilla for the full thread.
Thanks.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218531
--
An old man doll... just what I always wanted! - Clara
From: Vitor Soares <vitor.soares(a)toradex.com>
When the mcp251xfd_start_xmit() function fails, the driver stops
processing messages, and the interrupt routine does not return,
running indefinitely even after killing the running application.
Error messages:
[ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16
[ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3).
... and repeat forever.
The issue can be triggered when multiple devices share the same
SPI interface. And there is concurrent access to the bus.
The problem occurs because tx_ring->head increments even if
mcp251xfd_start_xmit() fails. Consequently, the driver skips one
TX package while still expecting a response in
mcp251xfd_handle_tefif_one().
This patch resolves the issue by decreasing tx_ring->head if
mcp251xfd_start_xmit() fails. With the fix, if we attempt to trigger
the issue again, the driver prints an error and discard the message.
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vitor Soares <vitor.soares(a)toradex.com>
---
drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c | 27 ++++++++++----------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
index 160528d3cc26..a8eb941c1b95 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c
@@ -181,25 +181,26 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb,
tx_obj = mcp251xfd_get_tx_obj_next(tx_ring);
mcp251xfd_tx_obj_from_skb(priv, tx_obj, skb, tx_ring->head);
- /* Stop queue if we occupy the complete TX FIFO */
tx_head = mcp251xfd_get_tx_head(tx_ring);
- tx_ring->head++;
- if (mcp251xfd_get_tx_free(tx_ring) == 0)
- netif_stop_queue(ndev);
-
frame_len = can_skb_get_frame_len(skb);
- err = can_put_echo_skb(skb, ndev, tx_head, frame_len);
- if (!err)
- netdev_sent_queue(priv->ndev, frame_len);
+ can_put_echo_skb(skb, ndev, tx_head, frame_len);
+
+ tx_ring->head++;
err = mcp251xfd_tx_obj_write(priv, tx_obj);
- if (err)
- goto out_err;
+ if (err) {
+ can_free_echo_skb(ndev, tx_head, NULL);
- return NETDEV_TX_OK;
+ tx_ring->head--;
+
+ netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err);
+ } else {
+ /* Stop queue if we occupy the complete TX FIFO */
+ if (mcp251xfd_get_tx_free(tx_ring) == 0)
+ netif_stop_queue(ndev);
- out_err:
- netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err);
+ netdev_sent_queue(priv->ndev, frame_len);
+ }
return NETDEV_TX_OK;
}
--
2.34.1
From: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
[WHY]
Blackscreen hang @ PC EF000025 when trying to wake up from S0i3. DCN
gets powered off due to dc_power_down_on_boot() being called after
timeout.
[HOW]
Setting the power_down_on_boot function pointer to null since we don't
expect the function to be called for APU.
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
Acked-by: Alex Hung <alex.hung(a)amd.com>
Signed-off-by: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
index dce620d359a6..d4e0abbef28e 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_init.c
@@ -39,7 +39,7 @@
static const struct hw_sequencer_funcs dcn35_funcs = {
.program_gamut_remap = dcn30_program_gamut_remap,
.init_hw = dcn35_init_hw,
- .power_down_on_boot = dcn35_power_down_on_boot,
+ .power_down_on_boot = NULL,
.apply_ctx_to_hw = dce110_apply_ctx_to_hw,
.apply_ctx_for_surface = NULL,
.program_front_end_for_ctx = dcn20_program_front_end_for_ctx,
--
2.34.1
We need to avoid the first page if we don't read it entirely.
We need to avoid the last page if we don't read it entirely.
While rather simple, this logic has been failed in the previous
fix. This time I wrote about 30 unit tests locally to check each
possible condition, hopefully I covered them all.
Reported-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Closes: https://lore.kernel.org/linux-mtd/20240221175327.42f7076d@xps-13/T/#m399bac…
Suggested-by: Christophe Kerello <christophe.kerello(a)foss.st.com>
Fixes: 828f6df1bcba ("mtd: rawnand: Clarify conditions to enable continuous reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/raw/nand_base.c | 38 ++++++++++++++++++--------------
1 file changed, 22 insertions(+), 16 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3b3ce2926f5d..bcfd99a1699f 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -3466,30 +3466,36 @@ static void rawnand_enable_cont_reads(struct nand_chip *chip, unsigned int page,
u32 readlen, int col)
{
struct mtd_info *mtd = nand_to_mtd(chip);
- unsigned int end_page, end_col;
+ unsigned int first_page, last_page;
chip->cont_read.ongoing = false;
if (!chip->controller->supported_op.cont_read)
return;
- end_page = DIV_ROUND_UP(col + readlen, mtd->writesize);
- end_col = (col + readlen) % mtd->writesize;
+ /*
+ * Don't bother making any calculations if the length is too small.
+ * Side effect: avoids possible integer underflows below.
+ */
+ if (readlen < (2 * mtd->writesize))
+ return;
+ /* Derive the page where continuous read should start (the first full page read) */
+ first_page = page;
if (col)
- page++;
-
- if (end_col && end_page)
- end_page--;
-
- if (page + 1 > end_page)
- return;
-
- chip->cont_read.first_page = page;
- chip->cont_read.last_page = end_page;
- chip->cont_read.ongoing = true;
-
- rawnand_cap_cont_reads(chip);
+ first_page++;
+
+ /* Derive the page where continuous read should stop (the last full page read) */
+ last_page = page + ((col + readlen) / mtd->writesize) - 1;
+
+ /* Configure and enable continuous read when suitable */
+ if (first_page < last_page) {
+ chip->cont_read.first_page = first_page;
+ chip->cont_read.last_page = last_page;
+ chip->cont_read.ongoing = true;
+ /* May reset the ongoing flag */
+ rawnand_cap_cont_reads(chip);
+ }
}
static void rawnand_cont_read_skip_first_page(struct nand_chip *chip, unsigned int page)
--
2.34.1
While crossing a LUN boundary, it is probably safer (and clearer) to
keep all members of the continuous read structure aligned, including the
pause page (which is the last page of the lun or the last page of the
continuous read). Once these members properly in sync, we can use the
rawnand_cap_cont_reads() helper everywhere to "prepare" the next
continuous read if there is one.
Fixes: bbcd80f53a5e ("mtd: rawnand: Prevent crossing LUN boundaries during sequential reads")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
This is not 100% a fix but I believe it is worth backporting as there
may be corner cases which were not identified with the initial
implementation.
---
drivers/mtd/nand/raw/nand_base.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index d6a27e08b112..4d5a663e4e05 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -1232,6 +1232,15 @@ static void rawnand_cap_cont_reads(struct nand_chip *chip)
chip->cont_read.pause_page = rawnand_last_page_of_lun(ppl, first_lun);
else
chip->cont_read.pause_page = chip->cont_read.last_page;
+
+ if (chip->cont_read.first_page == chip->cont_read.pause_page) {
+ chip->cont_read.first_page++;
+ chip->cont_read.pause_page = min(chip->cont_read.last_page,
+ rawnand_last_page_of_lun(ppl, first_lun + 1));
+ }
+
+ if (chip->cont_read.first_page >= chip->cont_read.last_page)
+ chip->cont_read.ongoing = false;
}
static int nand_lp_exec_cont_read_page_op(struct nand_chip *chip, unsigned int page,
@@ -1298,12 +1307,11 @@ static int nand_lp_exec_cont_read_page_op(struct nand_chip *chip, unsigned int p
if (!chip->cont_read.ongoing)
return 0;
- if (page == chip->cont_read.pause_page &&
- page != chip->cont_read.last_page) {
- chip->cont_read.first_page = chip->cont_read.pause_page + 1;
- rawnand_cap_cont_reads(chip);
- } else if (page == chip->cont_read.last_page) {
+ if (page == chip->cont_read.last_page) {
chip->cont_read.ongoing = false;
+ } else if (page == chip->cont_read.pause_page) {
+ chip->cont_read.first_page++;
+ rawnand_cap_cont_reads(chip);
}
return 0;
@@ -3510,10 +3518,7 @@ static void rawnand_cont_read_skip_first_page(struct nand_chip *chip, unsigned i
return;
chip->cont_read.first_page++;
- if (chip->cont_read.first_page == chip->cont_read.pause_page)
- chip->cont_read.first_page++;
- if (chip->cont_read.first_page >= chip->cont_read.last_page)
- chip->cont_read.ongoing = false;
+ rawnand_cap_cont_reads(chip);
}
/**
--
2.34.1