[BUG]
If submit_one_sector() failed inside extent_writepage_io() for sector
size < page size cases (e.g. 4K sector size and 64K page size), then
we can hit double ordered extent accounting error.
This should be very rare, as submit_one_sector() only fails when we
failed to grab the extent map, and such extent map should exist inside
the memory and have been pinned.
[CAUSE]
For example we have the following folio layout:
0 4K 32K 48K 60K 64K
|//| |//////| |///|
Where |///| is the dirty range we need to writeback. The 3 different
dirty ranges are submitted for regular COW.
Now we hit the following sequence:
- submit_one_sector() returned 0 for [0, 4K)
- submit_one_sector() returned 0 for [32K, 48K)
- submit_one_sector() returned error for [60K, 64K)
- btrfs_mark_ordered_io_finished() called for the whole folio
This will mark the following ranges as finished:
* [0, 4K)
* [32K, 48K)
Both ranges have their IO already submitted, this cleanup will
lead to double accounting.
* [60K, 64K)
That's the correct cleanup.
The only good news is, this error is only theoretical, as the target
extent map is always pinned, thus we should directly grab it from
memory, other than reading it from the disk.
[FIX]
Instead of calling btrfs_mark_ordered_io_finished() for the whole folio
range, which can touch ranges we should not touch, instead
move the error handling inside extent_writepage_io().
So that we can cleanup exact sectors that are ought to be submitted but
failed.
This provide much more accurate cleanup, avoiding the double accounting.
Cc: stable(a)vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent_io.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 417c710c55ca..b6a4f1765b4c 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1418,6 +1418,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
struct btrfs_fs_info *fs_info = inode->root->fs_info;
unsigned long range_bitmap = 0;
bool submitted_io = false;
+ bool error = false;
const u64 folio_start = folio_pos(folio);
u64 cur;
int bit;
@@ -1460,11 +1461,21 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
break;
}
ret = submit_one_sector(inode, folio, cur, bio_ctrl, i_size);
- if (ret < 0)
- goto out;
+ if (unlikely(ret < 0)) {
+ submit_one_bio(bio_ctrl);
+ /*
+ * Failed to grab the extent map which should be very rare.
+ * Since there is no bio submitted to finish the ordered
+ * extent, we have to manually finish this sector.
+ */
+ btrfs_mark_ordered_io_finished(inode, folio, cur,
+ fs_info->sectorsize, false);
+ error = true;
+ continue;
+ }
submitted_io = true;
}
-out:
+
/*
* If we didn't submitted any sector (>= i_size), folio dirty get
* cleared but PAGECACHE_TAG_DIRTY is not cleared (only cleared
@@ -1472,8 +1483,11 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
*
* Here we set writeback and clear for the range. If the full folio
* is no longer dirty then we clear the PAGECACHE_TAG_DIRTY tag.
+ *
+ * If we hit any error, the corresponding sector will still be dirty
+ * thus no need to clear PAGECACHE_TAG_DIRTY.
*/
- if (!submitted_io) {
+ if (!submitted_io && !error) {
btrfs_folio_set_writeback(fs_info, folio, start, len);
btrfs_folio_clear_writeback(fs_info, folio, start, len);
}
@@ -1493,7 +1507,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
{
struct inode *inode = folio->mapping->host;
struct btrfs_fs_info *fs_info = inode_to_fs_info(inode);
- const u64 page_start = folio_pos(folio);
int ret;
size_t pg_offset;
loff_t i_size = i_size_read(inode);
@@ -1536,10 +1549,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl
bio_ctrl->wbc->nr_to_write--;
- if (ret)
- btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio,
- page_start, PAGE_SIZE, !ret);
-
done:
if (ret < 0)
mapping_set_error(folio->mapping, ret);
@@ -2319,11 +2328,8 @@ void extent_write_locked_range(struct inode *inode, const struct folio *locked_f
if (ret == 1)
goto next_page;
- if (ret) {
- btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio,
- cur, cur_len, !ret);
+ if (ret)
mapping_set_error(mapping, ret);
- }
btrfs_folio_end_lock(fs_info, folio, cur, cur_len);
if (ret < 0)
found_error = true;
--
2.47.1
This partially reverts commit fe4f5d4b6616 ("drm/xe: Clean up VM / exec
queue file lock usage."). While it's desired to have the mutex to
protect only the reference to the exec queue, getting and dropping each
mutex and then later getting the GPU timestamp, doesn't produce a
correct result: it introduces multiple opportunities for the task to be
scheduled out and thus wrecking havoc the deltas reported to userspace.
Also, to better correlate the timestamp from the exec queues with the
GPU, disable preemption so they can be updated without allowing the task
to be scheduled out. We leave interrupts enabled as that shouldn't be
enough disturbance for the deltas to matter to userspace.
Test scenario:
* IGT'S `xe_drm_fdinfo --r --r utilization-single-full-load`
* Platform: LNL, where CI occasionally reports failures
* `stress -c $(nproc)` running in parallel to disturb the
system
This brings a first failure from "after ~150 executions" to "never
occurs after 1000 attempts".
Cc: stable(a)vger.kernel.org # v6.11+
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3512
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
drivers/gpu/drm/xe/xe_drm_client.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c
index 298a587da7f17..e307b4d6bab5a 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -338,15 +338,12 @@ static void show_run_ticks(struct drm_printer *p, struct drm_file *file)
/* Accumulate all the exec queues from this client */
mutex_lock(&xef->exec_queue.lock);
- xa_for_each(&xef->exec_queue.xa, i, q) {
- xe_exec_queue_get(q);
- mutex_unlock(&xef->exec_queue.lock);
+ preempt_disable();
+ xa_for_each(&xef->exec_queue.xa, i, q)
xe_exec_queue_update_run_ticks(q);
- mutex_lock(&xef->exec_queue.lock);
- xe_exec_queue_put(q);
- }
+ preempt_enable();
mutex_unlock(&xef->exec_queue.lock);
gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
--
2.47.0
This series collects the various SME related fixes that were previously
posted separately. These should address all the issues I am aware of so
a patch which reenables the SME configuration option is also included.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v2:
- Pull simplification of the signal restore code after the SME
reenablement, it's not a fix but there's some code overlap.
- Comment updates.
- Link to v1: https://lore.kernel.org/r/20241203-arm64-sme-reenable-v1-0-d853479d1b77@ker…
---
Mark Brown (6):
arm64/sme: Flush foreign register state in do_sme_acc()
arm64/fp: Don't corrupt FPMR when streaming mode changes
arm64/ptrace: Zero FPMR on streaming mode entry/exit
arm64/signal: Avoid corruption of SME state when entering signal handler
arm64/sme: Reenable SME
arm64/signal: Consistently invalidate the in register FP state in restore
arch/arm64/Kconfig | 1 -
arch/arm64/include/asm/fpsimd.h | 1 +
arch/arm64/kernel/fpsimd.c | 57 ++++++++++++++++++++++----
arch/arm64/kernel/ptrace.c | 12 +++++-
arch/arm64/kernel/signal.c | 89 +++++++++++------------------------------
5 files changed, 84 insertions(+), 76 deletions(-)
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241202-arm64-sme-reenable-98e64c161a8e
Best regards,
--
Mark Brown <broonie(a)kernel.org>
During mass manufacturing, we noticed the mmc_rx_crc_error counter,
as reported by "ethtool -S eth0 | grep mmc_rx_crc_error", to increase
above zero during nuttcp speedtests. Most of the time, this did not
affect the achieved speed, but it prompted this investigation.
Cycling through the rx_delay range on six boards (see table below) of
various ages shows that there is a large good region from 0x12 to 0x35
where we see zero crc errors on all tested boards.
The old rx_delay value (0x10) seems to have always been on the edge for
the KSZ9031RNX that is usually placed on Puma.
Choose "rx_delay = 0x23" to put us smack in the middle of the good
region. This works fine as well with the KSZ9131RNX PHY that was used
for a small number of boards during the COVID chip shortages.
Board S/N PHY rx_delay good region
--------- --- --------------------
Puma TT0069903 KSZ9031RNX 0x11 0x35
Puma TT0157733 KSZ9031RNX 0x11 0x35
Puma TT0681551 KSZ9031RNX 0x12 0x37
Puma TT0681156 KSZ9031RNX 0x10 0x38
Puma 17496030079 KSZ9031RNX 0x10 0x37 (Puma v1.2 from 2017)
Puma TT0681720 KSZ9131RNX 0x02 0x39 (alternative PHY used in very few boards)
Intersection of good regions = 0x12 0x35
Middle of good region = 0x23
Fixes: 2c66fc34e945 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM")
Cc: stable(a)vger.kernel.org
Reviewed-by: Quentin Schulz <quentin.schulz(a)cherry.de>
Tested-by: Quentin Schulz <quentin.schulz(a)cherry.de> # Puma v2.1 and v2.3 with KSZ9031
Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher(a)cherry.de>
---
v4: drop internal Relates-to tag, add Tested-by, rebase to Linus master, send with b4
v3: use rx_delay = 0x23 instead of 0x11, which was not enough.
v2: cc stable, add "Fixes:", add omitted "there" to commit msg,
v1: initial submission
---
arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
index d12e661dfd9917f820284477a215389c16205f46..995b30a7aae01a0326e9f80d6be930f227968539 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
@@ -182,7 +182,7 @@ &gmac {
snps,reset-active-low;
snps,reset-delays-us = <0 10000 50000>;
tx_delay = <0x10>;
- rx_delay = <0x10>;
+ rx_delay = <0x23>;
status = "okay";
};
---
base-commit: f932fb9b40749d1c9a539d89bb3e288c077aafe5
change-id: 20241213-puma_rx_delay-fadbcdf09783
Best regards,
--
Jakob Unterwurzacher <jakob.unterwurzacher(a)cherry.de>
Hi ,
Hope this email finds you well
I am reaching out to let you know that we have a list of attendees in NRF 2025 Retail's Big Show
Attendees count: 20,000 Leads
Contact Information: Company Name, Web URL, Contact Name, Title, Direct Email, Phone Number, FAX Number, Mailing Address, Industry, Employee Size, Annual Sales.
Please let me know if you are interested in acquiring this list, I can share pricing information for you review
I would like to thank you for never keeping me waiting for your reply
Regards
Rachel Taylor
Demand Generation Manager
Apollo Leads Hub Inc.,
Please reply with REMOVE if you don't wish to receive further emails
A NULL sock pointer is passed into l2cap_sock_alloc() when it is called
from l2cap_sock_new_connection_cb() and the error handling paths should
also be aware of it.
Seemingly a more elegant solution would be to swap bt_sock_alloc() and
l2cap_chan_create() calls since they are not interdependent to that moment
but then l2cap_chan_create() adds the soon to be deallocated and still
dummy-initialized channel to the global list accessible by many L2CAP
paths. The channel would be removed from the list in short period of time
but be a bit more straight-forward here and just check for NULL instead of
changing the order of function calls.
Found by Linux Verification Center (linuxtesting.org) with SVACE static
analysis tool.
Fixes: 7c4f78cdb8e7 ("Bluetooth: L2CAP: do not leave dangling sk pointer on error in l2cap_sock_create()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
net/bluetooth/l2cap_sock.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 3d2553dcdb1b..49f97d4138ea 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1888,7 +1888,8 @@ static struct sock *l2cap_sock_alloc(struct net *net, struct socket *sock,
chan = l2cap_chan_create();
if (!chan) {
sk_free(sk);
- sock->sk = NULL;
+ if (sock)
+ sock->sk = NULL;
return NULL;
}
--
2.39.5
The last two are perhaps not strictly fixes, as they essentially add
support for nested ATRs. The first one is a fix, thus stable is in cc.
Tomi
Signed-off-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
---
Changes in v2:
- Use mutext_init_with_key()
- Rearrange the series so that the fix is the first patch
- Link to v1: https://lore.kernel.org/r/20241122-i2c-atr-fixes-v1-0-62c51ce790be@ideasonb…
---
Cosmin Tanislav (1):
i2c: atr: Allow unmapped addresses from nested ATRs
Tomi Valkeinen (2):
i2c: atr: Fix client detach
i2c: atr: Fix lockdep for nested ATRs
drivers/i2c/i2c-atr.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
---
base-commit: adc218676eef25575469234709c2d87185ca223a
change-id: 20241121-i2c-atr-fixes-6d52b9f5f0c1
Best regards,
--
Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>