On Thu, May 22, 2025 at 05:30:09PM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
>
> to the 6.14-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> x86-relocs-handle-r_x86_64_rex_gotpcrelx-relocations.patch
> and it can be found in the queue-6.14 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit d8e603969259e50aa632d1a3fde8883f41e26150
> Author: Brian Gerst <brgerst(a)gmail.com>
> Date: Thu Jan 23 14:07:37 2025 -0500
>
> x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
>
> [ Upstream commit cb7927fda002ca49ae62e2782c1692acc7b80c67 ]
>
> Clang may produce R_X86_64_REX_GOTPCRELX relocations when redefining the
> stack protector location. Treat them as another type of PC-relative
> relocation.
>
> Signed-off-by: Brian Gerst <brgerst(a)gmail.com>
> Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
> Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
> Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
> Link: https://lore.kernel.org/r/20250123190747.745588-6-brgerst@gmail.com
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
> index e937be979ec86..92a1e503305ef 100644
> --- a/arch/x86/tools/relocs.c
> +++ b/arch/x86/tools/relocs.c
> @@ -32,6 +32,11 @@ static struct relocs relocs32;
> static struct relocs relocs32neg;
> static struct relocs relocs64;
> # define FMT PRIu64
> +
> +#ifndef R_X86_64_REX_GOTPCRELX
> +# define R_X86_64_REX_GOTPCRELX 42
> +#endif
> +
> #else
> # define FMT PRIu32
> #endif
> @@ -227,6 +232,7 @@ static const char *rel_type(unsigned type)
> REL_TYPE(R_X86_64_PC16),
> REL_TYPE(R_X86_64_8),
> REL_TYPE(R_X86_64_PC8),
> + REL_TYPE(R_X86_64_REX_GOTPCRELX),
> #else
> REL_TYPE(R_386_NONE),
> REL_TYPE(R_386_32),
> @@ -861,6 +867,7 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
>
> case R_X86_64_PC32:
> case R_X86_64_PLT32:
> + case R_X86_64_REX_GOTPCRELX:
> /*
> * PC relative relocations don't need to be adjusted unless
> * referencing a percpu symbol.
Didn't Ard just say this has no purpose in stable?
https://lore.kernel.org/CAMj1kXGtasdqRPn8koNN095VEEU4K409QvieMdgGXNUK0kPgkw…
Cheers,
Nathan
Customer is reporting a really subtle issue where we get random DMAR
faults, hangs and other nasties for kernel migration jobs when stressing
stuff like s2idle/s3/s4. The explosions seems to happen somewhere
after resuming the system with splats looking something like:
PM: suspend exit
rfkill: input handler disabled
xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0
xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1]
xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out
The likely cause appears to be a race between suspend cancelling the
worker that processes the free_job()'s, such that we still have pending
jobs to be freed after the cancel. Following from this, on resume the
pending_list will now contain at least one already complete job, but it
looks like we call drm_sched_resubmit_jobs(), which will then call
run_job() on everything still on the pending_list. But if the job was
already complete, then all the resources tied to the job, like the bb
itself, any memory that is being accessed, the iommu mappings etc. might
be long gone since those are usually tied to the fence signalling.
This scenario can be seen in ftrace when running a slightly modified
xe_pm (kernel was only modified to inject artificial latency into
free_job to make the race easier to hit):
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3
xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
.....
xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13
So the job_run() is clearly triggered twice, even though the first must
have already signalled to completion during suspend. We can also see a
CAT error after the re-submit.
To prevent this try to call xe_sched_stop() to forcefully remove
anything on the pending_list that has already signalled, before we
re-submit.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: William Tseng <william.tseng(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
---
drivers/gpu/drm/xe/xe_gpu_scheduler.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.h b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
index c250ea773491..9315da58d02d 100644
--- a/drivers/gpu/drm/xe/xe_gpu_scheduler.h
+++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.h
@@ -51,6 +51,7 @@ static inline void xe_sched_tdr_queue_imm(struct xe_gpu_scheduler *sched)
static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
{
+ xe_sched_stop(sched);
drm_sched_resubmit_jobs(&sched->base);
}
--
2.49.0
If, during a mremap() operation for a hugetlb-backed memory mapping,
copy_vma() fails after the source vma has been duplicated and
opened (ie. vma_link() fails), the error is handled by closing the new
vma. This updates the hugetlbfs reservation counter of the reservation
map which at this point is referenced by both the source vma and the new
copy. As a result, once the new vma has been freed and copy_vma()
returns, the reservation counter for the source vma will be incorrect.
This patch addresses this corner case by clearing the hugetlb private
page reservation reference for the new vma and decrementing the
reference before closing the vma, so that vma_close() won't update the
reservation counter.
The issue was reported by a private syzbot instance, see the error
report log [1] and reproducer [2]. Possible duplicate of public syzbot
report [3].
Signed-off-by: Ricardo Cañuelo Navarro <rcn(a)igalia.com>
Cc: stable(a)vger.kernel.org # 6.12+
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter… [1]
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter… [2]
Link: https://lore.kernel.org/all/67000a50.050a0220.49194.048d.GAE@google.com/ [3]
---
mm/vma.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/vma.c b/mm/vma.c
index 839d12f02c885d3338d8d233583eb302d82bb80b..9d9f699ace977c9c869e5da5f88f12be183adcfb 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -1834,6 +1834,8 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
return new_vma;
out_vma_link:
+ if (is_vm_hugetlb_page(new_vma))
+ clear_vma_resv_huge_pages(new_vma);
vma_close(new_vma);
if (new_vma->vm_file)
---
base-commit: 94305e83eccb3120c921cd3a015cd74731140bac
change-id: 20250523-warning_in_page_counter_cancel-e8c71a6b4c88
From: Xin Li <xin(a)zytor.com>
Clear the software event flag in the augmented SS to prevent infinite
SIGTRAP handler loop if TF is used without an external debugger.
Following is a typical single-stepping flow for a user process:
1) The user process is prepared for single-stepping by setting
RFLAGS.TF = 1.
2) When any instruction in user space completes, a #DB is triggered.
3) The kernel handles the #DB and returns to user space, invoking the
SIGTRAP handler with RFLAGS.TF = 0.
4) After the SIGTRAP handler finishes, the user process performs a
sigreturn syscall, restoring the original state, including
RFLAGS.TF = 1.
5) Goto step 2.
According to the FRED specification:
A) Bit 17 in the augmented SS is designated as the software event
flag, which is set to 1 for FRED event delivery of SYSCALL,
SYSENTER, or INT n.
B) If bit 17 of the augmented SS is 1 and ERETU would result in
RFLAGS.TF = 1, a single-step trap will be pending upon completion
of ERETU.
In step 4) above, the software event flag is set upon the sigreturn
syscall, and its corresponding ERETU would restore RFLAGS.TF = 1.
This combination causes a pending single-step trap upon completion of
ERETU. Therefore, another #DB is triggered before any user space
instruction is executed, which leads to an infinite loop in which the
SIGTRAP handler keeps being invoked on the same user space IP.
Suggested-by: H. Peter Anvin (Intel) <hpa(a)zytor.com>
Signed-off-by: Xin Li (Intel) <xin(a)zytor.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/include/asm/sighandling.h | 20 ++++++++++++++++++++
arch/x86/kernel/signal_32.c | 4 ++++
arch/x86/kernel/signal_64.c | 4 ++++
3 files changed, 28 insertions(+)
diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index e770c4fc47f4..ecb0411fe88c 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -24,4 +24,24 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
+/*
+ * To prevent infinite SIGTRAP handler loop if TF is used without an external
+ * debugger, clear the software event flag in the augmented SS, ensuring no
+ * single-step trap is pending upon ERETU completion.
+ *
+ * Note, this function should be called in sigreturn() before the original state
+ * is restored to make sure the TF is read from the entry frame.
+ */
+static __always_inline void prevent_single_step_upon_eretu(struct pt_regs *regs)
+{
+ /*
+ * If the trap flag (TF) is set, i.e., the sigreturn() SYSCALL instruction
+ * is being single-stepped, do not clear the software event flag in the
+ * augmented SS, thus a debugger won't skip over the following instruction.
+ */
+ if (IS_ENABLED(CONFIG_X86_FRED) && cpu_feature_enabled(X86_FEATURE_FRED) &&
+ !(regs->flags & X86_EFLAGS_TF))
+ regs->fred_ss.swevent = 0;
+}
+
#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 98123ff10506..42bbc42bd350 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -152,6 +152,8 @@ SYSCALL32_DEFINE0(sigreturn)
struct sigframe_ia32 __user *frame = (struct sigframe_ia32 __user *)(regs->sp-8);
sigset_t set;
+ prevent_single_step_upon_eretu(regs);
+
if (!access_ok(frame, sizeof(*frame)))
goto badframe;
if (__get_user(set.sig[0], &frame->sc.oldmask)
@@ -175,6 +177,8 @@ SYSCALL32_DEFINE0(rt_sigreturn)
struct rt_sigframe_ia32 __user *frame;
sigset_t set;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe_ia32 __user *)(regs->sp - 4);
if (!access_ok(frame, sizeof(*frame)))
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index ee9453891901..d483b585c6c6 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -250,6 +250,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
sigset_t set;
unsigned long uc_flags;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
if (!access_ok(frame, sizeof(*frame)))
goto badframe;
@@ -366,6 +368,8 @@ COMPAT_SYSCALL_DEFINE0(x32_rt_sigreturn)
sigset_t set;
unsigned long uc_flags;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe_x32 __user *)(regs->sp - 8);
if (!access_ok(frame, sizeof(*frame)))
base-commit: 6a7c3c2606105a41dde81002c0037420bc1ddf00
--
2.49.0
From: Xin Li <xin(a)zytor.com>
Clear the software event flag in the augmented SS to prevent infinite
SIGTRAP handler loop if TF is used without an external debugger.
Following is a typical single-stepping flow for a user process:
1) The user process is prepared for single-stepping by setting
RFLAGS.TF = 1.
2) When any instruction in user space completes, a #DB is triggered.
3) The kernel handles the #DB and returns to user space, invoking the
SIGTRAP handler with RFLAGS.TF = 0.
4) After the SIGTRAP handler finishes, the user process performs a
sigreturn syscall, restoring the original state, including
RFLAGS.TF = 1.
5) Goto step 2.
According to the FRED specification:
A) Bit 17 in the augmented SS is designated as the software event
flag, which is set to 1 for FRED event delivery of SYSCALL,
SYSENTER, or INT n.
B) If bit 17 of the augmented SS is 1 and ERETU would result in
RFLAGS.TF = 1, a single-step trap will be pending upon completion
of ERETU.
In step 4) above, the software event flag is set upon the sigreturn
syscall, and its corresponding ERETU would restore RFLAGS.TF = 1.
This combination causes a pending single-step trap upon completion of
ERETU. Therefore, another #DB is triggered before any user space
instruction is executed, which leads to an infinite loop in which the
SIGTRAP handler keeps being invoked on the same user space IP.
Suggested-by: H. Peter Anvin (Intel) <hpa(a)zytor.com>
Signed-off-by: Xin Li (Intel) <xin(a)zytor.com>
Cc: stable(a)vger.kernel.org
---
Change in v3:
*) Use "#ifdef CONFIG_X86_FRED" instead of IS_ENABLED(CONFIG_X86_FRED)
(Intel LKP).
Change in v2:
*) Remove the check cpu_feature_enabled(X86_FEATURE_FRED), because
regs->fred_ss.swevent will always be 0 otherwise (H. Peter Anvin).
---
arch/x86/include/asm/sighandling.h | 21 +++++++++++++++++++++
arch/x86/kernel/signal_32.c | 4 ++++
arch/x86/kernel/signal_64.c | 4 ++++
3 files changed, 29 insertions(+)
diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index e770c4fc47f4..530eecc371fc 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -24,4 +24,25 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
+/*
+ * To prevent infinite SIGTRAP handler loop if TF is used without an external
+ * debugger, clear the software event flag in the augmented SS, ensuring no
+ * single-step trap is pending upon ERETU completion.
+ *
+ * Note, this function should be called in sigreturn() before the original state
+ * is restored to make sure the TF is read from the entry frame.
+ */
+static __always_inline void prevent_single_step_upon_eretu(struct pt_regs *regs)
+{
+ /*
+ * If the trap flag (TF) is set, i.e., the sigreturn() SYSCALL instruction
+ * is being single-stepped, do not clear the software event flag in the
+ * augmented SS, thus a debugger won't skip over the following instruction.
+ */
+#ifdef CONFIG_X86_FRED
+ if (!(regs->flags & X86_EFLAGS_TF))
+ regs->fred_ss.swevent = 0;
+#endif
+}
+
#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 98123ff10506..42bbc42bd350 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -152,6 +152,8 @@ SYSCALL32_DEFINE0(sigreturn)
struct sigframe_ia32 __user *frame = (struct sigframe_ia32 __user *)(regs->sp-8);
sigset_t set;
+ prevent_single_step_upon_eretu(regs);
+
if (!access_ok(frame, sizeof(*frame)))
goto badframe;
if (__get_user(set.sig[0], &frame->sc.oldmask)
@@ -175,6 +177,8 @@ SYSCALL32_DEFINE0(rt_sigreturn)
struct rt_sigframe_ia32 __user *frame;
sigset_t set;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe_ia32 __user *)(regs->sp - 4);
if (!access_ok(frame, sizeof(*frame)))
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index ee9453891901..d483b585c6c6 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -250,6 +250,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
sigset_t set;
unsigned long uc_flags;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
if (!access_ok(frame, sizeof(*frame)))
goto badframe;
@@ -366,6 +368,8 @@ COMPAT_SYSCALL_DEFINE0(x32_rt_sigreturn)
sigset_t set;
unsigned long uc_flags;
+ prevent_single_step_upon_eretu(regs);
+
frame = (struct rt_sigframe_x32 __user *)(regs->sp - 8);
if (!access_ok(frame, sizeof(*frame)))
base-commit: 6a7c3c2606105a41dde81002c0037420bc1ddf00
--
2.49.0
在 2025/5/23 06:35, Sasha Levin 写道:
> This is a note to let you know that I've just added the patch titled
>
> btrfs: allow buffered write to avoid full page read if it's block aligned
>
> to the 6.14-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> btrfs-allow-buffered-write-to-avoid-full-page-read-i.patch
> and it can be found in the queue-6.14 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Please drop this patch from all stable branches.
Although this patch mentions a failure in fstests, it acts more like an
optimization for btrfs.
Furthermore it relies quite some patches that may not be in stable kernels.
Without all the dependency, this can lead to data corruption.
Please drop this one from all stable kernels.
Thanks,
Qu
>
>
>
> commit de0860d610aaaee77a8c5c713c41fea584ac83b3
> Author: Qu Wenruo <wqu(a)suse.com>
> Date: Wed Oct 30 17:04:02 2024 +1030
>
> btrfs: allow buffered write to avoid full page read if it's block aligned
>
> [ Upstream commit 0d31ca6584f21821c708752d379871b9fce2dc48 ]
>
> [BUG]
> Since the support of block size (sector size) < page size for btrfs,
> test case generic/563 fails with 4K block size and 64K page size:
>
> --- tests/generic/563.out 2024-04-25 18:13:45.178550333 +0930
> +++ /home/adam/xfstests-dev/results//generic/563.out.bad 2024-09-30 09:09:16.155312379 +0930
> @@ -3,7 +3,8 @@
> read is in range
> write is in range
> write -> read/write
> -read is in range
> +read has value of 8388608
> +read is NOT in range -33792 .. 33792
> write is in range
> ...
>
> [CAUSE]
> The test case creates a 8MiB file, then does buffered write into the 8MiB
> using 4K block size, to overwrite the whole file.
>
> On 4K page sized systems, since the write range covers the full block and
> page, btrfs will not bother reading the page, just like what XFS and EXT4
> do.
>
> But on 64K page sized systems, although the 4K sized write is still block
> aligned, it's not page aligned anymore, thus btrfs will read the full
> page, which will be accounted by cgroup and fail the test.
>
> As the test case itself expects such 4K block aligned write should not
> trigger any read.
>
> Such expected behavior is an optimization to reduce folio reads when
> possible, and unfortunately btrfs does not implement such optimization.
>
> [FIX]
> To skip the full page read, we need to do the following modification:
>
> - Do not trigger full page read as long as the buffered write is block
> aligned
> This is pretty simple by modifying the check inside
> prepare_uptodate_page().
>
> - Skip already uptodate blocks during full page read
> Or we can lead to the following data corruption:
>
> 0 32K 64K
> |///////| |
>
> Where the file range [0, 32K) is dirtied by buffered write, the
> remaining range [32K, 64K) is not.
>
> When reading the full page, since [0,32K) is only dirtied but not
> written back, there is no data extent map for it, but a hole covering
> [0, 64k).
>
> If we continue reading the full page range [0, 64K), the dirtied range
> will be filled with 0 (since there is only a hole covering the whole
> range).
> This causes the dirtied range to get lost.
>
> With this optimization, btrfs can pass generic/563 even if the page size
> is larger than fs block size.
>
> Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
> Signed-off-by: Qu Wenruo <wqu(a)suse.com>
> Signed-off-by: David Sterba <dsterba(a)suse.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 06922529f19dc..13b5359ea1b77 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -974,6 +974,10 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
> end_folio_read(folio, true, cur, iosize);
> break;
> }
> + if (btrfs_folio_test_uptodate(fs_info, folio, cur, blocksize)) {
> + end_folio_read(folio, true, cur, blocksize);
> + continue;
> + }
> em = get_extent_map(BTRFS_I(inode), folio, cur, end - cur + 1, em_cached);
> if (IS_ERR(em)) {
> end_folio_read(folio, false, cur, end + 1 - cur);
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index cd4e40a719186..61ad1a79e5698 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -804,14 +804,15 @@ static int prepare_uptodate_folio(struct inode *inode, struct folio *folio, u64
> {
> u64 clamp_start = max_t(u64, pos, folio_pos(folio));
> u64 clamp_end = min_t(u64, pos + len, folio_pos(folio) + folio_size(folio));
> + const u32 blocksize = inode_to_fs_info(inode)->sectorsize;
> int ret = 0;
>
> if (folio_test_uptodate(folio))
> return 0;
>
> if (!force_uptodate &&
> - IS_ALIGNED(clamp_start, PAGE_SIZE) &&
> - IS_ALIGNED(clamp_end, PAGE_SIZE))
> + IS_ALIGNED(clamp_start, blocksize) &&
> + IS_ALIGNED(clamp_end, blocksize))
> return 0;
>
> ret = btrfs_read_folio(NULL, folio);