The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s
shares the VCC33 supply with other peripherals that may remain powered
during suspend (e.g. when enabled as wakeup sources).
The reset line is also wired so that it can be left deasserted when the
supply is off.
This is important as it avoids holding the controller in reset for
extended periods of time when it remains powered, which can lead to
increased power consumption, and also avoids leaking current through the
X13s reset circuitry during suspend (and after driver unbind).
Use the new 'no-reset-on-power-off' devicetree property to determine
when reset needs to be asserted on power down.
Notably this also avoids wasting power on machine variants without a
touchscreen for which the driver would otherwise exit probe with reset
asserted.
Fixes: bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens")
Cc: stable(a)vger.kernel.org # 6.0
Cc: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/hid/i2c-hid/i2c-hid-of-elan.c | 37 ++++++++++++++++++++-------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/drivers/hid/i2c-hid/i2c-hid-of-elan.c b/drivers/hid/i2c-hid/i2c-hid-of-elan.c
index 5b91fb106cfc..8a905027d5e9 100644
--- a/drivers/hid/i2c-hid/i2c-hid-of-elan.c
+++ b/drivers/hid/i2c-hid/i2c-hid-of-elan.c
@@ -31,6 +31,7 @@ struct i2c_hid_of_elan {
struct regulator *vcc33;
struct regulator *vccio;
struct gpio_desc *reset_gpio;
+ bool no_reset_on_power_off;
const struct elan_i2c_hid_chip_data *chip_data;
};
@@ -40,17 +41,17 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops)
container_of(ops, struct i2c_hid_of_elan, ops);
int ret;
+ gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1);
+
if (ihid_elan->vcc33) {
ret = regulator_enable(ihid_elan->vcc33);
if (ret)
- return ret;
+ goto err_deassert_reset;
}
ret = regulator_enable(ihid_elan->vccio);
- if (ret) {
- regulator_disable(ihid_elan->vcc33);
- return ret;
- }
+ if (ret)
+ goto err_disable_vcc33;
if (ihid_elan->chip_data->post_power_delay_ms)
msleep(ihid_elan->chip_data->post_power_delay_ms);
@@ -60,6 +61,15 @@ static int elan_i2c_hid_power_up(struct i2chid_ops *ops)
msleep(ihid_elan->chip_data->post_gpio_reset_on_delay_ms);
return 0;
+
+err_disable_vcc33:
+ if (ihid_elan->vcc33)
+ regulator_disable(ihid_elan->vcc33);
+err_deassert_reset:
+ if (ihid_elan->no_reset_on_power_off)
+ gpiod_set_value_cansleep(ihid_elan->reset_gpio, 0);
+
+ return ret;
}
static void elan_i2c_hid_power_down(struct i2chid_ops *ops)
@@ -67,7 +77,14 @@ static void elan_i2c_hid_power_down(struct i2chid_ops *ops)
struct i2c_hid_of_elan *ihid_elan =
container_of(ops, struct i2c_hid_of_elan, ops);
- gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1);
+ /*
+ * Do not assert reset when the hardware allows for it to remain
+ * deasserted regardless of the state of the (shared) power supply to
+ * avoid wasting power when the supply is left on.
+ */
+ if (!ihid_elan->no_reset_on_power_off)
+ gpiod_set_value_cansleep(ihid_elan->reset_gpio, 1);
+
if (ihid_elan->chip_data->post_gpio_reset_off_delay_ms)
msleep(ihid_elan->chip_data->post_gpio_reset_off_delay_ms);
@@ -87,12 +104,14 @@ static int i2c_hid_of_elan_probe(struct i2c_client *client)
ihid_elan->ops.power_up = elan_i2c_hid_power_up;
ihid_elan->ops.power_down = elan_i2c_hid_power_down;
- /* Start out with reset asserted */
- ihid_elan->reset_gpio =
- devm_gpiod_get_optional(&client->dev, "reset", GPIOD_OUT_HIGH);
+ ihid_elan->reset_gpio = devm_gpiod_get_optional(&client->dev, "reset",
+ GPIOD_ASIS);
if (IS_ERR(ihid_elan->reset_gpio))
return PTR_ERR(ihid_elan->reset_gpio);
+ ihid_elan->no_reset_on_power_off = of_property_read_bool(client->dev.of_node,
+ "no-reset-on-power-off");
+
ihid_elan->vccio = devm_regulator_get(&client->dev, "vccio");
if (IS_ERR(ihid_elan->vccio))
return PTR_ERR(ihid_elan->vccio);
--
2.43.2
It turned out that KMSAN instruments READ_ONCE_NOCHECK(), resulting in
false positive reports, because __no_sanitize_or_inline enforced inlining.
Properly declare __no_sanitize_or_inline under __SANITIZE_MEMORY__,
so that it does not __always_inline the annotated function.
Reported-by: syzbot+355c5bb8c1445c871ee8(a)syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/000000000000826ac1061675b0e3@google.com
Fixes: 5de0ce85f5a4 ("kmsan: mark noinstr as __no_sanitize_memory")
Cc: stable(a)vger.kernel.org
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
---
include/linux/compiler_types.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 0caf354cb94b5..a6a28952836cb 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -278,6 +278,17 @@ struct ftrace_likely_data {
# define __no_kcsan
#endif
+#ifdef __SANITIZE_MEMORY__
+/*
+ * Similarly to KASAN and KCSAN, KMSAN loses function attributes of inlined
+ * functions, therefore disabling KMSAN checks also requires disabling inlining.
+ *
+ * __no_sanitize_or_inline effectively prevents KMSAN from reporting errors
+ * within the function and marks all its outputs as initialized.
+ */
+# define __no_sanitize_or_inline __no_kmsan_checks notrace __maybe_unused
+#endif
+
#ifndef __no_sanitize_or_inline
#define __no_sanitize_or_inline __always_inline
#endif
--
2.44.0.769.g3c40516874-goog
Correctly set the length of the drm_event to the size of the structure
that's actually used.
The length of the drm_event was set to the parent structure instead of
to the drm_vmw_event_fence which is supposed to be read. drm_read
uses the length parameter to copy the event to the user space thus
resuling in oob reads.
Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com>
Fixes: 8b7de6aa8468 ("vmwgfx: Rework fence event action")
Reported-by: zdi-disclosures(a)trendmicro.com # ZDI-CAN-23566
Cc: David Airlie <airlied(a)gmail.com>
CC: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Zack Rusin <zack.rusin(a)broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linux-kernel(a)vger.kernel.org
Cc: <stable(a)vger.kernel.org> # v3.4+
---
drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
index 2a0cda324703..5efc6a766f64 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
@@ -991,7 +991,7 @@ static int vmw_event_fence_action_create(struct drm_file *file_priv,
}
event->event.base.type = DRM_VMW_EVENT_FENCE_SIGNALED;
- event->event.base.length = sizeof(*event);
+ event->event.base.length = sizeof(event->event);
event->event.user_data = user_data;
ret = drm_event_reserve_init(dev, file_priv, &event->base, &event->event.base);
--
2.40.1
Legacy DU was broken by the referenced fixes commit because the placement
and the busy_placement no longer pointed to the same object. This was later
fixed indirectly by commit a78a8da51b36c7a0c0c16233f91d60aac03a5a49
("drm/ttm: replace busy placement with flags v6") in v6.9.
Fixes: 39985eea5a6d ("drm/vmwgfx: Abstract placement selection")
Signed-off-by: Ian Forbes <ian.forbes(a)broadcom.com>
Cc: <stable(a)vger.kernel.org> # v6.4+
---
drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 2bfac3aad7b7..98e73eb0ccf1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -204,6 +204,7 @@ int vmw_bo_pin_in_start_of_vram(struct vmw_private *dev_priv,
VMW_BO_DOMAIN_VRAM,
VMW_BO_DOMAIN_VRAM);
buf->places[0].lpfn = PFN_UP(bo->resource->size);
+ buf->busy_places[0].lpfn = PFN_UP(bo->resource->size);
ret = ttm_bo_validate(bo, &buf->placement, &ctx);
/* For some reason we didn't end up at the start of vram */
--
2.34.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 35e351780fa9d8240dd6f7e4f245f9ea37e96c19
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042319-freeing-degree-ca0d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
35e351780fa9 ("fork: defer linking file vma until vma is fully initialized")
d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 35e351780fa9d8240dd6f7e4f245f9ea37e96c19 Mon Sep 17 00:00:00 2001
From: Miaohe Lin <linmiaohe(a)huawei.com>
Date: Wed, 10 Apr 2024 17:14:41 +0800
Subject: [PATCH] fork: defer linking file vma until vma is fully initialized
Thorvald reported a WARNING [1]. And the root cause is below race:
CPU 1 CPU 2
fork hugetlbfs_fallocate
dup_mmap hugetlbfs_punch_hole
i_mmap_lock_write(mapping);
vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree.
i_mmap_unlock_write(mapping);
hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem!
i_mmap_lock_write(mapping);
hugetlb_vmdelete_list
vma_interval_tree_foreach
hugetlb_vma_trylock_write -- Vma_lock is cleared.
tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem!
hugetlb_vma_unlock_write -- Vma_lock is assigned!!!
i_mmap_unlock_write(mapping);
hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside
i_mmap_rwsem lock while vma lock can be used in the same time. Fix this
by deferring linking file vma until vma is fully initialized. Those vmas
should be initialized first before they can be used.
Link: https://lkml.kernel.org/r/20240410091441.3539905-1-linmiaohe@huawei.com
Fixes: 8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reported-by: Thorvald Natvig <thorvald(a)google.com>
Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1]
Reviewed-by: Jane Chu <jane.chu(a)oracle.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Kent Overstreet <kent.overstreet(a)linux.dev>
Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Cc: Mateusz Guzik <mjguzik(a)gmail.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Peng Zhang <zhangpeng.00(a)bytedance.com>
Cc: Tycho Andersen <tandersen(a)netflix.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/kernel/fork.c b/kernel/fork.c
index 39a5046c2f0b..aebb3e6c96dc 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -714,6 +714,23 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
} else if (anon_vma_fork(tmp, mpnt))
goto fail_nomem_anon_vma_fork;
vm_flags_clear(tmp, VM_LOCKED_MASK);
+ /*
+ * Copy/update hugetlb private vma information.
+ */
+ if (is_vm_hugetlb_page(tmp))
+ hugetlb_dup_vma_private(tmp);
+
+ /*
+ * Link the vma into the MT. After using __mt_dup(), memory
+ * allocation is not necessary here, so it cannot fail.
+ */
+ vma_iter_bulk_store(&vmi, tmp);
+
+ mm->map_count++;
+
+ if (tmp->vm_ops && tmp->vm_ops->open)
+ tmp->vm_ops->open(tmp);
+
file = tmp->vm_file;
if (file) {
struct address_space *mapping = file->f_mapping;
@@ -730,25 +747,9 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
i_mmap_unlock_write(mapping);
}
- /*
- * Copy/update hugetlb private vma information.
- */
- if (is_vm_hugetlb_page(tmp))
- hugetlb_dup_vma_private(tmp);
-
- /*
- * Link the vma into the MT. After using __mt_dup(), memory
- * allocation is not necessary here, so it cannot fail.
- */
- vma_iter_bulk_store(&vmi, tmp);
-
- mm->map_count++;
if (!(tmp->vm_flags & VM_WIPEONFORK))
retval = copy_page_range(tmp, mpnt);
- if (tmp->vm_ops && tmp->vm_ops->open)
- tmp->vm_ops->open(tmp);
-
if (retval) {
mpnt = vma_next(&vmi);
goto loop_out;
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x ca7c52ac7ad384bcf299d89482c45fec7cd00da9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042358-esteemed-fastball-c2d8@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
ca7c52ac7ad3 ("drm/xe/vm: prevent UAF with asid based lookup")
0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr")
be7d51c5b468 ("drm/xe: Add batch buffer addresses to devcoredump")
4376cee62092 ("drm/xe: Print more device information in devcoredump")
98fefec8c381 ("drm/xe: Change devcoredump functions parameters to xe_sched_job")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ca7c52ac7ad384bcf299d89482c45fec7cd00da9 Mon Sep 17 00:00:00 2001
From: Matthew Auld <matthew.auld(a)intel.com>
Date: Fri, 12 Apr 2024 12:31:45 +0100
Subject: [PATCH] drm/xe/vm: prevent UAF with asid based lookup
The asid is only erased from the xarray when the vm refcount reaches
zero, however this leads to potential UAF since the xe_vm_get() only
works on a vm with refcount != 0. Since the asid is allocated in the vm
create ioctl, rather erase it when closing the vm, prior to dropping the
potential last ref. This should also work when user closes driver fd
without explicit vm destroy.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1594
Signed-off-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240412113144.259426-4-matth…
(cherry picked from commit 83967c57320d0d01ae512f10e79213f81e4bf594)
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
index 62d1ef8867a8..3d4c8f342e21 100644
--- a/drivers/gpu/drm/xe/xe_vm.c
+++ b/drivers/gpu/drm/xe/xe_vm.c
@@ -1577,6 +1577,16 @@ void xe_vm_close_and_put(struct xe_vm *vm)
xe->usm.num_vm_in_fault_mode--;
else if (!(vm->flags & XE_VM_FLAG_MIGRATION))
xe->usm.num_vm_in_non_fault_mode--;
+
+ if (vm->usm.asid) {
+ void *lookup;
+
+ xe_assert(xe, xe->info.has_asid);
+ xe_assert(xe, !(vm->flags & XE_VM_FLAG_MIGRATION));
+
+ lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);
+ xe_assert(xe, lookup == vm);
+ }
mutex_unlock(&xe->usm.lock);
for_each_tile(tile, xe, id)
@@ -1592,24 +1602,15 @@ static void vm_destroy_work_func(struct work_struct *w)
struct xe_device *xe = vm->xe;
struct xe_tile *tile;
u8 id;
- void *lookup;
/* xe_vm_close_and_put was not called? */
xe_assert(xe, !vm->size);
mutex_destroy(&vm->snap_mutex);
- if (!(vm->flags & XE_VM_FLAG_MIGRATION)) {
+ if (!(vm->flags & XE_VM_FLAG_MIGRATION))
xe_device_mem_access_put(xe);
- if (xe->info.has_asid && vm->usm.asid) {
- mutex_lock(&xe->usm.lock);
- lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);
- xe_assert(xe, lookup == vm);
- mutex_unlock(&xe->usm.lock);
- }
- }
-
for_each_tile(tile, xe, id)
XE_WARN_ON(vm->pt_root[id]);
On boards where EC IRQ is not wake capable, EC does not trigger IRQ to
signal any non-wake events until EC receives host resume event.
Commit 47ea0ddb1f56 ("platform/chrome: cros_ec_lpc: Separate host
command and irq disable") separated enabling IRQ and sending resume
event host command into early_resume and resume_complete stages
respectively. This separation leads to host not handling certain events
posted during a small time window between early_resume and
resume_complete stages. This change moves handling all events that
happened during suspend after sending host resume event.
Fixes: 47ea0ddb1f56 ("platform/chrome: cros_ec_lpc: Separate host command and irq disable")
Cc: stable(a)vger.kernel.org
Cc: Lalith Rajendran <lalithkraj(a)chromium.org>
Cc: chrome-platform(a)lists.linux.dev
Signed-off-by: Karthikeyan Ramasubramanian <kramasub(a)chromium.org>
---
drivers/platform/chrome/cros_ec.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
index badc68bbae8cc..41714df053916 100644
--- a/drivers/platform/chrome/cros_ec.c
+++ b/drivers/platform/chrome/cros_ec.c
@@ -432,6 +432,12 @@ static void cros_ec_send_resume_event(struct cros_ec_device *ec_dev)
void cros_ec_resume_complete(struct cros_ec_device *ec_dev)
{
cros_ec_send_resume_event(ec_dev);
+ /*
+ * Let the mfd devices know about events that occur during
+ * suspend. This way the clients know what to do with them.
+ */
+ cros_ec_report_events_during_suspend(ec_dev);
+
}
EXPORT_SYMBOL(cros_ec_resume_complete);
@@ -442,12 +448,6 @@ static void cros_ec_enable_irq(struct cros_ec_device *ec_dev)
if (ec_dev->wake_enabled)
disable_irq_wake(ec_dev->irq);
-
- /*
- * Let the mfd devices know about events that occur during
- * suspend. This way the clients know what to do with them.
- */
- cros_ec_report_events_during_suspend(ec_dev);
}
/**
@@ -475,8 +475,9 @@ EXPORT_SYMBOL(cros_ec_resume_early);
*/
int cros_ec_resume(struct cros_ec_device *ec_dev)
{
- cros_ec_enable_irq(ec_dev);
- cros_ec_send_resume_event(ec_dev);
+ cros_ec_resume_early(ec_dev);
+ cros_ec_resume_complete(ec_dev);
+
return 0;
}
EXPORT_SYMBOL(cros_ec_resume);
--
2.44.0.769.g3c40516874-goog
The quilt patch titled
Subject: ocfs2: use coarse time for new created files
has been removed from the -mm tree. Its filename was
ocfs2-use-coarse-time-for-new-created-files.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Su Yue <glass.su(a)suse.com>
Subject: ocfs2: use coarse time for new created files
Date: Mon, 8 Apr 2024 16:20:41 +0800
The default atime related mount option is '-o realtime' which means file
atime should be updated if atime <= ctime or atime <= mtime. atime should
be updated in the following scenario, but it is not:
==========================================================
$ rm /mnt/testfile;
$ echo test > /mnt/testfile
$ stat -c "%X %Y %Z" /mnt/testfile
1711881646 1711881646 1711881646
$ sleep 5
$ cat /mnt/testfile > /dev/null
$ stat -c "%X %Y %Z" /mnt/testfile
1711881646 1711881646 1711881646
==========================================================
And the reason the atime in the test is not updated is that ocfs2 calls
ktime_get_real_ts64() in __ocfs2_mknod_locked during file creation. Then
inode_set_ctime_current() is called in inode_set_ctime_current() calls
ktime_get_coarse_real_ts64() to get current time.
ktime_get_real_ts64() is more accurate than ktime_get_coarse_real_ts64().
In my test box, I saw ctime set by ktime_get_coarse_real_ts64() is less
than ktime_get_real_ts64() even ctime is set later. The ctime of the new
inode is smaller than atime.
The call trace is like:
ocfs2_create
ocfs2_mknod
__ocfs2_mknod_locked
....
ktime_get_real_ts64 <------- set atime,ctime,mtime, more accurate
ocfs2_populate_inode
...
ocfs2_init_acl
ocfs2_acl_set_mode
inode_set_ctime_current
current_time
ktime_get_coarse_real_ts64 <-------less accurate
ocfs2_file_read_iter
ocfs2_inode_lock_atime
ocfs2_should_update_atime
atime <= ctime ? <-------- false, ctime < atime due to accuracy
So here call ktime_get_coarse_real_ts64 to set inode time coarser while
creating new files. It may lower the accuracy of file times. But it's
not a big deal since we already use coarse time in other places like
ocfs2_update_inode_atime and inode_set_ctime_current.
Link: https://lkml.kernel.org/r/20240408082041.20925-5-glass.su@suse.com
Fixes: c62c38f6b91b ("ocfs2: replace CURRENT_TIME macro")
Signed-off-by: Su Yue <glass.su(a)suse.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ocfs2/namei.c~ocfs2-use-coarse-time-for-new-created-files
+++ a/fs/ocfs2/namei.c
@@ -566,7 +566,7 @@ static int __ocfs2_mknod_locked(struct i
fe->i_last_eb_blk = 0;
strcpy(fe->i_signature, OCFS2_INODE_SIGNATURE);
fe->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
- ktime_get_real_ts64(&ts);
+ ktime_get_coarse_real_ts64(&ts);
fe->i_atime = fe->i_ctime = fe->i_mtime =
cpu_to_le64(ts.tv_sec);
fe->i_mtime_nsec = fe->i_ctime_nsec = fe->i_atime_nsec =
_
Patches currently in -mm which might be from glass.su(a)suse.com are
The quilt patch titled
Subject: ocfs2: update inode fsync transaction id in ocfs2_unlink and ocfs2_link
has been removed from the -mm tree. Its filename was
ocfs2-update-inode-fsync-transaction-id-in-ocfs2_unlink-and-ocfs2_link.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Su Yue <glass.su(a)suse.com>
Subject: ocfs2: update inode fsync transaction id in ocfs2_unlink and ocfs2_link
Date: Mon, 8 Apr 2024 16:20:40 +0800
transaction id should be updated in ocfs2_unlink and ocfs2_link.
Otherwise, inode link will be wrong after journal replay even fsync was
called before power failure:
=======================================================================
$ touch testdir/bar
$ ln testdir/bar testdir/bar_link
$ fsync testdir/bar
$ stat -c %h $SCRATCH_MNT/testdir/bar
1
$ stat -c %h $SCRATCH_MNT/testdir/bar
1
=======================================================================
Link: https://lkml.kernel.org/r/20240408082041.20925-4-glass.su@suse.com
Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Su Yue <glass.su(a)suse.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/namei.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/ocfs2/namei.c~ocfs2-update-inode-fsync-transaction-id-in-ocfs2_unlink-and-ocfs2_link
+++ a/fs/ocfs2/namei.c
@@ -797,6 +797,7 @@ static int ocfs2_link(struct dentry *old
ocfs2_set_links_count(fe, inode->i_nlink);
fe->i_ctime = cpu_to_le64(inode_get_ctime_sec(inode));
fe->i_ctime_nsec = cpu_to_le32(inode_get_ctime_nsec(inode));
+ ocfs2_update_inode_fsync_trans(handle, inode, 0);
ocfs2_journal_dirty(handle, fe_bh);
err = ocfs2_add_entry(handle, dentry, inode,
@@ -993,6 +994,7 @@ static int ocfs2_unlink(struct inode *di
drop_nlink(inode);
drop_nlink(inode);
ocfs2_set_links_count(fe, inode->i_nlink);
+ ocfs2_update_inode_fsync_trans(handle, inode, 0);
ocfs2_journal_dirty(handle, fe_bh);
inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir));
_
Patches currently in -mm which might be from glass.su(a)suse.com are
The quilt patch titled
Subject: ocfs2: fix races between hole punching and AIO+DIO
has been removed from the -mm tree. Its filename was
ocfs2-fix-races-between-hole-punching-and-aiodio.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Su Yue <glass.su(a)suse.com>
Subject: ocfs2: fix races between hole punching and AIO+DIO
Date: Mon, 8 Apr 2024 16:20:39 +0800
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block",
fstests/generic/300 become from always failed to sometimes failed:
========================================================================
[ 473.293420 ] run fstests generic/300
[ 475.296983 ] JBD2: Ignoring recovery information on journal
[ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
[ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
[ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[ 494.292018 ] OCFS2: File system is now read-only.
[ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
[ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072
=========================================================================
In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten
extents to a list. extents are also inserted into extent tree in
ocfs2_write_begin_nolock. Then another thread call fallocate to puch a
hole at one of the unwritten extent. The extent at cpos was removed by
ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list
found there is no such extent at the cpos.
T1 T2 T3
inode lock
...
insert extents
...
inode unlock
ocfs2_fallocate
__ocfs2_change_file_space
inode lock
lock ip_alloc_sem
ocfs2_remove_inode_range inode
ocfs2_remove_btree_range
ocfs2_remove_extent
^---remove the extent at cpos 78723
...
unlock ip_alloc_sem
inode unlock
ocfs2_dio_end_io
ocfs2_dio_end_io_write
lock ip_alloc_sem
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_search_extent_list
^---failed to find extent
...
unlock ip_alloc_sem
In most filesystems, fallocate is not compatible with racing with AIO+DIO,
so fix it by adding to wait for all dio before fallocate/punch_hole like
ext4.
Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com
Fixes: b25801038da5 ("ocfs2: Support xfs style space reservation ioctls")
Signed-off-by: Su Yue <glass.su(a)suse.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/ocfs2/file.c~ocfs2-fix-races-between-hole-punching-and-aiodio
+++ a/fs/ocfs2/file.c
@@ -1936,6 +1936,8 @@ static int __ocfs2_change_file_space(str
inode_lock(inode);
+ /* Wait all existing dio workers, newcomers will block on i_rwsem */
+ inode_dio_wait(inode);
/*
* This prevents concurrent writes on other nodes
*/
_
Patches currently in -mm which might be from glass.su(a)suse.com are