Hi,
This series enables USB3 Type-C lane orientation detection and
configuration on platforms that support this (Google gs101), and it
also allows the DWC3 core to enter runtime suspend even when UDC is
active.
For lane orientation, this driver now optionally (based on DT)
subscribes to the TCPC's lane orientation notifier and remembers the
orientation to later be used during phy_init().
To enable DWC3 runtime suspend, the gadget needs to inform the core via
dwc3_gadget_interrupt() with event type == DWC3_DEVICE_EVENT_DISCONNECT
of a cable disconnect. For that to allow to happen, this driver
therefore needs to stop forcing the Vbus and bvalid signals to active
and instead change their state based on actual conditions. The same
TCPC notifier is used to detect this, and program the hardware
accordingly.
That signal state is based on advice given by Thinh in
https://lore.kernel.org/all/20240813230625.jgkatqstyhcmpezv@synopsys.com/
Both changes together now allow cable orientation detection to work, as
the DWC3 will now call phy_exit() on cable disconnect, and we can
reprogram the lane mux in phy_init().
On top of that, there are some small related cleanup patches.
Signed-off-by: André Draszik <andre.draszik(a)linaro.org>
---
Changes in v3:
- patches 1 & 2: update as per Rob's suggestions
- patch 7 & 8: drop init to -1 of phy_drd->orientation (Vinod)
- patch 7: avoid an #ifdef
- Link to v2: https://lore.kernel.org/r/20241203-gs101-phy-lanes-orientation-phy-v2-0-40d…
Changes in v2:
- squash patches #2 and #3 from v1 to actually disallow
orientation-switch on !gs101 (not just optional) (Conor)
- update bindings commit message to clarify that the intention for the
driver is to work with old and new DTS (Conor)
- add cc-stable and fixes tags to power gating patch (Krzysztof)
- fix an #include and typo (Peter)
- Link to v1: https://lore.kernel.org/r/20241127-gs101-phy-lanes-orientation-phy-v1-0-1b7…
---
André Draszik (8):
dt-bindings: phy: samsung,usb3-drd-phy: add blank lines between DT properties
dt-bindings: phy: samsung,usb3-drd-phy: gs101: require Type-C properties
phy: exynos5-usbdrd: convert to dev_err_probe
phy: exynos5-usbdrd: fix EDS distribution tuning (gs101)
phy: exynos5-usbdrd: gs101: ensure power is gated to SS phy in phy_exit()
phy: exynos5-usbdrd: gs101: configure SS lanes based on orientation
phy: exynos5-usbdrd: subscribe to orientation notifier if required
phy: exynos5-usbdrd: allow DWC3 runtime suspend with UDC bound (E850+)
.../bindings/phy/samsung,usb3-drd-phy.yaml | 21 +-
drivers/phy/samsung/Kconfig | 1 +
drivers/phy/samsung/phy-exynos5-usbdrd.c | 215 ++++++++++++++++-----
3 files changed, 190 insertions(+), 47 deletions(-)
---
base-commit: c245a7a79602ccbee780c004c1e4abcda66aec32
change-id: 20241127-gs101-phy-lanes-orientation-phy-29d20c6d84d2
Best regards,
--
André Draszik <andre.draszik(a)linaro.org>
This series fixes various small issues in the drivers, and adds a few
things (a couple of pixel formats and a debugging feature).
It also takes a few steps in adding more i2c read/write error handlings
to the drivers, but covers only the easy places.
Adding error handling to all reads/writes needs more thinking, perhaps
adding a "ret" parameter to the calls, similar to the cci_* functions,
or perhaps adding helpers for writing multiple registers from a given
table. Also, in some places rolling back from an error will require
work.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
---
Changes in v3:
- Include bitfield.h for FIELD_PREP()
- Cc stable for relevant fixes
- Link to v2: https://lore.kernel.org/r/20241108-ub9xx-fixes-v2-0-c7db3b2ad89f@ideasonboa…
Changes in v2:
- Address comments from Andy
- Add two new patches:
- media: i2c: ds90ub960: Fix shadowing of local variables
- media: i2c: ds90ub960: Use HZ_PER_MHZ
- Link to v1: https://lore.kernel.org/r/20241004-ub9xx-fixes-v1-0-e30a4633c786@ideasonboa…
---
Tomi Valkeinen (15):
media: i2c: ds90ub9x3: Fix extra fwnode_handle_put()
media: i2c: ds90ub960: Fix UB9702 refclk register access
media: i2c: ds90ub960: Fix use of non-existing registers on UB9702
media: i2c: ds90ub960: Fix logging SP & EQ status only for UB9702
media: i2c: ds90ub960: Fix UB9702 VC map
media: i2c: ds90ub960: Use HZ_PER_MHZ
media: i2c: ds90ub960: Add support for I2C_RX_ID
media: i2c: ds90ub960: Add RGB24, RAW8 and RAW10 formats
media: i2c: ds90ub953: Clear CRC errors in ub953_log_status()
media: i2c: ds90ub960: Drop unused indirect block define
media: i2c: ds90ub960: Reduce sleep in ub960_rxport_wait_locks()
media: i2c: ds90ub960: Handle errors in ub960_log_status_ub960_sp_eq()
media: i2c: ds90ub913: Add error handling to ub913_hw_init()
media: i2c: ds90ub953: Add error handling for i2c reads/writes
media: i2c: ds90ub960: Fix shadowing of local variables
drivers/media/i2c/ds90ub913.c | 26 ++++--
drivers/media/i2c/ds90ub953.c | 56 +++++++++----
drivers/media/i2c/ds90ub960.c | 186 ++++++++++++++++++++++++++++--------------
3 files changed, 187 insertions(+), 81 deletions(-)
---
base-commit: adc218676eef25575469234709c2d87185ca223a
change-id: 20241004-ub9xx-fixes-bba80dc48627
Best regards,
--
Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
This patchset implements UVC v1.5 region of interest using V4L2
control API.
ROI control is consisted two uvc specific controls.
1. A rectangle control with a newly added type V4L2_CTRL_TYPE_RECT.
2. An auto control with type bitmask.
V4L2_CTRL_WHICH_MIN/MAX_VAL is added to support the rectangle control.
The corresponding v4l-utils series can be found at
https://patchwork.linuxtv.org/project/linux-media/list/?series=11069 .
Tested with v4l2-compliance, v4l2-ctl, calling ioctls on usb cameras and
VIVID with a newly added V4L2_CTRL_TYPE_RECT control.
This set includes also the patch:
media: uvcvideo: Fix event flags in uvc_ctrl_send_events
It is not technically part of this change, but we conflict with it.
I am continuing the work that Yunke did.
Changes in v15:
- Modify mapping set/get to support any size
- Remove v4l2_size field. It is not needed, we can use the v4l2_type to
infer it.
- Improve documentation.
- Lots of refactoring, now adding compound and roi are very small
patches.
- Remove rectangle clamping, not supported by some firmware.
- Remove init, we can add it later.
- Move uvc_cid to USER_BASE
- Link to v14: https://lore.kernel.org/linux-media/20231201071907.3080126-1-yunkec@google.…
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Hans Verkuil (1):
media: v4l2-ctrls: add support for V4L2_CTRL_WHICH_MIN/MAX_VAL
Ricardo Ribalda (12):
media: uvcvideo: Fix event flags in uvc_ctrl_send_events
media: uvcvideo: Handle uvc menu translation inside uvc_get_le_value
media: uvcvideo: Handle uvc menu translation inside uvc_set_le_value
media: uvcvideo: refactor uvc_ioctl_g_ext_ctrls
media: uvcvideo: uvc_ioctl_(g|s)_ext_ctrls: handle NoP case
media: uvcvideo: Support any size for mapping get/set
media: uvcvideo: Factor out clamping from uvc_ctrl_set
media: uvcvideo: Factor out query_boundaries from query_ctrl
media: uvcvideo: Use the camera to clamp compound controls
media: uvcvideo: let v4l2_query_v4l2_ctrl() work with v4l2_query_ext_ctrl
media: uvcvideo: Introduce uvc_mapping_v4l2_size
media: uvcvideo: Add sanity check to uvc_ioctl_xu_ctrl_map
Yunke Cao (6):
media: v4l2_ctrl: Add V4L2_CTRL_TYPE_RECT
media: vivid: Add a rectangle control
media: uvcvideo: add support for compound controls
media: uvcvideo: support V4L2_CTRL_WHICH_MIN/MAX_VAL
media: uvcvideo: implement UVC v1.5 ROI
media: uvcvideo: document UVC v1.5 ROI
.../userspace-api/media/drivers/uvcvideo.rst | 64 ++
.../userspace-api/media/v4l/vidioc-g-ext-ctrls.rst | 26 +-
.../userspace-api/media/v4l/vidioc-queryctrl.rst | 14 +
.../userspace-api/media/videodev2.h.rst.exceptions | 4 +
drivers/media/i2c/imx214.c | 4 +-
drivers/media/platform/qcom/venus/venc_ctrls.c | 9 +-
drivers/media/test-drivers/vivid/vivid-ctrls.c | 34 +
drivers/media/usb/uvc/uvc_ctrl.c | 805 ++++++++++++++++-----
drivers/media/usb/uvc/uvc_v4l2.c | 77 +-
drivers/media/usb/uvc/uvcvideo.h | 25 +-
drivers/media/v4l2-core/v4l2-ctrls-api.c | 54 +-
drivers/media/v4l2-core/v4l2-ctrls-core.c | 167 ++++-
drivers/media/v4l2-core/v4l2-ioctl.c | 4 +-
include/media/v4l2-ctrls.h | 38 +-
include/uapi/linux/usb/video.h | 1 +
include/uapi/linux/uvcvideo.h | 13 +
include/uapi/linux/v4l2-controls.h | 9 +
include/uapi/linux/videodev2.h | 5 +
18 files changed, 1062 insertions(+), 291 deletions(-)
---
base-commit: 5516200c466f92954551406ea641376963c43a92
change-id: 20241113-uvc-roi-66bd6cfa1e64
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
This issue was found after attempting to make the same mistake for
a driver I maintain, which was fortunately spotted by Jonathan [1].
Keeping old sensor values if the channel configuration changes is known
and not considered an issue, which is also mentioned in [1], so it has
not been addressed by this series. That keeps most of the drivers out
of the way because they store the scan element in iio private data,
which is kzalloc() allocated.
This series only addresses cases where uninitialized i.e. unknown data
is pushed to the userspace, either due to holes in structs or
uninitialized struct members/array elements.
While analyzing involved functions, I found and fixed some triviality
(wrong function name) in the documentation of iio_dev_opaque.
Link: https://lore.kernel.org/linux-iio/20241123151634.303aa860@jic23-huawei/ [1]
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Javier Carrasco (11):
iio: temperature: tmp006: fix information leak in triggered buffer
iio: adc: ti-ads1119: fix information leak in triggered buffer
iio: pressure: zpa2326: fix information leak in triggered buffer
iio: adc: rockchip_saradc: fix information leak in triggered buffer
iio: imu: kmx61: fix information leak in triggered buffer
iio: light: vcnl4035: fix information leak in triggered buffer
iio: light: bh1745: fix information leak in triggered buffer
iio: adc: ti-ads8688: fix information leak in triggered buffer
iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
iio: light: as73211: fix information leak in triggered buffer
iio: core: fix doc reference to iio_push_to_buffers_with_ts_unaligned
drivers/iio/adc/rockchip_saradc.c | 2 ++
drivers/iio/adc/ti-ads1119.c | 2 ++
drivers/iio/adc/ti-ads8688.c | 2 +-
drivers/iio/dummy/iio_simple_dummy_buffer.c | 2 +-
drivers/iio/imu/kmx61.c | 2 +-
drivers/iio/light/as73211.c | 3 +++
drivers/iio/light/bh1745.c | 2 ++
drivers/iio/light/vcnl4035.c | 2 +-
drivers/iio/pressure/zpa2326.c | 2 ++
drivers/iio/temperature/tmp006.c | 2 ++
include/linux/iio/iio-opaque.h | 2 +-
11 files changed, 18 insertions(+), 5 deletions(-)
---
base-commit: ab376e4d674037f45d5758c1dc391bd4e11c5dc4
change-id: 20241123-iio_memset_scan_holes-a673833ef932
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
This issue was found after attempting to make the same mistake for
a driver I maintain, which was fortunately spotted by Jonathan [1].
Keeping old sensor values if the channel configuration changes is known
and not considered an issue, which is also mentioned in [1], so it has
not been addressed by this series. That keeps most of the drivers out
of the way because they store the scan element in iio private data,
which is kzalloc() allocated.
This series only addresses cases where uninitialized i.e. unknown data
is pushed to the userspace, either due to holes in structs or
uninitialized struct members/array elements.
While analyzing involved functions, I found and fixed some triviality
(wrong function name) in the documentation of iio_dev_opaque.
Link: https://lore.kernel.org/linux-iio/20241123151634.303aa860@jic23-huawei/ [1]
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
Changes in v2:
- as73211.c: shift channels if no temperature is available and
initialize chan[3] to zero.
- Link to v1: https://lore.kernel.org/r/20241125-iio_memset_scan_holes-v1-0-0cb6e98d895c@…
---
Javier Carrasco (2):
iio: temperature: tmp006: fix information leak in triggered buffer
iio: light: as73211: fix channel handling in only-color triggered buffer
drivers/iio/light/as73211.c | 17 +++++++++++++----
drivers/iio/temperature/tmp006.c | 2 ++
2 files changed, 15 insertions(+), 4 deletions(-)
---
base-commit: 1694dea95b02eff1a64c893ffee4626df533b2ab
change-id: 20241123-iio_memset_scan_holes-a673833ef932
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
From: Selvarasu Ganesan <selvarasu.g(a)samsung.com>
The current implementation sets the wMaxPacketSize of bulk in/out
endpoints to 1024 bytes at the end of the f_midi_bind function. However,
in cases where there is a failure in the first midi bind attempt,
consider rebinding. This scenario may encounter an f_midi_bind issue due
to the previous bind setting the bulk endpoint's wMaxPacketSize to 1024
bytes, which exceeds the ep->maxpacket_limit where configured TX/RX
FIFO's maxpacket size of 512 bytes for IN/OUT endpoints in support HS
speed only.
This commit addresses this issue by resetting the wMaxPacketSize before
endpoint claim
Fixes: 46decc82ffd5 ("usb: gadget: unconditionally allocate hs/ss descriptor in bind operation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Selvarasu Ganesan <selvarasu.g(a)samsung.com>
---
drivers/usb/gadget/function/f_midi.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/usb/gadget/function/f_midi.c b/drivers/usb/gadget/function/f_midi.c
index 837fcdfa3840..5caa0e4eb07e 100644
--- a/drivers/usb/gadget/function/f_midi.c
+++ b/drivers/usb/gadget/function/f_midi.c
@@ -907,6 +907,15 @@ static int f_midi_bind(struct usb_configuration *c, struct usb_function *f)
status = -ENODEV;
+ /*
+ * Reset wMaxPacketSize with maximum packet size of FS bulk transfer before
+ * endpoint claim. This ensures that the wMaxPacketSize does not exceed the
+ * limit during bind retries where configured TX/RX FIFO's maxpacket size
+ * of 512 bytes for IN/OUT endpoints in support HS speed only.
+ */
+ bulk_in_desc.wMaxPacketSize = cpu_to_le16(64);
+ bulk_out_desc.wMaxPacketSize = cpu_to_le16(64);
+
/* allocate instance-specific endpoints */
midi->in_ep = usb_ep_autoconfig(cdev->gadget, &bulk_in_desc);
if (!midi->in_ep)
--
2.17.1
[BUG]
With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash
with the following VM_BUG_ON_FOLIO():
BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28
BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28
page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664
aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774"
flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff)
page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
------------[ cut here ]------------
kernel BUG at mm/page-writeback.c:2992!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G OE 6.12.0-rc7-custom+ #87
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
pc : folio_clear_dirty_for_io+0x128/0x258
lr : folio_clear_dirty_for_io+0x128/0x258
Call trace:
folio_clear_dirty_for_io+0x128/0x258
btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs]
__process_folios_contig+0x154/0x268 [btrfs]
extent_clear_unlock_delalloc+0x5c/0x80 [btrfs]
run_delalloc_nocow+0x5f8/0x760 [btrfs]
btrfs_run_delalloc_range+0xa8/0x220 [btrfs]
writepage_delalloc+0x230/0x4c8 [btrfs]
extent_writepage+0xb8/0x358 [btrfs]
extent_write_cache_pages+0x21c/0x4e8 [btrfs]
btrfs_writepages+0x94/0x150 [btrfs]
do_writepages+0x74/0x190
filemap_fdatawrite_wbc+0x88/0xc8
start_delalloc_inodes+0x178/0x3a8 [btrfs]
btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
shrink_delalloc+0x114/0x280 [btrfs]
flush_space+0x250/0x2f8 [btrfs]
btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
process_one_work+0x164/0x408
worker_thread+0x25c/0x388
kthread+0x100/0x118
ret_from_fork+0x10/0x20
Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000)
---[ end trace 0000000000000000 ]---
[CAUSE]
The first two lines of extra debug messages show the problem is caused
by the error handling of run_delalloc_nocow().
E.g. we have the following dirtied range (4K blocksize 4K page size):
0 16K 32K
|//////////////////////////////////////|
| Pre-allocated |
And the range [0, 16K) has a preallocated extent.
- Enter run_delalloc_nocow() for range [0, 16K)
Which found range [0, 16K) is preallocated, can do the proper NOCOW
write.
- Enter fallback_to_fow() for range [16K, 32K)
Since the range [16K, 32K) is not backed by preallocated extent, we
have to go COW.
- cow_file_range() failed for range [16K, 32K)
So cow_file_range() will do the clean up by clearing folio dirty,
unlock the folios.
Now the folios in range [16K, 32K) is unlocked.
- Enter extent_clear_unlock_delalloc() from run_delalloc_nocow()
Which is called with PAGE_START_WRITEBACK to start page writeback.
But folios can only be marked writeback when it's properly locked,
thus this triggered the VM_BUG_ON_FOLIO().
Furthermore there is another hidden but common bug that
run_delalloc_nocow() is not clearing the folio dirty flags in its error
handling path.
This is the common bug shared between run_delalloc_nocow() and
cow_file_range().
[FIX]
- Clear folio dirty for range [@start, @cur_offset)
Introduce a helper, cleanup_dirty_folios(), which
will find and lock the folio in the range, clear the dirty flag and
start/end the writeback, with the extra handling for the
@locked_folio.
- Introduce a helper to record the last failed COW range end
This is to trace which range we should skip, to avoid double
unlocking.
- Skip the failed COW range for the error handling
Cc: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/inode.c | 93 ++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 86 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 57e9b7deee88..75b7956a7b4c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1961,6 +1961,48 @@ static int can_nocow_file_extent(struct btrfs_path *path,
return ret < 0 ? ret : can_nocow;
}
+static void cleanup_dirty_folios(struct btrfs_inode *inode,
+ struct folio *locked_folio,
+ u64 start, u64 end, int error)
+{
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
+ struct address_space *mapping = inode->vfs_inode.i_mapping;
+ pgoff_t start_index = start >> PAGE_SHIFT;
+ pgoff_t end_index = end >> PAGE_SHIFT;
+ u32 len;
+
+ ASSERT(end + 1 - start < U32_MAX);
+ ASSERT(IS_ALIGNED(start, fs_info->sectorsize) &&
+ IS_ALIGNED(end + 1, fs_info->sectorsize));
+ len = end + 1 - start;
+
+ /*
+ * Handle the locked folio first.
+ * btrfs_folio_clamp_*() helpers can handle range out of the folio case.
+ */
+ btrfs_folio_clamp_clear_dirty(fs_info, locked_folio, start, len);
+ btrfs_folio_clamp_set_writeback(fs_info, locked_folio, start, len);
+ btrfs_folio_clamp_clear_writeback(fs_info, locked_folio, start, len);
+
+ for (pgoff_t index = start_index; index <= end_index; index++) {
+ struct folio *folio;
+
+ /* Already handled at the beginning. */
+ if (index == locked_folio->index)
+ continue;
+ folio = __filemap_get_folio(mapping, index, FGP_LOCK, GFP_NOFS);
+ /* Cache already dropped, no need to do any cleanup. */
+ if (IS_ERR(folio))
+ continue;
+ btrfs_folio_clamp_clear_dirty(fs_info, folio, start, len);
+ btrfs_folio_clamp_set_writeback(fs_info, folio, start, len);
+ btrfs_folio_clamp_clear_writeback(fs_info, folio, start, len);
+ folio_unlock(folio);
+ folio_put(folio);
+ }
+ mapping_set_error(mapping, error);
+}
+
/*
* when nowcow writeback call back. This checks for snapshots or COW copies
* of the extents that exist in the file, and COWs the file as required.
@@ -1976,6 +2018,11 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
struct btrfs_root *root = inode->root;
struct btrfs_path *path;
u64 cow_start = (u64)-1;
+ /*
+ * If not 0, represents the inclusive end of the last fallback_to_cow()
+ * range. Only for error handling.
+ */
+ u64 cow_end = 0;
u64 cur_offset = start;
int ret;
bool check_prev = true;
@@ -2136,6 +2183,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
found_key.offset - 1);
cow_start = (u64)-1;
if (ret) {
+ cow_end = found_key.offset - 1;
btrfs_dec_nocow_writers(nocow_bg);
goto error;
}
@@ -2209,11 +2257,12 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
cow_start = cur_offset;
if (cow_start != (u64)-1) {
- cur_offset = end;
ret = fallback_to_cow(inode, locked_folio, cow_start, end);
cow_start = (u64)-1;
- if (ret)
+ if (ret) {
+ cow_end = end;
goto error;
+ }
}
btrfs_free_path(path);
@@ -2221,12 +2270,42 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode,
error:
/*
- * If an error happened while a COW region is outstanding, cur_offset
- * needs to be reset to cow_start to ensure the COW region is unlocked
- * as well.
+ * There are several error cases:
+ *
+ * 1) Failed without falling back to COW
+ * start cur_start end
+ * |/////////////| |
+ *
+ * For range [start, cur_start) the folios are already unlocked (except
+ * @locked_folio), EXTENT_DELALLOC already removed.
+ * Only need to clear the dirty flag as they will never be submitted.
+ * Ordered extent and extent maps are handled by
+ * btrfs_mark_ordered_io_finished() inside run_delalloc_range().
+ *
+ * 2) Failed with error from fallback_to_cow()
+ * start cur_start cow_end end
+ * |/////////////|-----------| |
+ *
+ * For range [start, cur_start) it's the same as case 1).
+ * But for range [cur_start, cow_end), the folios have dirty flag
+ * cleared and unlocked, EXTENT_DEALLLOC cleared.
+ * There may or may not be any ordered extents/extent maps allocated.
+ *
+ * We should not call extent_clear_unlock_delalloc() on range [cur_start,
+ * cow_end), as the folios are already unlocked.
+ *
+ * So clear the folio dirty flags for [start, cur_offset) first.
*/
- if (cow_start != (u64)-1)
- cur_offset = cow_start;
+ if (cur_offset > start)
+ cleanup_dirty_folios(inode, locked_folio, start, cur_offset - 1, ret);
+
+ /*
+ * If an error happened while a COW region is outstanding, cur_offset
+ * needs to be reset to @cow_end + 1 to skip the COW range, as
+ * cow_file_range() will do the proper cleanup at error.
+ */
+ if (cow_end)
+ cur_offset = cow_end + 1;
/*
* We need to lock the extent here because we're clearing DELALLOC and
--
2.47.1
[BUG]
When testing with COW fixup marked as BUG_ON() (this is involved with the
new pin_user_pages*() change, which should not result new out-of-band
dirty pages), I hit a crash triggered by the BUG_ON() from hitting COW
fixup path.
This BUG_ON() happens just after a failed btrfs_run_delalloc_range():
BTRFS error (device dm-2): failed to run delalloc range, root 348 ino 405 folio 65536 submit_bitmap 6-15 start 90112 len 106496: -28
------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:1444!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
CPU: 0 UID: 0 PID: 434621 Comm: kworker/u24:8 Tainted: G OE 6.12.0-rc7-custom+ #86
Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
pc : extent_writepage_io+0x2d4/0x308 [btrfs]
lr : extent_writepage_io+0x2d4/0x308 [btrfs]
Call trace:
extent_writepage_io+0x2d4/0x308 [btrfs]
extent_writepage+0x218/0x330 [btrfs]
extent_write_cache_pages+0x1d4/0x4b0 [btrfs]
btrfs_writepages+0x94/0x150 [btrfs]
do_writepages+0x74/0x190
filemap_fdatawrite_wbc+0x88/0xc8
start_delalloc_inodes+0x180/0x3b0 [btrfs]
btrfs_start_delalloc_roots+0x174/0x280 [btrfs]
shrink_delalloc+0x114/0x280 [btrfs]
flush_space+0x250/0x2f8 [btrfs]
btrfs_async_reclaim_data_space+0x180/0x228 [btrfs]
process_one_work+0x164/0x408
worker_thread+0x25c/0x388
kthread+0x100/0x118
ret_from_fork+0x10/0x20
Code: aa1403e1 9402f3ef aa1403e0 9402f36f (d4210000)
---[ end trace 0000000000000000 ]---
[CAUSE]
That failure is mostly from cow_file_range(), where we can hit -ENOSPC.
Although the -ENOSPC is already a bug related to our space reservation
code, let's just focus on the error handling.
For example, we have the following dirty range [0, 64K) of an inode,
with 4K sector size and 4K page size:
0 16K 32K 48K 64K
|///////////////////////////////////////|
|#######################################|
Where |///| means page are still dirty, and |###| means the extent io
tree has EXTENT_DELALLOC flag.
- Enter extent_writepage() for page 0
- Enter btrfs_run_delalloc_range() for range [0, 64K)
- Enter cow_file_range() for range [0, 64K)
- Function btrfs_reserve_extent() only reserved one 16K extent
So we created extent map and ordered extent for range [0, 16K)
0 16K 32K 48K 64K
|////////|//////////////////////////////|
|<- OE ->|##############################|
And range [0, 16K) has its delalloc flag cleared.
But since we haven't yet submit any bio, involved 4 pages are still
dirty.
- Function btrfs_reserve_extent() return with -ENOSPC
Now we have to run error cleanup, which will clear all
EXTENT_DELALLOC* flags and clear the dirty flags for the remaining
ranges:
0 16K 32K 48K 64K
|////////| |
| | |
Note that range [0, 16K) still has their pages dirty.
- Some time later, writeback are triggered again for the range [0, 16K)
since the page range still have dirty flags.
- btrfs_run_delalloc_range() will do nothing because there is no
EXTENT_DELALLOC flag.
- extent_writepage_io() find page 0 has no ordered flag
Which falls into the COW fixup path, triggering the BUG_ON().
Unfortunately this error handling bug dates back to the introduction of btrfs.
Thankfully with the abuse of cow fixup, at least it won't crash the
kernel.
[FIX]
Instead of immediately unlock the extent and folios, we keep the extent
and folios locked until either erroring out or the whole delalloc range
finished.
When the whole delalloc range finished without error, we just unlock the
whole range with PAGE_SET_ORDERED (and PAGE_UNLOCK for !keep_locked
cases), with EXTENT_DELALLOC and EXTENT_LOCKED cleared.
And those involved folios will be properly submitted, with their dirty
flags cleared during submission.
For the error path, it will be a little more complex:
- The range with ordered extent allocated (range (1))
We only clear the EXTENT_DELALLOC and EXTENT_LOCKED, as the remaining
flags are cleaned up by
btrfs_mark_ordered_io_finished()->btrfs_finish_one_ordered().
For folios we finish the IO (clear dirty, start writeback and
immediately finish the writeback) and unlock the folios.
- The range with reserved extent but no ordered extent (range(2))
- The range we never touched (range(3))
For both range (2) and range(3) the behavior is not changed.
Now even if cow_file_range() failed halfway with some successfully
reserved extents/ordered extents, we will keep all folios clean, so
there will be no future writeback triggered on them.
Cc: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/inode.c | 65 ++++++++++++++++++++++++------------------------
1 file changed, 32 insertions(+), 33 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index aee172abad1c..57e9b7deee88 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1364,6 +1364,17 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
alloc_hint = btrfs_get_extent_allocation_hint(inode, start, num_bytes);
+ /*
+ * We're not doing compressed IO, don't unlock the first page
+ * (which the caller expects to stay locked), don't clear any
+ * dirty bits and don't set any writeback bits
+ *
+ * Do set the Ordered (Private2) bit so we know this page was
+ * properly setup for writepage.
+ */
+ page_ops = (keep_locked ? 0 : PAGE_UNLOCK);
+ page_ops |= PAGE_SET_ORDERED;
+
/*
* Relocation relies on the relocated extents to have exactly the same
* size as the original extents. Normally writeback for relocation data
@@ -1423,6 +1434,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
file_extent.offset = 0;
file_extent.compression = BTRFS_COMPRESS_NONE;
+ /*
+ * Locked range will be released either during error clean up or
+ * after the whole range is finished.
+ */
lock_extent(&inode->io_tree, start, start + cur_alloc_size - 1,
&cached);
@@ -1468,21 +1483,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
btrfs_dec_block_group_reservations(fs_info, ins.objectid);
- /*
- * We're not doing compressed IO, don't unlock the first page
- * (which the caller expects to stay locked), don't clear any
- * dirty bits and don't set any writeback bits
- *
- * Do set the Ordered flag so we know this page was
- * properly setup for writepage.
- */
- page_ops = (keep_locked ? 0 : PAGE_UNLOCK);
- page_ops |= PAGE_SET_ORDERED;
-
- extent_clear_unlock_delalloc(inode, start, start + cur_alloc_size - 1,
- locked_folio, &cached,
- EXTENT_LOCKED | EXTENT_DELALLOC,
- page_ops);
if (num_bytes < cur_alloc_size)
num_bytes = 0;
else
@@ -1499,6 +1499,9 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
if (ret)
goto out_unlock;
}
+ extent_clear_unlock_delalloc(inode, orig_start, end, locked_folio, &cached,
+ EXTENT_LOCKED | EXTENT_DELALLOC,
+ page_ops);
done:
if (done_offset)
*done_offset = end;
@@ -1519,35 +1522,31 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
* We process each region below.
*/
- clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
- EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
- page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
-
/*
* For the range (1). We have already instantiated the ordered extents
* for this region. They are cleaned up by
* btrfs_cleanup_ordered_extents() in e.g,
- * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are
- * already cleared in the above loop. And, EXTENT_DELALLOC_NEW |
- * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup
- * function.
+ * btrfs_run_delalloc_range().
+ * EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV
+ * are also handled by the cleanup function.
*
- * However, in case of @keep_locked, we still need to unlock the pages
- * (except @locked_folio) to ensure all the pages are unlocked.
+ * So here we only clear EXTENT_LOCKED and EXTENT_DELALLOC flag,
+ * and finish the writeback of the involved folios, which will be
+ * never submitted.
*/
- if (keep_locked && orig_start < start) {
+ if (orig_start < start) {
+ clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC;
+ page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
+
if (!locked_folio)
mapping_set_error(inode->vfs_inode.i_mapping, ret);
extent_clear_unlock_delalloc(inode, orig_start, start - 1,
- locked_folio, NULL, 0, page_ops);
+ locked_folio, NULL, clear_bits, page_ops);
}
- /*
- * At this point we're unlocked, we want to make sure we're only
- * clearing these flags under the extent lock, so lock the rest of the
- * range and clear everything up.
- */
- lock_extent(&inode->io_tree, start, end, NULL);
+ clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
+ EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
+ page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
/*
* For the range (2). If we reserved an extent for our delalloc range
--
2.47.1