The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cfb3c85a600c6aa25a2581b3c1c4db3460f14e46 Mon Sep 17 00:00:00 2001
From: Jeffle Xu <jefflexu(a)linux.alibaba.com>
Date: Fri, 22 May 2020 12:18:44 +0800
Subject: [PATCH] ext4: fix partial cluster initialization when splitting
extent
Fix the bug when calculating the physical block number of the first
block in the split extent.
This bug will cause xfstests shared/298 failure on ext4 with bigalloc
enabled occasionally. Ext4 error messages indicate that previously freed
blocks are being freed again, and the following fsck will fail due to
the inconsistency of block bitmap and bg descriptor.
The following is an example case:
1. First, Initialize a ext4 filesystem with cluster size '16K', block size
'4K', in which case, one cluster contains four blocks.
2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent
tree of this file is like:
...
36864:[0]4:220160
36868:[0]14332:145408
51200:[0]2:231424
...
3. Then execute PUNCH_HOLE fallocate on this file. The hole range is
like:
..
ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1
ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1
ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1
...
4. Then the extent tree of this file after punching is like
...
49507:[0]37:158047
49547:[0]58:158087
...
5. Detailed procedure of punching hole [49544, 49546]
5.1. The block address space:
```
lblk ~49505 49506 49507~49543 49544~49546 49547~
---------+------+-------------+----------------+--------
extent | hole | extent | hole | extent
---------+------+-------------+----------------+--------
pblk ~158045 158046 158047~158083 158084~158086 158087~
```
5.2. The detailed layout of cluster 39521:
```
cluster 39521
<------------------------------->
hole extent
<----------------------><--------
lblk 49544 49545 49546 49547
+-------+-------+-------+-------+
| | | | |
+-------+-------+-------+-------+
pblk 158084 1580845 158086 158087
```
5.3. The ftrace output when punching hole [49544, 49546]:
- ext4_ext_remove_space (start 49544, end 49546)
- ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2])
- ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2]
- ext4_free_blocks: (block 158084 count 4)
- ext4_mballoc_free (extent 1/6753/1)
5.4. Ext4 error message in dmesg:
EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters
In this case, the whole cluster 39521 is freed mistakenly when freeing
pblock 158084~158086 (i.e., the first three blocks of this cluster),
although pblock 158087 (the last remaining block of this cluster) has
not been freed yet.
The root cause of this isuue is that, the pclu of the partial cluster is
calculated mistakenly in ext4_ext_remove_space(). The correct
partial_cluster.pclu (i.e., the cluster number of the first block in the
next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather
than 39522.
Fixes: f4226d9ea400 ("ext4: fix partial cluster initialization")
Signed-off-by: Jeffle Xu <jefflexu(a)linux.alibaba.com>
Reviewed-by: Eric Whitney <enwlinux(a)gmail.com>
Cc: stable(a)kernel.org # v3.19+
Link: https://lore.kernel.org/r/1590121124-37096-1-git-send-email-jefflexu@linux.…
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 7d088ff1e902..221f240eae60 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2844,7 +2844,7 @@ int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start,
* in use to avoid freeing it when removing blocks.
*/
if (sbi->s_cluster_ratio > 1) {
- pblk = ext4_ext_pblock(ex) + end - ee_block + 2;
+ pblk = ext4_ext_pblock(ex) + end - ee_block + 1;
partial.pclu = EXT4_B2C(sbi, pblk);
partial.state = nofree;
}
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 56952e91acc93ed624fe9da840900defb75f1323 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Wed, 17 Jun 2020 15:00:04 -0600
Subject: [PATCH] io_uring: reap poll completions while waiting for refs to
drop on exit
If we're doing polled IO and end up having requests being submitted
async, then completions can come in while we're waiting for refs to
drop. We need to reap these manually, as nobody else will be looking
for them.
Break the wait into 1/20th of a second time waits, and check for done
poll completions if we time out. Otherwise we can have done poll
completions sitting in ctx->poll_list, which needs us to reap them but
we're just waiting for them.
Cc: stable(a)vger.kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 98c83fbf4f88..2038d52c5450 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7363,7 +7363,17 @@ static void io_ring_exit_work(struct work_struct *work)
if (ctx->rings)
io_cqring_overflow_flush(ctx, true);
- wait_for_completion(&ctx->ref_comp);
+ /*
+ * If we're doing polled IO and end up having requests being
+ * submitted async (out-of-line), then completions can come in while
+ * we're waiting for refs to drop. We need to reap these manually,
+ * as nobody else will be looking for them.
+ */
+ while (!wait_for_completion_timeout(&ctx->ref_comp, HZ/20)) {
+ io_iopoll_reap_events(ctx);
+ if (ctx->rings)
+ io_cqring_overflow_flush(ctx, true);
+ }
io_ring_ctx_free(ctx);
}
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: e99ef5c64d5f - perf symbols: Fix kernel maps for kcore and eBPF
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 💥 jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ❌ Networking firewall: basic netfilter test
🚧 ❌ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ⚡⚡⚡ Storage blktests
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
x86_64:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The reverted commit illegitly uses tpm2-tools. External dependencies are
absolutely forbidden from these tests. There is also the problem that
clearing is not necessarily wanted behavior if the test/target computer is
not used only solely for testing.
Fixes: a9920d3bad40 ("tpm: selftest: cleanup after unseal with wrong auth/policy test")
Cc: Tadeusz Struk <tadeusz.struk(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: linux-integrity(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
tools/testing/selftests/tpm2/test_smoke.sh | 5 -----
1 file changed, 5 deletions(-)
diff --git a/tools/testing/selftests/tpm2/test_smoke.sh b/tools/testing/selftests/tpm2/test_smoke.sh
index 663062701d5a..79f8e9da5d21 100755
--- a/tools/testing/selftests/tpm2/test_smoke.sh
+++ b/tools/testing/selftests/tpm2/test_smoke.sh
@@ -8,8 +8,3 @@ ksft_skip=4
python -m unittest -v tpm2_tests.SmokeTest
python -m unittest -v tpm2_tests.AsyncTest
-
-CLEAR_CMD=$(which tpm2_clear)
-if [ -n $CLEAR_CMD ]; then
- tpm2_clear -T device
-fi
--
2.25.1
From: Frieder Schrempf <frieder.schrempf(a)kontron.de>
For reading and writing the bad block markers, spinand->oobbuf is
currently used as a buffer for the marker bytes. During the
underlying read and write operations to actually get/set the content
of the OOB area, the content of spinand->oobbuf is reused and changed
by accessing it through spinand->oobbuf and/or spinand->databuf.
This is a flaw in the original design of the SPI NAND core and at the
latest from 13c15e07eedf ("mtd: spinand: Handle the case where
PROGRAM LOAD does not reset the cache") on, it results in not having
the bad block marker written at all, as the spinand->oobbuf is
cleared to 0xff after setting the marker bytes to zero.
To fix it, we now just store the two bytes for the marker on the
stack and let the read/write operations copy it from/to the page
buffer later.
Fixes: 7529df465248 ("mtd: nand: Add core infrastructure to support SPI NANDs")
Cc: stable(a)vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf(a)kontron.de>
Reviewed-by: Boris Brezillon <boris.brezillon(a)collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200218100432.32433-2-frieder.schrempf@k…
---
drivers/mtd/nand/spi/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index 93371fdde0..410ea2382d 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -655,16 +655,16 @@ static int spinand_mtd_write(struct mtd_info *mtd, loff_t to,
static bool spinand_isbad(struct nand_device *nand, const struct nand_pos *pos)
{
struct spinand_device *spinand = nand_to_spinand(nand);
+ u8 marker[2] = { };
struct nand_page_io_req req = {
.pos = *pos,
- .ooblen = 2,
+ .ooblen = sizeof(marker),
.ooboffs = 0,
- .oobbuf.in = spinand->oobbuf,
+ .oobbuf.in = marker,
.mode = MTD_OPS_RAW,
};
int ret;
- memset(spinand->oobbuf, 0, 2);
ret = spinand_select_target(spinand, pos->target);
if (ret)
return ret;
@@ -673,7 +673,7 @@ static bool spinand_isbad(struct nand_device *nand, const struct nand_pos *pos)
if (ret)
return ret;
- if (spinand->oobbuf[0] != 0xff || spinand->oobbuf[1] != 0xff)
+ if (marker[0] != 0xff || marker[1] != 0xff)
return true;
return false;
@@ -702,11 +702,12 @@ static int spinand_mtd_block_isbad(struct mtd_info *mtd, loff_t offs)
static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
{
struct spinand_device *spinand = nand_to_spinand(nand);
+ u8 marker[2] = { };
struct nand_page_io_req req = {
.pos = *pos,
.ooboffs = 0,
- .ooblen = 2,
- .oobbuf.out = spinand->oobbuf,
+ .ooblen = sizeof(marker),
+ .oobbuf.out = marker,
};
int ret;
@@ -723,7 +724,6 @@ static int spinand_markbad(struct nand_device *nand, const struct nand_pos *pos)
if (ret)
return ret;
- memset(spinand->oobbuf, 0, 2);
return spinand_write_page(spinand, &req);
}
--
2.27.0
This reverts commit 8cfaaa811894a3ae2d7360a15a6cfccff3ebc7db.
If device was unbound and bound, the polling interval would be set to 0.
This is both unexpected and messes up with other bq27xxx devices (if
more than one battery device is used).
This reset of polling interval was added in commit 8cfaaa811894
("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver")
stating that power_supply_unregister() calls get_property(). However in
Linux kernel v3.1 and newer, such call trace does not exist.
Unregistering power supply does not call get_property() on unregistered
power supply.
Fixes: 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
I really could not identify the issue being fixed in offending commit
8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00
driver"), therefore maybe I missed here something important.
Please share your thoughts on this.
---
drivers/power/supply/bq27xxx_battery.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/power/supply/bq27xxx_battery.c b/drivers/power/supply/bq27xxx_battery.c
index 942c92127b6d..4c94ee72de95 100644
--- a/drivers/power/supply/bq27xxx_battery.c
+++ b/drivers/power/supply/bq27xxx_battery.c
@@ -1905,14 +1905,6 @@ EXPORT_SYMBOL_GPL(bq27xxx_battery_setup);
void bq27xxx_battery_teardown(struct bq27xxx_device_info *di)
{
- /*
- * power_supply_unregister call bq27xxx_battery_get_property which
- * call bq27xxx_battery_poll.
- * Make sure that bq27xxx_battery_poll will not call
- * schedule_delayed_work again after unregister (which cause OOPS).
- */
- poll_interval = 0;
-
cancel_delayed_work_sync(&di->work);
power_supply_unregister(di->bat);
--
2.17.1
From: Vladimir Oltean <vladimir.oltean(a)nxp.com>
This reverts commit b145710b69388aa4034d32b4a937f18f66b5538e.
The patch is not wrong, but the Fixes: tag is. It should have been:
Fixes: 060ad66f9795 ("dpaa_eth: change DMA device")
which means that it's fixing a commit which was introduced in:
git describe --tags 060ad66f97954
v5.4-rc3-783-g060ad66f9795
which then means it should have not been backported to linux-4.19.y,
where things _were_ working and now they're not.
Reported-by: Joakim Tjernlund <joakim.tjernlund(a)infinera.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
---
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 6683409fbd4a..4b21ae27a9fd 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2796,7 +2796,7 @@ static int dpaa_eth_probe(struct platform_device *pdev)
}
/* Do this here, so we can be verbose early */
- SET_NETDEV_DEV(net_dev, dev->parent);
+ SET_NETDEV_DEV(net_dev, dev);
dev_set_drvdata(dev, net_dev);
priv = netdev_priv(net_dev);
--
2.25.1
If shared interrupt comes late, during probe error path or device remove
(could be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
dspi_interrupt() will access registers with the clock being disabled.
This leads to external abort on non-linefetch on Toradex Colibri VF50
module (with Vybrid VF5xx):
$ echo 4002d000.spi > /sys/devices/platform/soc/40000000.bus/4002d000.spi/driver/unbind
Unhandled fault: external abort on non-linefetch (0x1008) at 0x8887f02c
Internal error: : 1008 [#1] ARM
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
Backtrace:
(regmap_mmio_read32le)
(regmap_mmio_read)
(_regmap_bus_reg_read)
(_regmap_read)
(regmap_read)
(dspi_interrupt)
(free_irq)
(devm_irq_release)
(release_nodes)
(devres_release_all)
(device_release_driver_internal)
The resource-managed framework should not be used for shared interrupt
handling, because the interrupt handler might be called after releasing
other resources and disabling clocks.
Similar bug could happen during suspend - the shared interrupt handler
could be invoked after suspending the device. Each device sharing this
interrupt line should disable the IRQ during suspend so handler will be
invoked only in following cases:
1. None suspended,
2. All devices resumed.
Fixes: 349ad66c0ab0 ("spi:Add Freescale DSPI driver for Vybrid VF610 platform")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
Changes since v2:
1. Go back to v1 and use non-devm interface,
2. Fix also suspend/resume paths.
Changes since v1:
1. Disable the IRQ instead of using non-devm interface.
---
drivers/spi/spi-fsl-dspi.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 58190c94561f..7ecc90ec8f2f 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -1109,6 +1109,8 @@ static int dspi_suspend(struct device *dev)
struct spi_controller *ctlr = dev_get_drvdata(dev);
struct fsl_dspi *dspi = spi_controller_get_devdata(ctlr);
+ if (dspi->irq > 0)
+ disable_irq(dspi->irq);
spi_controller_suspend(ctlr);
clk_disable_unprepare(dspi->clk);
@@ -1129,6 +1131,8 @@ static int dspi_resume(struct device *dev)
if (ret)
return ret;
spi_controller_resume(ctlr);
+ if (dspi->irq > 0)
+ enable_irq(dspi->irq);
return 0;
}
@@ -1385,8 +1389,8 @@ static int dspi_probe(struct platform_device *pdev)
goto poll_mode;
}
- ret = devm_request_irq(&pdev->dev, dspi->irq, dspi_interrupt,
- IRQF_SHARED, pdev->name, dspi);
+ ret = request_threaded_irq(dspi->irq, dspi_interrupt, NULL,
+ IRQF_SHARED, pdev->name, dspi);
if (ret < 0) {
dev_err(&pdev->dev, "Unable to attach DSPI interrupt\n");
goto out_clk_put;
@@ -1400,7 +1404,7 @@ static int dspi_probe(struct platform_device *pdev)
ret = dspi_request_dma(dspi, res->start);
if (ret < 0) {
dev_err(&pdev->dev, "can't get dma channels\n");
- goto out_clk_put;
+ goto out_free_irq;
}
}
@@ -1415,11 +1419,14 @@ static int dspi_probe(struct platform_device *pdev)
ret = spi_register_controller(ctlr);
if (ret != 0) {
dev_err(&pdev->dev, "Problem registering DSPI ctlr\n");
- goto out_clk_put;
+ goto out_free_irq;
}
return ret;
+out_free_irq:
+ if (dspi->irq > 0)
+ free_irq(dspi->irq, dspi);
out_clk_put:
clk_disable_unprepare(dspi->clk);
out_ctlr_put:
@@ -1435,6 +1442,8 @@ static int dspi_remove(struct platform_device *pdev)
/* Disconnect from the SPI framework */
dspi_release_dma(dspi);
+ if (dspi->irq > 0)
+ free_irq(dspi->irq, dspi);
clk_disable_unprepare(dspi->clk);
spi_unregister_controller(dspi->ctlr);
--
2.7.4