fxls8962af_fifo_flush() uses indio_dev->active_scan_mask (with iio_for_each_active_channel()) without making sure the indio_dev stays in buffer mode. There is a race if indio_dev exits buffer mode in the middle of the interrupt that flushes the fifo. Fix this by calling iio_device_claim_buffer_mode() to ensure indio_dev can't exit buffer mode during the flush.
Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read [...] _find_first_bit_le from fxls8962af_fifo_flush+0x17c/0x290 fxls8962af_fifo_flush from fxls8962af_interrupt+0x80/0x178 fxls8962af_interrupt from irq_thread_fn+0x1c/0x7c irq_thread_fn from irq_thread+0x110/0x1f4 irq_thread from kthread+0xe0/0xfc kthread from ret_from_fork+0x14/0x2c
Fixes: 79e3a5bdd9ef ("iio: accel: fxls8962af: add hw buffered sampling") Cc: stable@vger.kernel.org Signed-off-by: Sean Nyekjaer sean@geanix.com --- drivers/iio/accel/fxls8962af-core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iio/accel/fxls8962af-core.c b/drivers/iio/accel/fxls8962af-core.c index 6d23da3e7aa22c61f2d9348bb91d70cc5719a732..7db83ebeea823173d79bf8ff484add16f575edfc 100644 --- a/drivers/iio/accel/fxls8962af-core.c +++ b/drivers/iio/accel/fxls8962af-core.c @@ -973,6 +973,9 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) if (ret) return ret;
+ if (iio_device_claim_buffer_mode(indio_dev) < 0) + return 0; + /* Demux hw FIFO into kfifo. */ for (i = 0; i < count; i++) { int j, bit; @@ -989,6 +992,8 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) tstamp += sample_period; }
+ iio_device_release_buffer_mode(indio_dev); + return count; }
--- base-commit: 5c3fcb36c92443a9a037683626a2e43d8825f783 change-id: 20250524-fxlsrace-f4d20e29fb29
Best regards,
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
fxls8962af_fifo_flush() uses indio_dev->active_scan_mask (with iio_for_each_active_channel()) without making sure the indio_dev stays in buffer mode. There is a race if indio_dev exits buffer mode in the middle of the interrupt that flushes the fifo. Fix this by calling iio_device_claim_buffer_mode() to ensure indio_dev can't exit buffer mode during the flush.
Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read [...] _find_first_bit_le from fxls8962af_fifo_flush+0x17c/0x290 fxls8962af_fifo_flush from fxls8962af_interrupt+0x80/0x178 fxls8962af_interrupt from irq_thread_fn+0x1c/0x7c irq_thread_fn from irq_thread+0x110/0x1f4 irq_thread from kthread+0xe0/0xfc kthread from ret_from_fork+0x14/0x2c
Fixes: 79e3a5bdd9ef ("iio: accel: fxls8962af: add hw buffered sampling") Cc: stable@vger.kernel.org Signed-off-by: Sean Nyekjaer sean@geanix.com
drivers/iio/accel/fxls8962af-core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iio/accel/fxls8962af-core.c b/drivers/iio/accel/fxls8962af-core.c index 6d23da3e7aa22c61f2d9348bb91d70cc5719a732..7db83ebeea823173d79bf8ff484add16f575edfc 100644 --- a/drivers/iio/accel/fxls8962af-core.c +++ b/drivers/iio/accel/fxls8962af-core.c @@ -973,6 +973,9 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) if (ret) return ret;
- if (iio_device_claim_buffer_mode(indio_dev) < 0)
return 0;
I see one other driver with a check like this, so I suppose there is some precedent, but I wonder if this could be fixed instead with:
--- diff --git a/drivers/iio/accel/fxls8962af-core.c b/drivers/iio/accel/fxls8962af-core.c index 9aa02f599ae9..2034f4407dd7 100644 --- a/drivers/iio/accel/fxls8962af-core.c +++ b/drivers/iio/accel/fxls8962af-core.c @@ -866,6 +866,8 @@ static int fxls8962af_buffer_predisable(struct iio_dev *indio_dev) if (ret) return ret;
+ synchronize_irq(data->irq); + ret = __fxls8962af_fifo_set_mode(data, false);
if (data->enable_event) ---
This should ensure that if the interrupt handler is running while trying to disable the buffer, that the interrupt handler completes before returning from fxls8962af_buffer_predisable(). And no more interrupts should occur because the interrupt was disabled before waiting for completion.
/* Demux hw FIFO into kfifo. */ for (i = 0; i < count; i++) { int j, bit; @@ -989,6 +992,8 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) tstamp += sample_period; }
- iio_device_release_buffer_mode(indio_dev);
- return count;
}
base-commit: 5c3fcb36c92443a9a037683626a2e43d8825f783 change-id: 20250524-fxlsrace-f4d20e29fb29
Best regards,
In addition to the race you have identified, I wonder if there is also a race in fxls8962af_suspend(). This one wouldn't cause a crash, but would cause an immediate wake.
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
On Thu, May 29, 2025 at 7:02 PM David Lechner dlechner@baylibre.com wrote:
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
...
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
AFAIU the wake capability of IRQ line is orthogonal to the interrupt controller enabling (unmasking) / disabling (masking) the line itself. Or did you mean something else?
On 5/29/25 1:16 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 7:02 PM David Lechner dlechner@baylibre.com wrote:
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
...
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
AFAIU the wake capability of IRQ line is orthogonal to the interrupt controller enabling (unmasking) / disabling (masking) the line itself. Or did you mean something else?
I don't know enough about how suspend/wake stuff works to say for sure.
I just saw the comment:
/* * Disable buffer, as the buffer is so small the device will wake * almost immediately. */
so I assumed someone had observed something like this happening already. If an interrupt occurs between enable_irq_wake() and actually going into a low power mode, what effect does it have? I ask because I don't know.
On Thu, May 29, 2025 at 01:49:16PM -0500, David Lechner wrote:
On 5/29/25 1:16 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 7:02 PM David Lechner dlechner@baylibre.com wrote:
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
...
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
AFAIU the wake capability of IRQ line is orthogonal to the interrupt controller enabling (unmasking) / disabling (masking) the line itself. Or did you mean something else?
I don't know enough about how suspend/wake stuff works to say for sure.
I just saw the comment:
/* * Disable buffer, as the buffer is so small the device will wake * almost immediately. */
so I assumed someone had observed something like this happening already. If an interrupt occurs between enable_irq_wake() and actually going into a low power mode, what effect does it have? I ask because I don't know.
To be a "wake source" means to be capable of signaling to the system that wake is needed. If an event comes after enabling an IRQ line to be a wake source, that should wakeup the system (independently if that IRQ line is disabled or not on the IRQ controller side).
On 5/30/25 12:51 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 01:49:16PM -0500, David Lechner wrote:
On 5/29/25 1:16 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 7:02 PM David Lechner dlechner@baylibre.com wrote:
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
...
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
AFAIU the wake capability of IRQ line is orthogonal to the interrupt controller enabling (unmasking) / disabling (masking) the line itself. Or did you mean something else?
I don't know enough about how suspend/wake stuff works to say for sure.
I just saw the comment:
/* * Disable buffer, as the buffer is so small the device will wake * almost immediately. */
so I assumed someone had observed something like this happening already. If an interrupt occurs between enable_irq_wake() and actually going into a low power mode, what effect does it have? I ask because I don't know.
To be a "wake source" means to be capable of signaling to the system that wake is needed. If an event comes after enabling an IRQ line to be a wake source, that should wakeup the system (independently if that IRQ line is disabled or not on the IRQ controller side).
OK, more clear now. So I should have been more specific with my previous comment. When I said, "before disabling the interrupt", I didn't mean calling disable_irq(). I meant disabling the actual output pin on the accelerometer chip.
On Fri, May 30, 2025 at 8:57 PM David Lechner dlechner@baylibre.com wrote:
On 5/30/25 12:51 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 01:49:16PM -0500, David Lechner wrote:
On 5/29/25 1:16 PM, Andy Shevchenko wrote:
On Thu, May 29, 2025 at 7:02 PM David Lechner dlechner@baylibre.com wrote:
On 5/24/25 5:34 AM, Sean Nyekjaer wrote:
...
fxls8962af_suspend() calls enable_irq_wake(data->irq); before disabling the interrupt by calling fxls8962af_buffer_predisable(indio_dev);
It seems like the order should be reversed.
AFAIU the wake capability of IRQ line is orthogonal to the interrupt controller enabling (unmasking) / disabling (masking) the line itself. Or did you mean something else?
I don't know enough about how suspend/wake stuff works to say for sure.
I just saw the comment:
/* * Disable buffer, as the buffer is so small the device will wake * almost immediately. */
so I assumed someone had observed something like this happening already. If an interrupt occurs between enable_irq_wake() and actually going into a low power mode, what effect does it have? I ask because I don't know.
To be a "wake source" means to be capable of signaling to the system that wake is needed. If an event comes after enabling an IRQ line to be a wake source, that should wakeup the system (independently if that IRQ line is disabled or not on the IRQ controller side).
OK, more clear now.
FWIW, https://elixir.bootlin.com/linux/v6.15/source/kernel/irq/manage.c#L887
So I should have been more specific with my previous comment. When I said, "before disabling the interrupt", I didn't mean calling disable_irq(). I meant disabling the actual output pin on the accelerometer chip.
Ah, in that case it's of course a different case.
On Sat, 24 May 2025 12:34:09 +0200 Sean Nyekjaer sean@geanix.com wrote:
fxls8962af_fifo_flush() uses indio_dev->active_scan_mask (with iio_for_each_active_channel()) without making sure the indio_dev stays in buffer mode. There is a race if indio_dev exits buffer mode in the middle of the interrupt that flushes the fifo. Fix this by calling iio_device_claim_buffer_mode() to ensure indio_dev can't exit buffer mode during the flush.
Unable to handle kernel NULL pointer dereference at virtual address 00000000 when read [...] _find_first_bit_le from fxls8962af_fifo_flush+0x17c/0x290 fxls8962af_fifo_flush from fxls8962af_interrupt+0x80/0x178 fxls8962af_interrupt from irq_thread_fn+0x1c/0x7c irq_thread_fn from irq_thread+0x110/0x1f4 irq_thread from kthread+0xe0/0xfc kthread from ret_from_fork+0x14/0x2c
Fixes: 79e3a5bdd9ef ("iio: accel: fxls8962af: add hw buffered sampling") Cc: stable@vger.kernel.org Signed-off-by: Sean Nyekjaer sean@geanix.com
That's nasty and a case I'd never thought about. Most of the races around disabling end up with an extra sample or two which then gets dropped because there are no buffers enabled.
We need to consider the active scan mask as part of the buffer state. So effectively taking mlock if we enter this code will delay the state transition (and change of active_scan_mask until after this interrupt is done).
If David's synchronize_irq() is enough maybe that's a lighter weight path?
Jonathan
drivers/iio/accel/fxls8962af-core.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iio/accel/fxls8962af-core.c b/drivers/iio/accel/fxls8962af-core.c index 6d23da3e7aa22c61f2d9348bb91d70cc5719a732..7db83ebeea823173d79bf8ff484add16f575edfc 100644 --- a/drivers/iio/accel/fxls8962af-core.c +++ b/drivers/iio/accel/fxls8962af-core.c @@ -973,6 +973,9 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) if (ret) return ret;
- if (iio_device_claim_buffer_mode(indio_dev) < 0)
return 0;
- /* Demux hw FIFO into kfifo. */ for (i = 0; i < count; i++) { int j, bit;
@@ -989,6 +992,8 @@ static int fxls8962af_fifo_flush(struct iio_dev *indio_dev) tstamp += sample_period; }
- iio_device_release_buffer_mode(indio_dev);
- return count;
}
base-commit: 5c3fcb36c92443a9a037683626a2e43d8825f783 change-id: 20250524-fxlsrace-f4d20e29fb29
Best regards,
linux-stable-mirror@lists.linaro.org