When testing large folio support with XFS on our servers, we observed that
only a few large folios are mapped when reading large files via mmap.
After a thorough analysis, I identified it was caused by the
`/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
parameter is set to 128KB. After I tune it to 2MB, the large folio can
work as expected. However, I believe the large folio behavior should not
be dependent on the value of read_ahead_kb. It would be more robust if
the kernel can automatically adopt to it.
With /sys/block/*/queue/read_ahead_kb set to 128KB and performing a
sequential read on a 1GB file using MADV_HUGEPAGE, the differences in
/proc/meminfo are as follows:
- before this patch
FileHugePages: 18432 kB
FilePmdMapped: 4096 kB
- after this patch
FileHugePages: 1067008 kB
FilePmdMapped: 1048576 kB
This shows that after applying the patch, the entire 1GB file is mapped to
huge pages. The stable list is CCed, as without this patch, large folios
don't function optimally in the readahead path.
It's worth noting that if read_ahead_kb is set to a larger value that
isn't aligned with huge page sizes (e.g., 4MB + 128KB), it may still fail
to map to hugepages.
Link: https://lkml.kernel.org/r/20241108141710.9721-1-laoar.shao@gmail.com
Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings")
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
Tested-by: kernel test robot <oliver.sang(a)intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
---
mm/readahead.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Changes:
v2->v3:
- Fix the softlockup reported by kernel test robot
https://lore.kernel.org/linux-fsdevel/202411292300.61edbd37-lkp@intel.com/
v1->v2: https://lore.kernel.org/linux-mm/20241108141710.9721-1-laoar.shao@gmail.com/
- Drop the alignment (Matthew)
- Improve commit log (Andrew)
RFC->v1: https://lore.kernel.org/linux-mm/20241106092114.8408-1-laoar.shao@gmail.com/
- Simplify the code as suggested by Matthew
RFC: https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@gmail.co…
diff --git a/mm/readahead.c b/mm/readahead.c
index 3dc6c7a128dd..1dc3cffd4843 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -642,7 +642,11 @@ void page_cache_async_ra(struct readahead_control *ractl,
1UL << order);
if (index == expected) {
ra->start += ra->size;
- ra->size = get_next_ra_size(ra, max_pages);
+ /*
+ * In the case of MADV_HUGEPAGE, the actual size might exceed
+ * the readahead window.
+ */
+ ra->size = max(ra->size, get_next_ra_size(ra, max_pages));
ra->async_size = ra->size;
goto readit;
}
--
2.43.5
From: Sungjong Seo <sj1557.seo(a)samsung.com>
commit 89fc548767a2155231128cb98726d6d2ea1256c9 upstream.
When accessing a file with more entries than ES_MAX_ENTRY_NUM, the bh-array
is allocated in __exfat_get_entry_set. The problem is that the bh-array is
allocated with GFP_KERNEL. It does not make sense. In the following cases,
a deadlock for sbi->s_lock between the two processes may occur.
CPU0 CPU1
---- ----
kswapd
balance_pgdat
lock(fs_reclaim)
exfat_iterate
lock(&sbi->s_lock)
exfat_readdir
exfat_get_uniname_from_ext_entry
exfat_get_dentry_set
__exfat_get_dentry_set
kmalloc_array
...
lock(fs_reclaim)
...
evict
exfat_evict_inode
lock(&sbi->s_lock)
To fix this, let's allocate bh-array with GFP_NOFS.
Fixes: a3ff29a95fde ("exfat: support dynamic allocate bh for exfat_entry_set_cache")
Cc: stable(a)vger.kernel.org # v6.2+
Reported-by: syzbot+412a392a2cd4a65e71db(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/000000000000fef47e0618c0327f@google.com
Signed-off-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[Sherry: The problematic commit was backported to 5.15.y and 5.10.y,
thus backport this fix]
Signed-off-by: Sherry Yang <sherry.yang(a)oracle.com>
---
fs/exfat/dir.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
index be7570d01ae1..0a1b1de032ef 100644
--- a/fs/exfat/dir.c
+++ b/fs/exfat/dir.c
@@ -878,7 +878,7 @@ struct exfat_entry_set_cache *exfat_get_dentry_set(struct super_block *sb,
num_bh = EXFAT_B_TO_BLK_ROUND_UP(off + num_entries * DENTRY_SIZE, sb);
if (num_bh > ARRAY_SIZE(es->__bh)) {
- es->bh = kmalloc_array(num_bh, sizeof(*es->bh), GFP_KERNEL);
+ es->bh = kmalloc_array(num_bh, sizeof(*es->bh), GFP_NOFS);
if (!es->bh) {
brelse(bh);
kfree(es);
--
2.46.0
The channel index is off by one unit if AS73211_SCAN_MASK_ALL is not
set (optimized path for color channel readings), and it must be shifted
instead of leaving an empty channel for the temperature when it is off.
Once the channel index is fixed, the uninitialized channel must be set
to zero to avoid pushing uninitialized data.
Add available_scan_masks for all channels and only-color channels to let
the IIO core demux and repack the enabled channels.
Cc: stable(a)vger.kernel.org
Fixes: 403e5586b52e ("iio: light: as73211: New driver")
Tested-by: Christian Eggers <ceggers(a)arri.de>
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
---
This issue was found after attempting to make the same mistake for
a driver I maintain, which was fortunately spotted by Jonathan [1].
Keeping old sensor values if the channel configuration changes is known
and not considered an issue, which is also mentioned in [1], so it has
not been addressed by this series. That keeps most of the drivers out
of the way because they store the scan element in iio private data,
which is kzalloc() allocated.
This series only addresses cases where uninitialized i.e. unknown data
is pushed to the userspace, either due to holes in structs or
uninitialized struct members/array elements.
While analyzing involved functions, I found and fixed some triviality
(wrong function name) in the documentation of iio_dev_opaque.
Link: https://lore.kernel.org/linux-iio/20241123151634.303aa860@jic23-huawei/ [1]
---
Changes in v3:
- as73211.c: add available_scan_masks for all channels and only-color
channels to let the IIO core demux and repack the enabled channels.
- Link to v2: https://lore.kernel.org/r/20241204-iio_memset_scan_holes-v2-0-3f941592a76d@…
Changes in v2:
- as73211.c: shift channels if no temperature is available and
initialize chan[3] to zero.
- Link to v1: https://lore.kernel.org/r/20241125-iio_memset_scan_holes-v1-0-0cb6e98d895c@…
---
drivers/iio/light/as73211.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/light/as73211.c b/drivers/iio/light/as73211.c
index be0068081ebb..4be2e349a068 100644
--- a/drivers/iio/light/as73211.c
+++ b/drivers/iio/light/as73211.c
@@ -177,6 +177,12 @@ struct as73211_data {
BIT(AS73211_SCAN_INDEX_TEMP) | \
AS73211_SCAN_MASK_COLOR)
+static const unsigned long as73211_scan_masks[] = {
+ AS73211_SCAN_MASK_ALL,
+ AS73211_SCAN_MASK_COLOR,
+ 0,
+};
+
static const struct iio_chan_spec as73211_channels[] = {
{
.type = IIO_TEMP,
@@ -672,9 +678,12 @@ static irqreturn_t as73211_trigger_handler(int irq __always_unused, void *p)
/* AS73211 starts reading at address 2 */
ret = i2c_master_recv(data->client,
- (char *)&scan.chan[1], 3 * sizeof(scan.chan[1]));
+ (char *)&scan.chan[0], 3 * sizeof(scan.chan[0]));
if (ret < 0)
goto done;
+
+ /* Avoid pushing uninitialized data */
+ scan.chan[3] = 0;
}
if (data_result) {
@@ -682,9 +691,15 @@ static irqreturn_t as73211_trigger_handler(int irq __always_unused, void *p)
* Saturate all channels (in case of overflows). Temperature channel
* is not affected by overflows.
*/
- scan.chan[1] = cpu_to_le16(U16_MAX);
- scan.chan[2] = cpu_to_le16(U16_MAX);
- scan.chan[3] = cpu_to_le16(U16_MAX);
+ if (*indio_dev->active_scan_mask == AS73211_SCAN_MASK_ALL) {
+ scan.chan[1] = cpu_to_le16(U16_MAX);
+ scan.chan[2] = cpu_to_le16(U16_MAX);
+ scan.chan[3] = cpu_to_le16(U16_MAX);
+ } else {
+ scan.chan[0] = cpu_to_le16(U16_MAX);
+ scan.chan[1] = cpu_to_le16(U16_MAX);
+ scan.chan[2] = cpu_to_le16(U16_MAX);
+ }
}
iio_push_to_buffers_with_timestamp(indio_dev, &scan, iio_get_time_ns(indio_dev));
@@ -758,6 +773,7 @@ static int as73211_probe(struct i2c_client *client)
indio_dev->channels = data->spec_dev->channels;
indio_dev->num_channels = data->spec_dev->num_channels;
indio_dev->modes = INDIO_DIRECT_MODE;
+ indio_dev->available_scan_masks = as73211_scan_masks;
ret = i2c_smbus_read_byte_data(data->client, AS73211_REG_OSR);
if (ret < 0)
---
base-commit: 91e71d606356e50f238d7a87aacdee4abc427f07
change-id: 20241123-iio_memset_scan_holes-a673833ef932
Best regards,
--
Javier Carrasco <javier.carrasco.cruz(a)gmail.com>