This is the start of the stable review cycle for the 4.4.171 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Jan 17 15:48:21 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.171-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.171-rc1
Vasily Averin vvs@virtuozzo.com sunrpc: use-after-free in svc_process_common()
Theodore Ts'o tytso@mit.edu ext4: fix a potential fiemap/page fault deadlock w/ inline_data
Eric Biggers ebiggers@google.com crypto: cts - fix crash on short inputs
Yi Zeng yizeng@asrmicro.com i2c: dev: prevent adapter retries and timeout being set as minus value
Hans de Goede hdegoede@redhat.com ACPI: power: Skip duplicate power resource references in _PRx
Ley Foon Tan lftan@altera.com PCI: altera: Move retrain from fixup to altera_pcie_host_init()
Ley Foon Tan lftan@altera.com PCI: altera: Rework config accessors for use without a struct pci_bus
Ley Foon Tan lftan@altera.com PCI: altera: Poll for link training status after retraining the link
Ley Foon Tan lftan@altera.com PCI: altera: Poll for link up status after retraining the link
Ley Foon Tan lftan@altera.com PCI: altera: Check link status before retrain link
Bjorn Helgaas bhelgaas@google.com PCI: altera: Reorder read/write functions
Ley Foon Tan lftan@altera.com PCI: altera: Fix altera_pcie_link_is_up()
Christoph Lameter cl@linux.com slab: alien caches must not be initialized if the allocation of the alien cache failed
Jack Stocker jackstocker.93@gmail.com USB: Add USB_QUIRK_DELAY_CTRL_MSG quirk for Corsair K70 RGB
Icenowy Zheng icenowy@aosc.io USB: storage: add quirk for SMI SM3350
Icenowy Zheng icenowy@aosc.io USB: storage: don't insert sane sense for SPC3+ when bad sense specified
Daniele Palmas dnlplm@gmail.com usb: cdc-acm: send ZLP for Telit 3G Intel based modems
Ross Lagerwall ross.lagerwall@citrix.com cifs: Fix potential OOB access of lock element array
Pavel Shilovsky pshilov@microsoft.com CIFS: Do not hide EINTR after sending network packets
Shaokun Zhang zhangshaokun@hisilicon.com btrfs: tree-checker: Fix misleading group system information
Qu Wenruo wqu@suse.com btrfs: tree-checker: Check level for leaves and nodes
Qu Wenruo wqu@suse.com btrfs: Verify that every chunk has corresponding block group at mount time
Qu Wenruo wqu@suse.com btrfs: Check that each block group has corresponding chunk at mount time
Gu Jinxiang gujx@cn.fujitsu.com btrfs: validate type when reading a chunk
Qu Wenruo wqu@suse.com btrfs: tree-checker: Detect invalid and empty essential trees
Qu Wenruo wqu@suse.com btrfs: tree-checker: Verify block_group_item
David Sterba dsterba@suse.com btrfs: tree-check: reduce stack consumption in check_dir_item
Arnd Bergmann arnd@arndb.de btrfs: tree-checker: use %zu format string for size_t
Qu Wenruo wqu@suse.com btrfs: tree-checker: Add checker for dir item
Qu Wenruo wqu@suse.com btrfs: tree-checker: Fix false panic for sanity test
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: tree-checker: Enhance btrfs_check_node output
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: Move leaf and node validation checker to tree-checker.c
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: Add checker for EXTENT_CSUM
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: Add sanity check for EXTENT_DATA when reading out leaf
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: Check if item pointer overlaps with the item itself
Qu Wenruo quwenruo.btrfs@gmx.com btrfs: Refactor check_leaf function for later expansion
Jeff Mahoney jeffm@suse.com btrfs: struct-funcs, constify readers
Filipe Manana fdmanana@suse.com Btrfs: fix emptiness check for dirtied extent buffers at check_leaf()
Liu Bo bo.li.liu@oracle.com Btrfs: memset to avoid stale content in btree leaf
Liu Bo bo.li.liu@oracle.com Btrfs: kill BUG_ON in run_delayed_tree_ref
Liu Bo bo.li.liu@oracle.com Btrfs: improve check_node to avoid reading corrupted nodes
Liu Bo bo.li.liu@oracle.com Btrfs: memset to avoid stale content in btree node block
Liu Bo bo.li.liu@oracle.com Btrfs: fix BUG_ON in btrfs_mark_buffer_dirty
Liu Bo bo.li.liu@oracle.com Btrfs: check btree node's nritems
Liu Bo bo.li.liu@oracle.com Btrfs: detect corruption when non-root leaf has zero item
Josef Bacik jbacik@fb.com Btrfs: fix em leak in find_first_block_group
Liu Bo bo.li.liu@oracle.com Btrfs: check inconsistence between chunk and block group
Liu Bo bo.li.liu@oracle.com Btrfs: add validadtion checks for chunk loading
Qu Wenruo quwenruo@cn.fujitsu.com btrfs: Enhance chunk validation check
Jeff Mahoney jeffm@suse.com btrfs: cleanup, stop casting for extent_map->lookup everywhere
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Disable headset Mic VREF for headset mode of ALC225
-------------
Diffstat:
Makefile | 4 +- crypto/cts.c | 8 +- drivers/acpi/power.c | 22 ++ drivers/i2c/i2c-dev.c | 6 + drivers/pci/host/pcie-altera.c | 201 +++++++++--- drivers/usb/class/cdc-acm.c | 7 + drivers/usb/core/quirks.c | 3 +- drivers/usb/storage/scsiglue.c | 8 +- drivers/usb/storage/unusual_devs.h | 12 + fs/btrfs/Makefile | 2 +- fs/btrfs/ctree.c | 14 - fs/btrfs/ctree.h | 141 ++++---- fs/btrfs/dev-replace.c | 2 +- fs/btrfs/disk-io.c | 80 +---- fs/btrfs/extent-tree.c | 112 ++++++- fs/btrfs/extent_io.c | 43 ++- fs/btrfs/extent_io.h | 19 +- fs/btrfs/extent_map.c | 2 +- fs/btrfs/extent_map.h | 10 +- fs/btrfs/scrub.c | 2 +- fs/btrfs/struct-funcs.c | 9 +- fs/btrfs/tree-checker.c | 649 +++++++++++++++++++++++++++++++++++++ fs/btrfs/tree-checker.h | 38 +++ fs/btrfs/volumes.c | 139 +++++++- fs/btrfs/volumes.h | 2 + fs/cifs/file.c | 8 +- fs/cifs/smb2file.c | 4 +- fs/cifs/transport.c | 2 +- fs/ext4/inline.c | 6 +- include/linux/sunrpc/svc.h | 5 +- mm/slab.c | 6 +- net/sunrpc/svc.c | 10 +- net/sunrpc/svc_xprt.c | 5 +- net/sunrpc/svcsock.c | 2 +- sound/pci/hda/patch_realtek.c | 16 +- 35 files changed, 1324 insertions(+), 275 deletions(-)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit d1dd42110d2727e81b9265841a62bc84c454c3a2 upstream.
Disable Headset Mic VREF for headset mode of ALC225. This will be controlled by coef bits of headset mode functions.
[ Fixed a compile warning and code simplification -- tiwai ]
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4792,6 +4792,13 @@ static void alc280_fixup_hp_9480m(struct } }
+static void alc_fixup_disable_mic_vref(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + if (action == HDA_FIXUP_ACT_PRE_PROBE) + snd_hda_codec_set_pin_target(codec, 0x19, PIN_VREFHIZ); +} + /* for hda_fixup_thinkpad_acpi() */ #include "thinkpad_helper.c"
@@ -4891,6 +4898,7 @@ enum { ALC293_FIXUP_LENOVO_SPK_NOISE, ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY, ALC255_FIXUP_DELL_SPK_NOISE, + ALC225_FIXUP_DISABLE_MIC_VREF, ALC225_FIXUP_DELL1_MIC_NO_PRESENCE, ALC295_FIXUP_DISABLE_DAC3, ALC280_FIXUP_HP_HEADSET_MIC, @@ -5546,6 +5554,12 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC255_FIXUP_DELL1_MIC_NO_PRESENCE }, + [ALC225_FIXUP_DISABLE_MIC_VREF] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc_fixup_disable_mic_vref, + .chained = true, + .chain_id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE + }, [ALC225_FIXUP_DELL1_MIC_NO_PRESENCE] = { .type = HDA_FIXUP_VERBS, .v.verbs = (const struct hda_verb[]) { @@ -5555,7 +5569,7 @@ static const struct hda_fixup alc269_fix {} }, .chained = true, - .chain_id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE + .chain_id = ALC225_FIXUP_DISABLE_MIC_VREF }, [ALC280_FIXUP_HP_HEADSET_MIC] = { .type = HDA_FIXUP_FUNC,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Mahoney jeffm@suse.com
commit 95617d69326ce386c95e33db7aeb832b45ee9f8f upstream.
Overloading extent_map->bdev to struct map_lookup * might have started out as a means to an end, but it's a pattern that's used all over the place now. Let's get rid of the casting and just add a union instead.
Signed-off-by: Jeff Mahoney jeffm@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/dev-replace.c | 2 +- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/extent_map.c | 2 +- fs/btrfs/extent_map.h | 10 +++++++++- fs/btrfs/scrub.c | 2 +- fs/btrfs/volumes.c | 24 ++++++++++++------------ 6 files changed, 25 insertions(+), 17 deletions(-)
--- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -620,7 +620,7 @@ static void btrfs_dev_replace_update_dev em = lookup_extent_mapping(em_tree, start, (u64)-1); if (!em) break; - map = (struct map_lookup *)em->bdev; + map = em->map_lookup; for (i = 0; i < map->num_stripes; i++) if (srcdev == map->stripes[i].dev) map->stripes[i].dev = tgtdev; --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10388,7 +10388,7 @@ btrfs_start_trans_remove_block_group(str * more device items and remove one chunk item), but this is done at * btrfs_remove_chunk() through a call to check_system_chunk(). */ - map = (struct map_lookup *)em->bdev; + map = em->map_lookup; num_items = 3 + map->num_stripes; free_extent_map(em);
--- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -76,7 +76,7 @@ void free_extent_map(struct extent_map * WARN_ON(extent_map_in_tree(em)); WARN_ON(!list_empty(&em->list)); if (test_bit(EXTENT_FLAG_FS_MAPPING, &em->flags)) - kfree(em->bdev); + kfree(em->map_lookup); kmem_cache_free(extent_map_cache, em); } } --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -32,7 +32,15 @@ struct extent_map { u64 block_len; u64 generation; unsigned long flags; - struct block_device *bdev; + union { + struct block_device *bdev; + + /* + * used for chunk mappings + * flags & EXTENT_FLAG_FS_MAPPING must be set + */ + struct map_lookup *map_lookup; + }; atomic_t refs; unsigned int compress_type; struct list_head list; --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3460,7 +3460,7 @@ static noinline_for_stack int scrub_chun return ret; }
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; if (em->start != chunk_offset) goto out;
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1184,7 +1184,7 @@ again: struct map_lookup *map; int i;
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; for (i = 0; i < map->num_stripes; i++) { u64 end;
@@ -2757,7 +2757,7 @@ int btrfs_remove_chunk(struct btrfs_tran free_extent_map(em); return -EINVAL; } - map = (struct map_lookup *)em->bdev; + map = em->map_lookup; lock_chunks(root->fs_info->chunk_root); check_system_chunk(trans, extent_root, map->type); unlock_chunks(root->fs_info->chunk_root); @@ -4731,7 +4731,7 @@ static int __btrfs_alloc_chunk(struct bt goto error; } set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags); - em->bdev = (struct block_device *)map; + em->map_lookup = map; em->start = start; em->len = num_bytes; em->block_start = 0; @@ -4826,7 +4826,7 @@ int btrfs_finish_chunk_alloc(struct btrf return -EINVAL; }
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; item_size = btrfs_chunk_item_size(map->num_stripes); stripe_size = em->orig_block_len;
@@ -4968,7 +4968,7 @@ int btrfs_chunk_readonly(struct btrfs_ro if (!em) return 1;
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; for (i = 0; i < map->num_stripes; i++) { if (map->stripes[i].dev->missing) { miss_ndevs++; @@ -5048,7 +5048,7 @@ int btrfs_num_copies(struct btrfs_fs_inf return 1; }
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; if (map->type & (BTRFS_BLOCK_GROUP_DUP | BTRFS_BLOCK_GROUP_RAID1)) ret = map->num_stripes; else if (map->type & BTRFS_BLOCK_GROUP_RAID10) @@ -5091,7 +5091,7 @@ unsigned long btrfs_full_stripe_len(stru BUG_ON(!em);
BUG_ON(em->start > logical || em->start + em->len < logical); - map = (struct map_lookup *)em->bdev; + map = em->map_lookup; if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) len = map->stripe_len * nr_data_stripes(map); free_extent_map(em); @@ -5112,7 +5112,7 @@ int btrfs_is_parity_mirror(struct btrfs_ BUG_ON(!em);
BUG_ON(em->start > logical || em->start + em->len < logical); - map = (struct map_lookup *)em->bdev; + map = em->map_lookup; if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) ret = 1; free_extent_map(em); @@ -5271,7 +5271,7 @@ static int __btrfs_map_block(struct btrf return -EINVAL; }
- map = (struct map_lookup *)em->bdev; + map = em->map_lookup; offset = logical - em->start;
stripe_len = map->stripe_len; @@ -5813,7 +5813,7 @@ int btrfs_rmap_block(struct btrfs_mappin free_extent_map(em); return -EIO; } - map = (struct map_lookup *)em->bdev; + map = em->map_lookup;
length = em->len; rmap_len = map->stripe_len; @@ -6249,7 +6249,7 @@ static int read_one_chunk(struct btrfs_r }
set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags); - em->bdev = (struct block_device *)map; + em->map_lookup = map; em->start = logical; em->len = length; em->orig_start = 0; @@ -6948,7 +6948,7 @@ void btrfs_update_commit_device_bytes_us /* In order to kick the device replace finish process */ lock_chunks(root); list_for_each_entry(em, &transaction->pending_chunks, list) { - map = (struct map_lookup *)em->bdev; + map = em->map_lookup;
for (i = 0; i < map->num_stripes; i++) { dev = map->stripes[i].dev;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo@cn.fujitsu.com
commit f04b772bfc17f502703794f4d100d12155c1a1a9 upstream.
Enhance chunk validation: 1) Num_stripes We already have such check but it's only in super block sys chunk array. Now check all on-disk chunks.
2) Chunk logical It should be aligned to sector size. This behavior should be *DOUBLE CHECKED* for 64K sector size like PPC64 or AArch64. Maybe we can found some hidden bugs.
3) Chunk length Same as chunk logical, should be aligned to sector size.
4) Stripe length It should be power of 2.
5) Chunk type Any bit out of TYPE_MAS | PROFILE_MASK is invalid.
With all these much restrict rules, several fuzzed image reported in mail list should no longer cause kernel panic.
Reported-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Qu Wenruo quwenruo@cn.fujitsu.com Signed-off-by: Chris Mason clm@fb.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6217,6 +6217,7 @@ static int read_one_chunk(struct btrfs_r struct extent_map *em; u64 logical; u64 length; + u64 stripe_len; u64 devid; u8 uuid[BTRFS_UUID_SIZE]; int num_stripes; @@ -6225,6 +6226,37 @@ static int read_one_chunk(struct btrfs_r
logical = key->offset; length = btrfs_chunk_length(leaf, chunk); + stripe_len = btrfs_chunk_stripe_len(leaf, chunk); + num_stripes = btrfs_chunk_num_stripes(leaf, chunk); + /* Validation check */ + if (!num_stripes) { + btrfs_err(root->fs_info, "invalid chunk num_stripes: %u", + num_stripes); + return -EIO; + } + if (!IS_ALIGNED(logical, root->sectorsize)) { + btrfs_err(root->fs_info, + "invalid chunk logical %llu", logical); + return -EIO; + } + if (!length || !IS_ALIGNED(length, root->sectorsize)) { + btrfs_err(root->fs_info, + "invalid chunk length %llu", length); + return -EIO; + } + if (!is_power_of_2(stripe_len)) { + btrfs_err(root->fs_info, "invalid chunk stripe length: %llu", + stripe_len); + return -EIO; + } + if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) & + btrfs_chunk_type(leaf, chunk)) { + btrfs_err(root->fs_info, "unrecognized chunk type: %llu", + ~(BTRFS_BLOCK_GROUP_TYPE_MASK | + BTRFS_BLOCK_GROUP_PROFILE_MASK) & + btrfs_chunk_type(leaf, chunk)); + return -EIO; + }
read_lock(&map_tree->map_tree.lock); em = lookup_extent_mapping(&map_tree->map_tree, logical, 1); @@ -6241,7 +6273,6 @@ static int read_one_chunk(struct btrfs_r em = alloc_extent_map(); if (!em) return -ENOMEM; - num_stripes = btrfs_chunk_num_stripes(leaf, chunk); map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS); if (!map) { free_extent_map(em);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit e06cd3dd7cea50e87663a88acdfdb7ac1c53a5ca upstream.
To prevent fuzzed filesystem images from panic the whole system, we need various validation checks to refuse to mount such an image if btrfs finds any invalid value during loading chunks, including both sys_array and regular chunks.
Note that these checks may not be sufficient to cover all corner cases, feel free to add more checks.
Reported-by: Vegard Nossum vegard.nossum@oracle.com Reported-by: Quentin Casasnovas quentin.casasnovas@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: Liu Bo bo.li.liu@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 82 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 67 insertions(+), 15 deletions(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6208,27 +6208,23 @@ struct btrfs_device *btrfs_alloc_device( return dev; }
-static int read_one_chunk(struct btrfs_root *root, struct btrfs_key *key, - struct extent_buffer *leaf, - struct btrfs_chunk *chunk) +/* Return -EIO if any error, otherwise return 0. */ +static int btrfs_check_chunk_valid(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_chunk *chunk, u64 logical) { - struct btrfs_mapping_tree *map_tree = &root->fs_info->mapping_tree; - struct map_lookup *map; - struct extent_map *em; - u64 logical; u64 length; u64 stripe_len; - u64 devid; - u8 uuid[BTRFS_UUID_SIZE]; - int num_stripes; - int ret; - int i; + u16 num_stripes; + u16 sub_stripes; + u64 type;
- logical = key->offset; length = btrfs_chunk_length(leaf, chunk); stripe_len = btrfs_chunk_stripe_len(leaf, chunk); num_stripes = btrfs_chunk_num_stripes(leaf, chunk); - /* Validation check */ + sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk); + type = btrfs_chunk_type(leaf, chunk); + if (!num_stripes) { btrfs_err(root->fs_info, "invalid chunk num_stripes: %u", num_stripes); @@ -6239,6 +6235,11 @@ static int read_one_chunk(struct btrfs_r "invalid chunk logical %llu", logical); return -EIO; } + if (btrfs_chunk_sector_size(leaf, chunk) != root->sectorsize) { + btrfs_err(root->fs_info, "invalid chunk sectorsize %u", + btrfs_chunk_sector_size(leaf, chunk)); + return -EIO; + } if (!length || !IS_ALIGNED(length, root->sectorsize)) { btrfs_err(root->fs_info, "invalid chunk length %llu", length); @@ -6250,13 +6251,54 @@ static int read_one_chunk(struct btrfs_r return -EIO; } if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) & - btrfs_chunk_type(leaf, chunk)) { + type) { btrfs_err(root->fs_info, "unrecognized chunk type: %llu", ~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) & btrfs_chunk_type(leaf, chunk)); return -EIO; } + if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) || + (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) || + (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) || + (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) || + (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) || + ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 && + num_stripes != 1)) { + btrfs_err(root->fs_info, + "invalid num_stripes:sub_stripes %u:%u for profile %llu", + num_stripes, sub_stripes, + type & BTRFS_BLOCK_GROUP_PROFILE_MASK); + return -EIO; + } + + return 0; +} + +static int read_one_chunk(struct btrfs_root *root, struct btrfs_key *key, + struct extent_buffer *leaf, + struct btrfs_chunk *chunk) +{ + struct btrfs_mapping_tree *map_tree = &root->fs_info->mapping_tree; + struct map_lookup *map; + struct extent_map *em; + u64 logical; + u64 length; + u64 stripe_len; + u64 devid; + u8 uuid[BTRFS_UUID_SIZE]; + int num_stripes; + int ret; + int i; + + logical = key->offset; + length = btrfs_chunk_length(leaf, chunk); + stripe_len = btrfs_chunk_stripe_len(leaf, chunk); + num_stripes = btrfs_chunk_num_stripes(leaf, chunk); + + ret = btrfs_check_chunk_valid(root, leaf, chunk, logical); + if (ret) + return ret;
read_lock(&map_tree->map_tree.lock); em = lookup_extent_mapping(&map_tree->map_tree, logical, 1); @@ -6504,6 +6546,7 @@ int btrfs_read_sys_array(struct btrfs_ro u32 array_size; u32 len = 0; u32 cur_offset; + u64 type; struct btrfs_key key;
ASSERT(BTRFS_SUPER_INFO_SIZE <= root->nodesize); @@ -6569,6 +6612,15 @@ int btrfs_read_sys_array(struct btrfs_ro ret = -EIO; break; } + + type = btrfs_chunk_type(sb, chunk); + if ((type & BTRFS_BLOCK_GROUP_SYSTEM) == 0) { + btrfs_err(root->fs_info, + "invalid chunk type %llu in sys_array at offset %u", + type, cur_offset); + ret = -EIO; + break; + }
len = btrfs_chunk_item_size(num_stripes); if (cur_offset + len > array_size)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 6fb37b756acce6d6e045f79c3764206033f617b4 upstream.
With btrfs-corrupt-block, one can drop one chunk item and mounting will end up with a panic in btrfs_full_stripe_len().
This doesn't not remove the BUG_ON, but instead checks it a bit earlier when we find the block group item.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9502,7 +9502,22 @@ static int find_first_block_group(struct
if (found_key.objectid >= key->objectid && found_key.type == BTRFS_BLOCK_GROUP_ITEM_KEY) { - ret = 0; + struct extent_map_tree *em_tree; + struct extent_map *em; + + em_tree = &root->fs_info->mapping_tree.map_tree; + read_lock(&em_tree->lock); + em = lookup_extent_mapping(em_tree, found_key.objectid, + found_key.offset); + read_unlock(&em_tree->lock); + if (!em) { + btrfs_err(root->fs_info, + "logical %llu len %llu found bg but no related chunk", + found_key.objectid, found_key.offset); + ret = -ENOENT; + } else { + ret = 0; + } goto out; } path->slots[0]++;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik jbacik@fb.com
commit 187ee58c62c1d0d238d3dc4835869d33e1869906 upstream.
We need to call free_extent_map() on the em we look up.
Signed-off-by: Josef Bacik jbacik@fb.com Reviewed-by: Omar Sandoval osandov@fb.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Chris Mason clm@fb.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9518,6 +9518,7 @@ static int find_first_block_group(struct } else { ret = 0; } + free_extent_map(em); goto out; } path->slots[0]++;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 1ba98d086fe3a14d6a31f2f66dbab70c45d00f63 upstream.
Right now we treat leaf which has zero item as a valid one because we could have an empty tree, that is, a root that is also a leaf without any item, however, in the same case but when the leaf is not a root, we can end up with hitting the BUG_ON(1) in btrfs_extend_item() called by setup_inline_extent_backref().
This makes us check the situation as a corruption if leaf is not its own root.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Chris Mason clm@fb.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -535,8 +535,29 @@ static noinline int check_leaf(struct bt u32 nritems = btrfs_header_nritems(leaf); int slot;
- if (nritems == 0) + if (nritems == 0) { + struct btrfs_root *check_root; + + key.objectid = btrfs_header_owner(leaf); + key.type = BTRFS_ROOT_ITEM_KEY; + key.offset = (u64)-1; + + check_root = btrfs_get_fs_root(root->fs_info, &key, false); + /* + * The only reason we also check NULL here is that during + * open_ctree() some roots has not yet been set up. + */ + if (!IS_ERR_OR_NULL(check_root)) { + /* if leaf is the root, then it's fine */ + if (leaf->start != + btrfs_root_bytenr(&check_root->root_item)) { + CORRUPT("non-root leaf's nritems is 0", + leaf, root, 0); + return -EIO; + } + } return 0; + }
/* Check the 0 item */ if (btrfs_item_offset_nr(leaf, 0) + btrfs_item_size_nr(leaf, 0) !=
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 053ab70f0604224c7893b43f9d9d5efa283580d6 upstream.
When btree node (level = 1) has nritems which equals to zero, we can end up with panic due to insert_ptr()'s
BUG_ON(slot > nritems);
where slot is 1 and nritems is 0, as copy_for_split() calls insert_ptr(.., path->slots[1] + 1, ...);
A invalid value results in the whole mess, this adds the check for btree's node nritems so that we stop reading block when when something is wrong.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Chris Mason clm@fb.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -609,6 +609,19 @@ static noinline int check_leaf(struct bt return 0; }
+static int check_node(struct btrfs_root *root, struct extent_buffer *node) +{ + unsigned long nr = btrfs_header_nritems(node); + + if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { + btrfs_crit(root->fs_info, + "corrupt node: block %llu root %llu nritems %lu", + node->start, root->objectid, nr); + return -EIO; + } + return 0; +} + static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 phy_offset, struct page *page, u64 start, u64 end, int mirror) @@ -680,6 +693,9 @@ static int btree_readpage_end_io_hook(st ret = -EIO; }
+ if (found_level > 0 && check_node(root, eb)) + ret = -EIO; + if (!ret) set_extent_buffer_uptodate(eb); err:
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit ef85b25e982b5bba1530b936e283ef129f02ab9d upstream.
This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y.
Commit 1ba98d0 ("Btrfs: detect corruption when non-root leaf has zero item") assumes that a leaf is its root when leaf->bytenr == btrfs_root_bytenr(root), however, we should not use btrfs_root_bytenr(root) since it's mainly got updated during committing transaction. So the check can fail when doing COW on this leaf while it is a root.
This changes to use "if (leaf == btrfs_root_node(root))" instead, just like how we check whether leaf is a root in __btrfs_cow_block().
Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Reported-by: Jeff Mahoney jeffm@suse.com Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -548,13 +548,17 @@ static noinline int check_leaf(struct bt * open_ctree() some roots has not yet been set up. */ if (!IS_ERR_OR_NULL(check_root)) { + struct extent_buffer *eb; + + eb = btrfs_root_node(check_root); /* if leaf is the root, then it's fine */ - if (leaf->start != - btrfs_root_bytenr(&check_root->root_item)) { + if (leaf != eb) { CORRUPT("non-root leaf's nritems is 0", - leaf, root, 0); + leaf, check_root, 0); + free_extent_buffer(eb); return -EIO; } + free_extent_buffer(eb); } return 0; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 3eb548ee3a8042d95ad81be254e67a5222c24e03 upstream.
During updating btree, we could push items between sibling nodes/leaves, for leaves data sections starts reversely from the end of the block while for nodes we only have key pairs which are stored one by one from the start of the block.
So we could do try to push key pairs from one node to the next node right in the tree, and after that, we update the node's nritems to reflect the correct end while leaving the stale content in the node. One may intentionally corrupt the fs image and access the stale content by bumping the nritems and causes various crashes.
This takes the in-memory @nritems as the correct one and gets to memset the unused part of a btree node.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3858,6 +3858,17 @@ static noinline_for_stack int write_one_ if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) bio_flags = EXTENT_BIO_TREE_LOG;
+ /* set btree node beyond nritems with 0 to avoid stale content */ + if (btrfs_header_level(eb) > 0) { + u32 nritems; + unsigned long end; + + nritems = btrfs_header_nritems(eb); + end = btrfs_node_key_ptr_offset(nritems); + + memset_extent_buffer(eb, 0, end, eb->len - end); + } + for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i];
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 6b722c1747d533ac6d4df110dc8233db46918b65 upstream.
We need to check items in a node to make sure that we're reading a valid one, otherwise we could get various crashes while processing delayed_refs.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -523,9 +523,10 @@ static int check_tree_block_fsid(struct }
#define CORRUPT(reason, eb, root, slot) \ - btrfs_crit(root->fs_info, "corrupt leaf, %s: block=%llu," \ - "root=%llu, slot=%d", reason, \ - btrfs_header_bytenr(eb), root->objectid, slot) + btrfs_crit(root->fs_info, "corrupt %s, %s: block=%llu," \ + " root=%llu, slot=%d", \ + btrfs_header_level(eb) == 0 ? "leaf" : "node",\ + reason, btrfs_header_bytenr(eb), root->objectid, slot)
static noinline int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf) @@ -616,6 +617,10 @@ static noinline int check_leaf(struct bt static int check_node(struct btrfs_root *root, struct extent_buffer *node) { unsigned long nr = btrfs_header_nritems(node); + struct btrfs_key key, next_key; + int slot; + u64 bytenr; + int ret = 0;
if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { btrfs_crit(root->fs_info, @@ -623,7 +628,26 @@ static int check_node(struct btrfs_root node->start, root->objectid, nr); return -EIO; } - return 0; + + for (slot = 0; slot < nr - 1; slot++) { + bytenr = btrfs_node_blockptr(node, slot); + btrfs_node_key_to_cpu(node, &key, slot); + btrfs_node_key_to_cpu(node, &next_key, slot + 1); + + if (!bytenr) { + CORRUPT("invalid item slot", node, root, slot); + ret = -EIO; + goto out; + } + + if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) { + CORRUPT("bad key order", node, root, slot); + ret = -EIO; + goto out; + } + } +out: + return ret; }
static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 02794222c4132ac003e7281fb71f4ec1645ffc87 upstream.
In a corrupted btrfs image, we can come across this BUG_ON and get an unreponsive system, but if we return errors instead, its caller can handle everything gracefully by aborting the current transaction.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2342,7 +2342,13 @@ static int run_delayed_tree_ref(struct b ins.type = BTRFS_EXTENT_ITEM_KEY; }
- BUG_ON(node->ref_mod != 1); + if (node->ref_mod != 1) { + btrfs_err(root->fs_info, + "btree block(%llu) has %d references rather than 1: action %d ref_root %llu parent %llu", + node->bytenr, node->ref_mod, node->action, ref_root, + parent); + return -EIO; + } if (node->action == BTRFS_ADD_DELAYED_REF && insert_reserved) { BUG_ON(!extent_op || !extent_op->update_flags); ret = alloc_reserved_tree_block(trans, root,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Bo bo.li.liu@oracle.com
commit 851cd173f06045816528176001cf82948282029c upstream.
This is an additional patch to "Btrfs: memset to avoid stale content in btree node block".
This uses memset to initialize the unused space in a leaf to avoid potential stale content, which may be incurred by pushing items between sibling leaves.
Signed-off-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.c | 14 -------------- fs/btrfs/ctree.h | 15 +++++++++++++++ fs/btrfs/extent_io.c | 18 +++++++++++++----- 3 files changed, 28 insertions(+), 19 deletions(-)
--- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1726,20 +1726,6 @@ int btrfs_realloc_node(struct btrfs_tran return err; }
-/* - * The leaf data grows from end-to-front in the node. - * this returns the address of the start of the last item, - * which is the stop of the leaf data stack - */ -static inline unsigned int leaf_data_end(struct btrfs_root *root, - struct extent_buffer *leaf) -{ - u32 nr = btrfs_header_nritems(leaf); - if (nr == 0) - return BTRFS_LEAF_DATA_SIZE(root); - return btrfs_item_offset_nr(leaf, nr - 1); -} -
/* * search for key in the extent_buffer. The items start at offset p, --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3158,6 +3158,21 @@ static inline unsigned long btrfs_leaf_d return offsetof(struct btrfs_leaf, items); }
+/* + * The leaf data grows from end-to-front in the node. + * this returns the address of the start of the last item, + * which is the stop of the leaf data stack + */ +static inline unsigned int leaf_data_end(struct btrfs_root *root, + struct extent_buffer *leaf) +{ + u32 nr = btrfs_header_nritems(leaf); + + if (nr == 0) + return BTRFS_LEAF_DATA_SIZE(root); + return btrfs_item_offset_nr(leaf, nr - 1); +} + /* struct btrfs_file_extent_item */ BTRFS_SETGET_FUNCS(file_extent_type, struct btrfs_file_extent_item, type, 8); BTRFS_SETGET_STACK_FUNCS(stack_file_extent_disk_bytenr, --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3847,8 +3847,10 @@ static noinline_for_stack int write_one_ struct block_device *bdev = fs_info->fs_devices->latest_bdev; struct extent_io_tree *tree = &BTRFS_I(fs_info->btree_inode)->io_tree; u64 offset = eb->start; + u32 nritems; unsigned long i, num_pages; unsigned long bio_flags = 0; + unsigned long start, end; int rw = (epd->sync_io ? WRITE_SYNC : WRITE) | REQ_META; int ret = 0;
@@ -3858,15 +3860,21 @@ static noinline_for_stack int write_one_ if (btrfs_header_owner(eb) == BTRFS_TREE_LOG_OBJECTID) bio_flags = EXTENT_BIO_TREE_LOG;
- /* set btree node beyond nritems with 0 to avoid stale content */ + /* set btree blocks beyond nritems with 0 to avoid stale content. */ + nritems = btrfs_header_nritems(eb); if (btrfs_header_level(eb) > 0) { - u32 nritems; - unsigned long end; - - nritems = btrfs_header_nritems(eb); end = btrfs_node_key_ptr_offset(nritems);
memset_extent_buffer(eb, 0, end, eb->len - end); + } else { + /* + * leaf: + * header 0 1 2 .. N ... data_N .. data_2 data_1 data_0 + */ + start = btrfs_item_nr_offset(nritems); + end = btrfs_leaf_data(eb) + + leaf_data_end(fs_info->tree_root, eb); + memset_extent_buffer(eb, 0, start, end - start); }
for (i = 0; i < num_pages; i++) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit f177d73949bf758542ca15a1c1945bd2e802cc65 upstream.
We can not simply use the owner field from an extent buffer's header to get the id of the respective tree when the extent buffer is from a relocation tree. When we create the root for a relocation tree we leave (on purpose) the owner field with the same value as the subvolume's tree root (we do this at ctree.c:btrfs_copy_root()). So we must ignore extent buffers from relocation trees, which have the BTRFS_HEADER_FLAG_RELOC flag set, because otherwise we will always consider the extent buffer as not being the root of the tree (the root of original subvolume tree is always different from the root of the respective relocation tree).
This lead to assertion failures when running with the integrity checker enabled (CONFIG_BTRFS_FS_CHECK_INTEGRITY=y) such as the following:
[ 643.393409] BTRFS critical (device sdg): corrupt leaf, non-root leaf's nritems is 0: block=38506496, root=260, slot=0 [ 643.397609] BTRFS info (device sdg): leaf 38506496 total ptrs 0 free space 3995 [ 643.407075] assertion failed: 0, file: fs/btrfs/disk-io.c, line: 4078 [ 643.408425] ------------[ cut here ]------------ [ 643.409112] kernel BUG at fs/btrfs/ctree.h:3419! [ 643.409773] invalid opcode: 0000 [#1] PREEMPT SMP [ 643.410447] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq ppdev psmouse acpi_cpufreq parport_pc evdev parport tpm_tis tpm_tis_core pcspkr serio_raw i2c_piix4 sg tpm i2c_core button processor loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring scsi_mod virtio e1000 floppy [ 643.414356] CPU: 11 PID: 32726 Comm: btrfs Not tainted 4.8.0-rc8-btrfs-next-35+ #1 [ 643.414356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 643.414356] task: ffff880145e95b00 task.stack: ffff88014826c000 [ 643.414356] RIP: 0010:[<ffffffffa0352759>] [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP: 0018:ffff88014826fa28 EFLAGS: 00010292 [ 643.414356] RAX: 0000000000000039 RBX: ffff88014e2d7c38 RCX: 0000000000000001 [ 643.414356] RDX: ffff88023f4d2f58 RSI: ffffffff81806c63 RDI: 00000000ffffffff [ 643.414356] RBP: ffff88014826fa28 R08: 0000000000000001 R09: 0000000000000000 [ 643.414356] R10: ffff88014826f918 R11: ffffffff82f3c5ed R12: ffff880172910000 [ 643.414356] R13: ffff880233992230 R14: ffff8801a68a3310 R15: fffffffffffffff8 [ 643.414356] FS: 00007f9ca305e8c0(0000) GS:ffff88023f4c0000(0000) knlGS:0000000000000000 [ 643.414356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 643.414356] CR2: 00007f9ca3071000 CR3: 000000015d01b000 CR4: 00000000000006e0 [ 643.414356] Stack: [ 643.414356] ffff88014826fa50 ffffffffa02d655a 000000000000000a ffff88014e2d7c38 [ 643.414356] 0000000000000000 ffff88014826faa8 ffffffffa02b72f3 ffff88014826fab8 [ 643.414356] 00ffffffa03228e4 0000000000000000 0000000000000000 ffff8801bbd4e000 [ 643.414356] Call Trace: [ 643.414356] [<ffffffffa02d655a>] btrfs_mark_buffer_dirty+0xdf/0xe5 [btrfs] [ 643.414356] [<ffffffffa02b72f3>] btrfs_copy_root+0x18a/0x1d1 [btrfs] [ 643.414356] [<ffffffffa0322921>] create_reloc_root+0x72/0x1ba [btrfs] [ 643.414356] [<ffffffffa03267c2>] btrfs_init_reloc_root+0x7b/0xa7 [btrfs] [ 643.414356] [<ffffffffa02d9e44>] record_root_in_trans+0xdf/0xed [btrfs] [ 643.414356] [<ffffffffa02db04e>] btrfs_record_root_in_trans+0x50/0x6a [btrfs] [ 643.414356] [<ffffffffa030ad2b>] create_subvol+0x472/0x773 [btrfs] [ 643.414356] [<ffffffffa030b406>] btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [<ffffffffa030b406>] ? btrfs_mksubvol+0x3da/0x463 [btrfs] [ 643.414356] [<ffffffff810781ac>] ? preempt_count_add+0x65/0x68 [ 643.414356] [<ffffffff811a6e97>] ? __mnt_want_write+0x62/0x77 [ 643.414356] [<ffffffffa030b55d>] btrfs_ioctl_snap_create_transid+0xce/0x187 [btrfs] [ 643.414356] [<ffffffffa030b67d>] btrfs_ioctl_snap_create+0x67/0x81 [btrfs] [ 643.414356] [<ffffffffa030ecfd>] btrfs_ioctl+0x508/0x20dd [btrfs] [ 643.414356] [<ffffffff81293e39>] ? __this_cpu_preempt_check+0x13/0x15 [ 643.414356] [<ffffffff81155eca>] ? handle_mm_fault+0x976/0x9ab [ 643.414356] [<ffffffff81091300>] ? arch_local_irq_save+0x9/0xc [ 643.414356] [<ffffffff8119a2b0>] vfs_ioctl+0x18/0x34 [ 643.414356] [<ffffffff8119a8e8>] do_vfs_ioctl+0x581/0x600 [ 643.414356] [<ffffffff814b9552>] ? entry_SYSCALL_64_fastpath+0x5/0xa8 [ 643.414356] [<ffffffff81093fe9>] ? trace_hardirqs_on_caller+0x17b/0x197 [ 643.414356] [<ffffffff8119a9be>] SyS_ioctl+0x57/0x79 [ 643.414356] [<ffffffff814b9565>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 643.414356] [<ffffffff81091b08>] ? trace_hardirqs_off_caller+0x3f/0xaa [ 643.414356] Code: 89 83 88 00 00 00 31 c0 5b 41 5c 41 5d 5d c3 55 89 f1 48 c7 c2 98 bc 35 a0 48 89 fe 48 c7 c7 05 be 35 a0 48 89 e5 e8 13 46 dd e0 <0f> 0b 55 89 f1 48 c7 c2 9f d3 35 a0 48 89 fe 48 c7 c7 7a d5 35 [ 643.414356] RIP [<ffffffffa0352759>] assfail.constprop.41+0x1c/0x1e [btrfs] [ 643.414356] RSP <ffff88014826fa28> [ 643.468267] ---[ end trace 6a1b3fb1a9d7d6e3 ]---
This can be easily reproduced by running xfstests with the integrity checker enabled.
Fixes: 1ba98d086fe3 (Btrfs: detect corruption when non-root leaf has zero item) Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: Liu Bo bo.li.liu@oracle.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -536,7 +536,15 @@ static noinline int check_leaf(struct bt u32 nritems = btrfs_header_nritems(leaf); int slot;
- if (nritems == 0) { + /* + * Extent buffers from a relocation tree have a owner field that + * corresponds to the subvolume tree they are based on. So just from an + * extent buffer alone we can not find out what is the id of the + * corresponding subvolume tree, so we can not figure out if the extent + * buffer corresponds to the root of the relocation tree or not. So skip + * this check for relocation trees. + */ + if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) { struct btrfs_root *check_root;
key.objectid = btrfs_header_owner(leaf); @@ -564,6 +572,9 @@ static noinline int check_leaf(struct bt return 0; }
+ if (nritems == 0) + return 0; + /* Check the 0 item */ if (btrfs_item_offset_nr(leaf, 0) + btrfs_item_size_nr(leaf, 0) != BTRFS_LEAF_DATA_SIZE(root)) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Mahoney jeffm@suse.com
commit 1cbb1f454e5321e47fc1e6b233066c7ccc979d15 upstream.
We have reader helpers for most of the on-disk structures that use an extent_buffer and pointer as offset into the buffer that are read-only. We should mark them as const and, in turn, allow consumers of these interfaces to mark the buffers const as well.
No impact on code, but serves as documentation that a buffer is intended not to be modified.
Signed-off-by: Jeff Mahoney jeffm@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.h | 128 ++++++++++++++++++++++++------------------------ fs/btrfs/extent_io.c | 24 ++++----- fs/btrfs/extent_io.h | 19 +++---- fs/btrfs/struct-funcs.c | 9 +-- 4 files changed, 91 insertions(+), 89 deletions(-)
--- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2283,7 +2283,7 @@ do { #define BTRFS_INODE_ROOT_ITEM_INIT (1 << 31)
struct btrfs_map_token { - struct extent_buffer *eb; + const struct extent_buffer *eb; char *kaddr; unsigned long offset; }; @@ -2314,18 +2314,19 @@ static inline void btrfs_init_map_token sizeof(((type *)0)->member)))
#define DECLARE_BTRFS_SETGET_BITS(bits) \ -u##bits btrfs_get_token_##bits(struct extent_buffer *eb, void *ptr, \ - unsigned long off, \ - struct btrfs_map_token *token); \ -void btrfs_set_token_##bits(struct extent_buffer *eb, void *ptr, \ +u##bits btrfs_get_token_##bits(const struct extent_buffer *eb, \ + const void *ptr, unsigned long off, \ + struct btrfs_map_token *token); \ +void btrfs_set_token_##bits(struct extent_buffer *eb, const void *ptr, \ unsigned long off, u##bits val, \ struct btrfs_map_token *token); \ -static inline u##bits btrfs_get_##bits(struct extent_buffer *eb, void *ptr, \ +static inline u##bits btrfs_get_##bits(const struct extent_buffer *eb, \ + const void *ptr, \ unsigned long off) \ { \ return btrfs_get_token_##bits(eb, ptr, off, NULL); \ } \ -static inline void btrfs_set_##bits(struct extent_buffer *eb, void *ptr, \ +static inline void btrfs_set_##bits(struct extent_buffer *eb, void *ptr,\ unsigned long off, u##bits val) \ { \ btrfs_set_token_##bits(eb, ptr, off, val, NULL); \ @@ -2337,7 +2338,8 @@ DECLARE_BTRFS_SETGET_BITS(32) DECLARE_BTRFS_SETGET_BITS(64)
#define BTRFS_SETGET_FUNCS(name, type, member, bits) \ -static inline u##bits btrfs_##name(struct extent_buffer *eb, type *s) \ +static inline u##bits btrfs_##name(const struct extent_buffer *eb, \ + const type *s) \ { \ BUILD_BUG_ON(sizeof(u##bits) != sizeof(((type *)0))->member); \ return btrfs_get_##bits(eb, s, offsetof(type, member)); \ @@ -2348,7 +2350,8 @@ static inline void btrfs_set_##name(stru BUILD_BUG_ON(sizeof(u##bits) != sizeof(((type *)0))->member); \ btrfs_set_##bits(eb, s, offsetof(type, member), val); \ } \ -static inline u##bits btrfs_token_##name(struct extent_buffer *eb, type *s, \ +static inline u##bits btrfs_token_##name(const struct extent_buffer *eb,\ + const type *s, \ struct btrfs_map_token *token) \ { \ BUILD_BUG_ON(sizeof(u##bits) != sizeof(((type *)0))->member); \ @@ -2363,9 +2366,9 @@ static inline void btrfs_set_token_##nam }
#define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits) \ -static inline u##bits btrfs_##name(struct extent_buffer *eb) \ +static inline u##bits btrfs_##name(const struct extent_buffer *eb) \ { \ - type *p = page_address(eb->pages[0]); \ + const type *p = page_address(eb->pages[0]); \ u##bits res = le##bits##_to_cpu(p->member); \ return res; \ } \ @@ -2377,7 +2380,7 @@ static inline void btrfs_set_##name(stru }
#define BTRFS_SETGET_STACK_FUNCS(name, type, member, bits) \ -static inline u##bits btrfs_##name(type *s) \ +static inline u##bits btrfs_##name(const type *s) \ { \ return le##bits##_to_cpu(s->member); \ } \ @@ -2678,7 +2681,7 @@ static inline unsigned long btrfs_node_k sizeof(struct btrfs_key_ptr) * nr; }
-void btrfs_node_key(struct extent_buffer *eb, +void btrfs_node_key(const struct extent_buffer *eb, struct btrfs_disk_key *disk_key, int nr);
static inline void btrfs_set_node_key(struct extent_buffer *eb, @@ -2707,28 +2710,28 @@ static inline struct btrfs_item *btrfs_i return (struct btrfs_item *)btrfs_item_nr_offset(nr); }
-static inline u32 btrfs_item_end(struct extent_buffer *eb, +static inline u32 btrfs_item_end(const struct extent_buffer *eb, struct btrfs_item *item) { return btrfs_item_offset(eb, item) + btrfs_item_size(eb, item); }
-static inline u32 btrfs_item_end_nr(struct extent_buffer *eb, int nr) +static inline u32 btrfs_item_end_nr(const struct extent_buffer *eb, int nr) { return btrfs_item_end(eb, btrfs_item_nr(nr)); }
-static inline u32 btrfs_item_offset_nr(struct extent_buffer *eb, int nr) +static inline u32 btrfs_item_offset_nr(const struct extent_buffer *eb, int nr) { return btrfs_item_offset(eb, btrfs_item_nr(nr)); }
-static inline u32 btrfs_item_size_nr(struct extent_buffer *eb, int nr) +static inline u32 btrfs_item_size_nr(const struct extent_buffer *eb, int nr) { return btrfs_item_size(eb, btrfs_item_nr(nr)); }
-static inline void btrfs_item_key(struct extent_buffer *eb, +static inline void btrfs_item_key(const struct extent_buffer *eb, struct btrfs_disk_key *disk_key, int nr) { struct btrfs_item *item = btrfs_item_nr(nr); @@ -2764,8 +2767,8 @@ BTRFS_SETGET_STACK_FUNCS(stack_dir_name_ BTRFS_SETGET_STACK_FUNCS(stack_dir_transid, struct btrfs_dir_item, transid, 64);
-static inline void btrfs_dir_item_key(struct extent_buffer *eb, - struct btrfs_dir_item *item, +static inline void btrfs_dir_item_key(const struct extent_buffer *eb, + const struct btrfs_dir_item *item, struct btrfs_disk_key *key) { read_eb_member(eb, item, struct btrfs_dir_item, location, key); @@ -2773,7 +2776,7 @@ static inline void btrfs_dir_item_key(st
static inline void btrfs_set_dir_item_key(struct extent_buffer *eb, struct btrfs_dir_item *item, - struct btrfs_disk_key *key) + const struct btrfs_disk_key *key) { write_eb_member(eb, item, struct btrfs_dir_item, location, key); } @@ -2785,8 +2788,8 @@ BTRFS_SETGET_FUNCS(free_space_bitmaps, s BTRFS_SETGET_FUNCS(free_space_generation, struct btrfs_free_space_header, generation, 64);
-static inline void btrfs_free_space_key(struct extent_buffer *eb, - struct btrfs_free_space_header *h, +static inline void btrfs_free_space_key(const struct extent_buffer *eb, + const struct btrfs_free_space_header *h, struct btrfs_disk_key *key) { read_eb_member(eb, h, struct btrfs_free_space_header, location, key); @@ -2794,7 +2797,7 @@ static inline void btrfs_free_space_key(
static inline void btrfs_set_free_space_key(struct extent_buffer *eb, struct btrfs_free_space_header *h, - struct btrfs_disk_key *key) + const struct btrfs_disk_key *key) { write_eb_member(eb, h, struct btrfs_free_space_header, location, key); } @@ -2821,25 +2824,25 @@ static inline void btrfs_cpu_key_to_disk disk->objectid = cpu_to_le64(cpu->objectid); }
-static inline void btrfs_node_key_to_cpu(struct extent_buffer *eb, - struct btrfs_key *key, int nr) +static inline void btrfs_node_key_to_cpu(const struct extent_buffer *eb, + struct btrfs_key *key, int nr) { struct btrfs_disk_key disk_key; btrfs_node_key(eb, &disk_key, nr); btrfs_disk_key_to_cpu(key, &disk_key); }
-static inline void btrfs_item_key_to_cpu(struct extent_buffer *eb, - struct btrfs_key *key, int nr) +static inline void btrfs_item_key_to_cpu(const struct extent_buffer *eb, + struct btrfs_key *key, int nr) { struct btrfs_disk_key disk_key; btrfs_item_key(eb, &disk_key, nr); btrfs_disk_key_to_cpu(key, &disk_key); }
-static inline void btrfs_dir_item_key_to_cpu(struct extent_buffer *eb, - struct btrfs_dir_item *item, - struct btrfs_key *key) +static inline void btrfs_dir_item_key_to_cpu(const struct extent_buffer *eb, + const struct btrfs_dir_item *item, + struct btrfs_key *key) { struct btrfs_disk_key disk_key; btrfs_dir_item_key(eb, item, &disk_key); @@ -2872,7 +2875,7 @@ BTRFS_SETGET_STACK_FUNCS(stack_header_nr nritems, 32); BTRFS_SETGET_STACK_FUNCS(stack_header_bytenr, struct btrfs_header, bytenr, 64);
-static inline int btrfs_header_flag(struct extent_buffer *eb, u64 flag) +static inline int btrfs_header_flag(const struct extent_buffer *eb, u64 flag) { return (btrfs_header_flags(eb) & flag) == flag; } @@ -2891,7 +2894,7 @@ static inline int btrfs_clear_header_fla return (flags & flag) == flag; }
-static inline int btrfs_header_backref_rev(struct extent_buffer *eb) +static inline int btrfs_header_backref_rev(const struct extent_buffer *eb) { u64 flags = btrfs_header_flags(eb); return flags >> BTRFS_BACKREF_REV_SHIFT; @@ -2911,12 +2914,12 @@ static inline unsigned long btrfs_header return offsetof(struct btrfs_header, fsid); }
-static inline unsigned long btrfs_header_chunk_tree_uuid(struct extent_buffer *eb) +static inline unsigned long btrfs_header_chunk_tree_uuid(const struct extent_buffer *eb) { return offsetof(struct btrfs_header, chunk_tree_uuid); }
-static inline int btrfs_is_leaf(struct extent_buffer *eb) +static inline int btrfs_is_leaf(const struct extent_buffer *eb) { return btrfs_header_level(eb) == 0; } @@ -2950,12 +2953,12 @@ BTRFS_SETGET_STACK_FUNCS(root_stransid, BTRFS_SETGET_STACK_FUNCS(root_rtransid, struct btrfs_root_item, rtransid, 64);
-static inline bool btrfs_root_readonly(struct btrfs_root *root) +static inline bool btrfs_root_readonly(const struct btrfs_root *root) { return (root->root_item.flags & cpu_to_le64(BTRFS_ROOT_SUBVOL_RDONLY)) != 0; }
-static inline bool btrfs_root_dead(struct btrfs_root *root) +static inline bool btrfs_root_dead(const struct btrfs_root *root) { return (root->root_item.flags & cpu_to_le64(BTRFS_ROOT_SUBVOL_DEAD)) != 0; } @@ -3012,51 +3015,51 @@ BTRFS_SETGET_STACK_FUNCS(backup_num_devi /* struct btrfs_balance_item */ BTRFS_SETGET_FUNCS(balance_flags, struct btrfs_balance_item, flags, 64);
-static inline void btrfs_balance_data(struct extent_buffer *eb, - struct btrfs_balance_item *bi, +static inline void btrfs_balance_data(const struct extent_buffer *eb, + const struct btrfs_balance_item *bi, struct btrfs_disk_balance_args *ba) { read_eb_member(eb, bi, struct btrfs_balance_item, data, ba); }
static inline void btrfs_set_balance_data(struct extent_buffer *eb, - struct btrfs_balance_item *bi, - struct btrfs_disk_balance_args *ba) + struct btrfs_balance_item *bi, + const struct btrfs_disk_balance_args *ba) { write_eb_member(eb, bi, struct btrfs_balance_item, data, ba); }
-static inline void btrfs_balance_meta(struct extent_buffer *eb, - struct btrfs_balance_item *bi, +static inline void btrfs_balance_meta(const struct extent_buffer *eb, + const struct btrfs_balance_item *bi, struct btrfs_disk_balance_args *ba) { read_eb_member(eb, bi, struct btrfs_balance_item, meta, ba); }
static inline void btrfs_set_balance_meta(struct extent_buffer *eb, - struct btrfs_balance_item *bi, - struct btrfs_disk_balance_args *ba) + struct btrfs_balance_item *bi, + const struct btrfs_disk_balance_args *ba) { write_eb_member(eb, bi, struct btrfs_balance_item, meta, ba); }
-static inline void btrfs_balance_sys(struct extent_buffer *eb, - struct btrfs_balance_item *bi, +static inline void btrfs_balance_sys(const struct extent_buffer *eb, + const struct btrfs_balance_item *bi, struct btrfs_disk_balance_args *ba) { read_eb_member(eb, bi, struct btrfs_balance_item, sys, ba); }
static inline void btrfs_set_balance_sys(struct extent_buffer *eb, - struct btrfs_balance_item *bi, - struct btrfs_disk_balance_args *ba) + struct btrfs_balance_item *bi, + const struct btrfs_disk_balance_args *ba) { write_eb_member(eb, bi, struct btrfs_balance_item, sys, ba); }
static inline void btrfs_disk_balance_args_to_cpu(struct btrfs_balance_args *cpu, - struct btrfs_disk_balance_args *disk) + const struct btrfs_disk_balance_args *disk) { memset(cpu, 0, sizeof(*cpu));
@@ -3076,7 +3079,7 @@ btrfs_disk_balance_args_to_cpu(struct bt
static inline void btrfs_cpu_balance_args_to_disk(struct btrfs_disk_balance_args *disk, - struct btrfs_balance_args *cpu) + const struct btrfs_balance_args *cpu) { memset(disk, 0, sizeof(*disk));
@@ -3144,7 +3147,7 @@ BTRFS_SETGET_STACK_FUNCS(super_magic, st BTRFS_SETGET_STACK_FUNCS(super_uuid_tree_generation, struct btrfs_super_block, uuid_tree_generation, 64);
-static inline int btrfs_super_csum_size(struct btrfs_super_block *s) +static inline int btrfs_super_csum_size(const struct btrfs_super_block *s) { u16 t = btrfs_super_csum_type(s); /* @@ -3163,8 +3166,8 @@ static inline unsigned long btrfs_leaf_d * this returns the address of the start of the last item, * which is the stop of the leaf data stack */ -static inline unsigned int leaf_data_end(struct btrfs_root *root, - struct extent_buffer *leaf) +static inline unsigned int leaf_data_end(const struct btrfs_root *root, + const struct extent_buffer *leaf) { u32 nr = btrfs_header_nritems(leaf);
@@ -3189,7 +3192,7 @@ BTRFS_SETGET_STACK_FUNCS(stack_file_exte struct btrfs_file_extent_item, compression, 8);
static inline unsigned long -btrfs_file_extent_inline_start(struct btrfs_file_extent_item *e) +btrfs_file_extent_inline_start(const struct btrfs_file_extent_item *e) { return (unsigned long)e + BTRFS_FILE_EXTENT_INLINE_DATA_START; } @@ -3223,8 +3226,9 @@ BTRFS_SETGET_FUNCS(file_extent_other_enc * size of any extent headers. If a file is compressed on disk, this is * the compressed size */ -static inline u32 btrfs_file_extent_inline_item_len(struct extent_buffer *eb, - struct btrfs_item *e) +static inline u32 btrfs_file_extent_inline_item_len( + const struct extent_buffer *eb, + struct btrfs_item *e) { return btrfs_item_size(eb, e) - BTRFS_FILE_EXTENT_INLINE_DATA_START; } @@ -3232,9 +3236,9 @@ static inline u32 btrfs_file_extent_inli /* this returns the number of file bytes represented by the inline item. * If an item is compressed, this is the uncompressed size */ -static inline u32 btrfs_file_extent_inline_len(struct extent_buffer *eb, - int slot, - struct btrfs_file_extent_item *fi) +static inline u32 btrfs_file_extent_inline_len(const struct extent_buffer *eb, + int slot, + const struct btrfs_file_extent_item *fi) { struct btrfs_map_token token;
@@ -3256,8 +3260,8 @@ static inline u32 btrfs_file_extent_inli
/* btrfs_dev_stats_item */ -static inline u64 btrfs_dev_stats_value(struct extent_buffer *eb, - struct btrfs_dev_stats_item *ptr, +static inline u64 btrfs_dev_stats_value(const struct extent_buffer *eb, + const struct btrfs_dev_stats_item *ptr, int index) { u64 val; --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5381,9 +5381,8 @@ unlock_exit: return ret; }
-void read_extent_buffer(struct extent_buffer *eb, void *dstv, - unsigned long start, - unsigned long len) +void read_extent_buffer(const struct extent_buffer *eb, void *dstv, + unsigned long start, unsigned long len) { size_t cur; size_t offset; @@ -5412,9 +5411,9 @@ void read_extent_buffer(struct extent_bu } }
-int read_extent_buffer_to_user(struct extent_buffer *eb, void __user *dstv, - unsigned long start, - unsigned long len) +int read_extent_buffer_to_user(const struct extent_buffer *eb, + void __user *dstv, + unsigned long start, unsigned long len) { size_t cur; size_t offset; @@ -5449,10 +5448,10 @@ int read_extent_buffer_to_user(struct ex return ret; }
-int map_private_extent_buffer(struct extent_buffer *eb, unsigned long start, - unsigned long min_len, char **map, - unsigned long *map_start, - unsigned long *map_len) +int map_private_extent_buffer(const struct extent_buffer *eb, + unsigned long start, unsigned long min_len, + char **map, unsigned long *map_start, + unsigned long *map_len) { size_t offset = start & (PAGE_CACHE_SIZE - 1); char *kaddr; @@ -5487,9 +5486,8 @@ int map_private_extent_buffer(struct ext return 0; }
-int memcmp_extent_buffer(struct extent_buffer *eb, const void *ptrv, - unsigned long start, - unsigned long len) +int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, + unsigned long start, unsigned long len) { size_t cur; size_t offset; --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -308,14 +308,13 @@ static inline void extent_buffer_get(str atomic_inc(&eb->refs); }
-int memcmp_extent_buffer(struct extent_buffer *eb, const void *ptrv, - unsigned long start, - unsigned long len); -void read_extent_buffer(struct extent_buffer *eb, void *dst, +int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, + unsigned long start, unsigned long len); +void read_extent_buffer(const struct extent_buffer *eb, void *dst, unsigned long start, unsigned long len); -int read_extent_buffer_to_user(struct extent_buffer *eb, void __user *dst, - unsigned long start, +int read_extent_buffer_to_user(const struct extent_buffer *eb, + void __user *dst, unsigned long start, unsigned long len); void write_extent_buffer(struct extent_buffer *eb, const void *src, unsigned long start, unsigned long len); @@ -334,10 +333,10 @@ int set_extent_buffer_uptodate(struct ex int clear_extent_buffer_uptodate(struct extent_buffer *eb); int extent_buffer_uptodate(struct extent_buffer *eb); int extent_buffer_under_io(struct extent_buffer *eb); -int map_private_extent_buffer(struct extent_buffer *eb, unsigned long offset, - unsigned long min_len, char **map, - unsigned long *map_start, - unsigned long *map_len); +int map_private_extent_buffer(const struct extent_buffer *eb, + unsigned long offset, unsigned long min_len, + char **map, unsigned long *map_start, + unsigned long *map_len); int extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end); int extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end); int extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end, --- a/fs/btrfs/struct-funcs.c +++ b/fs/btrfs/struct-funcs.c @@ -50,8 +50,8 @@ static inline void put_unaligned_le8(u8 */
#define DEFINE_BTRFS_SETGET_BITS(bits) \ -u##bits btrfs_get_token_##bits(struct extent_buffer *eb, void *ptr, \ - unsigned long off, \ +u##bits btrfs_get_token_##bits(const struct extent_buffer *eb, \ + const void *ptr, unsigned long off, \ struct btrfs_map_token *token) \ { \ unsigned long part_offset = (unsigned long)ptr; \ @@ -90,7 +90,8 @@ u##bits btrfs_get_token_##bits(struct ex return res; \ } \ void btrfs_set_token_##bits(struct extent_buffer *eb, \ - void *ptr, unsigned long off, u##bits val, \ + const void *ptr, unsigned long off, \ + u##bits val, \ struct btrfs_map_token *token) \ { \ unsigned long part_offset = (unsigned long)ptr; \ @@ -133,7 +134,7 @@ DEFINE_BTRFS_SETGET_BITS(16) DEFINE_BTRFS_SETGET_BITS(32) DEFINE_BTRFS_SETGET_BITS(64)
-void btrfs_node_key(struct extent_buffer *eb, +void btrfs_node_key(const struct extent_buffer *eb, struct btrfs_disk_key *disk_key, int nr) { unsigned long ptr = btrfs_node_key_ptr_offset(nr);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit c3267bbaa9cae09b62960eafe33ad19196803285 upstream.
Current check_leaf() function does a good job checking key order and item offset/size.
However it only checks from slot 0 to the last but one slot, this is good but makes later expansion hard.
So this refactoring iterates from slot 0 to the last slot. For key comparison, it uses a key with all 0 as initial key, so all valid keys should be larger than that.
And for item size/offset checks, it compares current item end with previous item offset. For slot 0, use leaf end as a special case.
This makes later item/key offset checks and item size checks easier to be implemented.
Also, makes check_leaf() to return -EUCLEAN other than -EIO to indicate error.
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - BTRFS_LEAF_DATA_SIZE() takes a root rather than an fs_info - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 50 +++++++++++++++++++++++++++----------------------- 1 file changed, 27 insertions(+), 23 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -531,8 +531,9 @@ static int check_tree_block_fsid(struct static noinline int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf) { + /* No valid key type is 0, so all key should be larger than this key */ + struct btrfs_key prev_key = {0, 0, 0}; struct btrfs_key key; - struct btrfs_key leaf_key; u32 nritems = btrfs_header_nritems(leaf); int slot;
@@ -565,7 +566,7 @@ static noinline int check_leaf(struct bt CORRUPT("non-root leaf's nritems is 0", leaf, check_root, 0); free_extent_buffer(eb); - return -EIO; + return -EUCLEAN; } free_extent_buffer(eb); } @@ -575,28 +576,23 @@ static noinline int check_leaf(struct bt if (nritems == 0) return 0;
- /* Check the 0 item */ - if (btrfs_item_offset_nr(leaf, 0) + btrfs_item_size_nr(leaf, 0) != - BTRFS_LEAF_DATA_SIZE(root)) { - CORRUPT("invalid item offset size pair", leaf, root, 0); - return -EIO; - } - /* - * Check to make sure each items keys are in the correct order and their - * offsets make sense. We only have to loop through nritems-1 because - * we check the current slot against the next slot, which verifies the - * next slot's offset+size makes sense and that the current's slot - * offset is correct. + * Check the following things to make sure this is a good leaf, and + * leaf users won't need to bother with similar sanity checks: + * + * 1) key order + * 2) item offset and size + * No overlap, no hole, all inside the leaf. */ - for (slot = 0; slot < nritems - 1; slot++) { - btrfs_item_key_to_cpu(leaf, &leaf_key, slot); - btrfs_item_key_to_cpu(leaf, &key, slot + 1); + for (slot = 0; slot < nritems; slot++) { + u32 item_end_expected; + + btrfs_item_key_to_cpu(leaf, &key, slot);
/* Make sure the keys are in the right order */ - if (btrfs_comp_cpu_keys(&leaf_key, &key) >= 0) { + if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) { CORRUPT("bad key order", leaf, root, slot); - return -EIO; + return -EUCLEAN; }
/* @@ -604,10 +600,14 @@ static noinline int check_leaf(struct bt * item data starts at the end of the leaf and grows towards the * front. */ - if (btrfs_item_offset_nr(leaf, slot) != - btrfs_item_end_nr(leaf, slot + 1)) { + if (slot == 0) + item_end_expected = BTRFS_LEAF_DATA_SIZE(root); + else + item_end_expected = btrfs_item_offset_nr(leaf, + slot - 1); + if (btrfs_item_end_nr(leaf, slot) != item_end_expected) { CORRUPT("slot offset bad", leaf, root, slot); - return -EIO; + return -EUCLEAN; }
/* @@ -618,8 +618,12 @@ static noinline int check_leaf(struct bt if (btrfs_item_end_nr(leaf, slot) > BTRFS_LEAF_DATA_SIZE(root)) { CORRUPT("slot end outside of leaf", leaf, root, slot); - return -EIO; + return -EUCLEAN; } + + prev_key.objectid = key.objectid; + prev_key.type = key.type; + prev_key.offset = key.offset; }
return 0;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit 7f43d4affb2a254d421ab20b0cf65ac2569909fb upstream.
Function check_leaf() checks if any item pointer points outside of the leaf, but it doesn't check if the pointer overlaps with the item itself.
Normally only the last item may be the victim, but adding such check is never a bad idea anyway.
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -621,6 +621,13 @@ static noinline int check_leaf(struct bt return -EUCLEAN; }
+ /* Also check if the item pointer overlaps with btrfs item. */ + if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) > + btrfs_item_ptr_offset(leaf, slot)) { + CORRUPT("slot overlap with its data", leaf, root, slot); + return -EUCLEAN; + } + prev_key.objectid = key.objectid; prev_key.type = key.type; prev_key.offset = key.offset;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit 40c3c40947324d9f40bf47830c92c59a9bbadf4a upstream.
Add extra checks for item with EXTENT_DATA type. This checks the following thing:
0) Key offset All key offsets must be aligned to sectorsize. Inline extent must have 0 for key offset.
1) Item size Uncompressed inline file extent size must match item size. (Compressed inline file extent has no information about its on-disk size.) Regular/preallocated file extent size must be a fixed value.
2) Every member of regular file extent item Including alignment for bytenr and offset, possible value for compression/encryption/type.
3) Type/compression/encode must be one of the valid values.
This should be the most comprehensive and strict check in the context of btrfs_item for EXTENT_DATA.
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com [ switch to BTRFS_FILE_EXTENT_TYPES, similar to what BTRFS_COMPRESS_TYPES does ] Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - Use root->sectorsize instead of root->fs_info->sectorsize - Adjust filename] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.h | 1 fs/btrfs/disk-io.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 104 insertions(+)
--- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -897,6 +897,7 @@ struct btrfs_balance_item { #define BTRFS_FILE_EXTENT_INLINE 0 #define BTRFS_FILE_EXTENT_REG 1 #define BTRFS_FILE_EXTENT_PREALLOC 2 +#define BTRFS_FILE_EXTENT_TYPES 2
struct btrfs_file_extent_item { /* --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -528,6 +528,100 @@ static int check_tree_block_fsid(struct btrfs_header_level(eb) == 0 ? "leaf" : "node",\ reason, btrfs_header_bytenr(eb), root->objectid, slot)
+static int check_extent_data_item(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_file_extent_item *fi; + u32 sectorsize = root->sectorsize; + u32 item_size = btrfs_item_size_nr(leaf, slot); + + if (!IS_ALIGNED(key->offset, sectorsize)) { + CORRUPT("unaligned key offset for file extent", + leaf, root, slot); + return -EUCLEAN; + } + + fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); + + if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) { + CORRUPT("invalid file extent type", leaf, root, slot); + return -EUCLEAN; + } + + /* + * Support for new compression/encrption must introduce incompat flag, + * and must be caught in open_ctree(). + */ + if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) { + CORRUPT("invalid file extent compression", leaf, root, slot); + return -EUCLEAN; + } + if (btrfs_file_extent_encryption(leaf, fi)) { + CORRUPT("invalid file extent encryption", leaf, root, slot); + return -EUCLEAN; + } + if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { + /* Inline extent must have 0 as key offset */ + if (key->offset) { + CORRUPT("inline extent has non-zero key offset", + leaf, root, slot); + return -EUCLEAN; + } + + /* Compressed inline extent has no on-disk size, skip it */ + if (btrfs_file_extent_compression(leaf, fi) != + BTRFS_COMPRESS_NONE) + return 0; + + /* Uncompressed inline extent size must match item size */ + if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START + + btrfs_file_extent_ram_bytes(leaf, fi)) { + CORRUPT("plaintext inline extent has invalid size", + leaf, root, slot); + return -EUCLEAN; + } + return 0; + } + + /* Regular or preallocated extent has fixed item size */ + if (item_size != sizeof(*fi)) { + CORRUPT( + "regluar or preallocated extent data item size is invalid", + leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) { + CORRUPT( + "regular or preallocated extent data item has unaligned value", + leaf, root, slot); + return -EUCLEAN; + } + + return 0; +} + +/* + * Common point to switch the item-specific validation. + */ +static int check_leaf_item(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + int ret = 0; + + switch (key->type) { + case BTRFS_EXTENT_DATA_KEY: + ret = check_extent_data_item(root, leaf, key, slot); + break; + } + return ret; +} + static noinline int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf) { @@ -583,9 +677,13 @@ static noinline int check_leaf(struct bt * 1) key order * 2) item offset and size * No overlap, no hole, all inside the leaf. + * 3) item content + * If possible, do comprehensive sanity check. + * NOTE: All checks must only rely on the item data itself. */ for (slot = 0; slot < nritems; slot++) { u32 item_end_expected; + int ret;
btrfs_item_key_to_cpu(leaf, &key, slot);
@@ -628,6 +726,11 @@ static noinline int check_leaf(struct bt return -EUCLEAN; }
+ /* Check if the item size and content meet other criteria */ + ret = check_leaf_item(root, leaf, &key, slot); + if (ret < 0) + return ret; + prev_key.objectid = key.objectid; prev_key.type = key.type; prev_key.offset = key.offset;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit 4b865cab96fe2a30ed512cf667b354bd291b3b0a upstream.
EXTENT_CSUM checker is a relatively easy one, only needs to check:
1) Objectid Fixed to BTRFS_EXTENT_CSUM_OBJECTID
2) Key offset alignment Must be aligned to sectorsize
3) Item size alignedment Must be aligned to csum size
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: Use root->sectorsize instead of root->fs_info->sectorsize] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -605,6 +605,27 @@ static int check_extent_data_item(struct return 0; }
+static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + u32 sectorsize = root->sectorsize; + u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy); + + if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) { + CORRUPT("invalid objectid for csum item", leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(key->offset, sectorsize)) { + CORRUPT("unaligned key offset for csum item", leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) { + CORRUPT("unaligned csum item size", leaf, root, slot); + return -EUCLEAN; + } + return 0; +} + /* * Common point to switch the item-specific validation. */ @@ -618,6 +639,9 @@ static int check_leaf_item(struct btrfs_ case BTRFS_EXTENT_DATA_KEY: ret = check_extent_data_item(root, leaf, key, slot); break; + case BTRFS_EXTENT_CSUM_KEY: + ret = check_csum_item(root, leaf, key, slot); + break; } return ret; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit 557ea5dd003d371536f6b4e8f7c8209a2b6fd4e3 upstream.
It's no doubt the comprehensive tree block checker will become larger, so moving them into their own files is quite reasonable.
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com [ wording adjustments ] Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - The moved code is slightly different - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/Makefile | 2 fs/btrfs/disk-io.c | 284 -------------------------------------------- fs/btrfs/tree-checker.c | 309 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/tree-checker.h | 26 ++++ 4 files changed, 340 insertions(+), 281 deletions(-) create mode 100644 fs/btrfs/tree-checker.c create mode 100644 fs/btrfs/tree-checker.h
--- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o export.o tree-log.o free-space-cache.o zlib.o lzo.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ - uuid-tree.o props.o hash.o + uuid-tree.o props.o hash.o tree-checker.o
btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -49,6 +49,7 @@ #include "raid56.h" #include "sysfs.h" #include "qgroup.h" +#include "tree-checker.h"
#ifdef CONFIG_X86 #include <asm/cpufeature.h> @@ -522,283 +523,6 @@ static int check_tree_block_fsid(struct return ret; }
-#define CORRUPT(reason, eb, root, slot) \ - btrfs_crit(root->fs_info, "corrupt %s, %s: block=%llu," \ - " root=%llu, slot=%d", \ - btrfs_header_level(eb) == 0 ? "leaf" : "node",\ - reason, btrfs_header_bytenr(eb), root->objectid, slot) - -static int check_extent_data_item(struct btrfs_root *root, - struct extent_buffer *leaf, - struct btrfs_key *key, int slot) -{ - struct btrfs_file_extent_item *fi; - u32 sectorsize = root->sectorsize; - u32 item_size = btrfs_item_size_nr(leaf, slot); - - if (!IS_ALIGNED(key->offset, sectorsize)) { - CORRUPT("unaligned key offset for file extent", - leaf, root, slot); - return -EUCLEAN; - } - - fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); - - if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) { - CORRUPT("invalid file extent type", leaf, root, slot); - return -EUCLEAN; - } - - /* - * Support for new compression/encrption must introduce incompat flag, - * and must be caught in open_ctree(). - */ - if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) { - CORRUPT("invalid file extent compression", leaf, root, slot); - return -EUCLEAN; - } - if (btrfs_file_extent_encryption(leaf, fi)) { - CORRUPT("invalid file extent encryption", leaf, root, slot); - return -EUCLEAN; - } - if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { - /* Inline extent must have 0 as key offset */ - if (key->offset) { - CORRUPT("inline extent has non-zero key offset", - leaf, root, slot); - return -EUCLEAN; - } - - /* Compressed inline extent has no on-disk size, skip it */ - if (btrfs_file_extent_compression(leaf, fi) != - BTRFS_COMPRESS_NONE) - return 0; - - /* Uncompressed inline extent size must match item size */ - if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START + - btrfs_file_extent_ram_bytes(leaf, fi)) { - CORRUPT("plaintext inline extent has invalid size", - leaf, root, slot); - return -EUCLEAN; - } - return 0; - } - - /* Regular or preallocated extent has fixed item size */ - if (item_size != sizeof(*fi)) { - CORRUPT( - "regluar or preallocated extent data item size is invalid", - leaf, root, slot); - return -EUCLEAN; - } - if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) || - !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) || - !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) || - !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) || - !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) { - CORRUPT( - "regular or preallocated extent data item has unaligned value", - leaf, root, slot); - return -EUCLEAN; - } - - return 0; -} - -static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf, - struct btrfs_key *key, int slot) -{ - u32 sectorsize = root->sectorsize; - u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy); - - if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) { - CORRUPT("invalid objectid for csum item", leaf, root, slot); - return -EUCLEAN; - } - if (!IS_ALIGNED(key->offset, sectorsize)) { - CORRUPT("unaligned key offset for csum item", leaf, root, slot); - return -EUCLEAN; - } - if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) { - CORRUPT("unaligned csum item size", leaf, root, slot); - return -EUCLEAN; - } - return 0; -} - -/* - * Common point to switch the item-specific validation. - */ -static int check_leaf_item(struct btrfs_root *root, - struct extent_buffer *leaf, - struct btrfs_key *key, int slot) -{ - int ret = 0; - - switch (key->type) { - case BTRFS_EXTENT_DATA_KEY: - ret = check_extent_data_item(root, leaf, key, slot); - break; - case BTRFS_EXTENT_CSUM_KEY: - ret = check_csum_item(root, leaf, key, slot); - break; - } - return ret; -} - -static noinline int check_leaf(struct btrfs_root *root, - struct extent_buffer *leaf) -{ - /* No valid key type is 0, so all key should be larger than this key */ - struct btrfs_key prev_key = {0, 0, 0}; - struct btrfs_key key; - u32 nritems = btrfs_header_nritems(leaf); - int slot; - - /* - * Extent buffers from a relocation tree have a owner field that - * corresponds to the subvolume tree they are based on. So just from an - * extent buffer alone we can not find out what is the id of the - * corresponding subvolume tree, so we can not figure out if the extent - * buffer corresponds to the root of the relocation tree or not. So skip - * this check for relocation trees. - */ - if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) { - struct btrfs_root *check_root; - - key.objectid = btrfs_header_owner(leaf); - key.type = BTRFS_ROOT_ITEM_KEY; - key.offset = (u64)-1; - - check_root = btrfs_get_fs_root(root->fs_info, &key, false); - /* - * The only reason we also check NULL here is that during - * open_ctree() some roots has not yet been set up. - */ - if (!IS_ERR_OR_NULL(check_root)) { - struct extent_buffer *eb; - - eb = btrfs_root_node(check_root); - /* if leaf is the root, then it's fine */ - if (leaf != eb) { - CORRUPT("non-root leaf's nritems is 0", - leaf, check_root, 0); - free_extent_buffer(eb); - return -EUCLEAN; - } - free_extent_buffer(eb); - } - return 0; - } - - if (nritems == 0) - return 0; - - /* - * Check the following things to make sure this is a good leaf, and - * leaf users won't need to bother with similar sanity checks: - * - * 1) key order - * 2) item offset and size - * No overlap, no hole, all inside the leaf. - * 3) item content - * If possible, do comprehensive sanity check. - * NOTE: All checks must only rely on the item data itself. - */ - for (slot = 0; slot < nritems; slot++) { - u32 item_end_expected; - int ret; - - btrfs_item_key_to_cpu(leaf, &key, slot); - - /* Make sure the keys are in the right order */ - if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) { - CORRUPT("bad key order", leaf, root, slot); - return -EUCLEAN; - } - - /* - * Make sure the offset and ends are right, remember that the - * item data starts at the end of the leaf and grows towards the - * front. - */ - if (slot == 0) - item_end_expected = BTRFS_LEAF_DATA_SIZE(root); - else - item_end_expected = btrfs_item_offset_nr(leaf, - slot - 1); - if (btrfs_item_end_nr(leaf, slot) != item_end_expected) { - CORRUPT("slot offset bad", leaf, root, slot); - return -EUCLEAN; - } - - /* - * Check to make sure that we don't point outside of the leaf, - * just incase all the items are consistent to eachother, but - * all point outside of the leaf. - */ - if (btrfs_item_end_nr(leaf, slot) > - BTRFS_LEAF_DATA_SIZE(root)) { - CORRUPT("slot end outside of leaf", leaf, root, slot); - return -EUCLEAN; - } - - /* Also check if the item pointer overlaps with btrfs item. */ - if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) > - btrfs_item_ptr_offset(leaf, slot)) { - CORRUPT("slot overlap with its data", leaf, root, slot); - return -EUCLEAN; - } - - /* Check if the item size and content meet other criteria */ - ret = check_leaf_item(root, leaf, &key, slot); - if (ret < 0) - return ret; - - prev_key.objectid = key.objectid; - prev_key.type = key.type; - prev_key.offset = key.offset; - } - - return 0; -} - -static int check_node(struct btrfs_root *root, struct extent_buffer *node) -{ - unsigned long nr = btrfs_header_nritems(node); - struct btrfs_key key, next_key; - int slot; - u64 bytenr; - int ret = 0; - - if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { - btrfs_crit(root->fs_info, - "corrupt node: block %llu root %llu nritems %lu", - node->start, root->objectid, nr); - return -EIO; - } - - for (slot = 0; slot < nr - 1; slot++) { - bytenr = btrfs_node_blockptr(node, slot); - btrfs_node_key_to_cpu(node, &key, slot); - btrfs_node_key_to_cpu(node, &next_key, slot + 1); - - if (!bytenr) { - CORRUPT("invalid item slot", node, root, slot); - ret = -EIO; - goto out; - } - - if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) { - CORRUPT("bad key order", node, root, slot); - ret = -EIO; - goto out; - } - } -out: - return ret; -} - static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 phy_offset, struct page *page, u64 start, u64 end, int mirror) @@ -865,12 +589,12 @@ static int btree_readpage_end_io_hook(st * that we don't try and read the other copies of this block, just * return -EIO. */ - if (found_level == 0 && check_leaf(root, eb)) { + if (found_level == 0 && btrfs_check_leaf(root, eb)) { set_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags); ret = -EIO; }
- if (found_level > 0 && check_node(root, eb)) + if (found_level > 0 && btrfs_check_node(root, eb)) ret = -EIO;
if (!ret) @@ -4172,7 +3896,7 @@ void btrfs_mark_buffer_dirty(struct exte buf->len, root->fs_info->dirty_metadata_batch); #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY - if (btrfs_header_level(buf) == 0 && check_leaf(root, buf)) { + if (btrfs_header_level(buf) == 0 && btrfs_check_leaf(root, buf)) { btrfs_print_leaf(root, buf); ASSERT(0); } --- /dev/null +++ b/fs/btrfs/tree-checker.c @@ -0,0 +1,309 @@ +/* + * Copyright (C) Qu Wenruo 2017. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program. + */ + +/* + * The module is used to catch unexpected/corrupted tree block data. + * Such behavior can be caused either by a fuzzed image or bugs. + * + * The objective is to do leaf/node validation checks when tree block is read + * from disk, and check *every* possible member, so other code won't + * need to checking them again. + * + * Due to the potential and unwanted damage, every checker needs to be + * carefully reviewed otherwise so it does not prevent mount of valid images. + */ + +#include "ctree.h" +#include "tree-checker.h" +#include "disk-io.h" +#include "compression.h" + +#define CORRUPT(reason, eb, root, slot) \ + btrfs_crit(root->fs_info, \ + "corrupt %s, %s: block=%llu, root=%llu, slot=%d", \ + btrfs_header_level(eb) == 0 ? "leaf" : "node", \ + reason, btrfs_header_bytenr(eb), root->objectid, slot) + +static int check_extent_data_item(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_file_extent_item *fi; + u32 sectorsize = root->sectorsize; + u32 item_size = btrfs_item_size_nr(leaf, slot); + + if (!IS_ALIGNED(key->offset, sectorsize)) { + CORRUPT("unaligned key offset for file extent", + leaf, root, slot); + return -EUCLEAN; + } + + fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); + + if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) { + CORRUPT("invalid file extent type", leaf, root, slot); + return -EUCLEAN; + } + + /* + * Support for new compression/encrption must introduce incompat flag, + * and must be caught in open_ctree(). + */ + if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) { + CORRUPT("invalid file extent compression", leaf, root, slot); + return -EUCLEAN; + } + if (btrfs_file_extent_encryption(leaf, fi)) { + CORRUPT("invalid file extent encryption", leaf, root, slot); + return -EUCLEAN; + } + if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { + /* Inline extent must have 0 as key offset */ + if (key->offset) { + CORRUPT("inline extent has non-zero key offset", + leaf, root, slot); + return -EUCLEAN; + } + + /* Compressed inline extent has no on-disk size, skip it */ + if (btrfs_file_extent_compression(leaf, fi) != + BTRFS_COMPRESS_NONE) + return 0; + + /* Uncompressed inline extent size must match item size */ + if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START + + btrfs_file_extent_ram_bytes(leaf, fi)) { + CORRUPT("plaintext inline extent has invalid size", + leaf, root, slot); + return -EUCLEAN; + } + return 0; + } + + /* Regular or preallocated extent has fixed item size */ + if (item_size != sizeof(*fi)) { + CORRUPT( + "regluar or preallocated extent data item size is invalid", + leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) || + !IS_ALIGNED(btrfs_file_extent_num_bytes(leaf, fi), sectorsize)) { + CORRUPT( + "regular or preallocated extent data item has unaligned value", + leaf, root, slot); + return -EUCLEAN; + } + + return 0; +} + +static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + u32 sectorsize = root->sectorsize; + u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy); + + if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) { + CORRUPT("invalid objectid for csum item", leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(key->offset, sectorsize)) { + CORRUPT("unaligned key offset for csum item", leaf, root, slot); + return -EUCLEAN; + } + if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) { + CORRUPT("unaligned csum item size", leaf, root, slot); + return -EUCLEAN; + } + return 0; +} + +/* + * Common point to switch the item-specific validation. + */ +static int check_leaf_item(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + int ret = 0; + + switch (key->type) { + case BTRFS_EXTENT_DATA_KEY: + ret = check_extent_data_item(root, leaf, key, slot); + break; + case BTRFS_EXTENT_CSUM_KEY: + ret = check_csum_item(root, leaf, key, slot); + break; + } + return ret; +} + +int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf) +{ + struct btrfs_fs_info *fs_info = root->fs_info; + /* No valid key type is 0, so all key should be larger than this key */ + struct btrfs_key prev_key = {0, 0, 0}; + struct btrfs_key key; + u32 nritems = btrfs_header_nritems(leaf); + int slot; + + /* + * Extent buffers from a relocation tree have a owner field that + * corresponds to the subvolume tree they are based on. So just from an + * extent buffer alone we can not find out what is the id of the + * corresponding subvolume tree, so we can not figure out if the extent + * buffer corresponds to the root of the relocation tree or not. So + * skip this check for relocation trees. + */ + if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) { + struct btrfs_root *check_root; + + key.objectid = btrfs_header_owner(leaf); + key.type = BTRFS_ROOT_ITEM_KEY; + key.offset = (u64)-1; + + check_root = btrfs_get_fs_root(fs_info, &key, false); + /* + * The only reason we also check NULL here is that during + * open_ctree() some roots has not yet been set up. + */ + if (!IS_ERR_OR_NULL(check_root)) { + struct extent_buffer *eb; + + eb = btrfs_root_node(check_root); + /* if leaf is the root, then it's fine */ + if (leaf != eb) { + CORRUPT("non-root leaf's nritems is 0", + leaf, check_root, 0); + free_extent_buffer(eb); + return -EUCLEAN; + } + free_extent_buffer(eb); + } + return 0; + } + + if (nritems == 0) + return 0; + + /* + * Check the following things to make sure this is a good leaf, and + * leaf users won't need to bother with similar sanity checks: + * + * 1) key ordering + * 2) item offset and size + * No overlap, no hole, all inside the leaf. + * 3) item content + * If possible, do comprehensive sanity check. + * NOTE: All checks must only rely on the item data itself. + */ + for (slot = 0; slot < nritems; slot++) { + u32 item_end_expected; + int ret; + + btrfs_item_key_to_cpu(leaf, &key, slot); + + /* Make sure the keys are in the right order */ + if (btrfs_comp_cpu_keys(&prev_key, &key) >= 0) { + CORRUPT("bad key order", leaf, root, slot); + return -EUCLEAN; + } + + /* + * Make sure the offset and ends are right, remember that the + * item data starts at the end of the leaf and grows towards the + * front. + */ + if (slot == 0) + item_end_expected = BTRFS_LEAF_DATA_SIZE(root); + else + item_end_expected = btrfs_item_offset_nr(leaf, + slot - 1); + if (btrfs_item_end_nr(leaf, slot) != item_end_expected) { + CORRUPT("slot offset bad", leaf, root, slot); + return -EUCLEAN; + } + + /* + * Check to make sure that we don't point outside of the leaf, + * just in case all the items are consistent to each other, but + * all point outside of the leaf. + */ + if (btrfs_item_end_nr(leaf, slot) > + BTRFS_LEAF_DATA_SIZE(root)) { + CORRUPT("slot end outside of leaf", leaf, root, slot); + return -EUCLEAN; + } + + /* Also check if the item pointer overlaps with btrfs item. */ + if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) > + btrfs_item_ptr_offset(leaf, slot)) { + CORRUPT("slot overlap with its data", leaf, root, slot); + return -EUCLEAN; + } + + /* Check if the item size and content meet other criteria */ + ret = check_leaf_item(root, leaf, &key, slot); + if (ret < 0) + return ret; + + prev_key.objectid = key.objectid; + prev_key.type = key.type; + prev_key.offset = key.offset; + } + + return 0; +} + +int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node) +{ + unsigned long nr = btrfs_header_nritems(node); + struct btrfs_key key, next_key; + int slot; + u64 bytenr; + int ret = 0; + + if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { + btrfs_crit(root->fs_info, + "corrupt node: block %llu root %llu nritems %lu", + node->start, root->objectid, nr); + return -EIO; + } + + for (slot = 0; slot < nr - 1; slot++) { + bytenr = btrfs_node_blockptr(node, slot); + btrfs_node_key_to_cpu(node, &key, slot); + btrfs_node_key_to_cpu(node, &next_key, slot + 1); + + if (!bytenr) { + CORRUPT("invalid item slot", node, root, slot); + ret = -EIO; + goto out; + } + + if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) { + CORRUPT("bad key order", node, root, slot); + ret = -EIO; + goto out; + } + } +out: + return ret; +} --- /dev/null +++ b/fs/btrfs/tree-checker.h @@ -0,0 +1,26 @@ +/* + * Copyright (C) Qu Wenruo 2017. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program. + */ + +#ifndef __BTRFS_TREE_CHECKER__ +#define __BTRFS_TREE_CHECKER__ + +#include "ctree.h" +#include "extent_io.h" + +int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf); +int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node); + +#endif
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo quwenruo.btrfs@gmx.com
commit bba4f29896c986c4cec17bc0f19f2ce644fceae1 upstream.
Use inline function to replace macro since we don't need stringification. (Macro still exists until all callers get updated)
And add more info about the error, and replace EIO with EUCLEAN.
For nr_items error, report if it's too large or too small, and output the valid value range.
For node block pointer, added a new alignment checker.
For key order, also output the next key to make the problem more obvious.
Signed-off-by: Qu Wenruo quwenruo.btrfs@gmx.com [ wording adjustments, unindented long strings ] Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - Use root->sectorsize instead of root->fs_info->sectorsize - BTRFS_NODEPTRS_PER_BLOCK() takes a root instead of an fs_info, and yields a value of type size_t instead of unsigned int] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 68 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 61 insertions(+), 7 deletions(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -37,6 +37,46 @@ btrfs_header_level(eb) == 0 ? "leaf" : "node", \ reason, btrfs_header_bytenr(eb), root->objectid, slot)
+/* + * Error message should follow the following format: + * corrupt <type>: <identifier>, <reason>[, <bad_value>] + * + * @type: leaf or node + * @identifier: the necessary info to locate the leaf/node. + * It's recommened to decode key.objecitd/offset if it's + * meaningful. + * @reason: describe the error + * @bad_value: optional, it's recommened to output bad value and its + * expected value (range). + * + * Since comma is used to separate the components, only space is allowed + * inside each component. + */ + +/* + * Append generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to @fmt. + * Allows callers to customize the output. + */ +__printf(4, 5) +static void generic_err(const struct btrfs_root *root, + const struct extent_buffer *eb, int slot, + const char *fmt, ...) +{ + struct va_format vaf; + va_list args; + + va_start(args, fmt); + + vaf.fmt = fmt; + vaf.va = &args; + + btrfs_crit(root->fs_info, + "corrupt %s: root=%llu block=%llu slot=%d, %pV", + btrfs_header_level(eb) == 0 ? "leaf" : "node", + root->objectid, btrfs_header_bytenr(eb), slot, &vaf); + va_end(args); +} + static int check_extent_data_item(struct btrfs_root *root, struct extent_buffer *leaf, struct btrfs_key *key, int slot) @@ -282,9 +322,11 @@ int btrfs_check_node(struct btrfs_root *
if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { btrfs_crit(root->fs_info, - "corrupt node: block %llu root %llu nritems %lu", - node->start, root->objectid, nr); - return -EIO; +"corrupt node: root=%llu block=%llu, nritems too %s, have %lu expect range [1,%zu]", + root->objectid, node->start, + nr == 0 ? "small" : "large", nr, + BTRFS_NODEPTRS_PER_BLOCK(root)); + return -EUCLEAN; }
for (slot = 0; slot < nr - 1; slot++) { @@ -293,14 +335,26 @@ int btrfs_check_node(struct btrfs_root * btrfs_node_key_to_cpu(node, &next_key, slot + 1);
if (!bytenr) { - CORRUPT("invalid item slot", node, root, slot); - ret = -EIO; + generic_err(root, node, slot, + "invalid NULL node pointer"); + ret = -EUCLEAN; + goto out; + } + if (!IS_ALIGNED(bytenr, root->sectorsize)) { + generic_err(root, node, slot, + "unaligned pointer, have %llu should be aligned to %u", + bytenr, root->sectorsize); + ret = -EUCLEAN; goto out; }
if (btrfs_comp_cpu_keys(&key, &next_key) >= 0) { - CORRUPT("bad key order", node, root, slot); - ret = -EIO; + generic_err(root, node, slot, + "bad key order, current (%llu %u %llu) next (%llu %u %llu)", + key.objectid, key.type, key.offset, + next_key.objectid, next_key.type, + next_key.offset); + ret = -EUCLEAN; goto out; } }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit 69fc6cbbac542c349b3d350d10f6e394c253c81d upstream.
[BUG] If we run btrfs with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y, it will instantly cause kernel panic like:
------ ... assertion failed: 0, file: fs/btrfs/disk-io.c, line: 3853 ... Call Trace: btrfs_mark_buffer_dirty+0x187/0x1f0 [btrfs] setup_items_for_insert+0x385/0x650 [btrfs] __btrfs_drop_extents+0x129a/0x1870 [btrfs] ... -----
[Cause] Btrfs will call btrfs_check_leaf() in btrfs_mark_buffer_dirty() to check if the leaf is valid with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y.
However quite some btrfs_mark_buffer_dirty() callers(*) don't really initialize its item data but only initialize its item pointers, leaving item data uninitialized.
This makes tree-checker catch uninitialized data as error, causing such panic.
*: These callers include but not limited to setup_items_for_insert() btrfs_split_item() btrfs_expand_item()
[Fix] Add a new parameter @check_item_data to btrfs_check_leaf(). With @check_item_data set to false, item data check will be skipped and fallback to old btrfs_check_leaf() behavior.
So we can still get early warning if we screw up item pointers, and avoid false panic.
Cc: Filipe Manana fdmanana@gmail.com Reported-by: Lakshmipathi.G lakshmipathi.g@gmail.com Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Liu Bo bo.li.liu@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 10 ++++++++-- fs/btrfs/tree-checker.c | 27 ++++++++++++++++++++++----- fs/btrfs/tree-checker.h | 14 +++++++++++++- 3 files changed, 43 insertions(+), 8 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -589,7 +589,7 @@ static int btree_readpage_end_io_hook(st * that we don't try and read the other copies of this block, just * return -EIO. */ - if (found_level == 0 && btrfs_check_leaf(root, eb)) { + if (found_level == 0 && btrfs_check_leaf_full(root, eb)) { set_bit(EXTENT_BUFFER_CORRUPT, &eb->bflags); ret = -EIO; } @@ -3896,7 +3896,13 @@ void btrfs_mark_buffer_dirty(struct exte buf->len, root->fs_info->dirty_metadata_batch); #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY - if (btrfs_header_level(buf) == 0 && btrfs_check_leaf(root, buf)) { + /* + * Since btrfs_mark_buffer_dirty() can be called with item pointer set + * but item data not updated. + * So here we should only check item pointers, not item data. + */ + if (btrfs_header_level(buf) == 0 && + btrfs_check_leaf_relaxed(root, buf)) { btrfs_print_leaf(root, buf); ASSERT(0); } --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -195,7 +195,8 @@ static int check_leaf_item(struct btrfs_ return ret; }
-int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf) +static int check_leaf(struct btrfs_root *root, struct extent_buffer *leaf, + bool check_item_data) { struct btrfs_fs_info *fs_info = root->fs_info; /* No valid key type is 0, so all key should be larger than this key */ @@ -299,10 +300,15 @@ int btrfs_check_leaf(struct btrfs_root * return -EUCLEAN; }
- /* Check if the item size and content meet other criteria */ - ret = check_leaf_item(root, leaf, &key, slot); - if (ret < 0) - return ret; + if (check_item_data) { + /* + * Check if the item size and content meet other + * criteria + */ + ret = check_leaf_item(root, leaf, &key, slot); + if (ret < 0) + return ret; + }
prev_key.objectid = key.objectid; prev_key.type = key.type; @@ -312,6 +318,17 @@ int btrfs_check_leaf(struct btrfs_root * return 0; }
+int btrfs_check_leaf_full(struct btrfs_root *root, struct extent_buffer *leaf) +{ + return check_leaf(root, leaf, true); +} + +int btrfs_check_leaf_relaxed(struct btrfs_root *root, + struct extent_buffer *leaf) +{ + return check_leaf(root, leaf, false); +} + int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node) { unsigned long nr = btrfs_header_nritems(node); --- a/fs/btrfs/tree-checker.h +++ b/fs/btrfs/tree-checker.h @@ -20,7 +20,19 @@ #include "ctree.h" #include "extent_io.h"
-int btrfs_check_leaf(struct btrfs_root *root, struct extent_buffer *leaf); +/* + * Comprehensive leaf checker. + * Will check not only the item pointers, but also every possible member + * in item data. + */ +int btrfs_check_leaf_full(struct btrfs_root *root, struct extent_buffer *leaf); + +/* + * Less strict leaf checker. + * Will only check item pointers, not reading item data. + */ +int btrfs_check_leaf_relaxed(struct btrfs_root *root, + struct extent_buffer *leaf); int btrfs_check_node(struct btrfs_root *root, struct extent_buffer *node);
#endif
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit ad7b0368f33cffe67fecd302028915926e50ef7e upstream.
Add checker for dir item, for key types DIR_ITEM, DIR_INDEX and XATTR_ITEM.
This checker does comprehensive checks for:
1) dir_item header and its data size Against item boundary and maximum name/xattr length. This part is mostly the same as old verify_dir_item().
2) dir_type Against maximum file types, and against key type. Since XATTR key should only have FT_XATTR dir item, and normal dir item type should not have XATTR key.
The check between key->type and dir_type is newly introduced by this patch.
3) name hash For XATTR and DIR_ITEM key, key->offset is name hash (crc32c). Check the hash of the name against the key to ensure it's correct.
The name hash check is only found in btrfs-progs before this patch.
Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: Su Yue suy.fnst@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: BTRFS_MAX_XATTR_SIZE() takes a root instead of an fs_info, and yields a value of type size_t instead of unsigned int] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 141 insertions(+)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -30,6 +30,7 @@ #include "tree-checker.h" #include "disk-io.h" #include "compression.h" +#include "hash.h"
#define CORRUPT(reason, eb, root, slot) \ btrfs_crit(root->fs_info, \ @@ -176,6 +177,141 @@ static int check_csum_item(struct btrfs_ }
/* + * Customized reported for dir_item, only important new info is key->objectid, + * which represents inode number + */ +__printf(4, 5) +static void dir_item_err(const struct btrfs_root *root, + const struct extent_buffer *eb, int slot, + const char *fmt, ...) +{ + struct btrfs_key key; + struct va_format vaf; + va_list args; + + btrfs_item_key_to_cpu(eb, &key, slot); + va_start(args, fmt); + + vaf.fmt = fmt; + vaf.va = &args; + + btrfs_crit(root->fs_info, + "corrupt %s: root=%llu block=%llu slot=%d ino=%llu, %pV", + btrfs_header_level(eb) == 0 ? "leaf" : "node", root->objectid, + btrfs_header_bytenr(eb), slot, key.objectid, &vaf); + va_end(args); +} + +static int check_dir_item(struct btrfs_root *root, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_dir_item *di; + u32 item_size = btrfs_item_size_nr(leaf, slot); + u32 cur = 0; + + di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item); + while (cur < item_size) { + char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)]; + u32 name_len; + u32 data_len; + u32 max_name_len; + u32 total_size; + u32 name_hash; + u8 dir_type; + + /* header itself should not cross item boundary */ + if (cur + sizeof(*di) > item_size) { + dir_item_err(root, leaf, slot, + "dir item header crosses item boundary, have %lu boundary %u", + cur + sizeof(*di), item_size); + return -EUCLEAN; + } + + /* dir type check */ + dir_type = btrfs_dir_type(leaf, di); + if (dir_type >= BTRFS_FT_MAX) { + dir_item_err(root, leaf, slot, + "invalid dir item type, have %u expect [0, %u)", + dir_type, BTRFS_FT_MAX); + return -EUCLEAN; + } + + if (key->type == BTRFS_XATTR_ITEM_KEY && + dir_type != BTRFS_FT_XATTR) { + dir_item_err(root, leaf, slot, + "invalid dir item type for XATTR key, have %u expect %u", + dir_type, BTRFS_FT_XATTR); + return -EUCLEAN; + } + if (dir_type == BTRFS_FT_XATTR && + key->type != BTRFS_XATTR_ITEM_KEY) { + dir_item_err(root, leaf, slot, + "xattr dir type found for non-XATTR key"); + return -EUCLEAN; + } + if (dir_type == BTRFS_FT_XATTR) + max_name_len = XATTR_NAME_MAX; + else + max_name_len = BTRFS_NAME_LEN; + + /* Name/data length check */ + name_len = btrfs_dir_name_len(leaf, di); + data_len = btrfs_dir_data_len(leaf, di); + if (name_len > max_name_len) { + dir_item_err(root, leaf, slot, + "dir item name len too long, have %u max %u", + name_len, max_name_len); + return -EUCLEAN; + } + if (name_len + data_len > BTRFS_MAX_XATTR_SIZE(root)) { + dir_item_err(root, leaf, slot, + "dir item name and data len too long, have %u max %zu", + name_len + data_len, + BTRFS_MAX_XATTR_SIZE(root)); + return -EUCLEAN; + } + + if (data_len && dir_type != BTRFS_FT_XATTR) { + dir_item_err(root, leaf, slot, + "dir item with invalid data len, have %u expect 0", + data_len); + return -EUCLEAN; + } + + total_size = sizeof(*di) + name_len + data_len; + + /* header and name/data should not cross item boundary */ + if (cur + total_size > item_size) { + dir_item_err(root, leaf, slot, + "dir item data crosses item boundary, have %u boundary %u", + cur + total_size, item_size); + return -EUCLEAN; + } + + /* + * Special check for XATTR/DIR_ITEM, as key->offset is name + * hash, should match its name + */ + if (key->type == BTRFS_DIR_ITEM_KEY || + key->type == BTRFS_XATTR_ITEM_KEY) { + read_extent_buffer(leaf, namebuf, + (unsigned long)(di + 1), name_len); + name_hash = btrfs_name_hash(namebuf, name_len); + if (key->offset != name_hash) { + dir_item_err(root, leaf, slot, + "name hash mismatch with key, have 0x%016x expect 0x%016llx", + name_hash, key->offset); + return -EUCLEAN; + } + } + cur += total_size; + di = (struct btrfs_dir_item *)((void *)di + total_size); + } + return 0; +} + +/* * Common point to switch the item-specific validation. */ static int check_leaf_item(struct btrfs_root *root, @@ -191,6 +327,11 @@ static int check_leaf_item(struct btrfs_ case BTRFS_EXTENT_CSUM_KEY: ret = check_csum_item(root, leaf, key, slot); break; + case BTRFS_DIR_ITEM_KEY: + case BTRFS_DIR_INDEX_KEY: + case BTRFS_XATTR_ITEM_KEY: + ret = check_dir_item(root, leaf, key, slot); + break; } return ret; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 7cfad65297bfe0aa2996cd72d21c898aa84436d9 upstream.
The return value of sizeof() is of type size_t, so we must print it using the %z format modifier rather than %l to avoid this warning on some architectures:
fs/btrfs/tree-checker.c: In function 'check_dir_item': fs/btrfs/tree-checker.c:273:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Werror=format=]
Fixes: 005887f2e3e0 ("btrfs: tree-checker: Add checker for dir item") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -223,7 +223,7 @@ static int check_dir_item(struct btrfs_r /* header itself should not cross item boundary */ if (cur + sizeof(*di) > item_size) { dir_item_err(root, leaf, slot, - "dir item header crosses item boundary, have %lu boundary %u", + "dir item header crosses item boundary, have %zu boundary %u", cur + sizeof(*di), item_size); return -EUCLEAN; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
commit e2683fc9d219430f5b78889b50cde7f40efeba7b upstream.
I've noticed that the updated item checker stack consumption increased dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker for dir item")
tree-checker.c:check_leaf +552 (176 -> 728)
The array is 255 bytes long, dynamic allocation would slow down the sanity checks so it's more reasonable to keep it on-stack. Moving the variable to the scope of use reduces the stack usage again
tree-checker.c:check_leaf -264 (728 -> 464)
Reviewed-by: Josef Bacik jbacik@fb.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -212,7 +212,6 @@ static int check_dir_item(struct btrfs_r
di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item); while (cur < item_size) { - char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)]; u32 name_len; u32 data_len; u32 max_name_len; @@ -295,6 +294,8 @@ static int check_dir_item(struct btrfs_r */ if (key->type == BTRFS_DIR_ITEM_KEY || key->type == BTRFS_XATTR_ITEM_KEY) { + char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)]; + read_extent_buffer(leaf, namebuf, (unsigned long)(di + 1), name_len); name_hash = btrfs_name_hash(namebuf, name_len);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit fce466eab7ac6baa9d2dcd88abcf945be3d4a089 upstream.
A crafted image with invalid block group items could make free space cache code to cause panic.
We could detect such invalid block group item by checking: 1) Item size Known fixed value. 2) Block group size (key.offset) We have an upper limit on block group item (10G) 3) Chunk objectid Known fixed value. 4) Type Only 4 valid type values, DATA, METADATA, SYSTEM and DATA|METADATA. No more than 1 bit set for profile type. 5) Used space No more than the block group size.
This should allow btrfs to detect and refuse to mount the crafted image.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199849 Reported-by: Xu Wen wen.xu@gatech.edu Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Gu Jinxiang gujx@cn.fujitsu.com Reviewed-by: Nikolay Borisov nborisov@suse.com Tested-by: Gu Jinxiang gujx@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - In check_leaf_item(), pass root->fs_info to check_block_group_item() - Include <linux/sizes.h> (in ctree.h, to match upstream) - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ctree.h | 1 fs/btrfs/tree-checker.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/volumes.c | 2 fs/btrfs/volumes.h | 2 4 files changed, 104 insertions(+), 1 deletion(-)
--- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -35,6 +35,7 @@ #include <linux/btrfs.h> #include <linux/workqueue.h> #include <linux/security.h> +#include <linux/sizes.h> #include "extent_io.h" #include "extent_map.h" #include "async-thread.h" --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -31,6 +31,7 @@ #include "disk-io.h" #include "compression.h" #include "hash.h" +#include "volumes.h"
#define CORRUPT(reason, eb, root, slot) \ btrfs_crit(root->fs_info, \ @@ -312,6 +313,102 @@ static int check_dir_item(struct btrfs_r return 0; }
+__printf(4, 5) +__cold +static void block_group_err(const struct btrfs_fs_info *fs_info, + const struct extent_buffer *eb, int slot, + const char *fmt, ...) +{ + struct btrfs_key key; + struct va_format vaf; + va_list args; + + btrfs_item_key_to_cpu(eb, &key, slot); + va_start(args, fmt); + + vaf.fmt = fmt; + vaf.va = &args; + + btrfs_crit(fs_info, + "corrupt %s: root=%llu block=%llu slot=%d bg_start=%llu bg_len=%llu, %pV", + btrfs_header_level(eb) == 0 ? "leaf" : "node", + btrfs_header_owner(eb), btrfs_header_bytenr(eb), slot, + key.objectid, key.offset, &vaf); + va_end(args); +} + +static int check_block_group_item(struct btrfs_fs_info *fs_info, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_block_group_item bgi; + u32 item_size = btrfs_item_size_nr(leaf, slot); + u64 flags; + u64 type; + + /* + * Here we don't really care about alignment since extent allocator can + * handle it. We care more about the size, as if one block group is + * larger than maximum size, it's must be some obvious corruption. + */ + if (key->offset > BTRFS_MAX_DATA_CHUNK_SIZE || key->offset == 0) { + block_group_err(fs_info, leaf, slot, + "invalid block group size, have %llu expect (0, %llu]", + key->offset, BTRFS_MAX_DATA_CHUNK_SIZE); + return -EUCLEAN; + } + + if (item_size != sizeof(bgi)) { + block_group_err(fs_info, leaf, slot, + "invalid item size, have %u expect %zu", + item_size, sizeof(bgi)); + return -EUCLEAN; + } + + read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot), + sizeof(bgi)); + if (btrfs_block_group_chunk_objectid(&bgi) != + BTRFS_FIRST_CHUNK_TREE_OBJECTID) { + block_group_err(fs_info, leaf, slot, + "invalid block group chunk objectid, have %llu expect %llu", + btrfs_block_group_chunk_objectid(&bgi), + BTRFS_FIRST_CHUNK_TREE_OBJECTID); + return -EUCLEAN; + } + + if (btrfs_block_group_used(&bgi) > key->offset) { + block_group_err(fs_info, leaf, slot, + "invalid block group used, have %llu expect [0, %llu)", + btrfs_block_group_used(&bgi), key->offset); + return -EUCLEAN; + } + + flags = btrfs_block_group_flags(&bgi); + if (hweight64(flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) > 1) { + block_group_err(fs_info, leaf, slot, +"invalid profile flags, have 0x%llx (%lu bits set) expect no more than 1 bit set", + flags & BTRFS_BLOCK_GROUP_PROFILE_MASK, + hweight64(flags & BTRFS_BLOCK_GROUP_PROFILE_MASK)); + return -EUCLEAN; + } + + type = flags & BTRFS_BLOCK_GROUP_TYPE_MASK; + if (type != BTRFS_BLOCK_GROUP_DATA && + type != BTRFS_BLOCK_GROUP_METADATA && + type != BTRFS_BLOCK_GROUP_SYSTEM && + type != (BTRFS_BLOCK_GROUP_METADATA | + BTRFS_BLOCK_GROUP_DATA)) { + block_group_err(fs_info, leaf, slot, +"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llu or 0x%llx", + type, hweight64(type), + BTRFS_BLOCK_GROUP_DATA, BTRFS_BLOCK_GROUP_METADATA, + BTRFS_BLOCK_GROUP_SYSTEM, + BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA); + return -EUCLEAN; + } + return 0; +} + /* * Common point to switch the item-specific validation. */ @@ -333,6 +430,9 @@ static int check_leaf_item(struct btrfs_ case BTRFS_XATTR_ITEM_KEY: ret = check_dir_item(root, leaf, key, slot); break; + case BTRFS_BLOCK_GROUP_ITEM_KEY: + ret = check_block_group_item(root->fs_info, leaf, key, slot); + break; } return ret; } --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4540,7 +4540,7 @@ static int __btrfs_alloc_chunk(struct bt
if (type & BTRFS_BLOCK_GROUP_DATA) { max_stripe_size = 1024 * 1024 * 1024; - max_chunk_size = 10 * max_stripe_size; + max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE; if (!devs_max) devs_max = BTRFS_MAX_DEVS(info->chunk_root); } else if (type & BTRFS_BLOCK_GROUP_METADATA) { --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -24,6 +24,8 @@ #include <linux/btrfs.h> #include "async-thread.h"
+#define BTRFS_MAX_DATA_CHUNK_SIZE (10ULL * SZ_1G) + extern struct mutex uuid_mutex;
#define BTRFS_STRIPE_LEN (64 * 1024)
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit ba480dd4db9f1798541eb2d1c423fc95feee8d36 upstream.
A crafted image has empty root tree block, which will later cause NULL pointer dereference.
The following trees should never be empty: 1) Tree root Must contain at least root items for extent tree, device tree and fs tree
2) Chunk tree Or we can't even bootstrap as it contains the mapping.
3) Fs tree At least inode item for top level inode (.).
4) Device tree Dev extents for chunks
5) Extent tree Must have corresponding extent for each chunk.
If any of them is empty, we are sure the fs is corrupted and no need to mount it.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 Reported-by: Xu Wen wen.xu@gatech.edu Signed-off-by: Qu Wenruo wqu@suse.com Tested-by: Gu Jinxiang gujx@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: Pass root instead of fs_info to generic_err()] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -456,9 +456,22 @@ static int check_leaf(struct btrfs_root * skip this check for relocation trees. */ if (nritems == 0 && !btrfs_header_flag(leaf, BTRFS_HEADER_FLAG_RELOC)) { + u64 owner = btrfs_header_owner(leaf); struct btrfs_root *check_root;
- key.objectid = btrfs_header_owner(leaf); + /* These trees must never be empty */ + if (owner == BTRFS_ROOT_TREE_OBJECTID || + owner == BTRFS_CHUNK_TREE_OBJECTID || + owner == BTRFS_EXTENT_TREE_OBJECTID || + owner == BTRFS_DEV_TREE_OBJECTID || + owner == BTRFS_FS_TREE_OBJECTID || + owner == BTRFS_DATA_RELOC_TREE_OBJECTID) { + generic_err(root, leaf, 0, + "invalid root, root %llu must never be empty", + owner); + return -EUCLEAN; + } + key.objectid = owner; key.type = BTRFS_ROOT_ITEM_KEY; key.offset = (u64)-1;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gu Jinxiang gujx@cn.fujitsu.com
commit 315409b0098fb2651d86553f0436b70502b29bb2 upstream.
Reported in https://bugzilla.kernel.org/show_bug.cgi?id=199839, with an image that has an invalid chunk type but does not return an error.
Add chunk type check in btrfs_check_chunk_valid, to detect the wrong type combinations.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199839 Reported-by: Xu Wen wen.xu@gatech.edu Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Gu Jinxiang gujx@cn.fujitsu.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: Use root->fs_info instead of fs_info] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6218,6 +6218,8 @@ static int btrfs_check_chunk_valid(struc u16 num_stripes; u16 sub_stripes; u64 type; + u64 features; + bool mixed = false;
length = btrfs_chunk_length(leaf, chunk); stripe_len = btrfs_chunk_stripe_len(leaf, chunk); @@ -6258,6 +6260,32 @@ static int btrfs_check_chunk_valid(struc btrfs_chunk_type(leaf, chunk)); return -EIO; } + + if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) { + btrfs_err(root->fs_info, "missing chunk type flag: 0x%llx", type); + return -EIO; + } + + if ((type & BTRFS_BLOCK_GROUP_SYSTEM) && + (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) { + btrfs_err(root->fs_info, + "system chunk with data or metadata type: 0x%llx", type); + return -EIO; + } + + features = btrfs_super_incompat_flags(root->fs_info->super_copy); + if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS) + mixed = true; + + if (!mixed) { + if ((type & BTRFS_BLOCK_GROUP_METADATA) && + (type & BTRFS_BLOCK_GROUP_DATA)) { + btrfs_err(root->fs_info, + "mixed chunk type in non-mixed mode: 0x%llx", type); + return -EIO; + } + } + if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) || (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) || (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit 514c7dca85a0bf40be984dab0b477403a6db901f upstream.
A crafted btrfs image with incorrect chunk<->block group mapping will trigger a lot of unexpected things as the mapping is essential.
Although the problem can be caught by block group item checker added in "btrfs: tree-checker: Verify block_group_item", it's still not sufficient. A sufficiently valid block group item can pass the check added by the mentioned patch but could fail to match the existing chunk.
This patch will add extra block group -> chunk mapping check, to ensure we have a completely matching (start, len, flags) chunk for each block group at mount time.
Here we reuse the original helper find_first_block_group(), which is already doing the basic bg -> chunk checks, adding further checks of the start/len and type flags.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199837 Reported-by: Xu Wen wen.xu@gatech.edu Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Su Yue suy.fnst@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: Use root->fs_info instead of fs_info] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9487,6 +9487,8 @@ static int find_first_block_group(struct int ret = 0; struct btrfs_key found_key; struct extent_buffer *leaf; + struct btrfs_block_group_item bg; + u64 flags; int slot;
ret = btrfs_search_slot(NULL, root, key, path, 0, 0); @@ -9521,8 +9523,32 @@ static int find_first_block_group(struct "logical %llu len %llu found bg but no related chunk", found_key.objectid, found_key.offset); ret = -ENOENT; + } else if (em->start != found_key.objectid || + em->len != found_key.offset) { + btrfs_err(root->fs_info, + "block group %llu len %llu mismatch with chunk %llu len %llu", + found_key.objectid, found_key.offset, + em->start, em->len); + ret = -EUCLEAN; } else { - ret = 0; + read_extent_buffer(leaf, &bg, + btrfs_item_ptr_offset(leaf, slot), + sizeof(bg)); + flags = btrfs_block_group_flags(&bg) & + BTRFS_BLOCK_GROUP_TYPE_MASK; + + if (flags != (em->map_lookup->type & + BTRFS_BLOCK_GROUP_TYPE_MASK)) { + btrfs_err(root->fs_info, +"block group %llu len %llu type flags 0x%llx mismatch with chunk type flags 0x%llx", + found_key.objectid, + found_key.offset, flags, + (BTRFS_BLOCK_GROUP_TYPE_MASK & + em->map_lookup->type)); + ret = -EUCLEAN; + } else { + ret = 0; + } } free_extent_map(em); goto out;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit 7ef49515fa6727cb4b6f2f5b0ffbc5fc20a9f8c6 upstream.
If a crafted image has missing block group items, it could cause unexpected behavior and breaks the assumption of 1:1 chunk<->block group mapping.
Although we have the block group -> chunk mapping check, we still need chunk -> block group mapping check.
This patch will do extra check to ensure each chunk has its corresponding block group.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 Reported-by: Xu Wen wen.xu@gatech.edu Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Gu Jinxiang gujx@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 57 insertions(+), 1 deletion(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9765,6 +9765,62 @@ btrfs_create_block_group_cache(struct bt return cache; }
+ +/* + * Iterate all chunks and verify that each of them has the corresponding block + * group + */ +static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) +{ + struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree; + struct extent_map *em; + struct btrfs_block_group_cache *bg; + u64 start = 0; + int ret = 0; + + while (1) { + read_lock(&map_tree->map_tree.lock); + /* + * lookup_extent_mapping will return the first extent map + * intersecting the range, so setting @len to 1 is enough to + * get the first chunk. + */ + em = lookup_extent_mapping(&map_tree->map_tree, start, 1); + read_unlock(&map_tree->map_tree.lock); + if (!em) + break; + + bg = btrfs_lookup_block_group(fs_info, em->start); + if (!bg) { + btrfs_err(fs_info, + "chunk start=%llu len=%llu doesn't have corresponding block group", + em->start, em->len); + ret = -EUCLEAN; + free_extent_map(em); + break; + } + if (bg->key.objectid != em->start || + bg->key.offset != em->len || + (bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK) != + (em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK)) { + btrfs_err(fs_info, +"chunk start=%llu len=%llu flags=0x%llx doesn't match block group start=%llu len=%llu flags=0x%llx", + em->start, em->len, + em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK, + bg->key.objectid, bg->key.offset, + bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK); + ret = -EUCLEAN; + free_extent_map(em); + btrfs_put_block_group(bg); + break; + } + start = em->start + em->len; + free_extent_map(em); + btrfs_put_block_group(bg); + } + return ret; +} + int btrfs_read_block_groups(struct btrfs_root *root) { struct btrfs_path *path; @@ -9951,7 +10007,7 @@ int btrfs_read_block_groups(struct btrfs }
init_global_block_rsv(info); - ret = 0; + ret = check_chunk_block_group_mappings(info); error: btrfs_free_path(path); return ret;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
commit f556faa46eb4e96d0d0772e74ecf66781e132f72 upstream.
Although we have tree level check at tree read runtime, it's completely based on its parent level. We still need to do accurate level check to avoid invalid tree blocks sneak into kernel space.
The check itself is simple, for leaf its level should always be 0. For nodes its level should be in range [1, BTRFS_MAX_LEVEL - 1].
Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: Su Yue suy.fnst@cn.fujitsu.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com [bwh: Backported to 4.4: - Pass root instead of fs_info to generic_err() - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -447,6 +447,13 @@ static int check_leaf(struct btrfs_root u32 nritems = btrfs_header_nritems(leaf); int slot;
+ if (btrfs_header_level(leaf) != 0) { + generic_err(root, leaf, 0, + "invalid level for leaf, have %d expect 0", + btrfs_header_level(leaf)); + return -EUCLEAN; + } + /* * Extent buffers from a relocation tree have a owner field that * corresponds to the subvolume tree they are based on. So just from an @@ -589,9 +596,16 @@ int btrfs_check_node(struct btrfs_root * unsigned long nr = btrfs_header_nritems(node); struct btrfs_key key, next_key; int slot; + int level = btrfs_header_level(node); u64 bytenr; int ret = 0;
+ if (level <= 0 || level >= BTRFS_MAX_LEVEL) { + generic_err(root, node, 0, + "invalid level for node, have %d expect [1, %d]", + level, BTRFS_MAX_LEVEL - 1); + return -EUCLEAN; + } if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root)) { btrfs_crit(root->fs_info, "corrupt node: root=%llu block=%llu, nritems too %s, have %lu expect range [1,%zu]",
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shaokun Zhang zhangshaokun@hisilicon.com
commit 761333f2f50ccc887aa9957ae829300262c0d15b upstream.
block_group_err shows the group system as a decimal value with a '0x' prefix, which is somewhat misleading.
Fix it to print hexadecimal, as was intended.
Fixes: fce466eab7ac6 ("btrfs: tree-checker: Verify block_group_item") Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Shaokun Zhang zhangshaokun@hisilicon.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-checker.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -399,7 +399,7 @@ static int check_block_group_item(struct type != (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA)) { block_group_err(fs_info, leaf, slot, -"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llu or 0x%llx", +"invalid type, have 0x%llx (%lu bits set) expect either 0x%llx, 0x%llx, 0x%llx or 0x%llx", type, hweight64(type), BTRFS_BLOCK_GROUP_DATA, BTRFS_BLOCK_GROUP_METADATA, BTRFS_BLOCK_GROUP_SYSTEM,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Shilovsky pshilov@microsoft.com
commit ee13919c2e8d1f904e035ad4b4239029a8994131 upstream.
Currently we hide EINTR code returned from sock_sendmsg() and return 0 instead. This makes a caller think that we successfully completed the network operation which is not true. Fix this by properly returning EINTR to callers.
Cc: stable@vger.kernel.org Signed-off-by: Pavel Shilovsky pshilov@microsoft.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -360,7 +360,7 @@ uncork: if (rc < 0 && rc != -EINTR) cifs_dbg(VFS, "Error %d sending data on socket to server\n", rc); - else + else if (rc > 0) rc = 0;
return rc;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ross Lagerwall ross.lagerwall@citrix.com
commit b9a74cde94957d82003fb9f7ab4777938ca851cd upstream.
If maxBuf is small but non-zero, it could result in a zero sized lock element array which we would then try and access OOB.
Signed-off-by: Ross Lagerwall ross.lagerwall@citrix.com Signed-off-by: Steve French stfrench@microsoft.com CC: Stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/cifs/file.c | 8 ++++---- fs/cifs/smb2file.c | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-)
--- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1073,10 +1073,10 @@ cifs_push_mandatory_locks(struct cifsFil
/* * Accessing maxBuf is racy with cifs_reconnect - need to store value - * and check it for zero before using. + * and check it before using. */ max_buf = tcon->ses->server->maxBuf; - if (!max_buf) { + if (max_buf < (sizeof(struct smb_hdr) + sizeof(LOCKING_ANDX_RANGE))) { free_xid(xid); return -EINVAL; } @@ -1404,10 +1404,10 @@ cifs_unlock_range(struct cifsFileInfo *c
/* * Accessing maxBuf is racy with cifs_reconnect - need to store value - * and check it for zero before using. + * and check it before using. */ max_buf = tcon->ses->server->maxBuf; - if (!max_buf) + if (max_buf < (sizeof(struct smb_hdr) + sizeof(LOCKING_ANDX_RANGE))) return -EINVAL;
max_num = (max_buf - sizeof(struct smb_hdr)) / --- a/fs/cifs/smb2file.c +++ b/fs/cifs/smb2file.c @@ -123,10 +123,10 @@ smb2_unlock_range(struct cifsFileInfo *c
/* * Accessing maxBuf is racy with cifs_reconnect - need to store value - * and check it for zero before using. + * and check it before using. */ max_buf = tcon->ses->server->maxBuf; - if (!max_buf) + if (max_buf < sizeof(struct smb2_lock_element)) return -EINVAL;
max_num = max_buf / sizeof(struct smb2_lock_element);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniele Palmas dnlplm@gmail.com
commit 34aabf918717dd14e05051896aaecd3b16b53d95 upstream.
Telit 3G Intel based modems require zero packet to be sent if out data size is equal to the endpoint max packet size.
Signed-off-by: Daniele Palmas dnlplm@gmail.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/class/cdc-acm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -1885,6 +1885,13 @@ static const struct usb_device_id acm_id .driver_info = IGNORE_DEVICE, },
+ { USB_DEVICE(0x1bc7, 0x0021), /* Telit 3G ACM only composition */ + .driver_info = SEND_ZERO_PACKET, + }, + { USB_DEVICE(0x1bc7, 0x0023), /* Telit 3G ACM + ECM composition */ + .driver_info = SEND_ZERO_PACKET, + }, + /* control interfaces without any protocol set */ { USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_ACM, USB_CDC_PROTO_NONE) },
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Icenowy Zheng icenowy@aosc.io
commit c5603d2fdb424849360fe7e3f8c1befc97571b8c upstream.
Currently the code will set US_FL_SANE_SENSE flag unconditionally if device claims SPC3+, however we should allow US_FL_BAD_SENSE flag to prevent this behavior, because SMI SM3350 UFS-USB bridge controller, which claims SPC4, will show strange behavior with 96-byte sense (put the chip into a wrong state that cannot read/write anything).
Check the presence of US_FL_BAD_SENSE when assuming US_FL_SANE_SENSE on SPC4+ devices.
Signed-off-by: Icenowy Zheng icenowy@aosc.io Cc: stable stable@vger.kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/scsiglue.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/usb/storage/scsiglue.c +++ b/drivers/usb/storage/scsiglue.c @@ -223,8 +223,12 @@ static int slave_configure(struct scsi_d if (!(us->fflags & US_FL_NEEDS_CAP16)) sdev->try_rc_10_first = 1;
- /* assume SPC3 or latter devices support sense size > 18 */ - if (sdev->scsi_level > SCSI_SPC_2) + /* + * assume SPC3 or latter devices support sense size > 18 + * unless US_FL_BAD_SENSE quirk is specified. + */ + if (sdev->scsi_level > SCSI_SPC_2 && + !(us->fflags & US_FL_BAD_SENSE)) us->fflags |= US_FL_SANE_SENSE;
/* USB-IDE bridges tend to report SK = 0x04 (Non-recoverable
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Icenowy Zheng icenowy@aosc.io
commit 0a99cc4b8ee83885ab9f097a3737d1ab28455ac0 upstream.
The SMI SM3350 USB-UFS bridge controller cannot handle long sense request correctly and will make the chip refuse to do read/write when requested long sense.
Add a bad sense quirk for it.
Signed-off-by: Icenowy Zheng icenowy@aosc.io Cc: stable stable@vger.kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/unusual_devs.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/usb/storage/unusual_devs.h +++ b/drivers/usb/storage/unusual_devs.h @@ -1393,6 +1393,18 @@ UNUSUAL_DEV( 0x0d49, 0x7310, 0x0000, 0x US_FL_SANE_SENSE),
/* + * Reported by Icenowy Zheng icenowy@aosc.io + * The SMI SM3350 USB-UFS bridge controller will enter a wrong state + * that do not process read/write command if a long sense is requested, + * so force to use 18-byte sense. + */ +UNUSUAL_DEV( 0x090c, 0x3350, 0x0000, 0xffff, + "SMI", + "SM3350 UFS-to-USB-Mass-Storage bridge", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_BAD_SENSE ), + +/* * Pete Zaitcev zaitcev@yahoo.com, bz#164688. * The device blatantly ignores LUN and returns 1 in GetMaxLUN. */
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jack Stocker jackstocker.93@gmail.com
commit 3483254b89438e60f719937376c5e0ce2bc46761 upstream.
To match the Corsair Strafe RGB, the Corsair K70 RGB also requires USB_QUIRK_DELAY_CTRL_MSG to completely resolve boot connection issues discussed here: https://github.com/ckb-next/ckb-next/issues/42. Otherwise roughly 1 in 10 boots the keyboard will fail to be detected.
Patch that applied delay control quirk for Corsair Strafe RGB: cb88a0588717 ("usb: quirks: add control message delay for 1b1c:1b20")
Previous K70 RGB patch to add delay-init quirk: 7a1646d92257 ("Add delay-init quirk for Corsair K70 RGB keyboards")
Signed-off-by: Jack Stocker jackstocker.93@gmail.com Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/core/quirks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -240,7 +240,8 @@ static const struct usb_device_id usb_qu USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL },
/* Corsair K70 RGB */ - { USB_DEVICE(0x1b1c, 0x1b13), .driver_info = USB_QUIRK_DELAY_INIT }, + { USB_DEVICE(0x1b1c, 0x1b13), .driver_info = USB_QUIRK_DELAY_INIT | + USB_QUIRK_DELAY_CTRL_MSG },
/* Corsair Strafe */ { USB_DEVICE(0x1b1c, 0x1b15), .driver_info = USB_QUIRK_DELAY_INIT |
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Lameter cl@linux.com
commit 09c2e76ed734a1d36470d257a778aaba28e86531 upstream.
Callers of __alloc_alien() check for NULL. We must do the same check in __alloc_alien_cache to avoid NULL pointer dereferences on allocation failures.
Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e98974906... Fixes: 49dfc304ba241 ("slab: use the lock on alien_cache, instead of the lock on array_cache") Fixes: c8522a3a5832b ("Slab: introduce alloc_alien") Signed-off-by: Christoph Lameter cl@linux.com Reported-by: syzbot+d6ed4ec679652b4fd4e4@syzkaller.appspotmail.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Pekka Enberg penberg@kernel.org Cc: David Rientjes rientjes@google.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/slab.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/slab.c +++ b/mm/slab.c @@ -875,8 +875,10 @@ static struct alien_cache *__alloc_alien struct alien_cache *alc = NULL;
alc = kmalloc_node(memsize, gfp, node); - init_arraycache(&alc->ac, entries, batch); - spin_lock_init(&alc->lock); + if (alc) { + init_arraycache(&alc->ac, entries, batch); + spin_lock_init(&alc->lock); + } return alc; }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit eff31f4002c4e25b9b8c39d0a3a551c6c64c77e8 upstream.
Originally altera_pcie_link_is_up() decided the link was up if any of the low four bits of the LTSSM register were set. But the link is only up if the LTSSM state is L0, so check for that exact value.
[bhelgaas: changelog] Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -40,6 +40,7 @@ #define P2A_INT_ENABLE 0x3070 #define P2A_INT_ENA_ALL 0xf #define RP_LTSSM 0x3c64 +#define RP_LTSSM_MASK 0x1f #define LTSSM_L0 0xf
/* TLP configuration type 0 and 1 */ @@ -140,7 +141,7 @@ static void tlp_write_tx(struct altera_p
static bool altera_pcie_link_is_up(struct altera_pcie *pcie) { - return !!(cra_readl(pcie, RP_LTSSM) & LTSSM_L0); + return !!((cra_readl(pcie, RP_LTSSM) & RP_LTSSM_MASK) == LTSSM_L0); }
static bool altera_pcie_valid_config(struct altera_pcie *pcie,
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas bhelgaas@google.com
commit f8be11ae3d2c9a1338da37ff91ff4c65922d21be upstream.
Move cra_writel(), cra_readl(), and altera_pcie_link_is_up() so a future patch can use them in altera_pcie_retrain(). No functional change intended.
Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -81,6 +81,22 @@ struct tlp_rp_regpair_t { u32 reg1; };
+static inline void cra_writel(struct altera_pcie *pcie, const u32 value, + const u32 reg) +{ + writel_relaxed(value, pcie->cra_base + reg); +} + +static inline u32 cra_readl(struct altera_pcie *pcie, const u32 reg) +{ + return readl_relaxed(pcie->cra_base + reg); +} + +static bool altera_pcie_link_is_up(struct altera_pcie *pcie) +{ + return !!((cra_readl(pcie, RP_LTSSM) & RP_LTSSM_MASK) == LTSSM_L0); +} + static void altera_pcie_retrain(struct pci_dev *dev) { u16 linkcap, linkstat; @@ -120,17 +136,6 @@ static bool altera_pcie_hide_rc_bar(stru return false; }
-static inline void cra_writel(struct altera_pcie *pcie, const u32 value, - const u32 reg) -{ - writel_relaxed(value, pcie->cra_base + reg); -} - -static inline u32 cra_readl(struct altera_pcie *pcie, const u32 reg) -{ - return readl_relaxed(pcie->cra_base + reg); -} - static void tlp_write_tx(struct altera_pcie *pcie, struct tlp_rp_regpair_t *tlp_rp_regdata) { @@ -139,11 +144,6 @@ static void tlp_write_tx(struct altera_p cra_writel(pcie, tlp_rp_regdata->ctrl, RP_TX_CNTRL); }
-static bool altera_pcie_link_is_up(struct altera_pcie *pcie) -{ - return !!((cra_readl(pcie, RP_LTSSM) & RP_LTSSM_MASK) == LTSSM_L0); -} - static bool altera_pcie_valid_config(struct altera_pcie *pcie, struct pci_bus *bus, int dev) {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit c622032ebc538cb3869c312ae3ad235a99da84b6 upstream.
Check the link status before retraining. If the link is not up, don't bother trying to retrain it.
[bhelgaas: split code move to separate patch, changelog] Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -100,6 +100,10 @@ static bool altera_pcie_link_is_up(struc static void altera_pcie_retrain(struct pci_dev *dev) { u16 linkcap, linkstat; + struct altera_pcie *pcie = dev->bus->sysdata; + + if (!altera_pcie_link_is_up(pcie)) + return;
/* * Set the retrain bit if the PCIe rootport support > 2.5GB/s, but
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit 3a928e98a833e1a470a60d2fedf3c55502185fb7 upstream.
Some PCIe devices take a long time to reach link up state after retrain. Poll for link up status after retraining the link. This is to make sure the link is up before we access configuration space.
[bhelgaas: changelog] Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -61,6 +61,8 @@ #define TLP_LOOP 500 #define RP_DEVFN 0
+#define LINK_UP_TIMEOUT 5000 + #define INTX_NUM 4
#define DWORD_MASK 3 @@ -101,6 +103,7 @@ static void altera_pcie_retrain(struct p { u16 linkcap, linkstat; struct altera_pcie *pcie = dev->bus->sysdata; + int timeout = 0;
if (!altera_pcie_link_is_up(pcie)) return; @@ -115,9 +118,16 @@ static void altera_pcie_retrain(struct p return;
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat); - if ((linkstat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) + if ((linkstat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) { pcie_capability_set_word(dev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_RL); + while (!altera_pcie_link_is_up(pcie)) { + timeout++; + if (timeout > LINK_UP_TIMEOUT) + break; + udelay(5); + } + } } DECLARE_PCI_FIXUP_EARLY(0x1172, PCI_ANY_ID, altera_pcie_retrain);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit 411dc32d8810e0a204c799ce5c97cb56990de1cb upstream.
Poll for link training status is cleared before poll for link up status. This can help to get the reliable link up status, especially when PCIe is in Gen 3 speed.
Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 45 +++++++++++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 8 deletions(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -61,7 +61,8 @@ #define TLP_LOOP 500 #define RP_DEVFN 0
-#define LINK_UP_TIMEOUT 5000 +#define LINK_UP_TIMEOUT HZ +#define LINK_RETRAIN_TIMEOUT HZ
#define INTX_NUM 4
@@ -99,11 +100,44 @@ static bool altera_pcie_link_is_up(struc return !!((cra_readl(pcie, RP_LTSSM) & RP_LTSSM_MASK) == LTSSM_L0); }
+static void altera_wait_link_retrain(struct pci_dev *dev) +{ + u16 reg16; + unsigned long start_jiffies; + struct altera_pcie *pcie = dev->bus->sysdata; + + /* Wait for link training end. */ + start_jiffies = jiffies; + for (;;) { + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, ®16); + if (!(reg16 & PCI_EXP_LNKSTA_LT)) + break; + + if (time_after(jiffies, start_jiffies + LINK_RETRAIN_TIMEOUT)) { + dev_err(&pcie->pdev->dev, "link retrain timeout\n"); + break; + } + udelay(100); + } + + /* Wait for link is up */ + start_jiffies = jiffies; + for (;;) { + if (altera_pcie_link_is_up(pcie)) + break; + + if (time_after(jiffies, start_jiffies + LINK_UP_TIMEOUT)) { + dev_err(&pcie->pdev->dev, "link up timeout\n"); + break; + } + udelay(100); + } +} + static void altera_pcie_retrain(struct pci_dev *dev) { u16 linkcap, linkstat; struct altera_pcie *pcie = dev->bus->sysdata; - int timeout = 0;
if (!altera_pcie_link_is_up(pcie)) return; @@ -121,12 +155,7 @@ static void altera_pcie_retrain(struct p if ((linkstat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) { pcie_capability_set_word(dev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_RL); - while (!altera_pcie_link_is_up(pcie)) { - timeout++; - if (timeout > LINK_UP_TIMEOUT) - break; - udelay(5); - } + altera_wait_link_retrain(dev); } } DECLARE_PCI_FIXUP_EARLY(0x1172, PCI_ANY_ID, altera_pcie_retrain);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit 31fc0ad47e2e0b8417616aa0f1ddcc67edf1e109 upstream.
Rework configs accessors so a future patch can use them in _probe() with struct altera_pcie instead of struct pci_bus.
Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 64 ++++++++++++++++++++++++++--------------- 1 file changed, 41 insertions(+), 23 deletions(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -330,22 +330,14 @@ static int tlp_cfg_dword_write(struct al return PCIBIOS_SUCCESSFUL; }
-static int altera_pcie_cfg_read(struct pci_bus *bus, unsigned int devfn, - int where, int size, u32 *value) +static int _altera_pcie_cfg_read(struct altera_pcie *pcie, u8 busno, + unsigned int devfn, int where, int size, + u32 *value) { - struct altera_pcie *pcie = bus->sysdata; int ret; u32 data; u8 byte_en;
- if (altera_pcie_hide_rc_bar(bus, devfn, where)) - return PCIBIOS_BAD_REGISTER_NUMBER; - - if (!altera_pcie_valid_config(pcie, bus, PCI_SLOT(devfn))) { - *value = 0xffffffff; - return PCIBIOS_DEVICE_NOT_FOUND; - } - switch (size) { case 1: byte_en = 1 << (where & 3); @@ -358,7 +350,7 @@ static int altera_pcie_cfg_read(struct p break; }
- ret = tlp_cfg_dword_read(pcie, bus->number, devfn, + ret = tlp_cfg_dword_read(pcie, busno, devfn, (where & ~DWORD_MASK), byte_en, &data); if (ret != PCIBIOS_SUCCESSFUL) return ret; @@ -378,20 +370,14 @@ static int altera_pcie_cfg_read(struct p return PCIBIOS_SUCCESSFUL; }
-static int altera_pcie_cfg_write(struct pci_bus *bus, unsigned int devfn, - int where, int size, u32 value) +static int _altera_pcie_cfg_write(struct altera_pcie *pcie, u8 busno, + unsigned int devfn, int where, int size, + u32 value) { - struct altera_pcie *pcie = bus->sysdata; u32 data32; u32 shift = 8 * (where & 3); u8 byte_en;
- if (altera_pcie_hide_rc_bar(bus, devfn, where)) - return PCIBIOS_BAD_REGISTER_NUMBER; - - if (!altera_pcie_valid_config(pcie, bus, PCI_SLOT(devfn))) - return PCIBIOS_DEVICE_NOT_FOUND; - switch (size) { case 1: data32 = (value & 0xff) << shift; @@ -407,8 +393,40 @@ static int altera_pcie_cfg_write(struct break; }
- return tlp_cfg_dword_write(pcie, bus->number, devfn, - (where & ~DWORD_MASK), byte_en, data32); + return tlp_cfg_dword_write(pcie, busno, devfn, (where & ~DWORD_MASK), + byte_en, data32); +} + +static int altera_pcie_cfg_read(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *value) +{ + struct altera_pcie *pcie = bus->sysdata; + + if (altera_pcie_hide_rc_bar(bus, devfn, where)) + return PCIBIOS_BAD_REGISTER_NUMBER; + + if (!altera_pcie_valid_config(pcie, bus, PCI_SLOT(devfn))) { + *value = 0xffffffff; + return PCIBIOS_DEVICE_NOT_FOUND; + } + + return _altera_pcie_cfg_read(pcie, bus->number, devfn, where, size, + value); +} + +static int altera_pcie_cfg_write(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 value) +{ + struct altera_pcie *pcie = bus->sysdata; + + if (altera_pcie_hide_rc_bar(bus, devfn, where)) + return PCIBIOS_BAD_REGISTER_NUMBER; + + if (!altera_pcie_valid_config(pcie, bus, PCI_SLOT(devfn))) + return PCIBIOS_DEVICE_NOT_FOUND; + + return _altera_pcie_cfg_write(pcie, bus->number, devfn, where, size, + value); }
static struct pci_ops altera_pcie_ops = {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ley Foon Tan lftan@altera.com
commit ce4f1c7ad490aa7129bde5632d6e53943f8a866c upstream.
Previously we used a PCI early fixup to initiate a link retrain on Altera devices. But Altera PCIe IP can be configured as either a Root Port or an Endpoint, and they might have same vendor ID, so the fixup would be run for both.
We only want to initiate a link retrain for Altera Root Port devices, not for Endpoints, so move the link retrain functionality from the fixup to altera_pcie_host_init().
[bhelgaas: changelog] Signed-off-by: Ley Foon Tan lftan@altera.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Claudius Heine claudius.heine.ext@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/host/pcie-altera.c | 151 ++++++++++++++++++++++++----------------- 1 file changed, 91 insertions(+), 60 deletions(-)
--- a/drivers/pci/host/pcie-altera.c +++ b/drivers/pci/host/pcie-altera.c @@ -43,6 +43,7 @@ #define RP_LTSSM_MASK 0x1f #define LTSSM_L0 0xf
+#define PCIE_CAP_OFFSET 0x80 /* TLP configuration type 0 and 1 */ #define TLP_FMTTYPE_CFGRD0 0x04 /* Configuration Read Type 0 */ #define TLP_FMTTYPE_CFGWR0 0x44 /* Configuration Write Type 0 */ @@ -100,66 +101,6 @@ static bool altera_pcie_link_is_up(struc return !!((cra_readl(pcie, RP_LTSSM) & RP_LTSSM_MASK) == LTSSM_L0); }
-static void altera_wait_link_retrain(struct pci_dev *dev) -{ - u16 reg16; - unsigned long start_jiffies; - struct altera_pcie *pcie = dev->bus->sysdata; - - /* Wait for link training end. */ - start_jiffies = jiffies; - for (;;) { - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, ®16); - if (!(reg16 & PCI_EXP_LNKSTA_LT)) - break; - - if (time_after(jiffies, start_jiffies + LINK_RETRAIN_TIMEOUT)) { - dev_err(&pcie->pdev->dev, "link retrain timeout\n"); - break; - } - udelay(100); - } - - /* Wait for link is up */ - start_jiffies = jiffies; - for (;;) { - if (altera_pcie_link_is_up(pcie)) - break; - - if (time_after(jiffies, start_jiffies + LINK_UP_TIMEOUT)) { - dev_err(&pcie->pdev->dev, "link up timeout\n"); - break; - } - udelay(100); - } -} - -static void altera_pcie_retrain(struct pci_dev *dev) -{ - u16 linkcap, linkstat; - struct altera_pcie *pcie = dev->bus->sysdata; - - if (!altera_pcie_link_is_up(pcie)) - return; - - /* - * Set the retrain bit if the PCIe rootport support > 2.5GB/s, but - * current speed is 2.5 GB/s. - */ - pcie_capability_read_word(dev, PCI_EXP_LNKCAP, &linkcap); - - if ((linkcap & PCI_EXP_LNKCAP_SLS) <= PCI_EXP_LNKCAP_SLS_2_5GB) - return; - - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &linkstat); - if ((linkstat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) { - pcie_capability_set_word(dev, PCI_EXP_LNKCTL, - PCI_EXP_LNKCTL_RL); - altera_wait_link_retrain(dev); - } -} -DECLARE_PCI_FIXUP_EARLY(0x1172, PCI_ANY_ID, altera_pcie_retrain); - /* * Altera PCIe port uses BAR0 of RC's configuration space as the translation * from PCI bus to native BUS. Entire DDR region is mapped into PCIe space @@ -434,6 +375,90 @@ static struct pci_ops altera_pcie_ops = .write = altera_pcie_cfg_write, };
+static int altera_read_cap_word(struct altera_pcie *pcie, u8 busno, + unsigned int devfn, int offset, u16 *value) +{ + u32 data; + int ret; + + ret = _altera_pcie_cfg_read(pcie, busno, devfn, + PCIE_CAP_OFFSET + offset, sizeof(*value), + &data); + *value = data; + return ret; +} + +static int altera_write_cap_word(struct altera_pcie *pcie, u8 busno, + unsigned int devfn, int offset, u16 value) +{ + return _altera_pcie_cfg_write(pcie, busno, devfn, + PCIE_CAP_OFFSET + offset, sizeof(value), + value); +} + +static void altera_wait_link_retrain(struct altera_pcie *pcie) +{ + u16 reg16; + unsigned long start_jiffies; + + /* Wait for link training end. */ + start_jiffies = jiffies; + for (;;) { + altera_read_cap_word(pcie, pcie->root_bus_nr, RP_DEVFN, + PCI_EXP_LNKSTA, ®16); + if (!(reg16 & PCI_EXP_LNKSTA_LT)) + break; + + if (time_after(jiffies, start_jiffies + LINK_RETRAIN_TIMEOUT)) { + dev_err(&pcie->pdev->dev, "link retrain timeout\n"); + break; + } + udelay(100); + } + + /* Wait for link is up */ + start_jiffies = jiffies; + for (;;) { + if (altera_pcie_link_is_up(pcie)) + break; + + if (time_after(jiffies, start_jiffies + LINK_UP_TIMEOUT)) { + dev_err(&pcie->pdev->dev, "link up timeout\n"); + break; + } + udelay(100); + } +} + +static void altera_pcie_retrain(struct altera_pcie *pcie) +{ + u16 linkcap, linkstat, linkctl; + + if (!altera_pcie_link_is_up(pcie)) + return; + + /* + * Set the retrain bit if the PCIe rootport support > 2.5GB/s, but + * current speed is 2.5 GB/s. + */ + altera_read_cap_word(pcie, pcie->root_bus_nr, RP_DEVFN, PCI_EXP_LNKCAP, + &linkcap); + if ((linkcap & PCI_EXP_LNKCAP_SLS) <= PCI_EXP_LNKCAP_SLS_2_5GB) + return; + + altera_read_cap_word(pcie, pcie->root_bus_nr, RP_DEVFN, PCI_EXP_LNKSTA, + &linkstat); + if ((linkstat & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) { + altera_read_cap_word(pcie, pcie->root_bus_nr, RP_DEVFN, + PCI_EXP_LNKCTL, &linkctl); + linkctl |= PCI_EXP_LNKCTL_RL; + altera_write_cap_word(pcie, pcie->root_bus_nr, RP_DEVFN, + PCI_EXP_LNKCTL, linkctl); + + altera_wait_link_retrain(pcie); + } +} + static int altera_pcie_intx_map(struct irq_domain *domain, unsigned int irq, irq_hw_number_t hwirq) { @@ -568,6 +593,11 @@ static int altera_pcie_parse_dt(struct a return 0; }
+static void altera_pcie_host_init(struct altera_pcie *pcie) +{ + altera_pcie_retrain(pcie); +} + static int altera_pcie_probe(struct platform_device *pdev) { struct altera_pcie *pcie; @@ -605,6 +635,7 @@ static int altera_pcie_probe(struct plat cra_writel(pcie, P2A_INT_STS_ALL, P2A_INT_STATUS); /* enable all interrupts */ cra_writel(pcie, P2A_INT_ENA_ALL, P2A_INT_ENABLE); + altera_pcie_host_init(pcie);
bus = pci_scan_root_bus(&pdev->dev, pcie->root_bus_nr, &altera_pcie_ops, pcie, &pcie->resources);
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
commit 7d7b467cb95bf29597b417d4990160d4ea6d69b9 upstream.
Some ACPI tables contain duplicate power resource references like this:
Name (_PR0, Package (0x04) // _PR0: Power Resources for D0 { P28P, P18P, P18P, CLK4 })
This causes a WARN_ON in sysfs_add_link_to_group() because we end up adding a link to the same acpi_device twice:
sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/808622C1:00/OVTI2680:00/power_resources_D0/LNXPOWER:0a' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.12-301.fc29.x86_64 #1 Hardware name: Insyde CherryTrail/Type2 - Board Product Name, BIOS jumperx.T87.KFBNEEA02 04/13/2016 Call Trace: dump_stack+0x5c/0x80 sysfs_warn_dup.cold.3+0x17/0x2a sysfs_do_create_link_sd.isra.2+0xa9/0xb0 sysfs_add_link_to_group+0x30/0x50 acpi_power_expose_list+0x74/0xa0 acpi_power_add_remove_device+0x50/0xa0 acpi_add_single_object+0x26b/0x5f0 acpi_bus_check_add+0xc4/0x250 ...
To address this issue, make acpi_extract_power_resources() check for duplicates and simply skip them when found.
Cc: All applicable stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com [ rjw: Subject & changelog, comments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/power.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
--- a/drivers/acpi/power.c +++ b/drivers/acpi/power.c @@ -131,6 +131,23 @@ void acpi_power_resources_list_free(stru } }
+static bool acpi_power_resource_is_dup(union acpi_object *package, + unsigned int start, unsigned int i) +{ + acpi_handle rhandle, dup; + unsigned int j; + + /* The caller is expected to check the package element types */ + rhandle = package->package.elements[i].reference.handle; + for (j = start; j < i; j++) { + dup = package->package.elements[j].reference.handle; + if (dup == rhandle) + return true; + } + + return false; +} + int acpi_extract_power_resources(union acpi_object *package, unsigned int start, struct list_head *list) { @@ -150,6 +167,11 @@ int acpi_extract_power_resources(union a err = -ENODEV; break; } + + /* Some ACPI tables contain duplicate power resource references */ + if (acpi_power_resource_is_dup(package, start, i)) + continue; + err = acpi_add_power_resource(rhandle); if (err) break;
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Zeng yizeng@asrmicro.com
commit 6ebec961d59bccf65d08b13fc1ad4e6272a89338 upstream.
If adapter->retries is set to a minus value from user space via ioctl, it will make __i2c_transfer and __i2c_smbus_xfer skip the calling to adapter->algo->master_xfer and adapter->algo->smbus_xfer that is registered by the underlying bus drivers, and return value 0 to all the callers. The bus driver will never be accessed anymore by all users, besides, the users may still get successful return value without any error or information log print out.
If adapter->timeout is set to minus value from user space via ioctl, it will make the retrying loop in __i2c_transfer and __i2c_smbus_xfer always break after the the first try, due to the time_after always returns true.
Signed-off-by: Yi Zeng yizeng@asrmicro.com [wsa: minor grammar updates to commit message] Signed-off-by: Wolfram Sang wsa@the-dreams.de Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/i2c/i2c-dev.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -459,9 +459,15 @@ static long i2cdev_ioctl(struct file *fi return i2cdev_ioctl_smbus(client, arg);
case I2C_RETRIES: + if (arg > INT_MAX) + return -EINVAL; + client->adapter->retries = arg; break; case I2C_TIMEOUT: + if (arg > INT_MAX) + return -EINVAL; + /* For historical reasons, user-space sets the timeout * value in units of 10 ms. */
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
[It's a minimal fix for a bug that was fixed incidentally by a large refactoring in v4.8.]
In the CTS template, when the input length is <= one block cipher block (e.g. <= 16 bytes for AES) pass the correct length to the underlying CBC transform rather than one block. This matches the upstream behavior and makes the encryption/decryption operation correctly return -EINVAL when 1 <= nbytes < bsize or succeed when nbytes == 0, rather than crashing.
This was fixed upstream incidentally by a large refactoring, commit 0605c41cc53c ("crypto: cts - Convert to skcipher"). But syzkaller easily trips over this when running on older kernels, as it's easily reachable via AF_ALG. Therefore, this patch makes the minimal fix for older kernels.
Cc: linux-crypto@vger.kernel.org Fixes: 76cb9521795a ("[CRYPTO] cts: Add CTS mode required for Kerberos AES support") Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- crypto/cts.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/crypto/cts.c +++ b/crypto/cts.c @@ -137,8 +137,8 @@ static int crypto_cts_encrypt(struct blk lcldesc.info = desc->info; lcldesc.flags = desc->flags;
- if (tot_blocks == 1) { - err = crypto_blkcipher_encrypt_iv(&lcldesc, dst, src, bsize); + if (tot_blocks <= 1) { + err = crypto_blkcipher_encrypt_iv(&lcldesc, dst, src, nbytes); } else if (nbytes <= bsize * 2) { err = cts_cbc_encrypt(ctx, desc, dst, src, 0, nbytes); } else { @@ -232,8 +232,8 @@ static int crypto_cts_decrypt(struct blk lcldesc.info = desc->info; lcldesc.flags = desc->flags;
- if (tot_blocks == 1) { - err = crypto_blkcipher_decrypt_iv(&lcldesc, dst, src, bsize); + if (tot_blocks <= 1) { + err = crypto_blkcipher_decrypt_iv(&lcldesc, dst, src, nbytes); } else if (nbytes <= bsize * 2) { err = cts_cbc_decrypt(ctx, desc, dst, src, 0, nbytes); } else {
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
commit 2b08b1f12cd664dc7d5c84ead9ff25ae97ad5491 upstream.
The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent() while still holding the xattr semaphore. This is not necessary and it triggers a circular lockdep warning. This is because fiemap_fill_next_extent() could trigger a page fault when it writes into page which triggers a page fault. If that page is mmaped from the inline file in question, this could very well result in a deadlock.
This problem can be reproduced using generic/519 with a file system configuration which has the inline_data feature enabled.
Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/inline.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -1861,12 +1861,12 @@ int ext4_inline_data_fiemap(struct inode physical += (char *)ext4_raw_inode(&iloc) - iloc.bh->b_data; physical += offsetof(struct ext4_inode, i_block);
- if (physical) - error = fiemap_fill_next_extent(fieinfo, start, physical, - inline_len, flags); brelse(iloc.bh); out: up_read(&EXT4_I(inode)->xattr_sem); + if (physical) + error = fiemap_fill_next_extent(fieinfo, start, physical, + inline_len, flags); return (error < 0 ? error : 0); }
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasily Averin vvs@virtuozzo.com
commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream.
if node have NFSv41+ mounts inside several net namespaces it can lead to use-after-free in svc_process_common()
svc_process_common() /* Setup reply header */ rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE
svc_process_common() can use incorrect rqstp->rq_xprt, its caller function bc_svc_process() takes it from serv->sv_bc_xprt. The problem is that serv is global structure but sv_bc_xprt is assigned per-netnamespace.
According to Trond, the whole "let's set up rqstp->rq_xprt for the back channel" is nothing but a giant hack in order to work around the fact that svc_process_common() uses it to find the xpt_ops, and perform a couple of (meaningless for the back channel) tests of xpt_flags.
All we really need in svc_process_common() is to be able to run rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()
Bruce J Fields points that this xpo_prep_reply_hdr() call is an awfully roundabout way just to do "svc_putnl(resv, 0);" in the tcp case.
This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(), now it calls svc_process_common() with rqstp->rq_xprt = NULL.
To adjust reply header svc_process_common() just check rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.
To handle rqstp->rq_xprt = NULL case in functions called from svc_process_common() patch intruduces net namespace pointer svc_rqst->rq_bc_net and adjust SVC_NET() definition. Some other function was also adopted to properly handle described case.
Signed-off-by: Vasily Averin vvs@virtuozzo.com Cc: stable@vger.kernel.org Fixes: 23c20ecd4475 ("NFS: callback up - users counting cleanup") Signed-off-by: J. Bruce Fields bfields@redhat.com v2: - added lost extern svc_tcp_prep_reply_hdr() - dropped trace_svc_process() changes - context fixes in svc_process_common() Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/sunrpc/svc.h | 5 ++++- net/sunrpc/svc.c | 10 +++++++--- net/sunrpc/svc_xprt.c | 5 +++-- net/sunrpc/svcsock.c | 2 +- 4 files changed, 15 insertions(+), 7 deletions(-)
--- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -290,9 +290,12 @@ struct svc_rqst { struct svc_cacherep * rq_cacherep; /* cache info */ struct task_struct *rq_task; /* service thread */ spinlock_t rq_lock; /* per-request lock */ + struct net *rq_bc_net; /* pointer to backchannel's + * net namespace + */ };
-#define SVC_NET(svc_rqst) (svc_rqst->rq_xprt->xpt_net) +#define SVC_NET(rqst) (rqst->rq_xprt ? rqst->rq_xprt->xpt_net : rqst->rq_bc_net)
/* * Rigorous type checking on sockaddr type conversions --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -1062,6 +1062,8 @@ void svc_printk(struct svc_rqst *rqstp, static __printf(2,3) void svc_printk(struct svc_rqst *rqstp, const char *fmt, ...) {} #endif
+extern void svc_tcp_prep_reply_hdr(struct svc_rqst *); + /* * Common routine for processing the RPC request. */ @@ -1091,7 +1093,8 @@ svc_process_common(struct svc_rqst *rqst clear_bit(RQ_DROPME, &rqstp->rq_flags);
/* Setup reply header */ - rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); + if (rqstp->rq_prot == IPPROTO_TCP) + svc_tcp_prep_reply_hdr(rqstp);
svc_putu32(resv, rqstp->rq_xid);
@@ -1138,7 +1141,8 @@ svc_process_common(struct svc_rqst *rqst case SVC_DENIED: goto err_bad_auth; case SVC_CLOSE: - if (test_bit(XPT_TEMP, &rqstp->rq_xprt->xpt_flags)) + if (rqstp->rq_xprt && + test_bit(XPT_TEMP, &rqstp->rq_xprt->xpt_flags)) svc_close_xprt(rqstp->rq_xprt); case SVC_DROP: goto dropit; @@ -1360,10 +1364,10 @@ bc_svc_process(struct svc_serv *serv, st dprintk("svc: %s(%p)\n", __func__, req);
/* Build the svc_rqst used by the common processing routine */ - rqstp->rq_xprt = serv->sv_bc_xprt; rqstp->rq_xid = req->rq_xid; rqstp->rq_prot = req->rq_xprt->prot; rqstp->rq_server = serv; + rqstp->rq_bc_net = req->rq_xprt->xprt_net;
rqstp->rq_addrlen = sizeof(req->rq_xprt->addr); memcpy(&rqstp->rq_addr, &req->rq_xprt->addr, rqstp->rq_addrlen); --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -454,10 +454,11 @@ out: */ void svc_reserve(struct svc_rqst *rqstp, int space) { + struct svc_xprt *xprt = rqstp->rq_xprt; + space += rqstp->rq_res.head[0].iov_len;
- if (space < rqstp->rq_reserved) { - struct svc_xprt *xprt = rqstp->rq_xprt; + if (xprt && space < rqstp->rq_reserved) { atomic_sub((rqstp->rq_reserved - space), &xprt->xpt_reserved); rqstp->rq_reserved = space;
--- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1240,7 +1240,7 @@ static int svc_tcp_sendto(struct svc_rqs /* * Setup response header. TCP has a 4B record length field. */ -static void svc_tcp_prep_reply_hdr(struct svc_rqst *rqstp) +void svc_tcp_prep_reply_hdr(struct svc_rqst *rqstp) { struct kvec *resv = &rqstp->rq_res.head[0];
On 1/15/19 9:34 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.171 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Jan 17 15:48:21 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.171-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Tue, 15 Jan 2019 at 22:07, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.4.171 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Jan 17 15:48:21 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.171-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.4.171-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.4.y git commit: cc14a9e207362757ecd2254de04a816657b58904 git describe: v4.4.170-52-gcc14a9e20736 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.4-oe/build/v4.4.170-52-...
No regressions (compared to build v4.4.170)
No fixes (compared to build v4.4.170)
Ran 17023 total tests in the following environments and test suites.
Environments -------------- - i386 - juno-r2 - arm64 - qemu_arm - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * install-android-platform-tools-r2600 * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
Summary ------------------------------------------------------------------------
kernel: 4.4.171-rc1 git repo: https://git.linaro.org/lkft/arm64-stable-rc.git git branch: 4.4.171-rc1-hikey-20190115-352 git commit: 82b93f4bdd350fa372bb7104304aeb5c977bb31f git describe: 4.4.171-rc1-hikey-20190115-352 Test details: https://qa-reports.linaro.org/lkft/linaro-hikey-stable-rc-4.4-oe/build/4.4.1...
No regressions (compared to build 4.4.170-rc2-hikey-20190111-350)
No fixes (compared to build 4.4.170-rc2-hikey-20190111-350)
Ran 2756 total tests in the following environments and test suites.
Environments -------------- - hi6220-hikey - arm64 - qemu_arm64
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test
On Tue, Jan 15, 2019 at 05:34:56PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.171 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu Jan 17 15:48:21 UTC 2019. Anything received after that time might be too late.
Build results: total: 171 pass: 171 fail: 0 Qemu test results: total: 291 pass: 291 fail: 0
Guenter
linux-stable-mirror@lists.linaro.org