Hi Jan,
On Wed, Feb 05, 2020 at 06:30:25PM +0100, Jan Kara wrote:
DIR_INDEX has been introduced as a compat ext4 feature. That means that even kernels / tools that don't understand the feature may modify the filesystem. This works because for kernels not understanding indexed dir format, internal htree nodes appear just as empty directory entries. Index dir aware kernels then check the htree structure is still consistent before using the data. This all worked reasonably well until metadata checksums were introduced. The problem is that these effectively made DIR_INDEX only ro-compatible because internal htree nodes store checksums in a different place than normal directory blocks. Thus any modification ignorant to DIR_INDEX (or just clearing EXT4_INDEX_FL from the inode) will effectively cause checksum mismatch and trigger kernel errors. So we have to be more careful when dealing with indexed directories on filesystems with checksumming enabled.
- We just disallow loading and directory inodes with EXT4_INDEX_FL when
DIR_INDEX is not enabled. This is harsh but it should be very rare (it means someone disabled DIR_INDEX on existing filesystem and didn't run e2fsck), e2fsck can fix the problem, and we don't want to answer the difficult question: "Should we rather corrupt the directory more or should we ignore that DIR_INDEX feature is not set?"
- When we find out htree structure is corrupted (but the filesystem and
the directory should in support htrees), we continue just ignoring htree information for reading but we refuse to add new entries to the directory to avoid corrupting it more.
CC: stable@vger.kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Signed-off-by: Jan Kara jack@suse.cz
fs/ext4/dir.c | 14 ++++++++------ fs/ext4/ext4.h | 5 ++++- fs/ext4/inode.c | 13 +++++++++++++ fs/ext4/namei.c | 7 +++++++ 4 files changed, 32 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c index 9f00fc0bf21d..cb9ea593b544 100644 --- a/fs/ext4/dir.c +++ b/fs/ext4/dir.c @@ -129,12 +129,14 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx) if (err != ERR_BAD_DX_DIR) { return err; }
/*
* We don't set the inode dirty flag since it's not
* critical that it get flushed back to the disk.
*/
ext4_clear_inode_flag(file_inode(file),
EXT4_INODE_INDEX);
/* Can we just clear INDEX flag to ignore htree information? */
if (!ext4_has_metadata_csum(sb)) {
/*
* We don't set the inode dirty flag since it's not
* critical that it gets flushed back to the disk.
*/
ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);
}}
if (ext4_has_inline_data(inode)) { diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index f8578caba40d..1fd6c1e2ce2a 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2482,8 +2482,11 @@ void ext4_insert_dentry(struct inode *inode, struct ext4_filename *fname); static inline void ext4_update_dx_flag(struct inode *inode) {
- if (!ext4_has_feature_dir_index(inode->i_sb))
- if (!ext4_has_feature_dir_index(inode->i_sb)) {
/* ext4_iget() should have caught this... */
ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);WARN_ON_ONCE(ext4_has_feature_metadata_csum(inode->i_sb));
- }
}
This new WARN_ON_ONCE() gets triggered by the following commands:
mkfs.ext4 -O ^dir_index /dev/vdc mount /dev/vdc /vdc mkdir /vdc/dir
WARNING: CPU: 1 PID: 305 at fs/ext4/ext4.h:2700 add_dirent_to_buf+0x1d0/0x1e0 fs/ext4/namei.c:2039 CPU: 1 PID: 305 Comm: mkdir Not tainted 5.10.0-rc4 #14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.14.0-1 04/01/2014 RIP: 0010:ext4_update_dx_flag fs/ext4/ext4.h:2700 [inline] RIP: 0010:add_dirent_to_buf+0x1d0/0x1e0 fs/ext4/namei.c:2038 [...] Call Trace: ext4_add_entry+0x179/0x4d0 fs/ext4/namei.c:2248 ext4_mkdir+0x1c0/0x320 fs/ext4/namei.c:2814 vfs_mkdir+0xcc/0x130 fs/namei.c:3650 do_mkdirat+0x81/0x120 fs/namei.c:3673 __do_sys_mkdir fs/namei.c:3689 [inline]
What is intended here? metadata_csum && ^dir_index is a weird combination, but it's technically valid, right?
- Eric