UBIFS's recovery code strictly assumes that a deleted inode will never come back, therefore it removes all data which belongs to that inode as soon it faces an inode with link count 0 in the replay list. Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE it can lead to data loss upon a power-cut.
Consider a journal with entries like: 0: inode X (nlink = 0) /* O_TMPFILE was created */ 1: data for inode X /* Someone writes to the temp file */ 2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */ 3: inode X (nlink = 1) /* inode was re-linked via linkat() */
Upon replay of entry #2 UBIFS will drop all data that belongs to inode X, this will lead to an empty file after mounting.
As solution for this problem, scan the replay list for a re-link entry before dropping data.
Fixes: 474b93704f32 ("ubifs: Implement O_TMPFILE") Cc: stable@vger.kernel.org Reported-by: Russell Senior russell@personaltelco.net Reported-by: Rafał Miłecki zajec5@gmail.com Signed-off-by: Richard Weinberger richard@nod.at --- Russel, Rafał,
please give this patch another testing. I'll also run it on different test systems before merging.
Thanks, //richard --- fs/ubifs/replay.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c index 4844538eb926..65a780685b82 100644 --- a/fs/ubifs/replay.c +++ b/fs/ubifs/replay.c @@ -209,6 +209,34 @@ static int trun_remove_range(struct ubifs_info *c, struct replay_entry *r) return ubifs_tnc_remove_range(c, &min_key, &max_key); }
+/** + * inode_relinked - check whether inode in question will be re-linked. + * @c: UBIFS file-system description object + * @rino: replay entry to test + * + * O_TMPFILE files can be re-linked, this means link count goes from 0 to 1. + * This case needs special care, otherwise all references to the inode will + * be removed upon the first replay entry of an inode with link count 0 + * is found. + */ +static bool inode_relinked(struct ubifs_info *c, struct replay_entry *rino) +{ + struct replay_entry *r = rino; + + ubifs_assert(c, rino->deletion); + ubifs_assert(c, key_type(c, &rino->key) == UBIFS_INO_KEY); + + list_for_each_entry_from(r, &c->replay_list, list) { + if (key_inum(c, &r->key) == key_inum(c, &rino->key) && + r->deletion == 0) { + ubifs_assert(c, r->sqnum > rino->sqnum); + return true; + } + } + + return false; +} + /** * apply_replay_entry - apply a replay entry to the TNC. * @c: UBIFS file-system description object @@ -236,6 +264,11 @@ static int apply_replay_entry(struct ubifs_info *c, struct replay_entry *r) { ino_t inum = key_inum(c, &r->key);
+ if (inode_relinked(c, r)) { + err = 0; + break; + } + err = ubifs_tnc_remove_ino(c, inum); break; }
UBIFS's recovery code strictly assumes that a deleted inode will never come back, therefore it removes all data which belongs to that inode as soon it faces an inode with link count 0 in the replay list. Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE it can lead to data loss upon a power-cut.
Consider a journal with entries like: 0: inode X (nlink = 0) /* O_TMPFILE was created */ 1: data for inode X /* Someone writes to the temp file */ 2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */ 3: inode X (nlink = 1) /* inode was re-linked via linkat() */
Upon replay of entry #2 UBIFS will drop all data that belongs to inode X, this will lead to an empty file after mounting.
As solution for this problem, scan the replay list for a re-link entry before dropping data.
Fixes: 474b93704f32 ("ubifs: Implement O_TMPFILE") Cc: stable@vger.kernel.org Reported-by: Russell Senior russell@personaltelco.net Reported-by: Rafał Miłecki zajec5@gmail.com Signed-off-by: Richard Weinberger richard@nod.at Tested-by: Russell Senior russell@personaltelco.net --- Russel, Rafał,
please give this patch another testing. I'll also run it on different test systems before merging.
Thanks, //richard --- fs/ubifs/replay.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)
diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c index 4844538eb926..65a780685b82 100644 --- a/fs/ubifs/replay.c +++ b/fs/ubifs/replay.c @@ -209,6 +209,34 @@ static int trun_remove_range(struct ubifs_info *c, struct replay_entry *r) return ubifs_tnc_remove_range(c, &min_key, &max_key); }
+/** + * inode_relinked - check whether inode in question will be re-linked. + * @c: UBIFS file-system description object + * @rino: replay entry to test + * + * O_TMPFILE files can be re-linked, this means link count goes from 0 to 1. + * This case needs special care, otherwise all references to the inode will + * be removed upon the first replay entry of an inode with link count 0 + * is found. + */ +static bool inode_relinked(struct ubifs_info *c, struct replay_entry *rino) +{ + struct replay_entry *r = rino; + + ubifs_assert(c, rino->deletion); + ubifs_assert(c, key_type(c, &rino->key) == UBIFS_INO_KEY); + + list_for_each_entry_from(r, &c->replay_list, list) { + if (key_inum(c, &r->key) == key_inum(c, &rino->key) && + r->deletion == 0) { + ubifs_assert(c, r->sqnum > rino->sqnum); + return true; + } + } + + return false; +} + /** * apply_replay_entry - apply a replay entry to the TNC. * @c: UBIFS file-system description object @@ -236,6 +264,11 @@ static int apply_replay_entry(struct ubifs_info *c, struct replay_entry *r) { ino_t inum = key_inum(c, &r->key);
+ if (inode_relinked(c, r)) { + err = 0; + break; + } + err = ubifs_tnc_remove_ino(c, inum); break; }
On Sun, 28 Oct 2018 at 22:44, Richard Weinberger richard@nod.at wrote:
UBIFS's recovery code strictly assumes that a deleted inode will never come back, therefore it removes all data which belongs to that inode as soon it faces an inode with link count 0 in the replay list. Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE it can lead to data loss upon a power-cut.
Consider a journal with entries like: 0: inode X (nlink = 0) /* O_TMPFILE was created */ 1: data for inode X /* Someone writes to the temp file */ 2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */ 3: inode X (nlink = 1) /* inode was re-linked via linkat() */
Upon replay of entry #2 UBIFS will drop all data that belongs to inode X, this will lead to an empty file after mounting.
As solution for this problem, scan the replay list for a re-link entry before dropping data.
Fixes: 474b93704f32 ("ubifs: Implement O_TMPFILE") Cc: stable@vger.kernel.org Reported-by: Russell Senior russell@personaltelco.net Reported-by: Rafał Miłecki zajec5@gmail.com Signed-off-by: Richard Weinberger richard@nod.at
Thank you Richard!!!
Tested-by: Rafał Miłecki rafal@milecki.pl
Am Donnerstag, 1. November 2018, 09:55:53 CET schrieb Rafał Miłecki:
On Sun, 28 Oct 2018 at 22:44, Richard Weinberger richard@nod.at wrote:
UBIFS's recovery code strictly assumes that a deleted inode will never come back, therefore it removes all data which belongs to that inode as soon it faces an inode with link count 0 in the replay list. Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE it can lead to data loss upon a power-cut.
Consider a journal with entries like: 0: inode X (nlink = 0) /* O_TMPFILE was created */ 1: data for inode X /* Someone writes to the temp file */ 2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */ 3: inode X (nlink = 1) /* inode was re-linked via linkat() */
Upon replay of entry #2 UBIFS will drop all data that belongs to inode X, this will lead to an empty file after mounting.
As solution for this problem, scan the replay list for a re-link entry before dropping data.
Fixes: 474b93704f32 ("ubifs: Implement O_TMPFILE") Cc: stable@vger.kernel.org Reported-by: Russell Senior russell@personaltelco.net Reported-by: Rafał Miłecki zajec5@gmail.com Signed-off-by: Richard Weinberger richard@nod.at
Thank you Richard!!!
Tested-by: Rafał Miłecki rafal@milecki.pl
Thanks for testing and providing the reproducer! I'll send a v2 of the patch soon where I've optimized the list scanning more. In fact, the correct and fasted approach is walking the replay list backwards to find the final link state of an inode.
Thanks, //richard
linux-stable-mirror@lists.linaro.org