From: Matthew Wilcox willy@infradead.org
The errseq_t infrastructure assumes that errors which occurred before the file descriptor was opened are of no interest to the application. This turns out to be a regression for some applications, notably Postgres.
Before errseq_t, a writeback error would be reported exactly once (as long as the inode remained in memory), so Postgres could open a file, call fsync() and find out whether there had been a writeback error on that file from another process.
This patch changes the errseq infrastructure to report errors to all file descriptors which are opened after the error occurred, but before it was reported to any file descriptor. This restores the user-visible behaviour.
[ jlayton: fix up conflicts in comments ]
Cc: stable@vger.kernel.org Fixes: 5660e13d2fd6 ("fs: new infrastructure for writeback error handling and reporting") Signed-off-by: Matthew Wilcox mawilcox@microsoft.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Jeff Layton jlayton@redhat.com (cherry picked from commit b4678df184b314a2bd47d2329feca2c2534aa12b) --- lib/errseq.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-)
This is a backport to the v4.14 stable series. The only merge conflict was due to an earlier patch by Willy to flesh out the comments. There were no code changes necessary.
diff --git a/lib/errseq.c b/lib/errseq.c index 79cc66897db4..b6ed81ec788d 100644 --- a/lib/errseq.c +++ b/lib/errseq.c @@ -111,25 +111,22 @@ EXPORT_SYMBOL(errseq_set); * errseq_sample - grab current errseq_t value * @eseq: pointer to errseq_t to be sampled * - * This function allows callers to sample an errseq_t value, marking it as - * "seen" if required. + * This function allows callers to initialise their errseq_t variable. + * If the error has been "seen", new callers will not see an old error. + * If there is an unseen error in @eseq, the caller of this function will + * see it the next time it checks for an error. + * + * Context: Any context. + * Return: The current errseq value. */ errseq_t errseq_sample(errseq_t *eseq) { errseq_t old = READ_ONCE(*eseq); - errseq_t new = old;
- /* - * For the common case of no errors ever having been set, we can skip - * marking the SEEN bit. Once an error has been set, the value will - * never go back to zero. - */ - if (old != 0) { - new |= ERRSEQ_SEEN; - if (old != new) - cmpxchg(eseq, old, new); - } - return new; + /* If nobody has seen this error yet, then we can be the first. */ + if (!(old & ERRSEQ_SEEN)) + old = 0; + return old; } EXPORT_SYMBOL(errseq_sample);