Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
Fix the problem by properly accounting for transaction space reserved for descriptor blocks when verifying requested transaction handle size.
CC: stable@vger.kernel.org Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") Reported-by: Alexander Coffin alex.coffin@maticrobots.com Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vy... Signed-off-by: Jan Kara jack@suse.cz --- fs/jbd2/transaction.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a095f1a3114b..66513c18ca29 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks) wake_up(&journal->j_wait_reserved); }
+/* Maximum number of blocks for user transaction payload */ +static int jbd2_max_user_trans_buffers(journal_t *journal) +{ + return journal->j_max_transaction_buffers - + journal->j_transaction_overhead_buffers; +} + /* * Wait until we can add credits for handle to the running transaction. Called * with j_state_lock held for reading. Returns 0 if handle joined the running @@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock) * big to fit this handle? Wait until reserved credits are freed. */ if (atomic_read(&journal->j_reserved_credits) + total > - journal->j_max_transaction_buffers) { + jbd2_max_user_trans_buffers(journal)) { read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + total <= - journal->j_max_transaction_buffers); + jbd2_max_user_trans_buffers(journal)); __acquire(&journal->j_state_lock); /* fake out sparse */ return 1; } @@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock)
needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits); /* We allow at most half of a transaction to be reserved */ - if (needed > journal->j_max_transaction_buffers / 2) { + if (needed > jbd2_max_user_trans_buffers(journal) / 2) { sub_reserved_credits(journal, rsv_blocks); atomic_sub(total, &t->t_outstanding_credits); read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + rsv_blocks - <= journal->j_max_transaction_buffers / 2); + <= jbd2_max_user_trans_buffers(journal) / 2); __acquire(&journal->j_state_lock); /* fake out sparse */ return 1; } @@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle, * size and limit the number of total credits to not exceed maximum * transaction size per operation. */ - if ((rsv_blocks > journal->j_max_transaction_buffers / 2) || - (rsv_blocks + blocks > journal->j_max_transaction_buffers)) { + if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 || + rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) { printk(KERN_ERR "JBD2: %s wants too many credits " "credits:%d rsv_credits:%d max:%d\n", current->comm, blocks, rsv_blocks, - journal->j_max_transaction_buffers); + jbd2_max_user_trans_buffers(journal)); WARN_ON(1); return -ENOSPC; }
On 2024/6/25 1:01, Jan Kara wrote:
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
Hello Jan!
I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction:
static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... }
I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block().
Thanks, Yi.
Fix the problem by properly accounting for transaction space reserved for descriptor blocks when verifying requested transaction handle size.
CC: stable@vger.kernel.org Fixes: 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") Reported-by: Alexander Coffin alex.coffin@maticrobots.com Link: https://lore.kernel.org/all/CA+hUFcuGs04JHZ_WzA1zGN57+ehL2qmHOt5a7RMpo+rv6Vy... Signed-off-by: Jan Kara jack@suse.cz
fs/jbd2/transaction.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a095f1a3114b..66513c18ca29 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -191,6 +191,13 @@ static void sub_reserved_credits(journal_t *journal, int blocks) wake_up(&journal->j_wait_reserved); } +/* Maximum number of blocks for user transaction payload */ +static int jbd2_max_user_trans_buffers(journal_t *journal) +{
- return journal->j_max_transaction_buffers -
journal->j_transaction_overhead_buffers;
+}
/*
- Wait until we can add credits for handle to the running transaction. Called
- with j_state_lock held for reading. Returns 0 if handle joined the running
@@ -240,12 +247,12 @@ __must_hold(&journal->j_state_lock) * big to fit this handle? Wait until reserved credits are freed. */ if (atomic_read(&journal->j_reserved_credits) + total >
journal->j_max_transaction_buffers) {
jbd2_max_user_trans_buffers(journal)) { read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + total <=
journal->j_max_transaction_buffers);
}jbd2_max_user_trans_buffers(journal)); __acquire(&journal->j_state_lock); /* fake out sparse */ return 1;
@@ -285,14 +292,14 @@ __must_hold(&journal->j_state_lock) needed = atomic_add_return(rsv_blocks, &journal->j_reserved_credits); /* We allow at most half of a transaction to be reserved */
- if (needed > journal->j_max_transaction_buffers / 2) {
- if (needed > jbd2_max_user_trans_buffers(journal) / 2) { sub_reserved_credits(journal, rsv_blocks); atomic_sub(total, &t->t_outstanding_credits); read_unlock(&journal->j_state_lock); jbd2_might_wait_for_commit(journal); wait_event(journal->j_wait_reserved, atomic_read(&journal->j_reserved_credits) + rsv_blocks
<= journal->j_max_transaction_buffers / 2);
__acquire(&journal->j_state_lock); /* fake out sparse */ return 1; }<= jbd2_max_user_trans_buffers(journal) / 2);
@@ -322,12 +329,12 @@ static int start_this_handle(journal_t *journal, handle_t *handle, * size and limit the number of total credits to not exceed maximum * transaction size per operation. */
- if ((rsv_blocks > journal->j_max_transaction_buffers / 2) ||
(rsv_blocks + blocks > journal->j_max_transaction_buffers)) {
- if (rsv_blocks > jbd2_max_user_trans_buffers(journal) / 2 ||
printk(KERN_ERR "JBD2: %s wants too many credits " "credits:%d rsv_credits:%d max:%d\n", current->comm, blocks, rsv_blocks,rsv_blocks + blocks > jbd2_max_user_trans_buffers(journal)) {
journal->j_max_transaction_buffers);
WARN_ON(1); return -ENOSPC; }jbd2_max_user_trans_buffers(journal));
On Wed 26-06-24 15:38:42, Zhang Yi wrote:
On 2024/6/25 1:01, Jan Kara wrote:
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction:
static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... }
I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block().
Sure. So what was exactly happening is a loop like this:
start_this_handle() blocks = 252 (handle->h_total_credits) - starts a new transaction - t_outstanding_credits set to 6 to account for the commit block and descriptor blocks add_transaction_credits(journal, 252, 0) needed = atomic_add_return(252, &t->t_outstanding_credits); if (needed > journal->j_max_transaction_buffers) { /* Yes, this is exceeded due to descriptor blocks being in * t_outstanding_credits */ ... wait_transaction_locked(journal); - this commits an empty transaction - contains only the commit block return 1 goto repeat
So we commit single block transactions in a loop until we exhaust all the journal space. The condition in add_transaction_credits() whether there's enough space in the journal is never reached so we don't ever push the journal tail to make space in the journal.
Honza
On 2024/6/26 19:22, Jan Kara wrote:
On Wed 26-06-24 15:38:42, Zhang Yi wrote:
On 2024/6/25 1:01, Jan Kara wrote:
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction:
static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... }
I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block().
Sure. So what was exactly happening is a loop like this:
start_this_handle() blocks = 252 (handle->h_total_credits)
- starts a new transaction
- t_outstanding_credits set to 6 to account for the commit block and descriptor blocks
add_transaction_credits(journal, 252, 0) needed = atomic_add_return(252, &t->t_outstanding_credits); if (needed > journal->j_max_transaction_buffers) { /* Yes, this is exceeded due to descriptor blocks being in * t_outstanding_credits */ ... wait_transaction_locked(journal); - this commits an empty transaction - contains only the commit block return 1 goto repeat
So we commit single block transactions in a loop until we exhaust all the journal space. The condition in add_transaction_credits() whether there's enough space in the journal is never reached so we don't ever push the journal tail to make space in the journal.
mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, but I still have one question, although the journal tail can't be updated in add_transaction_credits(), I think the journal tail should be updated in jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't add empty transactions to journal->j_checkpoint_transactions list, the loop in jbd2_journal_commit_transaction() should like this:
... jbd2_journal_commit_transaction() update_tail = jbd2_journal_get_log_tail() //journal->j_checkpoint_transactions should be NULL, tid is the //committing transaction's tid, which should be large than the //tail tid since the second loop, so this should be true after //the second loop if (freed < journal->j_max_transaction_buffers) update_tail = 0; //update_tail should be true after j_max_transaction_buffers loop
jbd2_update_log_tail() //j_free should be increased in each //j_max_transaction_buffers loop if (tid_gt(tid, journal->j_tail_sequence)) //it's true __jbd2_update_log_tail() journal->j_free += freed; //update log tail and increase j_free //j_max_transaction_buffers blocks ...
As I understand it, the journal space can't be exhausted, I don't know how it happened, am I missing something?
Thanks, Yi.
On Wed 26-06-24 21:24:13, Zhang Yi wrote:
On 2024/6/26 19:22, Jan Kara wrote:
On Wed 26-06-24 15:38:42, Zhang Yi wrote:
On 2024/6/25 1:01, Jan Kara wrote:
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction:
static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... }
I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block().
Sure. So what was exactly happening is a loop like this:
start_this_handle() blocks = 252 (handle->h_total_credits)
- starts a new transaction
- t_outstanding_credits set to 6 to account for the commit block and descriptor blocks
add_transaction_credits(journal, 252, 0) needed = atomic_add_return(252, &t->t_outstanding_credits); if (needed > journal->j_max_transaction_buffers) { /* Yes, this is exceeded due to descriptor blocks being in * t_outstanding_credits */ ... wait_transaction_locked(journal); - this commits an empty transaction - contains only the commit block return 1 goto repeat
So we commit single block transactions in a loop until we exhaust all the journal space. The condition in add_transaction_credits() whether there's enough space in the journal is never reached so we don't ever push the journal tail to make space in the journal.
mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, but I still have one question, although the journal tail can't be updated in add_transaction_credits(), I think the journal tail should be updated in jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't add empty transactions to journal->j_checkpoint_transactions list, the loop in jbd2_journal_commit_transaction() should like this:
... jbd2_journal_commit_transaction() update_tail = jbd2_journal_get_log_tail() //journal->j_checkpoint_transactions should be NULL, tid is the //committing transaction's tid, which should be large than the //tail tid since the second loop, so this should be true after //the second loop if (freed < journal->j_max_transaction_buffers) update_tail = 0; //update_tail should be true after j_max_transaction_buffers loop
jbd2_update_log_tail() //j_free should be increased in each //j_max_transaction_buffers loop if (tid_gt(tid, journal->j_tail_sequence)) //it's true __jbd2_update_log_tail() journal->j_free += freed; //update log tail and increase j_free //j_max_transaction_buffers blocks ...
As I understand it, the journal space can't be exhausted, I don't know how it happened, am I missing something?
Well, at the beginning of the journal there are a few non-empty transactions whose buffers didn't get checkpointed before we entered the commit loop. So we cannot just push the journal tail without doing a checkpoint and we never call into the checkpointing code in the commit loop...
Honza
On 2024/6/26 22:55, Jan Kara wrote:
On Wed 26-06-24 21:24:13, Zhang Yi wrote:
On 2024/6/26 19:22, Jan Kara wrote:
On Wed 26-06-24 15:38:42, Zhang Yi wrote:
On 2024/6/25 1:01, Jan Kara wrote:
Commit 9f356e5a4f12 ("jbd2: Account descriptor blocks into t_outstanding_credits") started to account descriptor blocks into transactions outstanding credits. However it didn't appropriately decrease the maximum amount of credits available to userspace. Thus if the filesystem requests a transaction smaller than j_max_transaction_buffers but large enough that when descriptor blocks are added the size exceeds j_max_transaction_buffers, we confuse add_transaction_credits() into thinking previous handles have grown the transaction too much and enter infinite journal commit loop in start_this_handle() -> add_transaction_credits() trying to create transaction with enough credits available.
I understand that the incorrect max transaction limit in start_this_handle() could lead to infinite loop in start_this_handle()-> add_transaction_credits() with large enough userspace credits (from j_max_transaction_buffers - overheads to j_max_transaction_buffers), but I don't get how could it lead to ran out of space in the journal commit traction? IIUC, below codes in add_transaction_credits() could make sure that we have enough space when committing traction:
static int add_transaction_credits() { ... if (jbd2_log_space_left(journal) < journal->j_max_transaction_buffers) { ... return 1; ... } ... }
I can't open and download the image Alexander gave, so I can't get to the bottom of this issue, please let me know what happened with jbd2_journal_next_log_block().
Sure. So what was exactly happening is a loop like this:
start_this_handle() blocks = 252 (handle->h_total_credits)
- starts a new transaction
- t_outstanding_credits set to 6 to account for the commit block and descriptor blocks
add_transaction_credits(journal, 252, 0) needed = atomic_add_return(252, &t->t_outstanding_credits); if (needed > journal->j_max_transaction_buffers) { /* Yes, this is exceeded due to descriptor blocks being in * t_outstanding_credits */ ... wait_transaction_locked(journal); - this commits an empty transaction - contains only the commit block return 1 goto repeat
So we commit single block transactions in a loop until we exhaust all the journal space. The condition in add_transaction_credits() whether there's enough space in the journal is never reached so we don't ever push the journal tail to make space in the journal.
mm-hm, ha, yeah, this will lead to submit an empty transaction in each loop, but I still have one question, although the journal tail can't be updated in add_transaction_credits(), I think the journal tail should be updated in jbd2_journal_commit_transaction()->jbd2_update_log_tail() since we don't add empty transactions to journal->j_checkpoint_transactions list, the loop in jbd2_journal_commit_transaction() should like this:
... jbd2_journal_commit_transaction() update_tail = jbd2_journal_get_log_tail() //journal->j_checkpoint_transactions should be NULL, tid is the //committing transaction's tid, which should be large than the //tail tid since the second loop, so this should be true after //the second loop if (freed < journal->j_max_transaction_buffers) update_tail = 0; //update_tail should be true after j_max_transaction_buffers loop
jbd2_update_log_tail() //j_free should be increased in each //j_max_transaction_buffers loop if (tid_gt(tid, journal->j_tail_sequence)) //it's true __jbd2_update_log_tail() journal->j_free += freed; //update log tail and increase j_free //j_max_transaction_buffers blocks ...
As I understand it, the journal space can't be exhausted, I don't know how it happened, am I missing something?
Well, at the beginning of the journal there are a few non-empty transactions whose buffers didn't get checkpointed before we entered the commit loop. So we cannot just push the journal tail without doing a checkpoint and we never call into the checkpointing code in the commit loop...
Oh, indeed, I see, thanks for explaining and this patch looks good to me.
Reviewed-by: Zhang Yi yi.zhang@huawei.com
Thanks, Yi.
linux-stable-mirror@lists.linaro.org