From: Jim Wylder jwylder@google.com
[ Upstream commit 3981514180c987a79ea98f0ae06a7cbf58a9ac0f ]
Currently, when regmap_raw_write() splits the data, it uses the max_raw_write value defined for the bus. For any bus that includes the target register address in the max_raw_write value, the chunked transmission will always exceed the maximum transmission length. To avoid this problem, subtract the length of the register and the padding from the maximum transmission.
Signed-off-by: Jim Wylder <jwylder@google.com Link: https://lore.kernel.org/r/20230517152444.3690870-2-jwylder@google.com Signed-off-by: Mark Brown <broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c index d2a54eb0efd9b..f9f38ab3f030e 100644 --- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -2064,6 +2064,8 @@ int _regmap_raw_write(struct regmap *map, unsigned int reg, size_t val_count = val_len / val_bytes; size_t chunk_count, chunk_bytes; size_t chunk_regs = val_count; + size_t max_data = map->max_raw_write - map->format.reg_bytes - + map->format.pad_bytes; int ret, i;
if (!val_count) @@ -2071,8 +2073,8 @@ int _regmap_raw_write(struct regmap *map, unsigned int reg,
if (map->use_single_write) chunk_regs = 1; - else if (map->max_raw_write && val_len > map->max_raw_write) - chunk_regs = map->max_raw_write / val_bytes; + else if (map->max_raw_write && val_len > max_data) + chunk_regs = max_data / val_bytes;
chunk_count = val_count / chunk_regs; chunk_bytes = chunk_regs * val_bytes;
From: Joao Martins joao.m.martins@oracle.com
[ Upstream commit af47b0a24058e56e983881993752f88288ca6511 ]
GALog exists to propagate interrupts into all vCPUs in the system when interrupts are marked as non running (e.g. when vCPUs aren't running). A GALog overflow happens when there's in no space in the log to record the GATag of the interrupt. So when the GALOverflow condition happens, the GALog queue is processed and the GALog is restarted, as the IOMMU manual indicates in section "2.7.4 Guest Virtual APIC Log Restart Procedure":
| * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request | entries are completed as circumstances allow. GALogRun must be 0b to | modify the guest virtual APIC log registers safely. | * Write MMIO Offset 0018h[GALogEn]=0b. | * As necessary, change the following values (e.g., to relocate or | resize the guest virtual APIC event log): | - the Guest Virtual APIC Log Base Address Register | [MMIO Offset 00E0h], | - the Guest Virtual APIC Log Head Pointer Register | [MMIO Offset 2040h][GALogHead], and | - the Guest Virtual APIC Log Tail Pointer Register | [MMIO Offset 2048h][GALogTail]. | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C). | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set | MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear | the bit to disable it.
Failing to handle the GALog overflow means that none of the VFs (in any guest) will work with IOMMU AVIC forcing the user to power cycle the host. When handling the event it resumes the GALog without resizing much like how it is done in the event handler overflow. The [MMIO Offset 2020h][GALOverflow] bit might be set in status register without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll for GA events (to clear space in the galog), also check the overflow bit.
[suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
Co-developed-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Joao Martins joao.m.martins@oracle.com Reviewed-by: Vasant Hegde vasant.hegde@amd.com Link: https://lore.kernel.org/r/20230419201154.83880-3-joao.m.martins@oracle.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/amd_iommu.h | 1 + drivers/iommu/amd/init.c | 24 ++++++++++++++++++++++++ drivers/iommu/amd/iommu.c | 9 ++++++++- 3 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index c160a332ce339..24c7e6c6c0de9 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -15,6 +15,7 @@ extern irqreturn_t amd_iommu_int_thread(int irq, void *data); extern irqreturn_t amd_iommu_int_handler(int irq, void *data); extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid); extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu); +extern void amd_iommu_restart_ga_log(struct amd_iommu *iommu); extern int amd_iommu_init_devices(void); extern void amd_iommu_uninit_devices(void); extern void amd_iommu_init_notifier(void); diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 19a46b9f73574..fd487c33b28aa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -751,6 +751,30 @@ void amd_iommu_restart_event_logging(struct amd_iommu *iommu) iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN); }
+/* + * This function restarts event logging in case the IOMMU experienced + * an GA log overflow. + */ +void amd_iommu_restart_ga_log(struct amd_iommu *iommu) +{ + u32 status; + + status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET); + if (status & MMIO_STATUS_GALOG_RUN_MASK) + return; + + pr_info_ratelimited("IOMMU GA Log restarting\n"); + + iommu_feature_disable(iommu, CONTROL_GALOG_EN); + iommu_feature_disable(iommu, CONTROL_GAINT_EN); + + writel(MMIO_STATUS_GALOG_OVERFLOW_MASK, + iommu->mmio_base + MMIO_STATUS_OFFSET); + + iommu_feature_enable(iommu, CONTROL_GAINT_EN); + iommu_feature_enable(iommu, CONTROL_GALOG_EN); +} + /* * This function resets the command buffer if the IOMMU stopped fetching * commands from it. diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 167da5b1a5e31..3f2355c377630 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -845,6 +845,7 @@ amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { } (MMIO_STATUS_EVT_OVERFLOW_INT_MASK | \ MMIO_STATUS_EVT_INT_MASK | \ MMIO_STATUS_PPR_INT_MASK | \ + MMIO_STATUS_GALOG_OVERFLOW_MASK | \ MMIO_STATUS_GALOG_INT_MASK)
irqreturn_t amd_iommu_int_thread(int irq, void *data) @@ -868,10 +869,16 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data) }
#ifdef CONFIG_IRQ_REMAP - if (status & MMIO_STATUS_GALOG_INT_MASK) { + if (status & (MMIO_STATUS_GALOG_INT_MASK | + MMIO_STATUS_GALOG_OVERFLOW_MASK)) { pr_devel("Processing IOMMU GA Log\n"); iommu_poll_ga_log(iommu); } + + if (status & MMIO_STATUS_GALOG_OVERFLOW_MASK) { + pr_info_ratelimited("IOMMU GA Log overflow\n"); + amd_iommu_restart_ga_log(iommu); + } #endif
if (status & MMIO_STATUS_EVT_OVERFLOW_INT_MASK) {
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 13247018d68f21e7132924b9853f7e2c423588b6 ]
If the initiator suddenly stops sending data during a login while keeping the TCP connection open, the login_work won't be scheduled and will never release the login semaphore; concurrent login operations will therefore get stuck and fail.
The bug is due to the inability of the login timeout code to properly handle this particular case.
Fix the problem by replacing the old per-NP login timer with a new per-connection timer.
The timer is started when an initiator connects to the target; if it expires, it sends a SIGINT signal to the thread pointed at by the conn->login_kworker pointer.
conn->login_kworker is set by calling the iscsit_set_login_timer_kworker() helper, initially it will point to the np thread; When the login operation's control is in the process of being passed from the NP-thread to login_work, the conn->login_worker pointer is set to NULL. Finally, login_kworker will be changed to point to the worker thread executing the login_work job.
If conn->login_kworker is NULL when the timer expires, it means that the login operation hasn't been completed yet but login_work isn't running, in this case the timer will mark the login process as failed and will schedule login_work so the latter will be forced to free the resources it holds.
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Link: https://lore.kernel.org/r/20230508162219.1731964-2-mlombard@redhat.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/iscsi_target.c | 2 - drivers/target/iscsi/iscsi_target_login.c | 63 ++------------------ drivers/target/iscsi/iscsi_target_nego.c | 70 +++++++++++++---------- drivers/target/iscsi/iscsi_target_util.c | 51 +++++++++++++++++ drivers/target/iscsi/iscsi_target_util.h | 4 ++ include/target/iscsi/iscsi_target_core.h | 6 +- 6 files changed, 103 insertions(+), 93 deletions(-)
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 07e196b44b91d..c70ce1fa399f8 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -363,8 +363,6 @@ struct iscsi_np *iscsit_add_np( init_completion(&np->np_restart_comp); INIT_LIST_HEAD(&np->np_list);
- timer_setup(&np->np_login_timer, iscsi_handle_login_thread_timeout, 0); - ret = iscsi_target_setup_login_socket(np, sockaddr); if (ret != 0) { kfree(np); diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index 274bdd7845ca9..90b870f234f03 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -811,59 +811,6 @@ void iscsi_post_login_handler( iscsit_dec_conn_usage_count(conn); }
-void iscsi_handle_login_thread_timeout(struct timer_list *t) -{ - struct iscsi_np *np = from_timer(np, t, np_login_timer); - - spin_lock_bh(&np->np_thread_lock); - pr_err("iSCSI Login timeout on Network Portal %pISpc\n", - &np->np_sockaddr); - - if (np->np_login_timer_flags & ISCSI_TF_STOP) { - spin_unlock_bh(&np->np_thread_lock); - return; - } - - if (np->np_thread) - send_sig(SIGINT, np->np_thread, 1); - - np->np_login_timer_flags &= ~ISCSI_TF_RUNNING; - spin_unlock_bh(&np->np_thread_lock); -} - -static void iscsi_start_login_thread_timer(struct iscsi_np *np) -{ - /* - * This used the TA_LOGIN_TIMEOUT constant because at this - * point we do not have access to ISCSI_TPG_ATTRIB(tpg)->login_timeout - */ - spin_lock_bh(&np->np_thread_lock); - np->np_login_timer_flags &= ~ISCSI_TF_STOP; - np->np_login_timer_flags |= ISCSI_TF_RUNNING; - mod_timer(&np->np_login_timer, jiffies + TA_LOGIN_TIMEOUT * HZ); - - pr_debug("Added timeout timer to iSCSI login request for" - " %u seconds.\n", TA_LOGIN_TIMEOUT); - spin_unlock_bh(&np->np_thread_lock); -} - -static void iscsi_stop_login_thread_timer(struct iscsi_np *np) -{ - spin_lock_bh(&np->np_thread_lock); - if (!(np->np_login_timer_flags & ISCSI_TF_RUNNING)) { - spin_unlock_bh(&np->np_thread_lock); - return; - } - np->np_login_timer_flags |= ISCSI_TF_STOP; - spin_unlock_bh(&np->np_thread_lock); - - del_timer_sync(&np->np_login_timer); - - spin_lock_bh(&np->np_thread_lock); - np->np_login_timer_flags &= ~ISCSI_TF_RUNNING; - spin_unlock_bh(&np->np_thread_lock); -} - int iscsit_setup_np( struct iscsi_np *np, struct sockaddr_storage *sockaddr) @@ -1123,10 +1070,13 @@ static struct iscsit_conn *iscsit_alloc_conn(struct iscsi_np *np) spin_lock_init(&conn->nopin_timer_lock); spin_lock_init(&conn->response_queue_lock); spin_lock_init(&conn->state_lock); + spin_lock_init(&conn->login_worker_lock); + spin_lock_init(&conn->login_timer_lock);
timer_setup(&conn->nopin_response_timer, iscsit_handle_nopin_response_timeout, 0); timer_setup(&conn->nopin_timer, iscsit_handle_nopin_timeout, 0); + timer_setup(&conn->login_timer, iscsit_login_timeout, 0);
if (iscsit_conn_set_transport(conn, np->np_transport) < 0) goto free_conn; @@ -1304,7 +1254,7 @@ static int __iscsi_target_login_thread(struct iscsi_np *np) goto new_sess_out; }
- iscsi_start_login_thread_timer(np); + iscsit_start_login_timer(conn, current);
pr_debug("Moving to TARG_CONN_STATE_XPT_UP.\n"); conn->conn_state = TARG_CONN_STATE_XPT_UP; @@ -1417,8 +1367,6 @@ static int __iscsi_target_login_thread(struct iscsi_np *np) if (ret < 0) goto new_sess_out;
- iscsi_stop_login_thread_timer(np); - if (ret == 1) { tpg_np = conn->tpg_np;
@@ -1434,7 +1382,7 @@ static int __iscsi_target_login_thread(struct iscsi_np *np) new_sess_out: new_sess = true; old_sess_out: - iscsi_stop_login_thread_timer(np); + iscsit_stop_login_timer(conn); tpg_np = conn->tpg_np; iscsi_target_login_sess_out(conn, zero_tsih, new_sess); new_sess = false; @@ -1448,7 +1396,6 @@ static int __iscsi_target_login_thread(struct iscsi_np *np) return 1;
exit: - iscsi_stop_login_thread_timer(np); spin_lock_bh(&np->np_thread_lock); np->np_thread_state = ISCSI_NP_THREAD_EXIT; spin_unlock_bh(&np->np_thread_lock); diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c index 24040c118e49d..e3a5644a70b36 100644 --- a/drivers/target/iscsi/iscsi_target_nego.c +++ b/drivers/target/iscsi/iscsi_target_nego.c @@ -535,25 +535,6 @@ static void iscsi_target_login_drop(struct iscsit_conn *conn, struct iscsi_login iscsi_target_login_sess_out(conn, zero_tsih, true); }
-struct conn_timeout { - struct timer_list timer; - struct iscsit_conn *conn; -}; - -static void iscsi_target_login_timeout(struct timer_list *t) -{ - struct conn_timeout *timeout = from_timer(timeout, t, timer); - struct iscsit_conn *conn = timeout->conn; - - pr_debug("Entering iscsi_target_login_timeout >>>>>>>>>>>>>>>>>>>\n"); - - if (conn->login_kworker) { - pr_debug("Sending SIGINT to conn->login_kworker %s/%d\n", - conn->login_kworker->comm, conn->login_kworker->pid); - send_sig(SIGINT, conn->login_kworker, 1); - } -} - static void iscsi_target_do_login_rx(struct work_struct *work) { struct iscsit_conn *conn = container_of(work, @@ -562,12 +543,15 @@ static void iscsi_target_do_login_rx(struct work_struct *work) struct iscsi_np *np = login->np; struct iscsi_portal_group *tpg = conn->tpg; struct iscsi_tpg_np *tpg_np = conn->tpg_np; - struct conn_timeout timeout; int rc, zero_tsih = login->zero_tsih; bool state;
pr_debug("entering iscsi_target_do_login_rx, conn: %p, %s:%d\n", conn, current->comm, current->pid); + + spin_lock(&conn->login_worker_lock); + set_bit(LOGIN_FLAGS_WORKER_RUNNING, &conn->login_flags); + spin_unlock(&conn->login_worker_lock); /* * If iscsi_target_do_login_rx() has been invoked by ->sk_data_ready() * before initial PDU processing in iscsi_target_start_negotiation() @@ -597,19 +581,16 @@ static void iscsi_target_do_login_rx(struct work_struct *work) goto err; }
- conn->login_kworker = current; allow_signal(SIGINT); - - timeout.conn = conn; - timer_setup_on_stack(&timeout.timer, iscsi_target_login_timeout, 0); - mod_timer(&timeout.timer, jiffies + TA_LOGIN_TIMEOUT * HZ); - pr_debug("Starting login timer for %s/%d\n", current->comm, current->pid); + rc = iscsit_set_login_timer_kworker(conn, current); + if (rc < 0) { + /* The login timer has already expired */ + pr_debug("iscsi_target_do_login_rx, login failed\n"); + goto err; + }
rc = conn->conn_transport->iscsit_get_login_rx(conn, login); - del_timer_sync(&timeout.timer); - destroy_timer_on_stack(&timeout.timer); flush_signals(current); - conn->login_kworker = NULL;
if (rc < 0) goto err; @@ -646,7 +627,17 @@ static void iscsi_target_do_login_rx(struct work_struct *work) if (iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_WRITE_ACTIVE)) goto err; + + /* + * Set the login timer thread pointer to NULL to prevent the + * login process from getting stuck if the initiator + * stops sending data. + */ + rc = iscsit_set_login_timer_kworker(conn, NULL); + if (rc < 0) + goto err; } else if (rc == 1) { + iscsit_stop_login_timer(conn); cancel_delayed_work(&conn->login_work); iscsi_target_nego_release(conn); iscsi_post_login_handler(np, conn, zero_tsih); @@ -656,6 +647,7 @@ static void iscsi_target_do_login_rx(struct work_struct *work)
err: iscsi_target_restore_sock_callbacks(conn); + iscsit_stop_login_timer(conn); cancel_delayed_work(&conn->login_work); iscsi_target_login_drop(conn, login); iscsit_deaccess_np(np, tpg, tpg_np); @@ -1368,14 +1360,30 @@ int iscsi_target_start_negotiation( * and perform connection cleanup now. */ ret = iscsi_target_do_login(conn, login); - if (!ret && iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_INITIAL_PDU)) - ret = -1; + if (!ret) { + spin_lock(&conn->login_worker_lock); + + if (iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_INITIAL_PDU)) + ret = -1; + else if (!test_bit(LOGIN_FLAGS_WORKER_RUNNING, &conn->login_flags)) { + if (iscsit_set_login_timer_kworker(conn, NULL) < 0) { + /* + * The timeout has expired already. + * Schedule login_work to perform the cleanup. + */ + schedule_delayed_work(&conn->login_work, 0); + } + } + + spin_unlock(&conn->login_worker_lock); + }
if (ret < 0) { iscsi_target_restore_sock_callbacks(conn); iscsi_remove_failed_auth_entry(conn); } if (ret != 0) { + iscsit_stop_login_timer(conn); cancel_delayed_work_sync(&conn->login_work); iscsi_target_nego_release(conn); } diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index 26dc8ed3045b6..b14835fcb0334 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -1040,6 +1040,57 @@ void iscsit_stop_nopin_timer(struct iscsit_conn *conn) spin_unlock_bh(&conn->nopin_timer_lock); }
+void iscsit_login_timeout(struct timer_list *t) +{ + struct iscsit_conn *conn = from_timer(conn, t, login_timer); + struct iscsi_login *login = conn->login; + + pr_debug("Entering iscsi_target_login_timeout >>>>>>>>>>>>>>>>>>>\n"); + + spin_lock_bh(&conn->login_timer_lock); + login->login_failed = 1; + + if (conn->login_kworker) { + pr_debug("Sending SIGINT to conn->login_kworker %s/%d\n", + conn->login_kworker->comm, conn->login_kworker->pid); + send_sig(SIGINT, conn->login_kworker, 1); + } else { + schedule_delayed_work(&conn->login_work, 0); + } + spin_unlock_bh(&conn->login_timer_lock); +} + +void iscsit_start_login_timer(struct iscsit_conn *conn, struct task_struct *kthr) +{ + pr_debug("Login timer started\n"); + + conn->login_kworker = kthr; + mod_timer(&conn->login_timer, jiffies + TA_LOGIN_TIMEOUT * HZ); +} + +int iscsit_set_login_timer_kworker(struct iscsit_conn *conn, struct task_struct *kthr) +{ + struct iscsi_login *login = conn->login; + int ret = 0; + + spin_lock_bh(&conn->login_timer_lock); + if (login->login_failed) { + /* The timer has already expired */ + ret = -1; + } else { + conn->login_kworker = kthr; + } + spin_unlock_bh(&conn->login_timer_lock); + + return ret; +} + +void iscsit_stop_login_timer(struct iscsit_conn *conn) +{ + pr_debug("Login timer stopped\n"); + timer_delete_sync(&conn->login_timer); +} + int iscsit_send_tx_data( struct iscsit_cmd *cmd, struct iscsit_conn *conn, diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target/iscsi/iscsi_target_util.h index 33ea799a08508..24b8e577575a6 100644 --- a/drivers/target/iscsi/iscsi_target_util.h +++ b/drivers/target/iscsi/iscsi_target_util.h @@ -56,6 +56,10 @@ extern void iscsit_handle_nopin_timeout(struct timer_list *t); extern void __iscsit_start_nopin_timer(struct iscsit_conn *); extern void iscsit_start_nopin_timer(struct iscsit_conn *); extern void iscsit_stop_nopin_timer(struct iscsit_conn *); +extern void iscsit_login_timeout(struct timer_list *t); +extern void iscsit_start_login_timer(struct iscsit_conn *, struct task_struct *kthr); +extern void iscsit_stop_login_timer(struct iscsit_conn *); +extern int iscsit_set_login_timer_kworker(struct iscsit_conn *, struct task_struct *kthr); extern int iscsit_send_tx_data(struct iscsit_cmd *, struct iscsit_conn *, int); extern int iscsit_fe_sendpage_sg(struct iscsit_cmd *, struct iscsit_conn *); extern int iscsit_tx_login_rsp(struct iscsit_conn *, u8, u8); diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h index 229118156a1f6..42f4a4c0c1003 100644 --- a/include/target/iscsi/iscsi_target_core.h +++ b/include/target/iscsi/iscsi_target_core.h @@ -562,12 +562,14 @@ struct iscsit_conn { #define LOGIN_FLAGS_READ_ACTIVE 2 #define LOGIN_FLAGS_WRITE_ACTIVE 3 #define LOGIN_FLAGS_CLOSED 4 +#define LOGIN_FLAGS_WORKER_RUNNING 5 unsigned long login_flags; struct delayed_work login_work; struct iscsi_login *login; struct timer_list nopin_timer; struct timer_list nopin_response_timer; struct timer_list transport_timer; + struct timer_list login_timer; struct task_struct *login_kworker; /* Spinlock used for add/deleting cmd's from conn_cmd_list */ spinlock_t cmd_lock; @@ -576,6 +578,8 @@ struct iscsit_conn { spinlock_t nopin_timer_lock; spinlock_t response_queue_lock; spinlock_t state_lock; + spinlock_t login_timer_lock; + spinlock_t login_worker_lock; /* libcrypto RX and TX contexts for crc32c */ struct ahash_request *conn_rx_hash; struct ahash_request *conn_tx_hash; @@ -792,7 +796,6 @@ struct iscsi_np { enum np_thread_state_table np_thread_state; bool enabled; atomic_t np_reset_count; - enum iscsi_timer_flags_table np_login_timer_flags; u32 np_exports; enum np_flags_table np_flags; spinlock_t np_thread_lock; @@ -800,7 +803,6 @@ struct iscsi_np { struct socket *np_socket; struct sockaddr_storage np_sockaddr; struct task_struct *np_thread; - struct timer_list np_login_timer; void *np_context; struct iscsit_transport *np_transport; struct list_head np_list;
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 98a8c2bf938a5973716f280da618077a3d255976 ]
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Link: https://lore.kernel.org/r/20230508162219.1731964-3-mlombard@redhat.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/target/iscsi/iscsi_target_core.h | 1 - 1 file changed, 1 deletion(-)
diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h index 42f4a4c0c1003..4c15420e8965d 100644 --- a/include/target/iscsi/iscsi_target_core.h +++ b/include/target/iscsi/iscsi_target_core.h @@ -568,7 +568,6 @@ struct iscsit_conn { struct iscsi_login *login; struct timer_list nopin_timer; struct timer_list nopin_response_timer; - struct timer_list transport_timer; struct timer_list login_timer; struct task_struct *login_kworker; /* Spinlock used for add/deleting cmd's from conn_cmd_list */
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 2a737d3b8c792400118d6cf94958f559de9c5e59 ]
The tpg->np_login_sem is a semaphore that is used to serialize the login process when multiple login threads run concurrently against the same target portal group.
The iscsi_target_locate_portal() function finds the tpg, calls iscsit_access_np() against the np_login_sem semaphore and saves the tpg pointer in conn->tpg;
If iscsi_target_locate_portal() fails, the caller will check for the conn->tpg pointer and, if it's not NULL, then it will assume that iscsi_target_locate_portal() called iscsit_access_np() on the semaphore.
Make sure that conn->tpg gets initialized only if iscsit_access_np() was successful, otherwise iscsit_deaccess_np() may end up being called against a semaphore we never took, allowing more than one thread to access the same tpg.
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Link: https://lore.kernel.org/r/20230508162219.1731964-4-mlombard@redhat.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/iscsi_target_nego.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c index e3a5644a70b36..fa3fb5f4e6bc4 100644 --- a/drivers/target/iscsi/iscsi_target_nego.c +++ b/drivers/target/iscsi/iscsi_target_nego.c @@ -1122,6 +1122,7 @@ int iscsi_target_locate_portal( iscsi_target_set_sock_callbacks(conn);
login->np = np; + conn->tpg = NULL;
login_req = (struct iscsi_login_req *) login->req; payload_length = ntoh24(login_req->dlength); @@ -1189,7 +1190,6 @@ int iscsi_target_locate_portal( */ sessiontype = strncmp(s_buf, DISCOVERY, 9); if (!sessiontype) { - conn->tpg = iscsit_global->discovery_tpg; if (!login->leading_connection) goto get_target;
@@ -1206,9 +1206,11 @@ int iscsi_target_locate_portal( * Serialize access across the discovery struct iscsi_portal_group to * process login attempt. */ + conn->tpg = iscsit_global->discovery_tpg; if (iscsit_access_np(np, conn->tpg) < 0) { iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, ISCSI_LOGIN_STATUS_SVC_UNAVAILABLE); + conn->tpg = NULL; ret = -1; goto out; }
From: Sung-Chi Li lschyi@chromium.org
[ Upstream commit ed84c4517a5bc536e8572a01dfa11bc22a280d06 ]
Add 1 additional hammer-like device.
Signed-off-by: Sung-Chi Li lschyi@chromium.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-google-hammer.c | 2 ++ drivers/hid/hid-ids.h | 1 + 2 files changed, 3 insertions(+)
diff --git a/drivers/hid/hid-google-hammer.c b/drivers/hid/hid-google-hammer.c index 7ae5f27df54dd..c6bdb9c4ef3e0 100644 --- a/drivers/hid/hid-google-hammer.c +++ b/drivers/hid/hid-google-hammer.c @@ -586,6 +586,8 @@ static const struct hid_device_id hammer_devices[] = { USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_EEL) }, { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_HAMMER) }, + { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, + USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_JEWEL) }, { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, USB_VENDOR_ID_GOOGLE, USB_DEVICE_ID_GOOGLE_MAGNEMITE) }, { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 8f3e0a5d5f834..2e23018c9f23d 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -529,6 +529,7 @@ #define USB_DEVICE_ID_GOOGLE_MOONBALL 0x5044 #define USB_DEVICE_ID_GOOGLE_DON 0x5050 #define USB_DEVICE_ID_GOOGLE_EEL 0x5057 +#define USB_DEVICE_ID_GOOGLE_JEWEL 0x5061
#define USB_VENDOR_ID_GOTOP 0x08f2 #define USB_DEVICE_ID_SUPER_Q2 0x007f
From: Denis Arefev arefev@swemel.ru
[ Upstream commit 16a9c24f24fbe4564284eb575b18cc20586b9270 ]
Added a variable check and transition in case of an error
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Denis Arefev arefev@swemel.ru Reviewed-by: Ping Cheng ping.cheng@wacom.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/wacom_sys.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c index fb538a6c4add8..aff4a21a46b6a 100644 --- a/drivers/hid/wacom_sys.c +++ b/drivers/hid/wacom_sys.c @@ -2417,8 +2417,13 @@ static int wacom_parse_and_register(struct wacom *wacom, bool wireless) goto fail_quirks; }
- if (features->device_type & WACOM_DEVICETYPE_WL_MONITOR) + if (features->device_type & WACOM_DEVICETYPE_WL_MONITOR) { error = hid_hw_open(hdev); + if (error) { + hid_err(hdev, "hw open failed\n"); + goto fail_quirks; + } + }
wacom_set_shared_values(wacom_wac); devres_close_group(&hdev->dev, wacom);
From: Marc Zyngier maz@kernel.org
[ Upstream commit 8d0f019e4c4f2ee2de81efd9bf1c27e9fb3c0460 ]
Add the missing Set/Way CMOs that apply to tagged memory.
Signed-off-by: Marc Zyngier maz@kernel.org Reviewed-by: Cornelia Huck cohuck@redhat.com Reviewed-by: Steven Price steven.price@arm.com Reviewed-by: Oliver Upton oliver.upton@linux.dev Link: https://lore.kernel.org/r/20230515204601.1270428-2-maz@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/sysreg.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 9e3ecba3c4e67..d4a85c1db2e2a 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -115,8 +115,14 @@ #define SB_BARRIER_INSN __SYS_BARRIER_INSN(0, 7, 31)
#define SYS_DC_ISW sys_insn(1, 0, 7, 6, 2) +#define SYS_DC_IGSW sys_insn(1, 0, 7, 6, 4) +#define SYS_DC_IGDSW sys_insn(1, 0, 7, 6, 6) #define SYS_DC_CSW sys_insn(1, 0, 7, 10, 2) +#define SYS_DC_CGSW sys_insn(1, 0, 7, 10, 4) +#define SYS_DC_CGDSW sys_insn(1, 0, 7, 10, 6) #define SYS_DC_CISW sys_insn(1, 0, 7, 14, 2) +#define SYS_DC_CIGSW sys_insn(1, 0, 7, 14, 4) +#define SYS_DC_CIGDSW sys_insn(1, 0, 7, 14, 6)
/* * Automatically generated definitions for system registers, the
From: Steve French stfrench@microsoft.com
[ Upstream commit b535cc796a4b4942cd189652588e8d37c1f5925a ]
If plen is null when passed in, we only checked for null in one of the two places where it could be used. Although plen is always valid (not null) for current callers of the SMB2_change_notify function, this change makes it more consistent.
Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Closes: https://lore.kernel.org/all/202305251831.3V1gbbFs-lkp@intel.com/ Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2pdu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 02228a590cd54..a2a95cd113df8 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -3777,7 +3777,7 @@ SMB2_change_notify(const unsigned int xid, struct cifs_tcon *tcon, if (*out_data == NULL) { rc = -ENOMEM; goto cnotify_exit; - } else + } else if (plen) *plen = le32_to_cpu(smb_rsp->OutputBufferLength); }
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit fe4526d99e2e06b08bb80316c3a596ea6a807b75 ]
Explicitly disable the CEC adapter in cec_devnode_unregister()
Usually this does not really do anything important, but for drivers that use the CEC pin framework this is needed to properly stop the hrtimer. Without this a crash would happen when such a driver is unloaded with rmmod.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/cec/core/cec-adap.c | 5 ++++- drivers/media/cec/core/cec-core.c | 2 ++ drivers/media/cec/core/cec-priv.h | 1 + 3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c index 4f5ab3cae8a71..ac18707fddcd2 100644 --- a/drivers/media/cec/core/cec-adap.c +++ b/drivers/media/cec/core/cec-adap.c @@ -1582,7 +1582,7 @@ static void cec_claim_log_addrs(struct cec_adapter *adap, bool block) * * This function is called with adap->lock held. */ -static int cec_adap_enable(struct cec_adapter *adap) +int cec_adap_enable(struct cec_adapter *adap) { bool enable; int ret = 0; @@ -1592,6 +1592,9 @@ static int cec_adap_enable(struct cec_adapter *adap) if (adap->needs_hpd) enable = enable && adap->phys_addr != CEC_PHYS_ADDR_INVALID;
+ if (adap->devnode.unregistered) + enable = false; + if (enable == adap->is_enabled) return 0;
diff --git a/drivers/media/cec/core/cec-core.c b/drivers/media/cec/core/cec-core.c index af358e901b5f3..7e153c5cad04f 100644 --- a/drivers/media/cec/core/cec-core.c +++ b/drivers/media/cec/core/cec-core.c @@ -191,6 +191,8 @@ static void cec_devnode_unregister(struct cec_adapter *adap) mutex_lock(&adap->lock); __cec_s_phys_addr(adap, CEC_PHYS_ADDR_INVALID, false); __cec_s_log_addrs(adap, NULL, false); + // Disable the adapter (since adap->devnode.unregistered is true) + cec_adap_enable(adap); mutex_unlock(&adap->lock);
cdev_device_del(&devnode->cdev, &devnode->dev); diff --git a/drivers/media/cec/core/cec-priv.h b/drivers/media/cec/core/cec-priv.h index b78df931aa74b..ed1f8c67626bf 100644 --- a/drivers/media/cec/core/cec-priv.h +++ b/drivers/media/cec/core/cec-priv.h @@ -47,6 +47,7 @@ int cec_monitor_pin_cnt_inc(struct cec_adapter *adap); void cec_monitor_pin_cnt_dec(struct cec_adapter *adap); int cec_adap_status(struct seq_file *file, void *priv); int cec_thread_func(void *_adap); +int cec_adap_enable(struct cec_adapter *adap); void __cec_s_phys_addr(struct cec_adapter *adap, u16 phys_addr, bool block); int __cec_s_log_addrs(struct cec_adapter *adap, struct cec_log_addrs *log_addrs, bool block);
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 73af6c7511038249cad3d5f3b44bf8d78ac0f499 ]
When a message was received the last_initiator is set to 0xff. This will force the signal free time for the next transmit to that for a new initiator. However, if a new transmit is already in progress, then don't set last_initiator, since that's the initiator of the current transmit. Overwriting this would cause the signal free time of a following transmit to be that of the new initiator instead of a next transmit.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/cec/core/cec-adap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c index ac18707fddcd2..b1512f9c5895c 100644 --- a/drivers/media/cec/core/cec-adap.c +++ b/drivers/media/cec/core/cec-adap.c @@ -1090,7 +1090,8 @@ void cec_received_msg_ts(struct cec_adapter *adap, mutex_lock(&adap->lock); dprintk(2, "%s: %*ph\n", __func__, msg->len, msg->msg);
- adap->last_initiator = 0xff; + if (!adap->transmit_in_progress) + adap->last_initiator = 0xff;
/* Check if this message was for us (directed or broadcast). */ if (!cec_msg_is_broadcast(msg))
From: Osama Muhammad osmtendev@gmail.com
[ Upstream commit 9b9e46aa07273ceb96866b2e812b46f1ee0b8d2f ]
This patch fixes the error checking in nfcsim.c. The DebugFS kernel API is developed in a way that the caller can safely ignore the errors that occur during the creation of DebugFS nodes.
Signed-off-by: Osama Muhammad osmtendev@gmail.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nfc/nfcsim.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/nfc/nfcsim.c b/drivers/nfc/nfcsim.c index 85bf8d586c707..0f6befe8be1e2 100644 --- a/drivers/nfc/nfcsim.c +++ b/drivers/nfc/nfcsim.c @@ -336,10 +336,6 @@ static struct dentry *nfcsim_debugfs_root; static void nfcsim_debugfs_init(void) { nfcsim_debugfs_root = debugfs_create_dir("nfcsim", NULL); - - if (!nfcsim_debugfs_root) - pr_err("Could not create debugfs entry\n"); - }
static void nfcsim_debugfs_remove(void)
From: Shida Zhang zhangshida@kylinos.cn
[ Upstream commit 8fd9f4232d8152c650fd15127f533a0f6d0a4b2b ]
This fixes the following warning reported by gcc 10.2.1 under x86_64:
../fs/btrfs/tree-log.c: In function ‘btrfs_log_inode’: ../fs/btrfs/tree-log.c:6211:9: error: ‘last_range_start’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 6211 | ret = insert_dir_log_key(trans, log, path, key.objectid, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 6212 | first_dir_index, last_dir_index); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../fs/btrfs/tree-log.c:6161:6: note: ‘last_range_start’ was declared here 6161 | u64 last_range_start; | ^~~~~~~~~~~~~~~~
This might be a false positive fixed in later compiler versions but we want to have it fixed.
Reported-by: k2ci kernel-bot@kylinos.cn Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Shida Zhang zhangshida@kylinos.cn Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/tree-log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 200cea6e49e51..b91fa398b8141 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -6163,7 +6163,7 @@ static int log_delayed_deletions_incremental(struct btrfs_trans_handle *trans, { struct btrfs_root *log = inode->root->log_root; const struct btrfs_delayed_item *curr; - u64 last_range_start; + u64 last_range_start = 0; u64 last_range_end = 0; struct btrfs_key key;
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 016da9c65fec9f0e78c4909ed9a0f2d567af6775 ]
The "udc" pointer was never set in the probe() function so it will lead to a NULL dereference in udc_pci_remove() when we do:
usb_del_gadget_udc(&udc->gadget);
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/ZG+A/dNpFWAlCChk@kili Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/amd5536udc_pci.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/usb/gadget/udc/amd5536udc_pci.c b/drivers/usb/gadget/udc/amd5536udc_pci.c index c80f9bd51b750..a36913ae31f9e 100644 --- a/drivers/usb/gadget/udc/amd5536udc_pci.c +++ b/drivers/usb/gadget/udc/amd5536udc_pci.c @@ -170,6 +170,9 @@ static int udc_pci_probe( retval = -ENODEV; goto err_probe; } + + udc = dev; + return 0;
err_probe:
From: "min15.li" min15.li@samsung.com
[ Upstream commit 31a5978243d24d77be4bacca56c78a0fbc43b00d ]
In the function nvme_passthru_end(), only the value of the command opcode is checked, without checking the command type (IO command or Admin command). When we send a Dataset Management command (The opcode of the Dataset Management command is the same as the Set Feature command), kernel thinks it is a set feature command, then sets the controller's keep alive interval, and calls nvme_keep_alive_work().
Signed-off-by: min15.li min15.li@samsung.com Reviewed-by: Kanchan Joshi joshi.k@samsung.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 4 +++- drivers/nvme/host/ioctl.c | 2 +- drivers/nvme/host/nvme.h | 2 +- drivers/nvme/target/passthru.c | 2 +- 4 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index bdf1601219fc4..dba10d182c6b6 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1115,7 +1115,7 @@ u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode) } EXPORT_SYMBOL_NS_GPL(nvme_passthru_start, NVME_TARGET_PASSTHRU);
-void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects, +void nvme_passthru_end(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u32 effects, struct nvme_command *cmd, int status) { if (effects & NVME_CMD_EFFECTS_CSE_MASK) { @@ -1132,6 +1132,8 @@ void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects, nvme_queue_scan(ctrl); flush_work(&ctrl->scan_work); } + if (ns) + return;
switch (cmd->common.opcode) { case nvme_admin_set_features: diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index d24ea2e051564..0264ec74a4361 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -254,7 +254,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, blk_mq_free_request(req);
if (effects) - nvme_passthru_end(ctrl, effects, cmd, ret); + nvme_passthru_end(ctrl, ns, effects, cmd, ret);
return ret; } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index bf46f122e9e1e..fcf3af2d64509 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -1072,7 +1072,7 @@ u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode); u32 nvme_passthru_start(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode); int nvme_execute_rq(struct request *rq, bool at_head); -void nvme_passthru_end(struct nvme_ctrl *ctrl, u32 effects, +void nvme_passthru_end(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u32 effects, struct nvme_command *cmd, int status); struct nvme_ctrl *nvme_ctrl_from_file(struct file *file); struct nvme_ns *nvme_find_get_ns(struct nvme_ctrl *ctrl, unsigned nsid); diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c index 511c980d538df..71a9c1cc57f59 100644 --- a/drivers/nvme/target/passthru.c +++ b/drivers/nvme/target/passthru.c @@ -243,7 +243,7 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w) blk_mq_free_request(rq);
if (effects) - nvme_passthru_end(ctrl, effects, req->cmd, status); + nvme_passthru_end(ctrl, ns, effects, req->cmd, status); }
static enum rq_end_io_ret nvmet_passthru_req_done(struct request *rq,
From: Uday Shankar ushankar@purestorage.com
[ Upstream commit ea4d453b9ec9ea279c39744cd0ecb47ef48ede35 ]
With TBKAS on, the completion of one command can defer sending a keep alive for up to twice the delay between successive runs of nvme_keep_alive_work. The current delay of KATO / 2 thus makes it possible for one command to defer sending a keep alive for up to KATO, which can result in the controller detecting a KATO. The following trace demonstrates the issue, taking KATO = 8 for simplicity:
1. t = 0: run nvme_keep_alive_work, no keep-alive sent 2. t = ε: I/O completion seen, set comp_seen = true 3. t = 4: run nvme_keep_alive_work, see comp_seen == true, skip sending keep-alive, set comp_seen = false 4. t = 8: run nvme_keep_alive_work, see comp_seen == false, send a keep-alive command.
Here, there is a delay of 8 - ε between receiving a command completion and sending the next command. With ε small, the controller is likely to detect a keep alive timeout.
Fix this by running nvme_keep_alive_work with a delay of KATO / 4 whenever TBKAS is on. Going through the above trace now gives us a worst-case delay of 4 - ε, which is in line with the recommendation of sending a command every KATO / 2 in the NVMe specification.
Reported-by: Costa Sapuntzakis costa@purestorage.com Reported-by: Randy Jennings randyj@purestorage.com Signed-off-by: Uday Shankar ushankar@purestorage.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index dba10d182c6b6..28b6fd52f88cb 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1163,9 +1163,25 @@ EXPORT_SYMBOL_NS_GPL(nvme_passthru_end, NVME_TARGET_PASSTHRU); * The host should send Keep Alive commands at half of the Keep Alive Timeout * accounting for transport roundtrip times [..]. */ +static unsigned long nvme_keep_alive_work_period(struct nvme_ctrl *ctrl) +{ + unsigned long delay = ctrl->kato * HZ / 2; + + /* + * When using Traffic Based Keep Alive, we need to run + * nvme_keep_alive_work at twice the normal frequency, as one + * command completion can postpone sending a keep alive command + * by up to twice the delay between runs. + */ + if (ctrl->ctratt & NVME_CTRL_ATTR_TBKAS) + delay /= 2; + return delay; +} + static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl) { - queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ / 2); + queue_delayed_work(nvme_wq, &ctrl->ka_work, + nvme_keep_alive_work_period(ctrl)); }
static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
From: Uday Shankar ushankar@purestorage.com
[ Upstream commit 774a9636514764ddc0d072ae0d1d1c01a47e6ddd ]
When a command completes, we set a flag which will skip sending a keep alive at the next run of nvme_keep_alive_work when TBKAS is on. However, if the command was submitted long ago, it's possible that the controller may have also restarted its keep alive timer (as a result of receiving the command) long ago. The following trace demonstrates the issue, assuming TBKAS is on and KATO = 8 for simplicity:
1. t = 0: submit I/O commands A, B, C, D, E 2. t = 0.5: commands A, B, C, D, E reach controller, restart its keep alive timer 3. t = 1: A completes 4. t = 2: run nvme_keep_alive_work, see recent completion, do nothing 5. t = 3: B completes 6. t = 4: run nvme_keep_alive_work, see recent completion, do nothing 7. t = 5: C completes 8. t = 6: run nvme_keep_alive_work, see recent completion, do nothing 9. t = 7: D completes 10. t = 8: run nvme_keep_alive_work, see recent completion, do nothing 11. t = 9: E completes
At this point, 8.5 seconds have passed without restarting the controller's keep alive timer, so the controller will detect a keep alive timeout.
Fix this by checking the IO start time when deciding to defer sending a keep alive command. Only set comp_seen if the command started after the most recent run of nvme_keep_alive_work. With this change, the completions of B, C, and D will not set comp_seen and the run of nvme_keep_alive_work at t = 4 will send a keep alive.
Reported-by: Costa Sapuntzakis costa@purestorage.com Reported-by: Randy Jennings randyj@purestorage.com Signed-off-by: Uday Shankar ushankar@purestorage.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 14 +++++++++++++- drivers/nvme/host/nvme.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 28b6fd52f88cb..1f66ba634e9c3 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -397,7 +397,16 @@ void nvme_complete_rq(struct request *req) trace_nvme_complete_rq(req); nvme_cleanup_cmd(req);
- if (ctrl->kas) + /* + * Completions of long-running commands should not be able to + * defer sending of periodic keep alives, since the controller + * may have completed processing such commands a long time ago + * (arbitrarily close to command submission time). + * req->deadline - req->timeout is the command submission time + * in jiffies. + */ + if (ctrl->kas && + req->deadline - req->timeout >= ctrl->ka_last_check_time) ctrl->comp_seen = true;
switch (nvme_decide_disposition(req)) { @@ -1200,6 +1209,7 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, return RQ_END_IO_NONE; }
+ ctrl->ka_last_check_time = jiffies; ctrl->comp_seen = false; spin_lock_irqsave(&ctrl->lock, flags); if (ctrl->state == NVME_CTRL_LIVE || @@ -1218,6 +1228,8 @@ static void nvme_keep_alive_work(struct work_struct *work) bool comp_seen = ctrl->comp_seen; struct request *rq;
+ ctrl->ka_last_check_time = jiffies; + if ((ctrl->ctratt & NVME_CTRL_ATTR_TBKAS) && comp_seen) { dev_dbg(ctrl->device, "reschedule traffic based keep-alive timer\n"); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index fcf3af2d64509..3961b064499c0 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -323,6 +323,7 @@ struct nvme_ctrl { struct delayed_work ka_work; struct delayed_work failfast_work; struct nvme_command ka_cmd; + unsigned long ka_last_check_time; struct work_struct fw_act_work; unsigned long events;
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit eb1f822c76beeaa76ab8b6737ab9dc9f9798408c ]
In commit a44be64bbecb ("ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled") we defer clearing tyhe SB_RDONLY flag in struct super. However, we didn't defer when we checked sb_rdonly() to determine the lazy itable init thread should be enabled, with the next result that the lazy inode table initialization would not be properly started. This can cause generic/231 to fail in ext4's nojournal mode.
Fix this by moving when we decide to start or stop the lazy itable init thread to after we clear the SB_RDONLY flag when we are remounting the file system read/write.
Fixes a44be64bbecb ("ext4: don't clear SB_RDONLY when remounting r/w until...")
Signed-off-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230527035729.1001605-1-tytso@mit.edu Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/super.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d34afa8e0c158..1f222c396932e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6554,18 +6554,6 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) } }
- /* - * Reinitialize lazy itable initialization thread based on - * current settings - */ - if (sb_rdonly(sb) || !test_opt(sb, INIT_INODE_TABLE)) - ext4_unregister_li_request(sb); - else { - ext4_group_t first_not_zeroed; - first_not_zeroed = ext4_has_uninit_itable(sb); - ext4_register_li_request(sb, first_not_zeroed); - } - /* * Handle creation of system zone data early because it can fail. * Releasing of existing data is done when we are sure remount will @@ -6603,6 +6591,18 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) if (enable_rw) sb->s_flags &= ~SB_RDONLY;
+ /* + * Reinitialize lazy itable initialization thread based on + * current settings + */ + if (sb_rdonly(sb) || !test_opt(sb, INIT_INODE_TABLE)) + ext4_unregister_li_request(sb); + else { + ext4_group_t first_not_zeroed; + first_not_zeroed = ext4_has_uninit_itable(sb); + ext4_register_li_request(sb, first_not_zeroed); + } + if (!ext4_has_feature_mmp(sb) || sb_rdonly(sb)) ext4_stop_mmpd(sbi);
From: Uday Shankar ushankar@purestorage.com
[ Upstream commit c7275ce6a5fd32ca9f5a6294ed89cf0523181af9 ]
Upon keep alive completion, nvme_keep_alive_work is scheduled with the same delay every time. If keep alive commands are completing slowly, this may cause a keep alive timeout. The following trace illustrates the issue, taking KATO = 8 and TBKAS off for simplicity:
1. t = 0: run nvme_keep_alive_work, send keep alive 2. t = ε: keep alive reaches controller, controller restarts its keep alive timer 3. t = 4: host receives keep alive completion, schedules nvme_keep_alive_work with delay 4 4. t = 8: run nvme_keep_alive_work, send keep alive
Here, a keep alive having RTT of 4 causes a delay of at least 8 - ε between the controller receiving successive keep alives. With ε small, the controller is likely to detect a keep alive timeout.
Fix this by calculating the RTT of the keep alive command, and adjusting the scheduling delay of the next keep alive work accordingly.
Reported-by: Costa Sapuntzakis costa@purestorage.com Reported-by: Randy Jennings randyj@purestorage.com Signed-off-by: Uday Shankar ushankar@purestorage.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 1f66ba634e9c3..d3c3dbed30b37 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1199,6 +1199,20 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, struct nvme_ctrl *ctrl = rq->end_io_data; unsigned long flags; bool startka = false; + unsigned long rtt = jiffies - (rq->deadline - rq->timeout); + unsigned long delay = nvme_keep_alive_work_period(ctrl); + + /* + * Subtract off the keepalive RTT so nvme_keep_alive_work runs + * at the desired frequency. + */ + if (rtt <= delay) { + delay -= rtt; + } else { + dev_warn(ctrl->device, "long keepalive RTT (%u ms)\n", + jiffies_to_msecs(rtt)); + delay = 0; + }
blk_mq_free_request(rq);
@@ -1217,7 +1231,7 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq, startka = true; spin_unlock_irqrestore(&ctrl->lock, flags); if (startka) - nvme_queue_keep_alive_work(ctrl); + queue_delayed_work(nvme_wq, &ctrl->ka_work, delay); return RQ_END_IO_NONE; }
linux-stable-mirror@lists.linaro.org