August 2023 - Linux-stable-mirror

by pcm01＠nutriplan.com.br

Olá meu querido, Eu quero que você leia esta carta com muito cuidado e devo me desculpar por trazer esta mensagem para o seu e-mail sem qualquer apresentação formal devido à urgência deste assunto. Fico feliz em conhecê-lo. Mas o Senhor Todo-Poderoso conhece você melhor e sabe por que me dirigiu a você neste momento. Para começar, vi seu e-mail e detalhes de contato em minha busca por um bom parente próximo online, como resultado de minha urgência em realizar meu último desejo antes de morrer tão cedo. Estou escrevendo este e-mail para você com lágrimas pesadas em meus olhos e grande tristeza em meu coração, meu nome é Sra. Mariya Singh, estou entrando em contato com você do meu país, Índia, quero lhe dizer isso porque não tenho outra opção do que dizer como fiquei emocionado em me abrir com você, sou casada com o Sr. Peter Singh, que trabalhou com a embaixada indiana em Madri, Espanha, por nove anos antes de morrer nos últimos dois anos devido a Covid19. Ficamos casados por onze anos sem filhos. Ele morreu após uma breve doença do vírus corona que durou apenas vinte dias. Quando o meu falecido marido era vivo, depositou a quantia de 10.850.000,00€ (dez milhões oitocentos e cinquenta mil euros) num banco aqui na Índia. Atualmente esse dinheiro ainda está no banco, ele disponibilizou esse dinheiro para exportação de ouro da mineração de Madri, Espanha, antes de sua morte repentina. Recentemente, meu médico me disse que eu não duraria mais 3 meses devido a problemas de câncer novamente. O que mais me incomoda é o meu AVC. Tendo conhecido minha condição, decidi entregar a você esse dinheiro para cuidar dos menos privilegiados, você utilizará esse dinheiro da maneira que vou instruir aqui. Quero que você leve 60% do dinheiro total para seu uso pessoal e pelo seu esforço e dedicação ao trabalho do projeto de caridade, enquanto 40% do dinheiro irá para o trabalho do projeto de caridade, pessoas carentes da rua e ajudando o orfanato também, cresci órfão e não tenho ninguém como membro da minha família. Certifique-se também de que a casa de Deus seja mantida com os fundos. Estou fazendo isso para que Deus perdoe meus pecados e aceite minha alma porque sofri muito com essa doença. Assim que receber sua resposta, darei a você o contato do banco aqui na Índia e também instruirei o gerente do banco a emitir uma carta de autorização que o proverá o atual beneficiário do dinheiro no banco que é se você me garante que agirá de acordo com o que declarei aqui. Na esperança de receber sua resposta, então irei ao banco para a transferência de dinheiro para sua conta bancária o mais rápido possível, ou seja, se você for sério o suficiente para lidar com o projeto e também dedicar seu esforço a ele, porque acredito que pode confiar em você honestamente. Com os melhores cumprimentos. Da Sra. Mariya Singh

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Ensure login failure recovery is safe from other" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x 6db541ae279bd4e76dbd939e5fbf298396166242 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081217-unrest-trusting-66a9@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: 6db541ae279b ("ibmvnic: Ensure login failure recovery is safe from other resets") 23cc5f667453 ("ibmvnic: Do partial reset on login failure") 61772b0908c6 ("ibmvnic: don't release napi in __ibmvnic_open()") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 0775ebc4cf85 ("ibmvnic: avoid calling napi_disable() twice") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 6db541ae279bd4e76dbd939e5fbf298396166242 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:38 -0500 Subject: [PATCH] ibmvnic: Ensure login failure recovery is safe from other resets If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index e9619957d58a..df76cdaddcfb 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, static void free_long_term_buff(struct ibmvnic_adapter *adapter, struct ibmvnic_long_term_buff *ltb); static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter); +static void flush_reset_queue(struct ibmvnic_adapter *adapter); struct ibmvnic_stat { char name[ETH_GSTRING_LEN]; @@ -1507,8 +1508,8 @@ static const char *adapter_state_to_string(enum vnic_state state) static int ibmvnic_login(struct net_device *netdev) { + unsigned long flags, timeout = msecs_to_jiffies(20000); struct ibmvnic_adapter *adapter = netdev_priv(netdev); - unsigned long timeout = msecs_to_jiffies(20000); int retry_count = 0; int retries = 10; bool retry; @@ -1573,6 +1574,7 @@ static int ibmvnic_login(struct net_device *netdev) "SCRQ irq initialization failed\n"); return rc; } + /* Default/timeout error handling, reset and start fresh */ } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); @@ -1588,29 +1590,53 @@ static int ibmvnic_login(struct net_device *netdev) "Freeing and re-registering CRQs before attempting to login again\n"); retry = true; adapter->init_done_rc = 0; - retry_count++; release_sub_crqs(adapter, true); - reinit_init_done(adapter); - release_crq_queue(adapter); - /* If we don't sleep here then we risk an unnecessary - * failover event from the VIOS. This is a known VIOS - * issue caused by a vnic device freeing and registering - * a CRQ too quickly. + /* Much of this is similar logic as ibmvnic_probe(), + * we are essentially re-initializing communication + * with the server. We really should not run any + * resets/failovers here because this is already a form + * of reset and we do not want parallel resets occurring */ - msleep(1500); - rc = init_crq_queue(adapter); - if (rc) { - netdev_err(netdev, "login recovery: init CRQ failed %d\n", - rc); - return -EIO; - } + do { + reinit_init_done(adapter); + /* Clear any failovers we got in the previous + * pass since we are re-initializing the CRQ + */ + adapter->failover_pending = false; + release_crq_queue(adapter); + /* If we don't sleep here then we risk an + * unnecessary failover event from the VIOS. + * This is a known VIOS issue caused by a vnic + * device freeing and registering a CRQ too + * quickly. + */ + msleep(1500); + /* Avoid any resets, since we are currently + * resetting. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, + flags); - rc = ibmvnic_reset_init(adapter, false); - if (rc) { - netdev_err(netdev, "login recovery: Reset init failed %d\n", - rc); - return -EIO; - } + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + /* IBMVNIC_CRQ_INIT will return EAGAIN if it + * fails, since ibmvnic_reset_init will free + * irq's in failure, we won't be able to receive + * new CRQs so we need to keep trying. probe() + * handles this similarly. + */ + } while (rc == -EAGAIN && retry_count++ < retries); } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Ensure login failure recovery is safe from other" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 6db541ae279bd4e76dbd939e5fbf298396166242 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081212-elevation-cape-98ef@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: 6db541ae279b ("ibmvnic: Ensure login failure recovery is safe from other resets") 23cc5f667453 ("ibmvnic: Do partial reset on login failure") 61772b0908c6 ("ibmvnic: don't release napi in __ibmvnic_open()") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 0775ebc4cf85 ("ibmvnic: avoid calling napi_disable() twice") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 6db541ae279bd4e76dbd939e5fbf298396166242 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:38 -0500 Subject: [PATCH] ibmvnic: Ensure login failure recovery is safe from other resets If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index e9619957d58a..df76cdaddcfb 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, static void free_long_term_buff(struct ibmvnic_adapter *adapter, struct ibmvnic_long_term_buff *ltb); static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter); +static void flush_reset_queue(struct ibmvnic_adapter *adapter); struct ibmvnic_stat { char name[ETH_GSTRING_LEN]; @@ -1507,8 +1508,8 @@ static const char *adapter_state_to_string(enum vnic_state state) static int ibmvnic_login(struct net_device *netdev) { + unsigned long flags, timeout = msecs_to_jiffies(20000); struct ibmvnic_adapter *adapter = netdev_priv(netdev); - unsigned long timeout = msecs_to_jiffies(20000); int retry_count = 0; int retries = 10; bool retry; @@ -1573,6 +1574,7 @@ static int ibmvnic_login(struct net_device *netdev) "SCRQ irq initialization failed\n"); return rc; } + /* Default/timeout error handling, reset and start fresh */ } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); @@ -1588,29 +1590,53 @@ static int ibmvnic_login(struct net_device *netdev) "Freeing and re-registering CRQs before attempting to login again\n"); retry = true; adapter->init_done_rc = 0; - retry_count++; release_sub_crqs(adapter, true); - reinit_init_done(adapter); - release_crq_queue(adapter); - /* If we don't sleep here then we risk an unnecessary - * failover event from the VIOS. This is a known VIOS - * issue caused by a vnic device freeing and registering - * a CRQ too quickly. + /* Much of this is similar logic as ibmvnic_probe(), + * we are essentially re-initializing communication + * with the server. We really should not run any + * resets/failovers here because this is already a form + * of reset and we do not want parallel resets occurring */ - msleep(1500); - rc = init_crq_queue(adapter); - if (rc) { - netdev_err(netdev, "login recovery: init CRQ failed %d\n", - rc); - return -EIO; - } + do { + reinit_init_done(adapter); + /* Clear any failovers we got in the previous + * pass since we are re-initializing the CRQ + */ + adapter->failover_pending = false; + release_crq_queue(adapter); + /* If we don't sleep here then we risk an + * unnecessary failover event from the VIOS. + * This is a known VIOS issue caused by a vnic + * device freeing and registering a CRQ too + * quickly. + */ + msleep(1500); + /* Avoid any resets, since we are currently + * resetting. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, + flags); - rc = ibmvnic_reset_init(adapter, false); - if (rc) { - netdev_err(netdev, "login recovery: Reset init failed %d\n", - rc); - return -EIO; - } + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + /* IBMVNIC_CRQ_INIT will return EAGAIN if it + * fails, since ibmvnic_reset_init will free + * irq's in failure, we won't be able to receive + * new CRQs so we need to keep trying. probe() + * handles this similarly. + */ + } while (rc == -EAGAIN && retry_count++ < retries); } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Ensure login failure recovery is safe from other" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 6db541ae279bd4e76dbd939e5fbf298396166242 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081210-robbing-common-f106@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 6db541ae279b ("ibmvnic: Ensure login failure recovery is safe from other resets") 23cc5f667453 ("ibmvnic: Do partial reset on login failure") 61772b0908c6 ("ibmvnic: don't release napi in __ibmvnic_open()") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 0775ebc4cf85 ("ibmvnic: avoid calling napi_disable() twice") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 6db541ae279bd4e76dbd939e5fbf298396166242 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:38 -0500 Subject: [PATCH] ibmvnic: Ensure login failure recovery is safe from other resets If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index e9619957d58a..df76cdaddcfb 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, static void free_long_term_buff(struct ibmvnic_adapter *adapter, struct ibmvnic_long_term_buff *ltb); static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter); +static void flush_reset_queue(struct ibmvnic_adapter *adapter); struct ibmvnic_stat { char name[ETH_GSTRING_LEN]; @@ -1507,8 +1508,8 @@ static const char *adapter_state_to_string(enum vnic_state state) static int ibmvnic_login(struct net_device *netdev) { + unsigned long flags, timeout = msecs_to_jiffies(20000); struct ibmvnic_adapter *adapter = netdev_priv(netdev); - unsigned long timeout = msecs_to_jiffies(20000); int retry_count = 0; int retries = 10; bool retry; @@ -1573,6 +1574,7 @@ static int ibmvnic_login(struct net_device *netdev) "SCRQ irq initialization failed\n"); return rc; } + /* Default/timeout error handling, reset and start fresh */ } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); @@ -1588,29 +1590,53 @@ static int ibmvnic_login(struct net_device *netdev) "Freeing and re-registering CRQs before attempting to login again\n"); retry = true; adapter->init_done_rc = 0; - retry_count++; release_sub_crqs(adapter, true); - reinit_init_done(adapter); - release_crq_queue(adapter); - /* If we don't sleep here then we risk an unnecessary - * failover event from the VIOS. This is a known VIOS - * issue caused by a vnic device freeing and registering - * a CRQ too quickly. + /* Much of this is similar logic as ibmvnic_probe(), + * we are essentially re-initializing communication + * with the server. We really should not run any + * resets/failovers here because this is already a form + * of reset and we do not want parallel resets occurring */ - msleep(1500); - rc = init_crq_queue(adapter); - if (rc) { - netdev_err(netdev, "login recovery: init CRQ failed %d\n", - rc); - return -EIO; - } + do { + reinit_init_done(adapter); + /* Clear any failovers we got in the previous + * pass since we are re-initializing the CRQ + */ + adapter->failover_pending = false; + release_crq_queue(adapter); + /* If we don't sleep here then we risk an + * unnecessary failover event from the VIOS. + * This is a known VIOS issue caused by a vnic + * device freeing and registering a CRQ too + * quickly. + */ + msleep(1500); + /* Avoid any resets, since we are currently + * resetting. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, + flags); - rc = ibmvnic_reset_init(adapter, false); - if (rc) { - netdev_err(netdev, "login recovery: Reset init failed %d\n", - rc); - return -EIO; - } + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + /* IBMVNIC_CRQ_INIT will return EAGAIN if it + * fails, since ibmvnic_reset_init will free + * irq's in failure, we won't be able to receive + * new CRQs so we need to keep trying. probe() + * handles this similarly. + */ + } while (rc == -EAGAIN && retry_count++ < retries); } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Ensure login failure recovery is safe from other" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 6db541ae279bd4e76dbd939e5fbf298396166242 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081208-trinity-upfront-c2e1@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 6db541ae279b ("ibmvnic: Ensure login failure recovery is safe from other resets") 23cc5f667453 ("ibmvnic: Do partial reset on login failure") 61772b0908c6 ("ibmvnic: don't release napi in __ibmvnic_open()") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 6db541ae279bd4e76dbd939e5fbf298396166242 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:38 -0500 Subject: [PATCH] ibmvnic: Ensure login failure recovery is safe from other resets If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index e9619957d58a..df76cdaddcfb 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -116,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter, static void free_long_term_buff(struct ibmvnic_adapter *adapter, struct ibmvnic_long_term_buff *ltb); static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter); +static void flush_reset_queue(struct ibmvnic_adapter *adapter); struct ibmvnic_stat { char name[ETH_GSTRING_LEN]; @@ -1507,8 +1508,8 @@ static const char *adapter_state_to_string(enum vnic_state state) static int ibmvnic_login(struct net_device *netdev) { + unsigned long flags, timeout = msecs_to_jiffies(20000); struct ibmvnic_adapter *adapter = netdev_priv(netdev); - unsigned long timeout = msecs_to_jiffies(20000); int retry_count = 0; int retries = 10; bool retry; @@ -1573,6 +1574,7 @@ static int ibmvnic_login(struct net_device *netdev) "SCRQ irq initialization failed\n"); return rc; } + /* Default/timeout error handling, reset and start fresh */ } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); @@ -1588,29 +1590,53 @@ static int ibmvnic_login(struct net_device *netdev) "Freeing and re-registering CRQs before attempting to login again\n"); retry = true; adapter->init_done_rc = 0; - retry_count++; release_sub_crqs(adapter, true); - reinit_init_done(adapter); - release_crq_queue(adapter); - /* If we don't sleep here then we risk an unnecessary - * failover event from the VIOS. This is a known VIOS - * issue caused by a vnic device freeing and registering - * a CRQ too quickly. + /* Much of this is similar logic as ibmvnic_probe(), + * we are essentially re-initializing communication + * with the server. We really should not run any + * resets/failovers here because this is already a form + * of reset and we do not want parallel resets occurring */ - msleep(1500); - rc = init_crq_queue(adapter); - if (rc) { - netdev_err(netdev, "login recovery: init CRQ failed %d\n", - rc); - return -EIO; - } + do { + reinit_init_done(adapter); + /* Clear any failovers we got in the previous + * pass since we are re-initializing the CRQ + */ + adapter->failover_pending = false; + release_crq_queue(adapter); + /* If we don't sleep here then we risk an + * unnecessary failover event from the VIOS. + * This is a known VIOS issue caused by a vnic + * device freeing and registering a CRQ too + * quickly. + */ + msleep(1500); + /* Avoid any resets, since we are currently + * resetting. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, + flags); - rc = ibmvnic_reset_init(adapter, false); - if (rc) { - netdev_err(netdev, "login recovery: Reset init failed %d\n", - rc); - return -EIO; - } + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + /* IBMVNIC_CRQ_INIT will return EAGAIN if it + * fails, since ibmvnic_reset_init will free + * irq's in failure, we won't be able to receive + * new CRQs so we need to keep trying. probe() + * handles this similarly. + */ + } while (rc == -EAGAIN && retry_count++ < retries); } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Do partial reset on login failure" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x 23cc5f667453ca7645a24c8d21bf84dbf61107b2 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081252-sanction-juice-f652@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: 23cc5f667453 ("ibmvnic: Do partial reset on login failure") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 8f1c0fd2c84c ("ibmvnic: fix a race between open and reset") 1c7d45e7b2c2 ("ibmvnic: simplify reset_long_term_buff function") 91dc5d2553fb ("ibmvnic: fix miscellaneous checks") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 23cc5f667453ca7645a24c8d21bf84dbf61107b2 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:37 -0500 Subject: [PATCH] ibmvnic: Do partial reset on login failure Perform a partial reset before sending a login request if any of the following are true: 1. If a previous request times out. This can be dangerous because the VIOS could still receive the old login request at any point after the timeout. Therefore, it is best to re-register the CRQ's and sub-CRQ's before retrying. 2. If the previous request returns an error that is not described in PAPR. PAPR provides procedures if the login returns with partial success or aborted return codes (section L.5.1) but other values do not have a defined procedure. Previously, these conditions just returned error from the login function rather than trying to resolve the issue. This can cause further issues since most callers of the login function are not prepared to handle an error when logging in. This improper cleanup can lead to the device being permanently DOWN'd. For example, if the VIOS believes that the device is already logged in then it will return INVALID_STATE (-7). If we never re-register CRQ's then it will always think that the device is already logged in. This leaves the device inoperable. The partial reset involves freeing the sub-CRQs, freeing the CRQ then registering and initializing a new CRQ and sub-CRQs. This essentially restarts all communication with VIOS to allow for a fresh login attempt that will be unhindered by any previous failed attempts. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-4-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 77b0a744fa88..e9619957d58a 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *, static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *, struct ibmvnic_sub_crq_queue *); static int ibmvnic_poll(struct napi_struct *napi, int data); +static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter); +static inline void reinit_init_done(struct ibmvnic_adapter *adapter); static void send_query_map(struct ibmvnic_adapter *adapter); static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8); static int send_request_unmap(struct ibmvnic_adapter *, u8); @@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev) if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out, retrying...\n"); - retry = true; - adapter->init_done_rc = 0; - retry_count++; - continue; + netdev_warn(netdev, "Login timed out\n"); + adapter->login_pending = false; + goto partial_reset; } if (adapter->init_done_rc == ABORTED) { @@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev) } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); - return -EIO; + +partial_reset: + /* adapter login failed, so free any CRQs or sub-CRQs + * and register again before attempting to login again. + * If we don't do this then the VIOS may think that + * we are already logged in and reject any subsequent + * attempts + */ + netdev_warn(netdev, + "Freeing and re-registering CRQs before attempting to login again\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + release_sub_crqs(adapter, true); + reinit_init_done(adapter); + release_crq_queue(adapter); + /* If we don't sleep here then we risk an unnecessary + * failover event from the VIOS. This is a known VIOS + * issue caused by a vnic device freeing and registering + * a CRQ too quickly. + */ + msleep(1500); + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) { + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + return -EIO; + } } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Do partial reset on login failure" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 23cc5f667453ca7645a24c8d21bf84dbf61107b2 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081250-charging-banner-0ae7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: 23cc5f667453 ("ibmvnic: Do partial reset on login failure") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 8f1c0fd2c84c ("ibmvnic: fix a race between open and reset") 1c7d45e7b2c2 ("ibmvnic: simplify reset_long_term_buff function") 91dc5d2553fb ("ibmvnic: fix miscellaneous checks") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 23cc5f667453ca7645a24c8d21bf84dbf61107b2 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:37 -0500 Subject: [PATCH] ibmvnic: Do partial reset on login failure Perform a partial reset before sending a login request if any of the following are true: 1. If a previous request times out. This can be dangerous because the VIOS could still receive the old login request at any point after the timeout. Therefore, it is best to re-register the CRQ's and sub-CRQ's before retrying. 2. If the previous request returns an error that is not described in PAPR. PAPR provides procedures if the login returns with partial success or aborted return codes (section L.5.1) but other values do not have a defined procedure. Previously, these conditions just returned error from the login function rather than trying to resolve the issue. This can cause further issues since most callers of the login function are not prepared to handle an error when logging in. This improper cleanup can lead to the device being permanently DOWN'd. For example, if the VIOS believes that the device is already logged in then it will return INVALID_STATE (-7). If we never re-register CRQ's then it will always think that the device is already logged in. This leaves the device inoperable. The partial reset involves freeing the sub-CRQs, freeing the CRQ then registering and initializing a new CRQ and sub-CRQs. This essentially restarts all communication with VIOS to allow for a fresh login attempt that will be unhindered by any previous failed attempts. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-4-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 77b0a744fa88..e9619957d58a 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *, static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *, struct ibmvnic_sub_crq_queue *); static int ibmvnic_poll(struct napi_struct *napi, int data); +static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter); +static inline void reinit_init_done(struct ibmvnic_adapter *adapter); static void send_query_map(struct ibmvnic_adapter *adapter); static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8); static int send_request_unmap(struct ibmvnic_adapter *, u8); @@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev) if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out, retrying...\n"); - retry = true; - adapter->init_done_rc = 0; - retry_count++; - continue; + netdev_warn(netdev, "Login timed out\n"); + adapter->login_pending = false; + goto partial_reset; } if (adapter->init_done_rc == ABORTED) { @@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev) } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); - return -EIO; + +partial_reset: + /* adapter login failed, so free any CRQs or sub-CRQs + * and register again before attempting to login again. + * If we don't do this then the VIOS may think that + * we are already logged in and reject any subsequent + * attempts + */ + netdev_warn(netdev, + "Freeing and re-registering CRQs before attempting to login again\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + release_sub_crqs(adapter, true); + reinit_init_done(adapter); + release_crq_queue(adapter); + /* If we don't sleep here then we risk an unnecessary + * failover event from the VIOS. This is a known VIOS + * issue caused by a vnic device freeing and registering + * a CRQ too quickly. + */ + msleep(1500); + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) { + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + return -EIO; + } } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Do partial reset on login failure" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 23cc5f667453ca7645a24c8d21bf84dbf61107b2 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081249-throat-simply-db92@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 23cc5f667453 ("ibmvnic: Do partial reset on login failure") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") f6ebca8efa52 ("ibmvnic: free tx_pool if tso_pool alloc fails") 552a33729f1a ("ibmvnic: set ltb->buff to NULL after freeing") 72368f8b2b9e ("ibmvnic: account for bufs already saved in indir_buf") 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") 0ec13aff058a ("Revert "ibmvnic: simplify reset_long_term_buff function"") d3a6abccbd27 ("ibmvnic: remove duplicate napi_schedule call in do_reset function") 8f1c0fd2c84c ("ibmvnic: fix a race between open and reset") 1c7d45e7b2c2 ("ibmvnic: simplify reset_long_term_buff function") 91dc5d2553fb ("ibmvnic: fix miscellaneous checks") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 23cc5f667453ca7645a24c8d21bf84dbf61107b2 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:37 -0500 Subject: [PATCH] ibmvnic: Do partial reset on login failure Perform a partial reset before sending a login request if any of the following are true: 1. If a previous request times out. This can be dangerous because the VIOS could still receive the old login request at any point after the timeout. Therefore, it is best to re-register the CRQ's and sub-CRQ's before retrying. 2. If the previous request returns an error that is not described in PAPR. PAPR provides procedures if the login returns with partial success or aborted return codes (section L.5.1) but other values do not have a defined procedure. Previously, these conditions just returned error from the login function rather than trying to resolve the issue. This can cause further issues since most callers of the login function are not prepared to handle an error when logging in. This improper cleanup can lead to the device being permanently DOWN'd. For example, if the VIOS believes that the device is already logged in then it will return INVALID_STATE (-7). If we never re-register CRQ's then it will always think that the device is already logged in. This leaves the device inoperable. The partial reset involves freeing the sub-CRQs, freeing the CRQ then registering and initializing a new CRQ and sub-CRQs. This essentially restarts all communication with VIOS to allow for a fresh login attempt that will be unhindered by any previous failed attempts. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-4-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 77b0a744fa88..e9619957d58a 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *, static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *, struct ibmvnic_sub_crq_queue *); static int ibmvnic_poll(struct napi_struct *napi, int data); +static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter); +static inline void reinit_init_done(struct ibmvnic_adapter *adapter); static void send_query_map(struct ibmvnic_adapter *adapter); static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8); static int send_request_unmap(struct ibmvnic_adapter *, u8); @@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev) if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out, retrying...\n"); - retry = true; - adapter->init_done_rc = 0; - retry_count++; - continue; + netdev_warn(netdev, "Login timed out\n"); + adapter->login_pending = false; + goto partial_reset; } if (adapter->init_done_rc == ABORTED) { @@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev) } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); - return -EIO; + +partial_reset: + /* adapter login failed, so free any CRQs or sub-CRQs + * and register again before attempting to login again. + * If we don't do this then the VIOS may think that + * we are already logged in and reject any subsequent + * attempts + */ + netdev_warn(netdev, + "Freeing and re-registering CRQs before attempting to login again\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + release_sub_crqs(adapter, true); + reinit_init_done(adapter); + release_crq_queue(adapter); + /* If we don't sleep here then we risk an unnecessary + * failover event from the VIOS. This is a known VIOS + * issue caused by a vnic device freeing and registering + * a CRQ too quickly. + */ + msleep(1500); + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) { + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + return -EIO; + } } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Do partial reset on login failure" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 23cc5f667453ca7645a24c8d21bf84dbf61107b2 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081247-overplay-mullets-63db@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 23cc5f667453 ("ibmvnic: Do partial reset on login failure") b6ee566cf394 ("ibmvnic: Update driver return codes") bbd809305bc7 ("ibmvnic: Reuse tx pools when possible") 489de956e7a2 ("ibmvnic: Reuse rx pools when possible") f8ac0bfa7d7a ("ibmvnic: Reuse LTB when possible") 129854f061d8 ("ibmvnic: Use bitmap for LTB map_ids") 0d1af4fa7124 ("ibmvnic: init_tx_pools move loop-invariant code") 8243c7ed6d08 ("ibmvnic: Use/rename local vars in init_tx_pools") 0df7b9ad8f84 ("ibmvnic: Use/rename local vars in init_rx_pools") 0f2bf3188c43 ("ibmvnic: Fix up some comments and messages") 38106b2c433e ("ibmvnic: Consolidate code in replenish_rx_pool()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 23cc5f667453ca7645a24c8d21bf84dbf61107b2 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:37 -0500 Subject: [PATCH] ibmvnic: Do partial reset on login failure Perform a partial reset before sending a login request if any of the following are true: 1. If a previous request times out. This can be dangerous because the VIOS could still receive the old login request at any point after the timeout. Therefore, it is best to re-register the CRQ's and sub-CRQ's before retrying. 2. If the previous request returns an error that is not described in PAPR. PAPR provides procedures if the login returns with partial success or aborted return codes (section L.5.1) but other values do not have a defined procedure. Previously, these conditions just returned error from the login function rather than trying to resolve the issue. This can cause further issues since most callers of the login function are not prepared to handle an error when logging in. This improper cleanup can lead to the device being permanently DOWN'd. For example, if the VIOS believes that the device is already logged in then it will return INVALID_STATE (-7). If we never re-register CRQ's then it will always think that the device is already logged in. This leaves the device inoperable. The partial reset involves freeing the sub-CRQs, freeing the CRQ then registering and initializing a new CRQ and sub-CRQs. This essentially restarts all communication with VIOS to allow for a fresh login attempt that will be unhindered by any previous failed attempts. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-4-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 77b0a744fa88..e9619957d58a 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *, static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *, struct ibmvnic_sub_crq_queue *); static int ibmvnic_poll(struct napi_struct *napi, int data); +static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter); +static inline void reinit_init_done(struct ibmvnic_adapter *adapter); static void send_query_map(struct ibmvnic_adapter *adapter); static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8); static int send_request_unmap(struct ibmvnic_adapter *, u8); @@ -1527,11 +1529,9 @@ static int ibmvnic_login(struct net_device *netdev) if (!wait_for_completion_timeout(&adapter->init_done, timeout)) { - netdev_warn(netdev, "Login timed out, retrying...\n"); - retry = true; - adapter->init_done_rc = 0; - retry_count++; - continue; + netdev_warn(netdev, "Login timed out\n"); + adapter->login_pending = false; + goto partial_reset; } if (adapter->init_done_rc == ABORTED) { @@ -1576,7 +1576,41 @@ static int ibmvnic_login(struct net_device *netdev) } else if (adapter->init_done_rc) { netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n", adapter->init_done_rc); - return -EIO; + +partial_reset: + /* adapter login failed, so free any CRQs or sub-CRQs + * and register again before attempting to login again. + * If we don't do this then the VIOS may think that + * we are already logged in and reject any subsequent + * attempts + */ + netdev_warn(netdev, + "Freeing and re-registering CRQs before attempting to login again\n"); + retry = true; + adapter->init_done_rc = 0; + retry_count++; + release_sub_crqs(adapter, true); + reinit_init_done(adapter); + release_crq_queue(adapter); + /* If we don't sleep here then we risk an unnecessary + * failover event from the VIOS. This is a known VIOS + * issue caused by a vnic device freeing and registering + * a CRQ too quickly. + */ + msleep(1500); + rc = init_crq_queue(adapter); + if (rc) { + netdev_err(netdev, "login recovery: init CRQ failed %d\n", + rc); + return -EIO; + } + + rc = ibmvnic_reset_init(adapter, false); + if (rc) { + netdev_err(netdev, "login recovery: Reset init failed %d\n", + rc); + return -EIO; + } } } while (retry);

2 years, 4 months

1
0
0 0

FAILED: patch "[PATCH] ibmvnic: Enforce stronger sanity checks on login response" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x db17ba719bceb52f0ae4ebca0e4c17d9a3bebf05 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081211-outbreak-unwarlike-3ea6@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: db17ba719bce ("ibmvnic: Enforce stronger sanity checks on login response") 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login response buffer") f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct") 7c940b1a5291 ("ibmvnic: Fix unchecked return codes of memory allocations") e56e2515669a ("ibmvnic: Add device identification to requested IRQs") 94b2bb28dbb4 ("net: ibm: fix return type of ndo_start_xmit function") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From db17ba719bceb52f0ae4ebca0e4c17d9a3bebf05 Mon Sep 17 00:00:00 2001 From: Nick Child <nnac123(a)linux.ibm.com> Date: Wed, 9 Aug 2023 17:10:34 -0500 Subject: [PATCH] ibmvnic: Enforce stronger sanity checks on login response Ensure that all offsets in a login response buffer are within the size of the allocated response buffer. Any offsets or lengths that surpass the allocation are likely the result of an incomplete response buffer. In these cases, a full reset is necessary. When attempting to login, the ibmvnic device will allocate a response buffer and pass a reference to the VIOS. The VIOS will then send the ibmvnic device a LOGIN_RSP CRQ to signal that the buffer has been filled with data. If the ibmvnic device does not get a response in 20 seconds, the old buffer is freed and a new login request is sent. With 2 outstanding requests, any LOGIN_RSP CRQ's could be for the older login request. If this is the case then the login response buffer (which is for the newer login request) could be incomplete and contain invalid data. Therefore, we must enforce strict sanity checks on the response buffer values. Testing has shown that the `off_rxadd_buff_size` value is filled in last by the VIOS and will be the smoking gun for these circumstances. Until VIOS can implement a mechanism for tracking outstanding response buffers and a method for mapping a LOGIN_RSP CRQ to a particular login response buffer, the best ibmvnic can do in this situation is perform a full reset. Fixes: dff515a3e71d ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123(a)linux.ibm.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 763d613adbcc..f4bb2c9ab9a4 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -5396,6 +5396,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, int num_tx_pools; int num_rx_pools; u64 *size_array; + u32 rsp_len; int i; /* CHECK: Test/set of login_pending does not need to be atomic @@ -5447,6 +5448,23 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, ibmvnic_reset(adapter, VNIC_RESET_FATAL); return -EIO; } + + rsp_len = be32_to_cpu(login_rsp->len); + if (be32_to_cpu(login->login_rsp_len) < rsp_len || + rsp_len <= be32_to_cpu(login_rsp->off_txsubm_subcrqs) || + rsp_len <= be32_to_cpu(login_rsp->off_rxadd_subcrqs) || + rsp_len <= be32_to_cpu(login_rsp->off_rxadd_buff_size) || + rsp_len <= be32_to_cpu(login_rsp->off_supp_tx_desc)) { + /* This can happen if a login request times out and there are + * 2 outstanding login requests sent, the LOGIN_RSP crq + * could have been for the older login request. So we are + * parsing the newer response buffer which may be incomplete + */ + dev_err(dev, "FATAL: Login rsp offsets/lengths invalid\n"); + ibmvnic_reset(adapter, VNIC_RESET_FATAL); + return -EIO; + } + size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size)); /* variable buffer sizes are not supported, so just read the

2 years, 4 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2023