March 2024 - Linux-stable-mirror

[PATCH v3 1/5] efi/libstub: Use correct event size when measuring data into the TPM

by Ard Biesheuvel

From: Ard Biesheuvel <ardb(a)kernel.org> Our efi_tcg2_tagged_event is not defined in the EFI spec, but it is not a local invention either: it was taken from the TCG PC Client spec, where it is called TCG_PCClientTaggedEvent. This spec also contains some guidance on how to populate it, which is not being followed closely at the moment; the event size should cover the TCG_PCClientTaggedEvent and its payload only, but it currently covers the preceding efi_tcg2_event too, and this may result in trailing garbage being measured into the TPM. So rename the struct and document its provenance, and fix up the use so only the tagged event data is represented in the size field. Cc: <stable(a)vger.kernel.org> Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org> --- drivers/firmware/efi/libstub/efi-stub-helper.c | 20 +++++++++++--------- drivers/firmware/efi/libstub/efistub.h | 12 ++++++------ 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c index bfa30625f5d0..16843ab9b64d 100644 --- a/drivers/firmware/efi/libstub/efi-stub-helper.c +++ b/drivers/firmware/efi/libstub/efi-stub-helper.c @@ -11,6 +11,7 @@ #include <linux/efi.h> #include <linux/kernel.h> +#include <linux/overflow.h> #include <asm/efi.h> #include <asm/setup.h> @@ -219,23 +220,24 @@ static const struct { }, }; +struct efistub_measured_event { + efi_tcg2_event_t event_data; + TCG_PCClientTaggedEvent tagged_event; +} __packed; + static efi_status_t efi_measure_tagged_event(unsigned long load_addr, unsigned long load_size, enum efistub_event event) { + struct efistub_measured_event *evt; + int size = struct_size(&evt->tagged_event, tagged_event_data, + events[event].event_data_len); efi_guid_t tcg2_guid = EFI_TCG2_PROTOCOL_GUID; efi_tcg2_protocol_t *tcg2 = NULL; efi_status_t status; efi_bs_call(locate_protocol, &tcg2_guid, NULL, (void **)&tcg2); if (tcg2) { - struct efi_measured_event { - efi_tcg2_event_t event_data; - efi_tcg2_tagged_event_t tagged_event; - u8 tagged_event_data[]; - } *evt; - int size = sizeof(*evt) + events[event].event_data_len; - status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, size, (void **)&evt); if (status != EFI_SUCCESS) @@ -249,12 +251,12 @@ static efi_status_t efi_measure_tagged_event(unsigned long load_addr, .event_header.event_type = EV_EVENT_TAG, }; - evt->tagged_event = (struct efi_tcg2_tagged_event){ + evt->tagged_event = (TCG_PCClientTaggedEvent){ .tagged_event_id = events[event].event_id, .tagged_event_data_size = events[event].event_data_len, }; - memcpy(evt->tagged_event_data, events[event].event_data, + memcpy(evt->tagged_event.tagged_event_data, events[event].event_data, events[event].event_data_len); status = efi_call_proto(tcg2, hash_log_extend_event, 0, diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h index c04b82ea40f2..043a3ff435f3 100644 --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -843,14 +843,14 @@ struct efi_tcg2_event { /* u8[] event follows here */ } __packed; -struct efi_tcg2_tagged_event { - u32 tagged_event_id; - u32 tagged_event_data_size; - /* u8 tagged event data follows here */ -} __packed; +/* from TCG PC Client Platform Firmware Profile Specification */ +typedef struct tdTCG_PCClientTaggedEvent { + u32 tagged_event_id; + u32 tagged_event_data_size; + u8 tagged_event_data[]; +} TCG_PCClientTaggedEvent; typedef struct efi_tcg2_event efi_tcg2_event_t; -typedef struct efi_tcg2_tagged_event efi_tcg2_tagged_event_t; typedef union efi_tcg2_protocol efi_tcg2_protocol_t; union efi_tcg2_protocol { -- 2.44.0.278.ge034bb2e1d-goog

1 year, 4 months

3
4
0 0

Re: [PATCH] cifs: Convert struct fealist away from 1-element array

by Vitaly Chikunov

Greg, Sasha, Can you please backport this commit (below) to a stable 6.1.y tree, it's confirmed be Kees this could cause kernel panic due to false positive strncpy fortify, and this is already happened for some users. Thanks, On Fri, Feb 09, 2024 at 04:02:32PM -0800, Kees Cook wrote: > On Sat, Feb 10, 2024 at 01:13:06AM +0300, Vitaly Chikunov wrote: > > > > On Tue, Feb 14, 2023 at 04:08:39PM -0800, Kees Cook wrote: > > > The kernel is globally removing the ambiguous 0-length and 1-element > > > arrays in favor of flexible arrays, so that we can gain both compile-time > > > and run-time array bounds checking[1]. > > > > > > While struct fealist is defined as a "fake" flexible array (via a > > > 1-element array), it is only used for examination of the first array > > > element. Walking the list is performed separately, so there is no reason > > > to treat the "list" member of struct fealist as anything other than a > > > single entry. Adjust the struct and code to match. > > > > > > Additionally, struct fea uses the "name" member either as a dynamic > > > string, or is manually calculated from the start of the struct. Redefine > > > the member as a flexible array. > > > > > > No machine code output differences are produced after these changes. > > > > > > [1] For lots of details, see both: > > > https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element… > > > https://people.kernel.org/kees/bounded-flexible-arrays-in-c > > > > > > Cc: Steve French <sfrench(a)samba.org> > > > Cc: Paulo Alcantara <pc(a)cjr.nz> > > > Cc: Ronnie Sahlberg <lsahlber(a)redhat.com> > > > Cc: Shyam Prasad N <sprasad(a)microsoft.com> > > > Cc: Tom Talpey <tom(a)talpey.com> > > > Cc: linux-cifs(a)vger.kernel.org > > > Cc: samba-technical(a)lists.samba.org > > > Signed-off-by: Kees Cook <keescook(a)chromium.org> > > > --- > > > fs/cifs/cifspdu.h | 4 ++-- > > > fs/cifs/cifssmb.c | 16 ++++++++-------- > > > 2 files changed, 10 insertions(+), 10 deletions(-) > > > > > > diff --git a/fs/cifs/cifspdu.h b/fs/cifs/cifspdu.h > > > index 623caece2b10..add73be4902c 100644 > > > --- a/fs/cifs/cifspdu.h > > > +++ b/fs/cifs/cifspdu.h > > > @@ -2583,7 +2583,7 @@ struct fea { > > > unsigned char EA_flags; > > > __u8 name_len; > > > __le16 value_len; > > > - char name[1]; > > > + char name[]; > > > /* optionally followed by value */ > > > } __attribute__((packed)); > > > /* flags for _FEA.fEA */ > > > @@ -2591,7 +2591,7 @@ struct fea { > > > > > > struct fealist { > > > __le32 list_len; > > > - struct fea list[1]; > > > + struct fea list; > > > } __attribute__((packed)); > > > > > > /* used to hold an arbitrary blob of data */ > > > diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c > > > index 60dd4e37030a..7c587157d030 100644 > > > --- a/fs/cifs/cifssmb.c > > > +++ b/fs/cifs/cifssmb.c > > > @@ -5787,7 +5787,7 @@ CIFSSMBQAllEAs(const unsigned int xid, struct cifs_tcon *tcon, > > > > > > /* account for ea list len */ > > > list_len -= 4; > > > - temp_fea = ea_response_data->list; > > > + temp_fea = &ea_response_data->list; > > > temp_ptr = (char *)temp_fea; > > > while (list_len > 0) { > > > unsigned int name_len; > > > @@ -5902,7 +5902,7 @@ CIFSSMBSetEA(const unsigned int xid, struct cifs_tcon *tcon, > > > else > > > name_len = strnlen(ea_name, 255); > > > > > > - count = sizeof(*parm_data) + ea_value_len + name_len; > > > + count = sizeof(*parm_data) + 1 + ea_value_len + name_len; > > > pSMB->MaxParameterCount = cpu_to_le16(2); > > > /* BB find max SMB PDU from sess */ > > > pSMB->MaxDataCount = cpu_to_le16(1000); > > > @@ -5926,14 +5926,14 @@ CIFSSMBSetEA(const unsigned int xid, struct cifs_tcon *tcon, > > > byte_count = 3 /* pad */ + params + count; > > > pSMB->DataCount = cpu_to_le16(count); > > > parm_data->list_len = cpu_to_le32(count); > > > - parm_data->list[0].EA_flags = 0; > > > + parm_data->list.EA_flags = 0; > > > /* we checked above that name len is less than 255 */ > > > - parm_data->list[0].name_len = (__u8)name_len; > > > + parm_data->list.name_len = (__u8)name_len; > > > /* EA names are always ASCII */ > > > if (ea_name) > > > - strncpy(parm_data->list[0].name, ea_name, name_len); > > > - parm_data->list[0].name[name_len] = 0; > > > - parm_data->list[0].value_len = cpu_to_le16(ea_value_len); > > > + strncpy(parm_data->list.name, ea_name, name_len); > > > > Could non-applying this patch cause false-positive fortify_panic? > > We got a bug report from user of 6.1.73: > > > > Jan 24 15:15:20 kalt2test.dpt.local kernel: detected buffer overflow in strncpy > > Yes, this seems likely. I would backport this change. > > > Jan 24 15:15:20 kalt2test.dpt.local kernel: ------------[ cut here ]------------ > > Jan 24 15:15:20 kalt2test.dpt.local kernel: kernel BUG at lib/string_helpers.c:1027! > > ... > > Jan 24 15:15:20 kalt2test.dpt.local kernel: Call Trace: > > Jan 24 15:15:20 kalt2test.dpt.local kernel: CIFSSMBSetEA.cold+0xc/0x18 [cifs] > > Jan 24 15:15:20 kalt2test.dpt.local kernel: cifs_xattr_set+0x596/0x690 [cifs] > > Jan 24 15:15:20 kalt2test.dpt.local kernel: __vfs_removexattr+0x52/0x70 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: __vfs_removexattr_locked+0xbc/0x150 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: vfs_removexattr+0x56/0x100 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: removexattr+0x58/0x90 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: __ia32_sys_fremovexattr+0x80/0xa0 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: int80_emulation+0xa9/0x110 > > Jan 24 15:15:20 kalt2test.dpt.local kernel: asm_int80_emulation+0x16/0x20 > > This appears to be a compat call? > > -Kees > > > > > I don't find this patch appled to stable/linux-6.1.y. > > > > Thanks, > > > > ps. (Unfortunately `CIFSSMBSetEA+0xc` address is not resolvable to the > > actual line inside of CIFSSMBSetEA pointing just to the head of it. > > > > (gdb) l *CIFSSMBSetEA+0xc > > 0x6de3c is in CIFSSMBSetEA (fs/smb/client/cifssmb.c:5776). > > 5771 int > > 5772 CIFSSMBSetEA(const unsigned int xid, struct cifs_tcon *tcon, > > 5773 const char *fileName, const char *ea_name, const void *ea_value, > > 5774 const __u16 ea_value_len, const struct nls_table *nls_codepage, > > 5775 struct cifs_sb_info *cifs_sb) > > 5776 { > > 5777 struct smb_com_transaction2_spi_req *pSMB = NULL; > > 5778 struct smb_com_transaction2_spi_rsp *pSMBr = NULL; > > 5779 struct fealist *parm_data; > > 5780 int name_len; > > > > But there is only one strncpy there. > > > > > + parm_data->list.name[name_len] = '\0'; > > > + parm_data->list.value_len = cpu_to_le16(ea_value_len); > > > /* caller ensures that ea_value_len is less than 64K but > > > we need to ensure that it fits within the smb */ > > > > > > @@ -5941,7 +5941,7 @@ CIFSSMBSetEA(const unsigned int xid, struct cifs_tcon *tcon, > > > negotiated SMB buffer size BB */ > > > /* if (ea_value_len > buffer_size - 512 (enough for header)) */ > > > if (ea_value_len) > > > - memcpy(parm_data->list[0].name+name_len+1, > > > + memcpy(parm_data->list.name + name_len + 1, > > > ea_value, ea_value_len); > > > > > > pSMB->TotalDataCount = pSMB->DataCount; > > > -- > > > 2.34.1 > > > > > -- > Kees Cook

1 year, 4 months

2
8
0 0

[PATCH net v2] net: esp: fix bad handling of pages from page_pool

by Dragos Tatulea

When the skb is reorganized during esp_output (!esp->inline), the pages coming from the original skb fragments are supposed to be released back to the system through put_page. But if the skb fragment pages are originating from a page_pool, calling put_page on them will trigger a page_pool leak which will eventually result in a crash. This leak can be easily observed when using CONFIG_DEBUG_VM and doing ipsec + gre (non offloaded) forwarding: BUG: Bad page state in process ksoftirqd/16 pfn:1451b6 page:00000000de2b8d32 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1451b6000 pfn:0x1451b6 flags: 0x200000000000000(node=0|zone=2) page_type: 0xffffffff() raw: 0200000000000000 dead000000000040 ffff88810d23c000 0000000000000000 raw: 00000001451b6000 0000000000000001 00000000ffffffff 0000000000000000 page dumped because: page_pool leak Modules linked in: ip_gre gre mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay zram zsmalloc fuse [last unloaded: mlx5_core] CPU: 16 PID: 96 Comm: ksoftirqd/16 Not tainted 6.8.0-rc4+ #22 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x36/0x50 bad_page+0x70/0xf0 free_unref_page_prepare+0x27a/0x460 free_unref_page+0x38/0x120 esp_ssg_unref.isra.0+0x15f/0x200 esp_output_tail+0x66d/0x780 esp_xmit+0x2c5/0x360 validate_xmit_xfrm+0x313/0x370 ? validate_xmit_skb+0x1d/0x330 validate_xmit_skb_list+0x4c/0x70 sch_direct_xmit+0x23e/0x350 __dev_queue_xmit+0x337/0xba0 ? nf_hook_slow+0x3f/0xd0 ip_finish_output2+0x25e/0x580 iptunnel_xmit+0x19b/0x240 ip_tunnel_xmit+0x5fb/0xb60 ipgre_xmit+0x14d/0x280 [ip_gre] dev_hard_start_xmit+0xc3/0x1c0 __dev_queue_xmit+0x208/0xba0 ? nf_hook_slow+0x3f/0xd0 ip_finish_output2+0x1ca/0x580 ip_sublist_rcv_finish+0x32/0x40 ip_sublist_rcv+0x1b2/0x1f0 ? ip_rcv_finish_core.constprop.0+0x460/0x460 ip_list_rcv+0x103/0x130 __netif_receive_skb_list_core+0x181/0x1e0 netif_receive_skb_list_internal+0x1b3/0x2c0 napi_gro_receive+0xc8/0x200 gro_cell_poll+0x52/0x90 __napi_poll+0x25/0x1a0 net_rx_action+0x28e/0x300 __do_softirq+0xc3/0x276 ? sort_range+0x20/0x20 run_ksoftirqd+0x1e/0x30 smpboot_thread_fn+0xa6/0x130 kthread+0xcd/0x100 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x31/0x50 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork_asm+0x11/0x20 </TASK> The suggested fix is to introduce a new wrapper (skb_page_unref) that covers page refcounting for page_pool pages as well. Cc: stable(a)vger.kernel.org Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling") Reported-and-tested-by: Anatoli N.Chechelnickiy <Anatoli.Chechelnickiy(a)m.interpipe.biz> Reported-by: Ian Kumlien <ian.kumlien(a)gmail.com> Link: https://lore.kernel.org/netdev/CAA85sZvvHtrpTQRqdaOx6gd55zPAVsqMYk_Lwh4Md5k… Signed-off-by: Dragos Tatulea <dtatulea(a)nvidia.com> Reviewed-by: Mina Almasry <almasrymina(a)google.com> Reviewed-by: Jakub Kicinski <kuba(a)kernel.org> --- Changes in v2: - Fixes in tags. --- include/linux/skbuff.h | 10 ++++++++++ net/ipv4/esp4.c | 8 ++++---- net/ipv6/esp6.c | 8 ++++---- 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 696e7680656f..6126fc8e4a89 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -3452,6 +3452,16 @@ int skb_cow_data_for_xdp(struct page_pool *pool, struct sk_buff **pskb, struct bpf_prog *prog); bool napi_pp_put_page(struct page *page, bool napi_safe); +static inline void +skb_page_unref(const struct sk_buff *skb, struct page *page, bool napi_safe) +{ +#ifdef CONFIG_PAGE_POOL + if (skb->pp_recycle && napi_pp_put_page(page, napi_safe)) + return; +#endif + put_page(page); +} + static inline void napi_frag_unref(skb_frag_t *frag, bool recycle, bool napi_safe) { diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 4dd9e5040672..d33d12421814 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -95,7 +95,7 @@ static inline struct scatterlist *esp_req_sg(struct crypto_aead *aead, __alignof__(struct scatterlist)); } -static void esp_ssg_unref(struct xfrm_state *x, void *tmp) +static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb) { struct crypto_aead *aead = x->data; int extralen = 0; @@ -114,7 +114,7 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp) */ if (req->src != req->dst) for (sg = sg_next(req->src); sg; sg = sg_next(sg)) - put_page(sg_page(sg)); + skb_page_unref(skb, sg_page(sg), false); } #ifdef CONFIG_INET_ESPINTCP @@ -260,7 +260,7 @@ static void esp_output_done(void *data, int err) } tmp = ESP_SKB_CB(skb)->tmp; - esp_ssg_unref(x, tmp); + esp_ssg_unref(x, tmp, skb); kfree(tmp); if (xo && (xo->flags & XFRM_DEV_RESUME)) { @@ -639,7 +639,7 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info * } if (sg != dsg) - esp_ssg_unref(x, tmp); + esp_ssg_unref(x, tmp, skb); if (!err && x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) err = esp_output_tail_tcp(x, skb); diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 6e6efe026cdc..7371886d4f9f 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -112,7 +112,7 @@ static inline struct scatterlist *esp_req_sg(struct crypto_aead *aead, __alignof__(struct scatterlist)); } -static void esp_ssg_unref(struct xfrm_state *x, void *tmp) +static void esp_ssg_unref(struct xfrm_state *x, void *tmp, struct sk_buff *skb) { struct crypto_aead *aead = x->data; int extralen = 0; @@ -131,7 +131,7 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp) */ if (req->src != req->dst) for (sg = sg_next(req->src); sg; sg = sg_next(sg)) - put_page(sg_page(sg)); + skb_page_unref(skb, sg_page(sg), false); } #ifdef CONFIG_INET6_ESPINTCP @@ -294,7 +294,7 @@ static void esp_output_done(void *data, int err) } tmp = ESP_SKB_CB(skb)->tmp; - esp_ssg_unref(x, tmp); + esp_ssg_unref(x, tmp, skb); kfree(tmp); esp_output_encap_csum(skb); @@ -677,7 +677,7 @@ int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info } if (sg != dsg) - esp_ssg_unref(x, tmp); + esp_ssg_unref(x, tmp, skb); if (!err && x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) err = esp_output_tail_tcp(x, skb); -- 2.42.0

1 year, 4 months

2
1
0 0

[PATCH v3 2/6] usb: dwc3: gadget: cancel requests instead of release after missed isoc

by Dan Vacura

From: Jeff Vanhoof <qjv001(a)motorola.com> arm-smmu related crashes seen after a Missed ISOC interrupt when no_interrupt=1 is used. This can happen if the hardware is still using the data associated with a TRB after the usb_request's ->complete call has been made. Instead of immediately releasing a request when a Missed ISOC interrupt has occurred, this change will add logic to cancel the request instead where it will eventually be released when the END_TRANSFER command has completed. This logic is similar to some of the cleanup done in dwc3_gadget_ep_dequeue. Fixes: 6d8a019614f3 ("usb: dwc3: gadget: check for Missed Isoc from event status") Cc: <stable(a)vger.kernel.org> Signed-off-by: Jeff Vanhoof <qjv001(a)motorola.com> Co-developed-by: Dan Vacura <w36195(a)motorola.com> Signed-off-by: Dan Vacura <w36195(a)motorola.com> --- V1 -> V3: - no change, new patch in series drivers/usb/dwc3/core.h | 1 + drivers/usb/dwc3/gadget.c | 38 ++++++++++++++++++++++++++------------ 2 files changed, 27 insertions(+), 12 deletions(-) diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index 8f9959ba9fd4..9b005d912241 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -943,6 +943,7 @@ struct dwc3_request { #define DWC3_REQUEST_STATUS_DEQUEUED 3 #define DWC3_REQUEST_STATUS_STALLED 4 #define DWC3_REQUEST_STATUS_COMPLETED 5 +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6 #define DWC3_REQUEST_STATUS_UNKNOWN -1 u8 epnum; diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 079cd333632e..411532c5c378 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2021,6 +2021,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep) case DWC3_REQUEST_STATUS_STALLED: dwc3_gadget_giveback(dep, req, -EPIPE); break; + case DWC3_REQUEST_STATUS_MISSED_ISOC: + dwc3_gadget_giveback(dep, req, -EXDEV); + break; default: dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status); dwc3_gadget_giveback(dep, req, -ECONNRESET); @@ -3402,21 +3405,32 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep, struct dwc3 *dwc = dep->dwc; bool no_started_trb = true; - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); + if (status == -EXDEV) { + struct dwc3_request *tmp; + struct dwc3_request *req; - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) - goto out; + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) + dwc3_stop_active_transfer(dep, true, true); - if (!dep->endpoint.desc) - return no_started_trb; + list_for_each_entry_safe(req, tmp, &dep->started_list, list) + dwc3_gadget_move_cancelled_request(req, + DWC3_REQUEST_STATUS_MISSED_ISOC); + } else { + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && - list_empty(&dep->started_list) && - (list_empty(&dep->pending_list) || status == -EXDEV)) - dwc3_stop_active_transfer(dep, true, true); - else if (dwc3_gadget_ep_should_continue(dep)) - if (__dwc3_gadget_kick_transfer(dep) == 0) - no_started_trb = false; + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) + goto out; + + if (!dep->endpoint.desc) + return no_started_trb; + + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && + list_empty(&dep->started_list) && list_empty(&dep->pending_list)) + dwc3_stop_active_transfer(dep, true, true); + else if (dwc3_gadget_ep_should_continue(dep)) + if (__dwc3_gadget_kick_transfer(dep) == 0) + no_started_trb = false; + } out: /* -- 2.34.1

1 year, 4 months

3
12
0 0

[PATCH v3 0/3] Support intra-function call validation

by Rui Qi

Since kernel version 5.4.217 LTS, there has been an issue with the kernel live patching feature becoming unavailable. When compiling the sample code for kernel live patching, the following message is displayed when enabled: livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack Reproduction steps: 1.git checkout v5.4.269 -b v5.4.269 2.make defconfig 3. Set CONFIG_LIVEPATCH=y、CONFIG_SAMPLE_LIVEPATCH=m 4. make -j bzImage 5. make samples/livepatch/livepatch-sample.ko 6. qemu-system-x86_64 -kernel arch/x86_64/boot/bzImage -nographic -append "console=ttyS0" -initrd initrd.img -m 1024M 7. insmod livepatch-sample.ko Kernel live patch cannot complete successfully. After some debugging, the immediate cause of the patch failure is an error in stack checking. The logs are as follows: [ 340.974853] livepatch: klp_check_stack: kworker/u256:0:23486 has an unreliable stack [ 340.974858] livepatch: klp_check_stack: kworker/u256:1:23487 has an unreliable stack [ 340.974863] livepatch: klp_check_stack: kworker/u256:2:23488 has an unreliable stack [ 340.974868] livepatch: klp_check_stack: kworker/u256:5:23489 has an unreliable stack [ 340.974872] livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack ...... BTW,if you use the v5.4.217 tag for testing, make sure to set CONFIG_RETPOLINE = y and CONFIG_LIVEPATCH = y, and other steps are consistent with v5.4.269 After investigation, The problem is strongly related to the commit 8afd1c7da2b0 ("x86/speculation: Change FILL_RETURN_BUFFER to work with objtool"), which would cause incorrect ORC entries to be generated, and the v5.4.217 version can undo this commit to make kernel livepatch work normally. It is a back-ported upstream patch with some code adjustments,from the git log, the author also mentioned no intra-function call validation support. Based on commit 6e1f54a4985b63bc1b55a09e5e75a974c5d6719b (Linux 5.4.269), This patchset adds stack validation support for intra-function calls, allowing the kernel live patching feature to work correctly. v3 - v2 - fix the compile error in arch/x86/kvm/svm.c, the error message is../arch/x86/include/asm/nospec-branch.h: 313: Error: no such instruction: 'unwind_hint_empty' v2 - v1 - add the tag "Cc: stable(a)vger.kernel.org" in the sign-off area for patch x86/speculation: Support intra-function call - add my own Signed-off to all patches Alexandre Chartre (2): objtool: is_fentry_call() crashes if call has no destination objtool: Add support for intra-function calls Rui Qi (1): x86/speculation: Support intra-function call validation arch/x86/include/asm/nospec-branch.h | 7 ++ arch/x86/include/asm/unwind_hints.h | 2 +- include/linux/frame.h | 11 ++++ .../Documentation/stack-validation.txt | 8 +++ tools/objtool/arch/x86/decode.c | 6 ++ tools/objtool/check.c | 64 +++++++++++++++++-- 6 files changed, 92 insertions(+), 6 deletions(-) -- 2.20.1

1 year, 4 months

1
3
0 0

[PATCH v1] mm: swap: Fix race between free_swap_and_cache() and swapoff()

by Ryan Roberts

There was previously a theoretical window where swapoff() could run and teardown a swap_info_struct while a call to free_swap_and_cache() was running in another thread. This could cause, amongst other bad possibilities, swap_page_trans_huge_swapped() (called by free_swap_and_cache()) to access the freed memory for swap_map. This is a theoretical problem and I haven't been able to provoke it from a test case. But there has been agreement based on code review that this is possible (see link below). Fix it by using get_swap_device()/put_swap_device(), which will stall swapoff(). There was an extra check in _swap_info_get() to confirm that the swap entry was valid. This wasn't present in get_swap_device() so I've added it. I couldn't find any existing get_swap_device() call sites where this extra check would cause any false alarms. Details of how to provoke one possible issue (thanks to David Hilenbrand for deriving this): --8<----- __swap_entry_free() might be the last user and result in "count == SWAP_HAS_CACHE". swapoff->try_to_unuse() will stop as soon as soon as si->inuse_pages==0. So the question is: could someone reclaim the folio and turn si->inuse_pages==0, before we completed swap_page_trans_huge_swapped(). Imagine the following: 2 MiB folio in the swapcache. Only 2 subpages are still references by swap entries. Process 1 still references subpage 0 via swap entry. Process 2 still references subpage 1 via swap entry. Process 1 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE [then, preempted in the hypervisor etc.] Process 2 quits. Calls free_swap_and_cache(). -> count == SWAP_HAS_CACHE Process 2 goes ahead, passes swap_page_trans_huge_swapped(), and calls __try_to_reclaim_swap(). __try_to_reclaim_swap()->folio_free_swap()->delete_from_swap_cache()-> put_swap_folio()->free_swap_slot()->swapcache_free_entries()-> swap_entry_free()->swap_range_free()-> ... WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries); What stops swapoff to succeed after process 2 reclaimed the swap cache but before process1 finished its call to swap_page_trans_huge_swapped()? --8<----- Fixes: 7c00bafee87c ("mm/swap: free swap slots in batch") Closes: https://lore.kernel.org/linux-mm/65a66eb9-41f8-4790-8db2-0c70ea15979f@redha… Cc: stable(a)vger.kernel.org Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com> --- Applies on top of v6.8-rc6 and mm-unstable (b38c34939fe4). Thanks, Ryan mm/swapfile.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 2b3a2d85e350..f580e6abc674 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1281,7 +1281,9 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry) smp_rmb(); offset = swp_offset(entry); if (offset >= si->max) - goto put_out; + goto bad_offset; + if (data_race(!si->swap_map[swp_offset(entry)])) + goto bad_free; return si; bad_nofile: @@ -1289,9 +1291,14 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry) out: return NULL; put_out: - pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val); percpu_ref_put(&si->users); return NULL; +bad_offset: + pr_err("%s: %s%08lx\n", __func__, Bad_offset, entry.val); + goto put_out; +bad_free: + pr_err("%s: %s%08lx\n", __func__, Unused_offset, entry.val); + goto put_out; } static unsigned char __swap_entry_free(struct swap_info_struct *p, @@ -1609,13 +1616,14 @@ int free_swap_and_cache(swp_entry_t entry) if (non_swap_entry(entry)) return 1; - p = _swap_info_get(entry); + p = get_swap_device(entry); if (p) { count = __swap_entry_free(p, entry); if (count == SWAP_HAS_CACHE && !swap_page_trans_huge_swapped(p, entry)) __try_to_reclaim_swap(p, swp_offset(entry), TTRS_UNMAPPED | TTRS_FULL); + put_swap_device(p); } return p != NULL; } -- 2.25.1

1 year, 4 months

5
23
0 0

[PATCH v4 0/3] Disable automatic load CCS load balancing

by Andi Shyti

Hi, I have to admit that v3 was a lazy attempt. This one should be on the right path. this series does basically two things: 1. Disables automatic load balancing as adviced by the hardware workaround. 2. Assigns all the CCS slices to one single user engine. The user will then be able to query only one CCS engine I'm using here the "Requires: " tag, but I'm not sure the commit id will be valid, on the other hand, I don't know what commit id I should use. Thanks Tvrtko, Matt, John and Joonas for your reviews! Andi Changelog ========= v3 -> v4 - Reword correctly the comment in the workaround - Fix a buffer overflow (Thanks Joonas) - Handle properly the fused engines when setting the CCS mode. v2 -> v3 - Simplified the algorithm for creating the list of the exported uabi engines. (Patch 1) (Thanks, Tvrtko) - Consider the fused engines when creating the uabi engine list (Patch 2) (Thanks, Matt) - Patch 4 now uses a the refactoring from patch 1, in a cleaner outcome. v1 -> v2 - In Patch 1 use the correct workaround number (thanks Matt). - In Patch 2 do not add the extra CCS engines to the exposed UABI engine list and adapt the engine counting accordingly (thanks Tvrtko). - Reword the commit of Patch 2 (thanks John). Andi Shyti (3): drm/i915/gt: Disable HW load balancing for CCS drm/i915/gt: Refactor uabi engine class/instance list creation drm/i915/gt: Enable only one CCS for compute workload drivers/gpu/drm/i915/gt/intel_engine_user.c | 40 ++++++++++++++------- drivers/gpu/drm/i915/gt/intel_gt.c | 23 ++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_regs.h | 6 ++++ drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++ 4 files changed, 62 insertions(+), 12 deletions(-) -- 2.43.0

1 year, 4 months

3
9
0 0

Fwd: Continuous ACPI errors resulting in high CPU usage by journald

by Bagas Sanjaya

Hi, On Bugzilla, danilrybakov249(a)gmail.com reported stable-specific, ACPI error regression that led into high CPU temperature [1]. He wrote: > Overview: > > After updating from lts v6.6.14-2 to lts v6.6.17-1 noticed high CPU temperature and lag. After running htop noticed that journald was using 30-60% of CPU. Afterwards, tried switching to stable, or lts v6.6.18-1, but encountered the same issue. > > Running journalctl -f gives these lines over and over again: > > Feb 19 21:09:12 danirybe kernel: ACPI Error: Could not disable RealTimeClock events (20230628/evxfevnt-243) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 08, disabling event (20230628/evgpe-839) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0A, disabling event (20230628/evgpe-839) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No handler or method for GPE 0B, disabling event (20230628/evgpe-839) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PM_Timer (0), disabling (20230628/evevent-255) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - PowerButton (2), disabling (20230628/evevent-255) > Feb 19 21:09:12 danirybe kernel: ACPI Error: No installed handler for fixed event - SleepButton (3), disabling (20230628/evevent-255) > > My system info: > > Laptop model: ASUS VivoBook D540NV-GQ065T > OS: Arch Linux x86_64 > Kernel: 6.6.14-2-lts > WM: sway > CPU: Intel Pentium N420 (4) @ 2.500GHz > GPU1: Intel Apollo Lake [HD Graphics 505] > GPU2: NVIDIA GeForce 920MX > > I've pinned down the commit after which the problem occurs: > > 847e1eb30e269a094da046c08273abe3f3361cf2 is the first bad commit > commit 847e1eb30e269a094da046c08273abe3f3361cf2 > Author: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com> > Date: Mon Jan 8 15:20:58 2024 +0900 > > platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe > > commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream. > > <snipped>... See Bugzilla for the full thread. Thanks. [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218531 -- An old man doll... just what I always wanted! - Clara

1 year, 4 months

5
4
0 0

[PATCH] can: mcp251xfd: fix infinite loop when xmit fails

by Vitor Soares

From: Vitor Soares <vitor.soares(a)toradex.com> When the mcp251xfd_start_xmit() function fails, the driver stops processing messages, and the interrupt routine does not return, running indefinitely even after killing the running application. Error messages: [ 441.298819] mcp251xfd spi2.0 can0: ERROR in mcp251xfd_start_xmit: -16 [ 441.306498] mcp251xfd spi2.0 can0: Transmit Event FIFO buffer not empty. (seq=0x000017c7, tef_tail=0x000017cf, tef_head=0x000017d0, tx_head=0x000017d3). ... and repeat forever. The issue can be triggered when multiple devices share the same SPI interface. And there is concurrent access to the bus. The problem occurs because tx_ring->head increments even if mcp251xfd_start_xmit() fails. Consequently, the driver skips one TX package while still expecting a response in mcp251xfd_handle_tefif_one(). This patch resolves the issue by decreasing tx_ring->head if mcp251xfd_start_xmit() fails. With the fix, if we attempt to trigger the issue again, the driver prints an error and discard the message. Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Cc: stable(a)vger.kernel.org Signed-off-by: Vitor Soares <vitor.soares(a)toradex.com> --- drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c | 27 ++++++++++---------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c index 160528d3cc26..a8eb941c1b95 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c @@ -181,25 +181,26 @@ netdev_tx_t mcp251xfd_start_xmit(struct sk_buff *skb, tx_obj = mcp251xfd_get_tx_obj_next(tx_ring); mcp251xfd_tx_obj_from_skb(priv, tx_obj, skb, tx_ring->head); - /* Stop queue if we occupy the complete TX FIFO */ tx_head = mcp251xfd_get_tx_head(tx_ring); - tx_ring->head++; - if (mcp251xfd_get_tx_free(tx_ring) == 0) - netif_stop_queue(ndev); - frame_len = can_skb_get_frame_len(skb); - err = can_put_echo_skb(skb, ndev, tx_head, frame_len); - if (!err) - netdev_sent_queue(priv->ndev, frame_len); + can_put_echo_skb(skb, ndev, tx_head, frame_len); + + tx_ring->head++; err = mcp251xfd_tx_obj_write(priv, tx_obj); - if (err) - goto out_err; + if (err) { + can_free_echo_skb(ndev, tx_head, NULL); - return NETDEV_TX_OK; + tx_ring->head--; + + netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err); + } else { + /* Stop queue if we occupy the complete TX FIFO */ + if (mcp251xfd_get_tx_free(tx_ring) == 0) + netif_stop_queue(ndev); - out_err: - netdev_err(priv->ndev, "ERROR in %s: %d\n", __func__, err); + netdev_sent_queue(priv->ndev, frame_len); + } return NETDEV_TX_OK; } -- 2.34.1

1 year, 4 months

3
4
0 0

[PATCH] tpm,tpm_tis: Avoid warning splat at shutdown

by Lino Sanfilippo

If interrupts are not activated the work struct 'free_irq_work' is not initialized. This results in a warning splat at module shutdown. Fix this by always initializing the work regardless of whether interrupts are activated or not. cc: stable(a)vger.kernel.org Fixes: 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs") Reported-by: Jarkko Sakkinen <jarkko(a)kernel.org> Closes: https://lore.kernel.org/all/CX32RFOMJUQ0.3R4YCL9MDCB96@kernel.org/ Signed-off-by: Lino Sanfilippo <l.sanfilippo(a)kunbus.com> --- drivers/char/tpm/tpm_tis_core.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index 1b350412d8a6..64c875657687 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -919,8 +919,6 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask, int rc; u32 int_status; - INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func); - rc = devm_request_threaded_irq(chip->dev.parent, irq, NULL, tis_int_handler, IRQF_ONESHOT | flags, dev_name(&chip->dev), chip); @@ -1132,6 +1130,7 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq, priv->phy_ops = phy_ops; priv->locality_count = 0; mutex_init(&priv->locality_count_mutex); + INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func); dev_set_drvdata(&chip->dev, priv); base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3 -- 2.43.0

1 year, 4 months

4
7
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror March 2024