I've run into some problems which appear due to (a) recent patch(es) on the wlcore wifi driver.
4.4.160 - commit 3fdd34643ffc378b5924941fad40352c04610294 4.9.131 - commit afeeecc764436f31d4447575bb9007732333818c
Earlier versions (4.9.130 and 4.4.159 - tested back to 4.4.49) do not exhibit this problem. It is still present in 4.9.141.
master as of 4.20.0-rc4 does not exhibit this problem.
Basically, during client association when in AP mode (running hostapd), handshake may or may not complete following a noticeable delay. If successful, then the driver fails consistently in warn_slowpath_null during disassociation. If unsuccessful, the wifi client attempts multiple times, sometimes failing repeatedly. I've had clients unable to connect for 3-5 minutes during testing, with the syslog filled with dozens of backtraces. syslog details are below.
I'm working on an embedded device with a TI 3352 ARM processor and a murata wl1271 module in sdio mode. We're running a fully patched ubuntu 18.04 ARM build, with a kernel built from kernel.org's stable/linux repo https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.9.y&id=afeeecc764436f31d4447575bb9007732333818c. Relevant parts of the kernel config are included below.
The commit message states:
/I've only seen this few times with the runtime PM patches enabled so this one is probably not needed before that. This seems to work currently based on the current PM implementation timer. Let's apply this separately though in case others are hitting this issue./
We're not doing anything explicit with power management. The device is an IoT edge gateway with battery backup, normally running on wall power. The battery is currently used solely to shut down the system cleanly to avoid filesystem corruption.
The device tree is configured to keep power in suspend; but the device should never suspend, so in our case, there is no need to call wl1271_ps_elp_wakeup() or wl1271_ps_elp_sleep(), as occurs in the patch.
&mmc2 { status = "okay"; pinctrl-names = "default"; pinctrl-0 = <&wl1271_pins>; vmmc-supply = <&vwifi>; bus-width = <4>; ti,non-removable; /* am335x-evm.dts: ti,needs-special-hs-handling; - evm has wl18xx not wl12xx */ cap-power-off-card; keep-power-in-suspend;
#address-cells = <1>; #size-cells = <0>; wlcore: wlcore@2 { compatible = "ti,wl1271"; reg = <2>; interrupt-parent = <&gpio1>; interrupts = <14 IRQ_TYPE_LEVEL_HIGH>; /* gpio1[14] */ ref-clock-frequency = <38400000>; }; };
At this point, we're unable to ship a kernel version later than 4.9.130; so it's important to us to get this issue resolved.
The simplest thing for us would be if these changes could be reverted; but I'd be happy to debug or try some things out.
Thanks,
Dietmar May Software Architect Intellastar LLC
_Association_
Nov 16 15:25:52 ice hostapd: wlan0: STA 84:3a:4b:00:8d:04 IEEE 802.11: authenticated Nov 16 15:25:52 ice hostapd: wlan0: STA 84:3a:4b:00:8d:04 IEEE 802.11: associated (aid 1) Nov 16 15:25:52 ice hostapd: wlan0: STA 84:3a:4b:00:8d:04 RADIUS: starting accounting session 5BEEE158-00000000 Nov 16 15:25:52 ice hostapd: wlan0: STA 84:3a:4b:00:8d:04 WPA: pairwise key handshake completed (RSN)
_Disassociation_
Nov 16 15:26:05 ice kernel: ------------[ cut here ]------------ Nov 16 15:26:05 ice kernel: WARNING: CPU: 0 PID: 1067 at drivers/net/wireless/ti/wlcore/ps.c:91 wl12xx_op_sta_state+0x208/0x56c [wlcore] Nov 16 15:26:05 ice kernel: Modules linked in: bridge stp llc cdc_ncm usbnet mii cdc_acm usb_serial_simple usbserial bnep hci_uart bluetooth xt_conntrack iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack arc4 wl12xx wlcore mac80211 cfg80211 musb_dsps musb_hdrc usbcore phy_am335x cppi41 phy_am335x_control phy_generic usb_common ti_am335x_adc kfifo_buf industrialio wlcore_sdio omap_rng rng_core musb_am335x rtc_omap omap_wdt ti_am335x_tscadc cpufreq_dt leds_gpio led_class thermal_sys hwmon autofs4 Nov 16 15:26:05 ice kernel: CPU: 0 PID: 1067 Comm: hostapd Not tainted 4.9.131-ice245 #1 Nov 16 15:26:05 ice kernel: Hardware name: Generic AM33XX (Flattened Device Tree) Nov 16 15:26:05 ice kernel: [<c010d2e0>] (unwind_backtrace) from [<c010b50c>] (show_stack+0x10/0x14) Nov 16 15:26:05 ice kernel: [<c010b50c>] (show_stack) from [<c012e3ac>] (__warn+0xd8/0x100) Nov 16 15:26:05 ice kernel: [<c012e3ac>] (__warn) from [<c012e480>] (warn_slowpath_null+0x20/0x28) Nov 16 15:26:05 ice kernel: [<c012e480>] (warn_slowpath_null) from [<bf2e521c>] (wl12xx_op_sta_state+0x208/0x56c [wlcore]) Nov 16 15:26:05 ice kernel: [<bf2e521c>] (wl12xx_op_sta_state [wlcore]) from [<bf20dbf8>] (drv_sta_state+0x84/0x6c8 [mac80211]) Nov 16 15:26:05 ice kernel: [<bf20dbf8>] (drv_sta_state [mac80211]) from [<bf215284>] (__sta_info_destroy_part2+0x160/0x1b4 [mac80211]) Nov 16 15:26:05 ice kernel: [<bf215284>] (__sta_info_destroy_part2 [mac80211]) from [<bf2152f8>] (__sta_info_destroy+0x20/0x28 [mac80211]) Nov 16 15:26:05 ice kernel: [<bf2152f8>] (__sta_info_destroy [mac80211]) from [<bf21537c>] (sta_info_destroy_addr_bss+0x30/0x4c [mac80211]) Nov 16 15:26:05 ice kernel: [<bf21537c>] (sta_info_destroy_addr_bss [mac80211]) from [<bf171c70>] (nl80211_del_station+0xe8/0x2b8 [cfg80211]) Nov 16 15:26:05 ice kernel: [<bf171c70>] (nl80211_del_station [cfg80211]) from [<c05535c4>] (genl_rcv_msg+0x308/0x3e4) Nov 16 15:26:05 ice kernel: [<c05535c4>] (genl_rcv_msg) from [<c05527a0>] (netlink_rcv_skb+0xa4/0xe8) Nov 16 15:26:05 ice kernel: [<c05527a0>] (netlink_rcv_skb) from [<c05532a8>] (genl_rcv+0x20/0x34) Nov 16 15:26:05 ice kernel: [<c05532a8>] (genl_rcv) from [<c0552100>] (netlink_unicast+0x168/0x1f4) Nov 16 15:26:05 ice kernel: [<c0552100>] (netlink_unicast) from [<c0552540>] (netlink_sendmsg+0x2e8/0x378) Nov 16 15:26:05 ice kernel: [<c0552540>] (netlink_sendmsg) from [<c050396c>] (sock_sendmsg+0x14/0x24) Nov 16 15:26:05 ice kernel: [<c050396c>] (sock_sendmsg) from [<c05041f8>] (___sys_sendmsg+0x1ec/0x200) Nov 16 15:26:05 ice kernel: [<c05041f8>] (___sys_sendmsg) from [<c0504fa0>] (__sys_sendmsg+0x40/0x6c) Nov 16 15:26:05 ice kernel: [<c0504fa0>] (__sys_sendmsg) from [<c0107560>] (ret_fast_syscall+0x0/0x1c) Nov 16 15:26:05 ice kernel: ---[ end trace 44f73265865f31c4 ]---
CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="am335x-pm-firmware.elf" CONFIG_AM335X_PHY_USB=m CONFIG_ARCH_MULTI_V6=y CONFIG_ARCH_OMAP2PLUS=y CONFIG_ARCH_OMAP2=y CONFIG_ARCH_OMAP3=y CONFIG_ARCH_OMAP=y CONFIG_ARM_APPENDED_DTB=y CONFIG_ARM_ATAG_DTB_COMPAT=y CONFIG_ARM_CRYPTO=y CONFIG_ARM_ERRATA_411920=y CONFIG_ARM_ERRATA_430973=y CONFIG_ARM_THUMBEE=y CONFIG_CFG80211=m CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y CONFIG_CPUFREQ_DT=m CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_STAT_DETAILS=y CONFIG_CPU_FREQ=y CONFIG_CPU_IDLE=y CONFIG_CPUSETS=y CONFIG_CPU_THERMAL=y CONFIG_DMA_CMA=y CONFIG_DMADEVICES=y CONFIG_DMA_OMAP=yCONFIG_MAC80211=m CONFIG_MMC_OMAP_HS=y CONFIG_MMC_OMAP=y CONFIG_MMC=y CONFIG_OMAP3_THERMAL=y CONFIG_OMAP_IOMMU=y CONFIG_OMAP_MUX_DEBUG=n CONFIG_OMAP_OCP2SCP=y CONFIG_OMAP_RESET_CLOCKS=y CONFIG_OMAP_SSI=m CONFIG_OMAP_USB2=m CONFIG_OMAP_WATCHDOG=m CONFIG_POWER_AVS_OMAP_CLASS3=y CONFIG_POWER_AVS_OMAP=y CONFIG_POWER_AVS=y CONFIG_POWER_RESET=y CONFIG_SLUB=y CONFIG_SOC_AM33XX=y CONFIG_SOC_TI=y CONFIG_THERMAL_GOV_FAIR_SHARE=y CONFIG_THERMAL_GOV_USER_SPACE=y CONFIG_THERMAL=m CONFIG_TI_AM335X_ADC=m CONFIG_TI_CPSW=y CONFIG_TI_CPTS=y CONFIG_TI_DAVINCI_EMAC=y CONFIG_TI_EDMA=y CONFIG_TI_EMIF=m CONFIG_TIMER_STATS=y CONFIG_TI_PIPE3=y CONFIG_TI_SOC_THERMAL=m CONFIG_TI_THERMAL=y CONFIG_WIRELESS=y CONFIG_WL12XX=m CONFIG_WL18XX=m CONFIG_WLAN=y CONFIG_WLCORE_SDIO=m CONFIG_WLCORE_SPI=m CONFIG_WL_TI=y
On Thu, Nov 29, 2018 at 05:56:31PM -0500, Dietmar May wrote:
I've run into some problems which appear due to (a) recent patch(es) on the wlcore wifi driver.
4.4.160 - commit 3fdd34643ffc378b5924941fad40352c04610294 4.9.131 - commit afeeecc764436f31d4447575bb9007732333818c
Earlier versions (4.9.130 and 4.4.159 - tested back to 4.4.49) do not exhibit this problem. It is still present in 4.9.141.
master as of 4.20.0-rc4 does not exhibit this problem.
Basically, during client association when in AP mode (running hostapd), handshake may or may not complete following a noticeable delay. If successful, then the driver fails consistently in warn_slowpath_null during disassociation. If unsuccessful, the wifi client attempts multiple times, sometimes failing repeatedly. I've had clients unable to connect for 3-5 minutes during testing, with the syslog filled with dozens of backtraces. syslog details are below.
I'm working on an embedded device with a TI 3352 ARM processor and a murata wl1271 module in sdio mode. We're running a fully patched ubuntu 18.04 ARM build, with a kernel built from kernel.org's stable/linux repo https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.9.y&id=afeeecc764436f31d4447575bb9007732333818c. Relevant parts of the kernel config are included below.
The commit message states:
/I've only seen this few times with the runtime PM patches enabled so this one is probably not needed before that. This seems to work currently based on the current PM implementation timer. Let's apply this separately though in case others are hitting this issue./
We're not doing anything explicit with power management. The device is an IoT edge gateway with battery backup, normally running on wall power. The battery is currently used solely to shut down the system cleanly to avoid filesystem corruption.
The device tree is configured to keep power in suspend; but the device should never suspend, so in our case, there is no need to call wl1271_ps_elp_wakeup() or wl1271_ps_elp_sleep(), as occurs in the patch.
Given that this patch went in through AUTOSEL, I've queued up a revert of it (sorry for the trouble!).
I'll link this mail in the revert message. If anyone feels that this patch should be in any of the LTS trees then either reply to this thread or start a new one on stable@vger.kernel.org.
-- Thanks, Sasha
Sasha,
I've verified that 4.9.143 no longer exhibits this problem.
The revert hasn't shown up in 4.4 yet; but I'll verify once merged there.
Thanks, Dietmar ------------------------------------------------------------------------ On 12/2/18 10:08 AM, Sasha Levin wrote:
On Thu, Nov 29, 2018 at 05:56:31PM -0500, Dietmar May wrote:
I've run into some problems which appear due to (a) recent patch(es) on the wlcore wifi driver.
4.4.160 - commit 3fdd34643ffc378b5924941fad40352c04610294 4.9.131 - commit afeeecc764436f31d4447575bb9007732333818c
Earlier versions (4.9.130 and 4.4.159 - tested back to 4.4.49) do not exhibit this problem. It is still present in 4.9.141.
master as of 4.20.0-rc4 does not exhibit this problem.
Basically, during client association when in AP mode (running hostapd), handshake may or may not complete following a noticeable delay. If successful, then the driver fails consistently in warn_slowpath_null during disassociation. If unsuccessful, the wifi client attempts multiple times, sometimes failing repeatedly. I've had clients unable to connect for 3-5 minutes during testing, with the syslog filled with dozens of backtraces. syslog details are below.
I'm working on an embedded device with a TI 3352 ARM processor and a murata wl1271 module in sdio mode. We're running a fully patched ubuntu 18.04 ARM build, with a kernel built from kernel.org's stable/linux repo https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.9.y&id=afeeecc764436f31d4447575bb9007732333818c. Relevant parts of the kernel config are included below.
The commit message states:
/I've only seen this few times with the runtime PM patches enabled so this one is probably not needed before that. This seems to work currently based on the current PM implementation timer. Let's apply this separately though in case others are hitting this issue./
We're not doing anything explicit with power management. The device is an IoT edge gateway with battery backup, normally running on wall power. The battery is currently used solely to shut down the system cleanly to avoid filesystem corruption.
The device tree is configured to keep power in suspend; but the device should never suspend, so in our case, there is no need to call wl1271_ps_elp_wakeup() or wl1271_ps_elp_sleep(), as occurs in the patch.
Given that this patch went in through AUTOSEL, I've queued up a revert of it (sorry for the trouble!).
I'll link this mail in the revert message. If anyone feels that this patch should be in any of the LTS trees then either reply to this thread or start a new one on stable@vger.kernel.org.
-- Thanks, Sasha
linux-stable-mirror@lists.linaro.org