Hi Mika,
I'm following along with attempts to "fix" our user space to paper over this issue, and I think some of this conversation missed the mark. (Sorry for jumping in late.)
On Tue, Mar 10, 2020 at 04:49:13PM +0200, Mika Westerberg wrote:
On Tue, Mar 10, 2020 at 03:12:00PM +0100, Michał Stanek wrote:
On Mon, Feb 10, 2020 at 11:14 AM Mika Westerberg mika.westerberg@linux.intel.com wrote:
On Sat, Feb 08, 2020 at 07:43:24PM +0100, Michał Stanek wrote:
Hi Mika,
The previous patches from Dmitry handled IRQ numbering, here we have a similar issue with GPIO to pin translation - hardcoded values in FW which do not agree with the (non-consecutive) numbering in newer kernels.
Hmm, so instead of passing GpioIo/GpioInt resources to devices the firmware uses some hard-coded Linux GPIO numbering scheme? Would you able to share the exact firmware description where this happens?
Actually it is a GPIO offset in ACPI tables for Braswell that was hardcoded in the old firmware to match the previous (consecutive) Linux GPIO numbering.
Can you share the ACPI tables and point me to the GPIO that is using Linux number?
I think this is the one: https://chromium-review.googlesource.com/c/chromiumos/third_party/coreboot/%...
On Kefka the sysfs GPIO number for wpsw_cur was gpio392 before the translation change occurred in Linux.
But that table does not seem to have any GPIO numbers in it.
Actually, it's encoding pin numbers, not GPIO numbers. The 0x10016 (or now, 0x10013) is encoding a bank offset (0x10000) and pin number (0x16 or 0x13). The actual pin numbers is 0x16, I believe, but someone decided to subtract 3, because the Linux numbering used to be contiguous, skipping over the hole between 11 and 15.
So no, nobody was hard-coding gpiochip numbers -- we were hard-coding the contiguous pin number (relative to the bank). Now that commit 03c4749dd6c7ff94 ("gpio / ACPI: Drop unnecessary ACPI GPIO to Linux GPIO translation") made those non-contiguous, we're kinda screwed -- we have to guess (based on the kernel version number) whether pin numbers (within a single bank!) are contiguous or not.
This is something that should be fixed in userspace. Using global Linux GPIO or IRQ numbers is fragile and source of issues like this.
To be clear, we're not hard-coding global <anything> numbers in user space.
in case of sysfs, you can find the base of the chip
We're doing that.
and then user relative numbering against it or switch
^^ This is the problem. The *bank-relative* numbers changed.
Both cases the GPIO number are relative against the GPIO chip so they work even if global Linux GPIO numbering changes.
I analyzed crossystem source code and it looks like it is doing exactly what you're saying without any hardcoded assumptions.
^^ Exactly.
With the newer kernel the gpiochip%d number is different so crossystem ends up reading the wrong pin.
Hmm, so gpiochipX is also not considered a stable number. It is based on ARCH_NR_GPIOS which may change. So if the userspace is relaying certain GPIO chip is always gpichip200 for example then it is wrong.
If you just read the last sentence from Michal, you get the wrong picture. There's no hard-coding of gpiochipX numbers going on. We only had the pin offsets "hardcoded" (in ACPI), and the kernel driver unilaterally changed from a contiguous mapping to a non-contiguous mapping.
How do you recommend determining (both pre- and post-commit-03c4749dd6c7ff94) whether pin 22 is at offset 22, vs. offset 19?
Brian