Hi,
On 1/31/20 6:17 PM, Z R wrote:
> Hi Benjamin,
> in last log touchpad.log (omg, should be keyboard.log), I pressed fn-f3 multiple times. I got one two liner:
>
> # ReportID: 3 / Wireless Radio Button: 0 | #
> E: 000007.606583 2 03 00
>
> every time i release f3. Does not matter what is happening with fn-key. (could be released already or still pushed down). This log was collected with last patch from Hans applied.
>
> Sorry for the mess I caused :-) I still don't get how you guys manage to have your emails so well polished and readable.
I don't think you've, caused a mess. Thank you both for the bug report and for your
help in testing this.
One last test request, can you run evemu-record on a kernel with my latest simplified patch,
select the keyboard device and then press (and release) the "rfkill" key and see it generates
a RF_KILL key press + release evdev event?
What would also be helpful is the output of:
ls -l /sys/bus/hid/devices/0003*/driver
Regards,
Hans
> pá 31. 1. 2020 v 18:00 odesílatel Benjamin Tissoires <benjamin.tissoires(a)redhat.com <mailto:benjamin.tissoires@redhat.com>> napsal:
>
> On Fri, Jan 31, 2020 at 5:09 PM Z R <zdenda.rampas(a)gmail.com <mailto:zdenda.rampas@gmail.com>> wrote:
> >
> > I believe I pressed wifi button on both replays for keyboard.
>
> Yep, I can see that. Just to double check, on the last log, you
> pressed the wifi button twice?
>
> Anyway, thanks for all the logs, I should have enough to implement the
> regression tests.
>
> Cheers,
> Benjamin
>
> >
> > With latest patch from Hans on top of v5.5.0 touchpads "two finger scrolling" is working again. Attaching current hid-record for keyboard with Wifi button pressed. Events in log appeared after f3 button was "released".
> >
> > Thanks
> >
> > Zdeněk
> >
> > pá 31. 1. 2020 v 16:45 odesílatel Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com>> napsal:
> >>
> >> Hi,
> >>
> >> On 1/31/20 4:38 PM, Z R wrote:
> >> > Hi Benjamin,
> >> > hid-record for keyboard and touchpad. With Commit 8f18eca9ebc5 reverted and from unmodified kernel.
> >> >
> >> > I hope it is what you asked for :-)
> >> >
> >> > Currently waiting for reworked patch from Hans.
> >>
> >> I just send you the reworked patch.
> >>
> >> Does the recordning include pressing of the wlan on/off key (Fn + F3 I believe) ?
> >> That is the whole reason why the special hid-ite driver is necessary.
> >>
> >> Benjamin about the wlan on/off key. AFAICR on a press + release of the key a
> >> single hid input report for the generic-desktop Wireless Radio Controls group
> >> is send. This input-report only has the one button with usage code HID_GD_RFKILL_BTN
> >> in there and it is always 0. It is as if the input-report is only send on release
> >> and not on press. So the hid-ite code emulates a press + release whenever the
> >> input-report is send.
> >>
> >> IOW the receiving of the input report is (ab)used as indication of the button
> >> having been pressed.
> >>
> >> Regards,
> >>
> >> Hans
> >>
> >>
> >> >
> >> > Bye for now
> >> > Zdeněk
> >> >
> >> > pá 31. 1. 2020 v 15:12 odesílatel Benjamin Tissoires <benjamin.tissoires(a)redhat.com <mailto:benjamin.tissoires@redhat.com> <mailto:benjamin.tissoires@redhat.com <mailto:benjamin.tissoires@redhat.com>>> napsal:
> >> >
> >> > On Fri, Jan 31, 2020 at 3:04 PM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com> <mailto:hdegoede@redhat.com <mailto:hdegoede@redhat.com>>> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > On 1/31/20 2:54 PM, Benjamin Tissoires wrote:
> >> > > > On Fri, Jan 31, 2020 at 2:41 PM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com> <mailto:hdegoede@redhat.com <mailto:hdegoede@redhat.com>>> wrote:
> >> > > >>
> >> > > >> Hi,
> >> > > >>
> >> > > >> On 1/31/20 2:10 PM, Benjamin Tissoires wrote:
> >> > > >>> Hi Hans,
> >> > > >>>
> >> > > >>> On Fri, Jan 31, 2020 at 1:46 PM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com> <mailto:hdegoede@redhat.com <mailto:hdegoede@redhat.com>>> wrote:
> >> > > >>>>
> >> > > >>>> Commit 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard
> >> > > >>>> dock") added the USB id for the Acer SW5-012's keyboard dock to the
> >> > > >>>> hid-ite driver to fix the rfkill driver not working.
> >> > > >>>>
> >> > > >>>> Most keyboard docks with an ITE 8595 keyboard/touchpad controller have the
> >> > > >>>> "Wireless Radio Control" bits which need the special hid-ite driver on the
> >> > > >>>> second USB interface (the mouse interface) and their touchpad only supports
> >> > > >>>> mouse emulation, so using generic hid-input handling for anything but
> >> > > >>>> the "Wireless Radio Control" bits is fine. On these devices we simply bind
> >> > > >>>> to all USB interfaces.
> >> > > >>>>
> >> > > >>>> But unlike other ITE8595 using keyboard docks, the Acer Aspire Switch 10
> >> > > >>>> (SW5-012)'s touchpad not only does mouse emulation it also supports
> >> > > >>>> HID-multitouch and all the keys including the "Wireless Radio Control"
> >> > > >>>> bits have been moved to the first USB interface (the keyboard intf).
> >> > > >>>>
> >> > > >>>> So we need hid-ite to handle the first (keyboard) USB interface and have
> >> > > >>>> it NOT bind to the second (mouse) USB interface so that that can be
> >> > > >>>> handled by hid-multitouch.c and we get proper multi-touch support.
> >> > > >>>>
> >> > > >>>> This commit adds a match callback to hid-ite which makes it only
> >> > > >>>> match the first USB interface when running on the Acer SW5-012,
> >> > > >>>> fixing the regression to mouse-emulation mode introduced by adding the
> >> > > >>>> keyboard dock USB id.
> >> > > >>>>
> >> > > >>>> Note the match function only does the special only bind to the first
> >> > > >>>> USB interface on the Acer SW5-012, on other devices the hid-ite driver
> >> > > >>>> actually must bind to the second interface as that is where the
> >> > > >>>> "Wireless Radio Control" bits are.
> >> > > >>>
> >> > > >>> This is not a full review, but a couple of things that popped out
> >> > > >>> while scrolling through the patch.
> >> > > >>>
> >> > > >>>>
> >> > > >>>> Cc: stable(a)vger.kernel.org <mailto:stable@vger.kernel.org> <mailto:stable@vger.kernel.org <mailto:stable@vger.kernel.org>>
> >> > > >>>> Fixes: 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard dock")
> >> > > >>>> Reported-by: Zdeněk Rampas <zdenda.rampas(a)gmail.com <mailto:zdenda.rampas@gmail.com> <mailto:zdenda.rampas@gmail.com <mailto:zdenda.rampas@gmail.com>>>
> >> > > >>>> Signed-off-by: Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com> <mailto:hdegoede@redhat.com <mailto:hdegoede@redhat.com>>>
> >> > > >>>> ---
> >> > > >>>> drivers/hid/hid-ite.c | 34 ++++++++++++++++++++++++++++++++++
> >> > > >>>> 1 file changed, 34 insertions(+)
> >> > > >>>>
> >> > > >>>> diff --git a/drivers/hid/hid-ite.c b/drivers/hid/hid-ite.c
> >> > > >>>> index c436e12feb23..69a4ddfd033d 100644
> >> > > >>>> --- a/drivers/hid/hid-ite.c
> >> > > >>>> +++ b/drivers/hid/hid-ite.c
> >> > > >>>> @@ -8,9 +8,12 @@
> >> > > >>>> #include <linux/input.h>
> >> > > >>>> #include <linux/hid.h>
> >> > > >>>> #include <linux/module.h>
> >> > > >>>> +#include <linux/usb.h>
> >> > > >>>>
> >> > > >>>> #include "hid-ids.h"
> >> > > >>>>
> >> > > >>>> +#define ITE8595_KBD_USB_INTF 0
> >> > > >>>> +
> >> > > >>>> static int ite_event(struct hid_device *hdev, struct hid_field *field,
> >> > > >>>> struct hid_usage *usage, __s32 value)
> >> > > >>>> {
> >> > > >>>> @@ -37,6 +40,36 @@ static int ite_event(struct hid_device *hdev, struct hid_field *field,
> >> > > >>>> return 0;
> >> > > >>>> }
> >> > > >>>>
> >> > > >>>> +static bool ite_match(struct hid_device *hdev, bool ignore_special_driver)
> >> > > >>>> +{
> >> > > >>>> + struct usb_interface *intf;
> >> > > >>>> +
> >> > > >>>> + if (ignore_special_driver)
> >> > > >>>> + return false;
> >> > > >>>> +
> >> > > >>>> + /*
> >> > > >>>> + * Most keyboard docks with an ITE 8595 keyboard/touchpad controller
> >> > > >>>> + * have the "Wireless Radio Control" bits which need this special
> >> > > >>>> + * driver on the second USB interface (the mouse interface). On
> >> > > >>>> + * these devices we simply bind to all USB interfaces.
> >> > > >>>> + *
> >> > > >>>> + * The Acer Aspire Switch 10 (SW5-012) is special, its touchpad
> >> > > >>>> + * not only does mouse emulation it also supports HID-multitouch
> >> > > >>>> + * and all the keys including the "Wireless Radio Control" bits
> >> > > >>>> + * have been moved to the first USB interface (the keyboard intf).
> >> > > >>>> + *
> >> > > >>>> + * We want the hid-multitouch driver to bind to the touchpad, so on
> >> > > >>>> + * the Acer SW5-012 we should only bind to the keyboard USB intf.
> >> > > >>>> + */
> >> > > >>>> + if (hdev->bus != BUS_USB || hdev->vendor != USB_VENDOR_ID_SYNAPTICS ||
> >> > > >>>> + hdev->product != USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012)
> >> > > >>>
> >> > > >>> Isn't there an existing matching function we can use here, instead of
> >> > > >>> checking each individual field?
> >> > > >>
> >> > > >> There is hid_match_one_id() but that is not exported (can be fixed) and it
> >> > > >> requires a struct hid_device_id, which either requires declaring an extra
> >> > > >> standalone struct hid_device_id for the SW5-012 kbd-dock, or hardcoding an
> >> > > >> index into the existing hid_device_id array for the driver (with the hardcoding
> >> > > >> being error prone, so not a good idea).
> >> > > >>
> >> > > >> Given the problems with using hid_match_one_id() I decided to just go with
> >> > > >> the above.
> >> > > >
> >> > > > right. An other solution would be to have a local macro/function that
> >> > > > does that. Because as soon as you start adding a quirk, an other comes
> >> > > > right after.
> >> > > >
> >> > > >>
> >> > > >> But see below.
> >> > > >>
> >> > > >>>
> >> > > >>>> + return true;
> >> > > >>>> +
> >> > > >>>> + intf = to_usb_interface(hdev->dev.parent);
> >> > > >>>
> >> > > >>> And this is oops-prone. You need:
> >> > > >>> - ensure hid_is_using_ll_driver(hdev, &usb_hid_driver) returns true.
> >> > > >>> - add a dependency on USBHID in the KConfig now that you are checking
> >> > > >>> on the USB transport layer.
> >> > > >>>
> >> > > >>> That being said, I would love instead:
> >> > > >>> - to have a non USB version of this match, where you decide which
> >> > > >>> component needs to be handled based on the report descriptor
> >> > > >>
> >> > > >> Actually your idea to use the desciptors is not bad, but since what
> >> > > >> we really want is to not bind to the interface which is marked for the
> >> > > >> hid-multitouch driver I just realized we can just check that.
> >> > > >>
> >> > > >> So how about:
> >> > > >>
> >> > > >> static bool ite_match(struct hid_device *hdev, bool ignore_special_driver)
> >> > > >> {
> >> > > >> if (ignore_special_driver)
> >> > > >> return false;
> >> > > >>
> >> > > >> /*
> >> > > >> * Some keyboard docks with an ITE 8595 keyboard/touchpad controller
> >> > > >> * support the HID multitouch protocol for the touchpad, in that
> >> > > >> * case the "Wireless Radio Control" bits which we care about are
> >> > > >> * on the other interface; and we should not bind to the multitouch
> >> > > >> * capable interface as that breaks multitouch support.
> >> > > >> */
> >> > > >> return hdev->group != HID_GROUP_MULTITOUCH_WIN_8;
> >> > > >> }
> >> > > >
> >> > > > Yep, I like that very much :)
> >> > >
> >> > > Actually if we want to check the group and there are only 2 interfaces we do
> >> > > not need to use the match callback at all, w e can simply match on the
> >> > > group of the interface which we do want:
> >> > >
> >> > > diff --git a/drivers/hid/hid-ite.c b/drivers/hid/hid-ite.c
> >> > > index db0f35be5a8b..21bd48f16033 100644
> >> > > --- a/drivers/hid/hid-ite.c
> >> > > +++ b/drivers/hid/hid-ite.c
> >> > > @@ -56,8 +56,9 @@ static const struct hid_device_id ite_devices[] = {
> >> > > { HID_USB_DEVICE(USB_VENDOR_ID_ITE, USB_DEVICE_ID_ITE8595) },
> >> > > { HID_USB_DEVICE(USB_VENDOR_ID_258A, USB_DEVICE_ID_258A_6A88) },
> >> > > /* ITE8595 USB kbd ctlr, with Synaptics touchpad connected to it. */
> >> > > - { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS,
> >> > > - USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
> >> > > + { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
> >> > > + USB_VENDOR_ID_SYNAPTICS,
> >> > > + USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
> >> > > { }
> >> > > };
> >> > > MODULE_DEVICE_TABLE(hid, ite_devices);
> >> > >
> >> > > Much cleaner
> >> >
> >> > yep
> >> >
> >> > > (and now I don't need to write a test, which is always
> >> > > a good motivation to come up with a cleaner solution :)
> >> >
> >> > Hehe, too bad, you already picked up my curiosity on this one, and I
> >> > really would like to see the report descriptors and some events of the
> >> > keys that are fixed by hid-ite.c.
> >> > <with a low voice>This will be a hard requirement to accept this patch </joke>.
> >> >
> >> > More seriously, Zdeněk, can you run hid-recorder from
> >> > https://gitlab.freedesktop.org/libevdev/hid-tools/ and provide me the
> >> > report descriptor for all of your ITE HID devices? I'll add the
> >> > matching tests in hid-tools and be sure we do not regress in the
> >> > future.
> >> >
> >> > >
> >> > > Let me turn this into a proper patch and then I will send that to
> >> > > Zdeněk (off-list) for him to test (note don't worry if you do
> >> > > not have time to test this weekend, then I'll do it on Monday).
> >> > >
> >> > > Regards,
> >> > >
> >> > > Hans
> >> > >
> >> > > p.s.
> >> > >
> >> > > 1. My train is approaching Brussels (Fosdem) so my email response
> >> > > time will soon become irregular.
> >> >
> >> > How dare you? :)
> >> >
> >> > >
> >> > > 2. Benjamin will you be at Fosdem too ?
> >> > >
> >> >
> >> > Unfortunately no. Already got my quota of meeting people for this year
> >> > between Kernel Recipes in September, XDC in October and LCA last week.
> >> > So I need to keep in a quiet environment for a little bit :)
> >> >
> >> > Cheers,
> >> > Benjamin
> >> >
> >>
>
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 6f1a4891a5928a5969c87fa5a584844c983ec823
Gitweb: https://git.kernel.org/tip/6f1a4891a5928a5969c87fa5a584844c983ec823
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Fri, 31 Jan 2020 15:26:52 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Sat, 01 Feb 2020 09:31:47 +01:00
x86/apic/msi: Plug non-maskable MSI affinity race
Evan tracked down a subtle race between the update of the MSI message and
the device raising an interrupt internally on PCI devices which do not
support MSI masking. The update of the MSI message is non-atomic and
consists of either 2 or 3 sequential 32bit wide writes to the PCI config
space.
- Write address low 32bits
- Write address high 32bits (If supported by device)
- Write data
When an interrupt is migrated then both address and data might change, so
the kernel attempts to mask the MSI interrupt first. But for MSI masking is
optional, so there exist devices which do not provide it. That means that
if the device raises an interrupt internally between the writes then a MSI
message is sent built from half updated state.
On x86 this can lead to spurious interrupts on the wrong interrupt
vector when the affinity setting changes both address and data. As a
consequence the device interrupt can be lost causing the device to
become stuck or malfunctioning.
Evan tried to handle that by disabling MSI accross an MSI message
update. That's not feasible because disabling MSI has issues on its own:
If MSI is disabled the PCI device is routing an interrupt to the legacy
INTx mechanism. The INTx delivery can be disabled, but the disablement is
not working on all devices.
Some devices lose interrupts when both MSI and INTx delivery are disabled.
Another way to solve this would be to enforce the allocation of the same
vector on all CPUs in the system for this kind of screwed devices. That
could be done, but it would bring back the vector space exhaustion problems
which got solved a few years ago.
Fortunately the high address (if supported by the device) is only relevant
when X2APIC is enabled which implies interrupt remapping. In the interrupt
remapping case the affinity setting is happening at the interrupt remapping
unit and the PCI MSI message is programmed only once when the PCI device is
initialized.
That makes it possible to solve it with a two step update:
1) Target the MSI msg to the new vector on the current target CPU
2) Target the MSI msg to the new vector on the new target CPU
In both cases writing the MSI message is only changing a single 32bit word
which prevents the issue of inconsistency.
After writing the final destination it is necessary to check whether the
device issued an interrupt while the intermediate state #1 (new vector,
current CPU) was in effect.
This is possible because the affinity change is always happening on the
current target CPU. The code runs with interrupts disabled, so the
interrupt can be detected by checking the IRR of the local APIC. If the
vector is pending in the IRR then the interrupt is retriggered on the new
target CPU by sending an IPI for the associated vector on the target CPU.
This can cause spurious interrupts on both the local and the new target
CPU.
1) If the new vector is not in use on the local CPU and the device
affected by the affinity change raised an interrupt during the
transitional state (step #1 above) then interrupt entry code will
ignore that spurious interrupt. The vector is marked so that the
'No irq handler for vector' warning is supressed once.
2) If the new vector is in use already on the local CPU then the IRR check
might see an pending interrupt from the device which is using this
vector. The IPI to the new target CPU will then invoke the handler of
the device, which got the affinity change, even if that device did not
issue an interrupt
3) If the new vector is in use already on the local CPU and the device
affected by the affinity change raised an interrupt during the
transitional state (step #1 above) then the handler of the device which
uses that vector on the local CPU will be invoked.
expose issues in device driver interrupt handlers which are not prepared to
handle a spurious interrupt correctly. This not a regression, it's just
exposing something which was already broken as spurious interrupts can
happen for a lot of reasons and all driver handlers need to be able to deal
with them.
Reported-by: Evan Green <evgreen(a)chromium.org>
Debugged-by: Evan Green <evgreen(a)chromium.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Evan Green <evgreen(a)chromium.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de
---
arch/x86/include/asm/apic.h | 8 ++-
arch/x86/kernel/apic/msi.c | 128 ++++++++++++++++++++++++++++++++++-
include/linux/irq.h | 18 +++++-
include/linux/irqdomain.h | 7 ++-
kernel/irq/debugfs.c | 1 +-
kernel/irq/msi.c | 5 +-
6 files changed, 163 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index be0b9cf..19e94af 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -454,6 +454,14 @@ static inline void ack_APIC_irq(void)
apic_eoi();
}
+
+static inline bool lapic_vector_set_in_irr(unsigned int vector)
+{
+ u32 irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
+
+ return !!(irr & (1U << (vector % 32)));
+}
+
static inline unsigned default_get_apic_id(unsigned long x)
{
unsigned int ver = GET_APIC_VERSION(apic_read(APIC_LVR));
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 7f75334..159bd0c 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -23,10 +23,8 @@
static struct irq_domain *msi_default_domain;
-static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
+static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg)
{
- struct irq_cfg *cfg = irqd_cfg(data);
-
msg->address_hi = MSI_ADDR_BASE_HI;
if (x2apic_enabled())
@@ -47,6 +45,127 @@ static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
MSI_DATA_VECTOR(cfg->vector);
}
+static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
+{
+ __irq_msi_compose_msg(irqd_cfg(data), msg);
+}
+
+static void irq_msi_update_msg(struct irq_data *irqd, struct irq_cfg *cfg)
+{
+ struct msi_msg msg[2] = { [1] = { }, };
+
+ __irq_msi_compose_msg(cfg, msg);
+ irq_data_get_irq_chip(irqd)->irq_write_msi_msg(irqd, msg);
+}
+
+static int
+msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
+{
+ struct irq_cfg old_cfg, *cfg = irqd_cfg(irqd);
+ struct irq_data *parent = irqd->parent_data;
+ unsigned int cpu;
+ int ret;
+
+ /* Save the current configuration */
+ cpu = cpumask_first(irq_data_get_effective_affinity_mask(irqd));
+ old_cfg = *cfg;
+
+ /* Allocate a new target vector */
+ ret = parent->chip->irq_set_affinity(parent, mask, force);
+ if (ret < 0 || ret == IRQ_SET_MASK_OK_DONE)
+ return ret;
+
+ /*
+ * For non-maskable and non-remapped MSI interrupts the migration
+ * to a different destination CPU and a different vector has to be
+ * done careful to handle the possible stray interrupt which can be
+ * caused by the non-atomic update of the address/data pair.
+ *
+ * Direct update is possible when:
+ * - The MSI is maskable (remapped MSI does not use this code path)).
+ * The quirk bit is not set in this case.
+ * - The new vector is the same as the old vector
+ * - The old vector is MANAGED_IRQ_SHUTDOWN_VECTOR (interrupt starts up)
+ * - The new destination CPU is the same as the old destination CPU
+ */
+ if (!irqd_msi_nomask_quirk(irqd) ||
+ cfg->vector == old_cfg.vector ||
+ old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
+ cfg->dest_apicid == old_cfg.dest_apicid) {
+ irq_msi_update_msg(irqd, cfg);
+ return ret;
+ }
+
+ /*
+ * Paranoia: Validate that the interrupt target is the local
+ * CPU.
+ */
+ if (WARN_ON_ONCE(cpu != smp_processor_id())) {
+ irq_msi_update_msg(irqd, cfg);
+ return ret;
+ }
+
+ /*
+ * Redirect the interrupt to the new vector on the current CPU
+ * first. This might cause a spurious interrupt on this vector if
+ * the device raises an interrupt right between this update and the
+ * update to the final destination CPU.
+ *
+ * If the vector is in use then the installed device handler will
+ * denote it as spurious which is no harm as this is a rare event
+ * and interrupt handlers have to cope with spurious interrupts
+ * anyway. If the vector is unused, then it is marked so it won't
+ * trigger the 'No irq handler for vector' warning in do_IRQ().
+ *
+ * This requires to hold vector lock to prevent concurrent updates to
+ * the affected vector.
+ */
+ lock_vector_lock();
+
+ /*
+ * Mark the new target vector on the local CPU if it is currently
+ * unused. Reuse the VECTOR_RETRIGGERED state which is also used in
+ * the CPU hotplug path for a similar purpose. This cannot be
+ * undone here as the current CPU has interrupts disabled and
+ * cannot handle the interrupt before the whole set_affinity()
+ * section is done. In the CPU unplug case, the current CPU is
+ * about to vanish and will not handle any interrupts anymore. The
+ * vector is cleaned up when the CPU comes online again.
+ */
+ if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector])))
+ this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED);
+
+ /* Redirect it to the new vector on the local CPU temporarily */
+ old_cfg.vector = cfg->vector;
+ irq_msi_update_msg(irqd, &old_cfg);
+
+ /* Now transition it to the target CPU */
+ irq_msi_update_msg(irqd, cfg);
+
+ /*
+ * All interrupts after this point are now targeted at the new
+ * vector/CPU.
+ *
+ * Drop vector lock before testing whether the temporary assignment
+ * to the local CPU was hit by an interrupt raised in the device,
+ * because the retrigger function acquires vector lock again.
+ */
+ unlock_vector_lock();
+
+ /*
+ * Check whether the transition raced with a device interrupt and
+ * is pending in the local APICs IRR. It is safe to do this outside
+ * of vector lock as the irq_desc::lock of this interrupt is still
+ * held and interrupts are disabled: The check is not accessing the
+ * underlying vector store. It's just checking the local APIC's
+ * IRR.
+ */
+ if (lapic_vector_set_in_irr(cfg->vector))
+ irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);
+
+ return ret;
+}
+
/*
* IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices,
* which implement the MSI or MSI-X Capability Structure.
@@ -58,6 +177,7 @@ static struct irq_chip pci_msi_controller = {
.irq_ack = irq_chip_ack_parent,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_compose_msi_msg = irq_msi_compose_msg,
+ .irq_set_affinity = msi_set_affinity,
.flags = IRQCHIP_SKIP_SET_WAKE,
};
@@ -146,6 +266,8 @@ void __init arch_init_msi_domain(struct irq_domain *parent)
}
if (!msi_default_domain)
pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+ else
+ msi_default_domain->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
}
#ifdef CONFIG_IRQ_REMAP
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 7853eb9..3ed5a05 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -209,6 +209,8 @@ struct irq_data {
* IRQD_SINGLE_TARGET - IRQ allows only a single affinity target
* IRQD_DEFAULT_TRIGGER_SET - Expected trigger already been set
* IRQD_CAN_RESERVE - Can use reservation mode
+ * IRQD_MSI_NOMASK_QUIRK - Non-maskable MSI quirk for affinity change
+ * required
*/
enum {
IRQD_TRIGGER_MASK = 0xf,
@@ -231,6 +233,7 @@ enum {
IRQD_SINGLE_TARGET = (1 << 24),
IRQD_DEFAULT_TRIGGER_SET = (1 << 25),
IRQD_CAN_RESERVE = (1 << 26),
+ IRQD_MSI_NOMASK_QUIRK = (1 << 27),
};
#define __irqd_to_state(d) ACCESS_PRIVATE((d)->common, state_use_accessors)
@@ -390,6 +393,21 @@ static inline bool irqd_can_reserve(struct irq_data *d)
return __irqd_to_state(d) & IRQD_CAN_RESERVE;
}
+static inline void irqd_set_msi_nomask_quirk(struct irq_data *d)
+{
+ __irqd_to_state(d) |= IRQD_MSI_NOMASK_QUIRK;
+}
+
+static inline void irqd_clr_msi_nomask_quirk(struct irq_data *d)
+{
+ __irqd_to_state(d) &= ~IRQD_MSI_NOMASK_QUIRK;
+}
+
+static inline bool irqd_msi_nomask_quirk(struct irq_data *d)
+{
+ return __irqd_to_state(d) & IRQD_MSI_NOMASK_QUIRK;
+}
+
#undef __irqd_to_state
static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 3c340db..4da8df5 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -207,6 +207,13 @@ enum {
IRQ_DOMAIN_FLAG_MSI_REMAP = (1 << 5),
/*
+ * Quirk to handle MSI implementations which do not provide
+ * masking. Currently known to affect x86, but partially
+ * handled in core code.
+ */
+ IRQ_DOMAIN_MSI_NOMASK_QUIRK = (1 << 6),
+
+ /*
* Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved
* for implementation specific purposes and ignored by the
* core code.
diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c
index c1eccd4..a949bd3 100644
--- a/kernel/irq/debugfs.c
+++ b/kernel/irq/debugfs.c
@@ -114,6 +114,7 @@ static const struct irq_bit_descr irqdata_states[] = {
BIT_MASK_DESCR(IRQD_AFFINITY_MANAGED),
BIT_MASK_DESCR(IRQD_MANAGED_SHUTDOWN),
BIT_MASK_DESCR(IRQD_CAN_RESERVE),
+ BIT_MASK_DESCR(IRQD_MSI_NOMASK_QUIRK),
BIT_MASK_DESCR(IRQD_FORWARDED_TO_VCPU),
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index ad26fbc..eb95f61 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -453,8 +453,11 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
continue;
irq_data = irq_domain_get_irq_data(domain, desc->irq);
- if (!can_reserve)
+ if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
+ if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
+ irqd_set_msi_nomask_quirk(irq_data);
+ }
ret = irq_domain_activate_irq(irq_data, can_reserve);
if (ret)
goto cleanup;