On Fri, Jan 10, 2025 at 11:27:53AM -0800, Nicolin Chen wrote:
On Fri, Jan 10, 2025 at 01:48:42PM -0400, Jason Gunthorpe wrote:
On Tue, Jan 07, 2025 at 09:10:09AM -0800, Nicolin Chen wrote:
+static ssize_t iommufd_veventq_fops_read(struct iommufd_eventq *eventq,
char __user *buf, size_t count,
loff_t *ppos)
+{
- size_t done = 0;
- int rc = 0;
- if (*ppos)
return -ESPIPE;
- mutex_lock(&eventq->mutex);
- while (!list_empty(&eventq->deliver) && count > done) {
struct iommufd_vevent *cur = list_first_entry(
&eventq->deliver, struct iommufd_vevent, node);
if (cur->data_len > count - done)
break;
if (copy_to_user(buf + done, cur->event_data, cur->data_len)) {
rc = -EFAULT;
break;
}
Now that I look at this more closely, the fault path this is copied from is not great.
This copy_to_user() can block while waiting on a page fault, possibily for a long time. While blocked the mutex is held and we can't add more entries to the list.
That will cause the shared IRQ handler in the iommu driver to back up, which would cause a global DOS.
This probably wants to be organized to look more like
while (itm = eventq_get_next_item(eventq)) { if (..) { eventq_restore_failed_item(eventq); return -1; } }
Where the next_item would just be a simple spinlock across the linked list manipulation.
Would it be simpler by just limiting one node per read(), i.e. no "while (!list_empty)" and no block?
The report() adds one node at a time, and wakes up the poll() each time of adding a node. And user space could read one event at a time too?
That doesn't really help, the issue is it holds the lock over the copy_to_user() which it is doing because it doesn't want pull the item off the list and then try to handle the failure and put it back.
Jason