With deferred IO enabled, a page fault happens when data is written to the framebuffer device. Then driver determines which page is being updated by calculating the offset of the written virtual address within the virtual memory area, and uses this offset to get the updated page within the internal buffer. This page is later copied to hardware (thus the name "deferred IO").
This calculation is only correct if the virtual memory area is mapped to the beginning of the internal buffer. Otherwise this is wrong. For example, if users do: mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the internal buffer. This offset 0xff000 is not accounted for, and wrong page is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for example:
int fd = open("/dev/fb0", O_RDWR, 0); char *ptr = (char *) 0x20000000ul; mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000); *ptr = 0; // write #1 mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0); *ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address (0x20000000ul), and the driver mistakenly thinks the same page is being written to. When the second write happens, the driver thinks this is the same page as the last time, and reuse the page from write #1. The driver then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the correct page unlocked. It is unclear what will happen with memory management subsystem after that, but likely something terrible.
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@ora... Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcao@linutronix.de --- drivers/video/fbdev/core/fb_defio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index dae96c9f61cf..d5d6cd9e8b29 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long */ static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf) { - unsigned long offset = vmf->address - vmf->vma->vm_start; + unsigned long offset = vmf->address - vmf->vma->vm_start + + (vmf->vma->vm_pgoff << PAGE_SHIFT); struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
Hi,
thanks for following through with the bug and sending the patch
Am 19.04.24 um 21:00 schrieb Nam Cao:
With deferred IO enabled, a page fault happens when data is written to the framebuffer device. Then driver determines which page is being updated by calculating the offset of the written virtual address within the virtual memory area, and uses this offset to get the updated page within the internal buffer. This page is later copied to hardware (thus the name "deferred IO").
This calculation is only correct if the virtual memory area is mapped to the beginning of the internal buffer. Otherwise this is wrong. For example, if users do: mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the internal buffer. This offset 0xff000 is not accounted for, and wrong page is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for example:
int fd = open("/dev/fb0", O_RDWR, 0); char *ptr = (char *) 0x20000000ul; mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000); *ptr = 0; // write #1 mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0); *ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address (0x20000000ul), and the driver mistakenly thinks the same page is being written to. When the second write happens, the driver thinks this is the same page as the last time, and reuse the page from write #1. The driver then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the correct page unlocked. It is unclear what will happen with memory management subsystem after that, but likely something terrible.
Please tone down the drama. :)
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@ora... Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcao@linutronix.de
drivers/video/fbdev/core/fb_defio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index dae96c9f61cf..d5d6cd9e8b29 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long */ static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf) {
- unsigned long offset = vmf->address - vmf->vma->vm_start;
- unsigned long offset = vmf->address - vmf->vma->vm_start
+ (vmf->vma->vm_pgoff << PAGE_SHIFT);
The page-fault handler at [1] use vm_fault.pgoff to retrieve the page structure. Can we do the same here and avoid that computation?
Best regards Thomas
[1] https://elixir.bootlin.com/linux/v6.8/source/drivers/video/fbdev/core/fb_def...
struct page *page = vmf->page; file_update_time(vmf->vma->vm_file);
On Tue, Apr 23, 2024 at 10:37:41AM +0200, Thomas Zimmermann wrote:
Hi,
thanks for following through with the bug and sending the patch
Am 19.04.24 um 21:00 schrieb Nam Cao:
With deferred IO enabled, a page fault happens when data is written to the framebuffer device. Then driver determines which page is being updated by calculating the offset of the written virtual address within the virtual memory area, and uses this offset to get the updated page within the internal buffer. This page is later copied to hardware (thus the name "deferred IO").
This calculation is only correct if the virtual memory area is mapped to the beginning of the internal buffer. Otherwise this is wrong. For example, if users do: mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the internal buffer. This offset 0xff000 is not accounted for, and wrong page is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for example:
int fd = open("/dev/fb0", O_RDWR, 0); char *ptr = (char *) 0x20000000ul; mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000); *ptr = 0; // write #1 mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0); *ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address (0x20000000ul), and the driver mistakenly thinks the same page is being written to. When the second write happens, the driver thinks this is the same page as the last time, and reuse the page from write #1. The driver then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the correct page unlocked. It is unclear what will happen with memory management subsystem after that, but likely something terrible.
Please tone down the drama. :)
Sorry, that wasn't intentional. Writing is hard :(
Let me just cut this out, this info is not really needed to justify the changes.
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@ora... Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcao@linutronix.de
drivers/video/fbdev/core/fb_defio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index dae96c9f61cf..d5d6cd9e8b29 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long */ static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf) {
- unsigned long offset = vmf->address - vmf->vma->vm_start;
- unsigned long offset = vmf->address - vmf->vma->vm_start
+ (vmf->vma->vm_pgoff << PAGE_SHIFT);
The page-fault handler at [1] use vm_fault.pgoff to retrieve the page structure. Can we do the same here and avoid that computation?
Yes, thanks for the suggestion.
It will change things a bit: offset will not be the exact value anymore, but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter, because it will only be used to calculate the page offset later on.
We can clean this up and rename this "offset" to "pg_offset". But that's for another day.
Best regards, Nam
Hi
Am 23.04.24 um 11:55 schrieb Nam Cao: [...]
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@ora... Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcao@linutronix.de
drivers/video/fbdev/core/fb_defio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index dae96c9f61cf..d5d6cd9e8b29 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long */ static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf) {
- unsigned long offset = vmf->address - vmf->vma->vm_start;
- unsigned long offset = vmf->address - vmf->vma->vm_start
+ (vmf->vma->vm_pgoff << PAGE_SHIFT);
The page-fault handler at [1] use vm_fault.pgoff to retrieve the page structure. Can we do the same here and avoid that computation?
Yes, thanks for the suggestion.
It will change things a bit: offset will not be the exact value anymore, but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter, because it will only be used to calculate the page offset later on.
We can clean this up and rename this "offset" to "pg_offset". But that's for another day.
But can't we use struct vm_fault.pgoff directly? The page-fault handler has used it since forever. The look-up code for the pageref should probably do the same, because there's a 1:1 connection between the page and the pageref. The pageref structure only exists because we cannot store its data in struct page directly.
AFAICT pgoff is exactly the value want to compute. See [1] and the calculation at [2].
[1] https://elixir.bootlin.com/linux/v6.8/source/mm/memory.c#L5222 [2] https://elixir.bootlin.com/linux/v6.8/source/include/linux/pagemap.h#L957
Best regards Thomas
Best regards, Nam
On Tue, Apr 23, 2024 at 01:27:09PM +0200, Thomas Zimmermann wrote:
Am 23.04.24 um 11:55 schrieb Nam Cao:
...
The page-fault handler at [1] use vm_fault.pgoff to retrieve the page structure. Can we do the same here and avoid that computation?
Yes, thanks for the suggestion.
It will change things a bit: offset will not be the exact value anymore, but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter, because it will only be used to calculate the page offset later on.
We can clean this up and rename this "offset" to "pg_offset". But that's for another day.
But can't we use struct vm_fault.pgoff directly? The page-fault handler has used it since forever. The look-up code for the pageref should probably do the same, because there's a 1:1 connection between the page and the pageref. The pageref structure only exists because we cannot store its data in struct page directly.
AFAICT pgoff is exactly the value want to compute. See [1] and the calculation at [2].
Completely agree. I just wanted to point out that the value inside the variable "offset" won't be the same as before, but it's still correct.
I just tested the patch on real hardware, will send it shortly.
Best regards, Nam
linux-stable-mirror@lists.linaro.org