The series
https://patchwork.linuxtv.org/project/linux-media/list/?series=12485
was all merged but one patch that Laurent found a bug in the index used.
When I tried the fixed patch I found an integer overflow in the
timestamp calculations. This bug can be triggered with slow framerates.
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Ricardo Ribalda (2):
media: uvcvideo: Fix hw timestamp handling for slow FPS
media: uvcvideo: Fix integer overflow calculating timestamp
drivers/media/usb/uvc/uvc_video.c | 33 ++++++++++++++++++++++++++++-----
drivers/media/usb/uvc/uvcvideo.h | 1 +
2 files changed, 29 insertions(+), 5 deletions(-)
---
base-commit: ef1e48f725d30cb18d3f2d40c48f50f483080cf7
change-id: 20240610-hwtimestamp-followup-2489c5668b21
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
Am 23.05.24 um 15:13 schrieb Barry Kauler:
> On Wed, May 22, 2024 at 12:58 AM Armin Wolf <W_Armin(a)gmx.de> wrote:
>> Am 20.05.24 um 18:22 schrieb Alex Deucher:
>>
>>> On Sat, May 18, 2024 at 8:17 PM Armin Wolf <W_Armin(a)gmx.de> wrote:
>>>> Am 17.05.24 um 03:30 schrieb Barry Kauler:
>>>>
>>>>> Armin, Yifan, Prike,
>>>>> I will top-post, so you don't have to scroll down.
>>>>> After identifying the commit that causes black screen with my gpu, I
>>>>> posted the result to you guys, on May 9.
>>>>> It is now May 17 and no reply.
>>>>> OK, I have now created a patch that reverts Yifan's commit, compiled
>>>>> 5.15.158, and my gpu now works.
>>>>> Note, the radeon module is not loaded, so it is not a factor.
>>>>> I'm not a kernel developer. I have identified the culprit and it is up
>>>>> to you guys to fix it, Yifan especially, as you are the person who has
>>>>> created the regression.
>>>>> I will attach my patch.
>>>>> Regards,
>>>>> Barry Kauler
>>>> Hi,
>>>>
>>>> sorry for not responding to your findings. I normally do not work with GPU drivers,
>>>> so i hoped one of the amdgpu developers would handle this.
>>>>
>>>> I CCeddri-devel(a)lists.freedesktop.org and amd-gfx(a)lists.freedesktop.org so that other
>>>> amdgpu developers hear from this issue.
>>>>
>>>> Thanks you for you persistence in finding the offending commit.
>>> Likely this patch should not have been ported to 5.15 in the first
>>> place. The IOMMU requirements have been dropped from the driver for
>>> the last few kernel versions so it is no longer relevant on newer
>>> kernels.
>>>
>>> Alex
>> Barry, can you verify that the latest upstream kernel works on you device?
>> If yes, then the commit itself is ok and just the backporting itself was wrong.
>>
>> Thanks,
>> Armin Wolf
> Armin,
> The unmodified 6.8.1 kernel works ok.
> I presume that patch was applied long before 6.8.1 got released and
> only got backported to 5.15.x recently.
>
> Regards,
> Barry
>
Great to hear, that means we only have to revert commit 56b522f46681 ("drm/amdgpu: init iommu after amdkfd device init")
from the 5.15.y series.
I CCed the stable mailing list so that they can revert the offending commit.
Thanks,
Armin Wolf
>>>> Armin Wolf
>>>>
>>>>> On Thu, May 9, 2024 at 4:08 PM Barry Kauler <bkauler(a)gmail.com> wrote:
>>>>>> On Fri, May 3, 2024 at 9:03 PM Armin Wolf <W_Armin(a)gmx.de> wrote:
>>>>>>>> ...
>>>>>>>> # lspci | grep VGA
>>>>>>>> 05:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>>>>>>>> [AMD/ATI] Picasso/Raven 2 [Radeon Vega Series / Radeon Vega Mobile
>>>>>>>> Series] (rev c2)
>>>>>>>> 05:00.7 Non-VGA unclassified device: Advanced Micro Devices, Inc.
>>>>>>>> [AMD] Raven/Raven2/Renoir Non-Sensor Fusion Hub KMDF driver
>>>>>>>>
>>>>>>>> # lspci -n -k
>>>>>>>> ...
>>>>>>>> 05:00.0 0300: 1002:15d8 (rev c2)
>>>>>>>> Subsystem: 1025:1456
>>>>>>>> Kernel driver in use: amdgpu
>>>>>>>> Kernel modules: amdgpu
>>>>>>>> ...
>>>>>>> thanks for informing us of this regression. Since there are four commits affecting
>>>>>>> amdgpu in 5.15.150, i suggest that you use "git bisect" to find the faulty commits,
>>>>>>> see https://docs.kernel.org/admin-guide/bug-bisect.html for details.
>>>>>>>
>>>>>>> I think you can speed up the bisecting process by limiting yourself to the AMD DRM
>>>>>>> driver directory with "git bisect start -- drivers/gpu/drm/amd", take a look at the
>>>>>>> man page of "git bisect" for details.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Armin Wolf
>>>>>> Armin,
>>>>>> Thanks for the advice. I am unfamiliar with git on the commandline.
>>>>>> Previously only used SmartGit gui.
>>>>>> EasyOS requires aufs patch, and for a few days tried to figure out how
>>>>>> to use that with git bisect, then gave up. Changed to testing with my
>>>>>> "QV" distro, which is more conventional, doesn't need any kernel
>>>>>> patches. Managed to get it down to one commit. Here are the steps I
>>>>>> followed:
>>>>>>
>>>>>> # git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
>>>>>> # cd linux-stable
>>>>>> # git tag -l | grep '5\.15\.150'
>>>>>> v5.15.150
>>>>>> # git checkout -b my5.15.150 v5.15.150
>>>>>> Updating files: 100% (65776/65776), done.
>>>>>> Switched to a new branch 'my5.15.150'
>>>>>>
>>>>>> Copied in my .config then...
>>>>>>
>>>>>> # make menuconfig
>>>>>> # git bisect start -- drivers/gpu/drm/amd
>>>>>> # git bisect bad
>>>>>> # git bisect good v5.15.149
>>>>>> Bisecting: 1 revision left to test after this (roughly 1 step)
>>>>>> [b9a61ee2bb2704e42516e3da962f99dfa98f3b20] drm/amdgpu: reset gpu for
>>>>>> s3 suspend abort case
>>>>>> # make
>>>>>> # rm -rf /boot2
>>>>>> # mkdir -p /boot2/lib/modules
>>>>>> # make INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=/boot2 modules_install
>>>>>> # cp arch/x86/boot/bzImage /boot2/vmlinuz
>>>>>> # sync
>>>>>> ...QV on Acer laptop, with amdgpu, works!
>>>>>> # git bisect good
>>>>>> Bisecting: 0 revisions left to test after this (roughly 0 steps)
>>>>>> [56b522f4668167096a50c39446d6263c96219f5f] drm/amdgpu: init iommu
>>>>>> after amdkfd device init
>>>>>> # make
>>>>>> # mkdir -p /boot2/lib/modules
>>>>>> # make INSTALL_MOD_STRIP=1 INSTALL_MOD_PATH=/boot2 modules_install
>>>>>> # cp arch/x86/boot/bzImage /boot2/vmlinuz
>>>>>> # sync
>>>>>> ...QV on Acer laptop, black screen!
>>>>>>
>>>>>> # git bisect bad
>>>>>> 56b522f4668167096a50c39446d6263c96219f5f is the first bad commit
>>>>>> commit 56b522f4668167096a50c39446d6263c96219f5f
>>>>>> Author: Yifan Zhang <yifan1.zhang(a)amd.com>
>>>>>> Date: Tue Sep 28 15:42:35 2021 +0800
>>>>>>
>>>>>> drm/amdgpu: init iommu after amdkfd device init
>>>>>>
>>>>>> [ Upstream commit 286826d7d976e7646b09149d9bc2899d74ff962b ]
>>>>>>
>>>>>> This patch is to fix clinfo failure in Raven/Picasso:
>>>>>>
>>>>>> Number of platforms: 1
>>>>>> Platform Profile: FULL_PROFILE
>>>>>> Platform Version: OpenCL 2.2 AMD-APP (3364.0)
>>>>>> Platform Name: AMD Accelerated Parallel Processing
>>>>>> Platform Vendor: Advanced Micro Devices, Inc.
>>>>>> Platform Extensions: cl_khr_icd cl_amd_event_callback
>>>>>>
>>>>>> Platform Name: AMD Accelerated Parallel Processing Number of devices: 0
>>>>>>
>>>>>> Signed-off-by: Yifan Zhang <yifan1.zhang(a)amd.com>
>>>>>> Reviewed-by: James Zhu <James.Zhu(a)amd.com>
>>>>>> Tested-by: James Zhu <James.Zhu(a)amd.com>
>>>>>> Acked-by: Felix Kuehling <Felix.Kuehling(a)amd.com>
>>>>>> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
>>>>>> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>>>>>>
>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++----
>>>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> Anything else I should do, to identify what in this commit is the
>>>>>> likely culprit?
>>>>>> Regards,
>>>>>> Barry Kauler
The patch below does not apply to the 6.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.9.y
git checkout FETCH_HEAD
git cherry-pick -x 2a38e4ca302280fdcce370ba2bee79bac16c4587
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024060638-unenvied-immovably-70e4@gregkh' --subject-prefix 'PATCH 6.9.y' HEAD^..
Possible dependencies:
2a38e4ca3022 ("x86/cpu: Provide default cache line size if not enumerated")
95bfb35269b2 ("x86/cpu: Get rid of an unnecessary local variable in get_cpu_address_sizes()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2a38e4ca302280fdcce370ba2bee79bac16c4587 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen(a)linux.intel.com>
Date: Fri, 17 May 2024 13:05:34 -0700
Subject: [PATCH] x86/cpu: Provide default cache line size if not enumerated
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH]
will end up reporting cache_line_size()==0 and bad things happen.
Fill in a default on those to avoid the problem.
Long Story:
The kernel dies a horrible death if c->x86_cache_alignment (aka.
cache_line_size() is 0. Normally, this value is populated from
c->x86_clflush_size.
Right now the code is set up to get c->x86_clflush_size from two
places. First, modern CPUs get it from CPUID. Old CPUs that don't
have leaf 0x80000008 (or CPUID at all) just get some sane defaults
from the kernel in get_cpu_address_sizes().
The vast majority of CPUs that have leaf 0x80000008 also get
->x86_clflush_size from CPUID. But there are oddballs.
Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set
CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size:
cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
if (cap0 & (1<<19))
c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;
So they: land in get_cpu_address_sizes() and see that CPUID has level
0x80000008 and jump into the side of the if() that does not fill in
c->x86_clflush_size. That assigns a 0 to c->x86_cache_alignment, and
hilarity ensues in code like:
buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()),
GFP_KERNEL);
To fix this, always provide a sane value for ->x86_clflush_size.
Big thanks to Andy Shevchenko for finding and reporting this and also
providing a first pass at a fix. But his fix was only partial and only
worked on the Quark CPUs. It would not, for instance, have worked on
the QEMU config.
1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel…
2. You can also get this behavior if you use "-cpu 486,+clzero"
in QEMU.
[ dhansen: remove 'vp_bits_from_cpuid' reference in changelog
because bpetkov brutally murdered it recently. ]
Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Reported-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Tested-by: Jörn Heusipp <osmanx(a)heusipp.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20240516173928.3960193-1-andriy.shevchenko@linu…
Link: https://lore.kernel.org/lkml/5e31cad3-ad4d-493e-ab07-724cfbfaba44@heusipp.d…
Link: https://lore.kernel.org/all/20240517200534.8EC5F33E%40davehans-spike.ostc.i…
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2b170da84f97..e31293c9609f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1075,6 +1075,10 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
c->x86_virt_bits = (eax >> 8) & 0xff;
c->x86_phys_bits = eax & 0xff;
+
+ /* Provide a sane default if not enumerated: */
+ if (!c->x86_clflush_size)
+ c->x86_clflush_size = 32;
}
c->x86_cache_bits = c->x86_phys_bits;