Hi,
With SRC in the firmware processing pipeline the FE and BE rate can be different, the sample counters on the two side of the DSP counts in different rate domain and they will drift apart. The counters should be moved to the same rate domain to be usable for delay calculation.
The ChainDMA offset value was incorrect since the host buffer size and the trigger to start the chain is misunderstood initially.
Finally: we can have a situation when the host and link DMA channel in HDA is not using matching channel ids. We need to look up the link channel explicitly to make sure that we read the LLP from the correct link.
Regards, Peter --- Kai Vehmanen (3): ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples ASoC: SOF: ipc4-pcm: fix start offset calculation for chain DMA ASoC: SOF: ipc4-pcm: do not report invalid delay values
Peter Ujfalusi (2): ASoC: SOF: sof-audio: add dev_dbg_ratelimited wrapper ASoC: SOF: Intel: Read the LLP via the associated Link DMA channel
sound/soc/sof/intel/hda-stream.c | 29 ++++++++- sound/soc/sof/ipc4-pcm.c | 104 ++++++++++++++++++++++++------- sound/soc/sof/ipc4-topology.c | 1 - sound/soc/sof/ipc4-topology.h | 2 + sound/soc/sof/sof-audio.h | 5 ++ 5 files changed, 114 insertions(+), 27 deletions(-)
From: Kai Vehmanen kai.vehmanen@linux.intel.com
When the sampling rates going in (host) and out (dai) from the DSP are different, the IPC4 delay reporting does not work correctly. Add support for this case by scaling the all raw position values to a common timebase before calculating real-time delay for the PCM.
Cc: stable@vger.kernel.org Fixes: 0ea06680dfcb ("ASoC: SOF: ipc4-pcm: Correct the delay calculation") Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com --- sound/soc/sof/ipc4-pcm.c | 83 ++++++++++++++++++++++++++++++---------- 1 file changed, 62 insertions(+), 21 deletions(-)
diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c index 24f82a6f3610..075388d5cb3c 100644 --- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -19,12 +19,14 @@ * struct sof_ipc4_timestamp_info - IPC4 timestamp info * @host_copier: the host copier of the pcm stream * @dai_copier: the dai copier of the pcm stream - * @stream_start_offset: reported by fw in memory window (converted to frames) - * @stream_end_offset: reported by fw in memory window (converted to frames) + * @stream_start_offset: reported by fw in memory window (converted to + * frames at host_copier sampling rate) + * @stream_end_offset: reported by fw in memory window (converted to + * frames at host_copier sampling rate) * @llp_offset: llp offset in memory window - * @boundary: wrap boundary should be used for the LLP frame counter * @delay: Calculated and stored in pointer callback. The stored value is - * returned in the delay callback. + * returned in the delay callback. Expressed in frames at host copier + * sampling rate. */ struct sof_ipc4_timestamp_info { struct sof_ipc4_copier *host_copier; @@ -33,7 +35,6 @@ struct sof_ipc4_timestamp_info { u64 stream_end_offset; u32 llp_offset;
- u64 boundary; snd_pcm_sframes_t delay; };
@@ -48,6 +49,16 @@ struct sof_ipc4_pcm_stream_priv { bool chain_dma_allocated; };
+/* + * Modulus to use to compare host and link position counters. The sampling + * rates may be different, so the raw hardware counters will wrap + * around at different times. To calculate differences, use + * DELAY_BOUNDARY as a common modulus. This value must be smaller than + * the wrap-around point of any hardware counter, and larger than any + * valid delay measurement. + */ +#define DELAY_BOUNDARY U32_MAX + static inline struct sof_ipc4_timestamp_info * sof_ipc4_sps_to_time_info(struct snd_sof_pcm_stream *sps) { @@ -1049,6 +1060,35 @@ static int sof_ipc4_pcm_hw_params(struct snd_soc_component *component, return 0; }
+static u64 sof_ipc4_frames_dai_to_host(struct sof_ipc4_timestamp_info *time_info, u64 value) +{ + u64 dai_rate, host_rate; + + if (!time_info->dai_copier || !time_info->host_copier) + return value; + + /* + * copiers do not change sampling rate, so we can use the + * out_format independently of stream direction + */ + dai_rate = time_info->dai_copier->data.out_format.sampling_frequency; + host_rate = time_info->host_copier->data.out_format.sampling_frequency; + + if (!dai_rate || !host_rate || dai_rate == host_rate) + return value; + + /* take care not to overflow u64, rates can be up to 768000 */ + if (value > U32_MAX) { + value = div64_u64(value, dai_rate); + value *= host_rate; + } else { + value *= host_rate; + value = div64_u64(value, dai_rate); + } + + return value; +} + static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev, struct snd_pcm_substream *substream, struct snd_sof_pcm_stream *sps, @@ -1099,14 +1139,13 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev, time_info->stream_end_offset = ppl_reg.stream_end_offset; do_div(time_info->stream_end_offset, dai_sample_size);
+ /* convert to host frame time */ + time_info->stream_start_offset = + sof_ipc4_frames_dai_to_host(time_info, time_info->stream_start_offset); + time_info->stream_end_offset = + sof_ipc4_frames_dai_to_host(time_info, time_info->stream_end_offset); + out: - /* - * Calculate the wrap boundary need to be used for delay calculation - * The host counter is in bytes, it will wrap earlier than the frames - * based link counter. - */ - time_info->boundary = div64_u64(~((u64)0), - frames_to_bytes(substream->runtime, 1)); /* Initialize the delay value to 0 (no delay) */ time_info->delay = 0;
@@ -1149,6 +1188,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component,
/* For delay calculation we need the host counter */ host_cnt = snd_sof_pcm_get_host_byte_counter(sdev, component, substream); + + /* Store the original value to host_ptr */ host_ptr = host_cnt;
/* convert the host_cnt to frames */ @@ -1167,6 +1208,8 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component, sof_mailbox_read(sdev, time_info->llp_offset, &llp, sizeof(llp)); dai_cnt = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l; } + + dai_cnt = sof_ipc4_frames_dai_to_host(time_info, dai_cnt); dai_cnt += time_info->stream_end_offset;
/* In two cases dai dma counter is not accurate @@ -1200,8 +1243,9 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component, dai_cnt -= time_info->stream_start_offset; }
- /* Wrap the dai counter at the boundary where the host counter wraps */ - div64_u64_rem(dai_cnt, time_info->boundary, &dai_cnt); + /* Convert to a common base before comparisons */ + dai_cnt &= DELAY_BOUNDARY; + host_cnt &= DELAY_BOUNDARY;
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) { head_cnt = host_cnt; @@ -1211,14 +1255,11 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component, tail_cnt = host_cnt; }
- if (head_cnt < tail_cnt) { - time_info->delay = time_info->boundary - tail_cnt + head_cnt; - goto out; - } - - time_info->delay = head_cnt - tail_cnt; + if (unlikely(head_cnt < tail_cnt)) + time_info->delay = DELAY_BOUNDARY - tail_cnt + head_cnt; + else + time_info->delay = head_cnt - tail_cnt;
-out: /* * Convert the host byte counter to PCM pointer which wraps in buffer * and it is in frames
From: Kai Vehmanen kai.vehmanen@linux.intel.com
Assumption that chain DMA module starts the link DMA when 1ms of data is available from host is not correct. Instead the firmware chain DMA module fills the link DMA with initial buffer of zeroes and the host and link DMAs are started at the same time.
This results in a small error in delay calculation. This can become a more severe problem if host DMA has delays that exceed 1ms. This results in negative delay to be calculated and bogus values reported to applications. This can confuse some applications like alsa_conformance_test.
Fix the issue by correctly calculating the firmware chain DMA preamble size and initializing the start offset to this value.
Cc: stable@vger.kernel.org Fixes: a1d203d390e0 ("ASoC: SOF: ipc4-pcm: Enable delay reporting for ChainDMA streams") Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com --- sound/soc/sof/ipc4-pcm.c | 14 ++++++++++---- sound/soc/sof/ipc4-topology.c | 1 - sound/soc/sof/ipc4-topology.h | 2 ++ 3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c index 075388d5cb3c..9542c428daa4 100644 --- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -1108,7 +1108,7 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev, return -EINVAL; } else if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_CHAIN_DMA_NODE_ID) { /* - * While the firmware does not supports time_info reporting for + * While the firmware does not support time_info reporting for * streams using ChainDMA, it is granted that ChainDMA can only * be used on Host+Link pairs where the link position is * accessible from the host side. @@ -1116,10 +1116,16 @@ static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev, * Enable delay calculation in case of ChainDMA via host * accessible registers. * - * The ChainDMA uses 2x 1ms ping-pong buffer, dai side starts - * when 1ms data is available + * The ChainDMA prefills the link DMA with a preamble + * of zero samples. Set the stream start offset based + * on size of the preamble (driver provided fifo size + * multiplied by 2.5). We add 1ms of margin as the FW + * will align the buffer size to DMA hardware + * alignment that is not known to host. */ - time_info->stream_start_offset = substream->runtime->rate / MSEC_PER_SEC; + int pre_ms = SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS * 5 / 2 + 1; + + time_info->stream_start_offset = pre_ms * substream->runtime->rate / MSEC_PER_SEC; goto out; }
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index b6a732d0adb4..36568160f163 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -33,7 +33,6 @@ MODULE_PARM_DESC(ipc4_ignore_cpc,
#define SOF_IPC4_GAIN_PARAM_ID 0 #define SOF_IPC4_TPLG_ABI_SIZE 6 -#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2
static DEFINE_IDA(alh_group_ida); static DEFINE_IDA(pipeline_ida); diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h index dfa1a6c2ffa8..6b29692dff16 100644 --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -263,6 +263,8 @@ struct sof_ipc4_dma_stream_ch_map { #define SOF_IPC4_DMA_METHOD_HDA 1 #define SOF_IPC4_DMA_METHOD_GPDMA 2 /* defined for consistency but not used */
+#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2 + /** * struct sof_ipc4_dma_config: DMA configuration * @dma_method: HDAudio or GPDMA
Add dev_dbg_ratelimited() wrapper for snd_sof_pcm specific debug prints that needs rate limited.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com --- sound/soc/sof/sof-audio.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/soc/sof/sof-audio.h b/sound/soc/sof/sof-audio.h index db6973c8eac3..a8b93a2eec9c 100644 --- a/sound/soc/sof/sof-audio.h +++ b/sound/soc/sof/sof-audio.h @@ -629,6 +629,11 @@ void snd_sof_pcm_init_elapsed_work(struct work_struct *work); (__spcm)->pcm.pcm_id, (__spcm)->pcm.pcm_name, __dir, \ ##__VA_ARGS__)
+#define spcm_dbg_ratelimited(__spcm, __dir, __fmt, ...) \ + dev_dbg_ratelimited((__spcm)->scomp->dev, "pcm%u (%s), dir %d: " __fmt, \ + (__spcm)->pcm.pcm_id, (__spcm)->pcm.pcm_name, __dir, \ + ##__VA_ARGS__) + #define spcm_err(__spcm, __dir, __fmt, ...) \ dev_err((__spcm)->scomp->dev, "%s: pcm%u (%s), dir %d: " __fmt, \ __func__, (__spcm)->pcm.pcm_id, (__spcm)->pcm.pcm_name, __dir, \
From: Kai Vehmanen kai.vehmanen@linux.intel.com
Add a sanity check for the calculated delay value before reporting it to the application. If the value is clearly invalid, emit a rate limited warning to kernel log and return a zero delay. This can occur e.g if the host or link DMA hits a buffer over/underrun condition.
Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com --- sound/soc/sof/ipc4-pcm.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c index 9542c428daa4..6d81969e181c 100644 --- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -59,6 +59,8 @@ struct sof_ipc4_pcm_stream_priv { */ #define DELAY_BOUNDARY U32_MAX
+#define DELAY_MAX (DELAY_BOUNDARY >> 1) + static inline struct sof_ipc4_timestamp_info * sof_ipc4_sps_to_time_info(struct snd_sof_pcm_stream *sps) { @@ -1266,6 +1268,13 @@ static int sof_ipc4_pcm_pointer(struct snd_soc_component *component, else time_info->delay = head_cnt - tail_cnt;
+ if (time_info->delay > DELAY_MAX) { + spcm_dbg_ratelimited(spcm, substream->stream, + "inaccurate delay, host %llu dai_cnt %llu", + host_cnt, dai_cnt); + time_info->delay = 0; + } + /* * Convert the host byte counter to PCM pointer which wraps in buffer * and it is in frames
It is allowed to mix Link and Host DMA channels in a way that their index is different. In this case we would read the LLP from a channel which is not used or used for other operation.
Such case can be reproduced on cAVS2.5 or ACE1 platforms with soundwire configuration: playback to SDW would take Host channel 0 (stream_tag 1) and no Link DMA used Second playback to HDMI (HDA) would use Host channel 1 (stream_tag 2) and Link channel 0 (stream_tag 1).
In this case reading the LLP from channel 2 is incorrect as that is not the Link channel used for the HDMI playback.
To correct this, we should look up the BE and get the channel used on the Link side.
Fixes: 67b182bea08a ("ASoC: SOF: Intel: hda: Implement get_stream_position (Linear Link Position)") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com --- sound/soc/sof/intel/hda-stream.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c index a34f472ef175..9c3b3a9aaf83 100644 --- a/sound/soc/sof/intel/hda-stream.c +++ b/sound/soc/sof/intel/hda-stream.c @@ -1129,10 +1129,35 @@ u64 hda_dsp_get_stream_llp(struct snd_sof_dev *sdev, struct snd_soc_component *component, struct snd_pcm_substream *substream) { - struct hdac_stream *hstream = substream->runtime->private_data; - struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream); + struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream); + struct snd_soc_pcm_runtime *be_rtd = NULL; + struct hdac_ext_stream *hext_stream; + struct snd_soc_dai *cpu_dai; + struct snd_soc_dpcm *dpcm; u32 llp_l, llp_u;
+ /* + * The LLP needs to be read from the Link DMA used for this FE as it is + * allowed to use any combination of Link and Host channels + */ + for_each_dpcm_be(rtd, substream->stream, dpcm) { + if (dpcm->fe != rtd) + continue; + + be_rtd = dpcm->be; + } + + if (!be_rtd) + return 0; + + cpu_dai = snd_soc_rtd_to_cpu(be_rtd, 0); + if (!cpu_dai) + return 0; + + hext_stream = snd_soc_dai_get_dma_data(cpu_dai, substream); + if (!hext_stream) + return 0; + /* * The pplc_addr have been calculated during probe in * hda_dsp_stream_init():
On Thu, 02 Oct 2025 10:47:14 +0300, Peter Ujfalusi wrote:
With SRC in the firmware processing pipeline the FE and BE rate can be different, the sample counters on the two side of the DSP counts in different rate domain and they will drift apart. The counters should be moved to the same rate domain to be usable for delay calculation.
The ChainDMA offset value was incorrect since the host buffer size and the trigger to start the chain is misunderstood initially.
[...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[1/5] ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples commit: bcd1383516bb5a6f72b2d1e7f7ad42c4a14837d1 [2/5] ASoC: SOF: ipc4-pcm: fix start offset calculation for chain DMA commit: bace10b59624e6bd8d68bc9304357f292f1b3dcf
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
On Thu, 02 Oct 2025 10:47:14 +0300, Peter Ujfalusi wrote:
With SRC in the firmware processing pipeline the FE and BE rate can be different, the sample counters on the two side of the DSP counts in different rate domain and they will drift apart. The counters should be moved to the same rate domain to be usable for delay calculation.
The ChainDMA offset value was incorrect since the host buffer size and the trigger to start the chain is misunderstood initially.
[...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
Thanks!
[3/5] ASoC: SOF: sof-audio: add dev_dbg_ratelimited wrapper commit: 18dbff48a1ea58100f9fa6886cfef286a96a5fb0 [4/5] ASoC: SOF: ipc4-pcm: do not report invalid delay values commit: a4b8152c09a832b089864e5e209a479bb0fb5cc9 [5/5] ASoC: SOF: Intel: Read the LLP via the associated Link DMA channel commit: aaab61de1f1e44a2ab527e935474e2e03a0f6b08
All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying to this mail.
Thanks, Mark
linux-stable-mirror@lists.linaro.org