Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rockchip/64: fix pl330 cyclic dma transfers #7695

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

paolosabatino
Copy link
Contributor

@paolosabatino paolosabatino commented Jan 12, 2025

Description

PL330 DMA controller driver in mainline kernel is suboptimal; according to this patch, the driver can be modified to make it more efficient and stable.

This inefficiency causes ALL rockchip SoCs I tested (rk3228, rk3308, rk3328 and rk3399) to produce pops and clicks when playing through I²S (ie: analog or HDMI) or SPDIF outputs even with modest load on memory subsystem, but the DMA controller also serves UART, SPI bus, CAN bus, PWMs, and some other device... so possibly the DMA issues of the mainline driver can affect several other areas, sound is just the most evident one.

In my case, I can steadily reproduce the problem on all the mentioned SoCs this way:

  • play a track via aplay, ogg123 or whatever. The issue appears more prominent if at high samplerates (>= 96khz)
  • at the same time run mbw, tying the process to a core which is not the same core handling DMA controller interrupts

An example:

ogg123 -d alsa -o dev:hw:1 "fr08 soundtrack.ogg"
taskset -c 3 mbw -t2 -n 256 32M  # use "-c 5" to tie to little core on RK3399

This PR takes the patch above and applies to kernel 6.12 (current) and 6.13 (edge) for both rockchip and rockchip64 families.

After the patch, the issue is much more reduced, but not totally cancelled yet, because if you add a third iperf3 process to the mix, the clicks and pops still appears (at least on rk3308):

ogg123 -d alsa -o dev:hw:1 "fr08 soundtrack.ogg"
iperf3 -c <your_iperf_server_ip> -t 120 -P 4
taskset -c 3 mbw -t2 -n 256 32M  # use "-c 5" to tie to little core on RK3399

Nonetheless, this is a first step that will address the common scenario with "regular" workloads and is probably a good idea to bring in this fix until I progress in completely fixing the issue. As a note, the rockchip vendor 4.4 kernel on rk3308 does not show any issue.

I don't know if rk356x and rk3588 also are affected, but since they use pl330 DMA controllers as well, I guess the issues can appear there too. Other possibly affected SoC may be some Samsung Exynos SoCs.

SPDIF maxburst parameter is also increased from 4 to 8 words, as it is in the vendor kernel.

GitHub issue reference:
Jira reference number AR-2584

How Has This Been Tested?

  • current 6.12 kernel built and packaged for both rockchip and rockchip64
  • edge 6.13 kernel built and packaged for both rockchip and rockchip64
  • tested current 6.12 kernel on rk322x, rk3308, rk3328 and rk3399: kernel boots on all SoCs and a run of ogg123 + mbw has been done to verify there were no artifacts in analog and SPDIF audio

Checklist:

Please delete options that are not relevant.

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings

@paolosabatino paolosabatino requested a review from a team January 12, 2025 17:40
@github-actions github-actions bot added size/large PR with 250 lines or more Needs review Seeking for review Hardware Hardware related like kernel, U-Boot, ... Patches Patches related to kernel, U-Boot, ... labels Jan 12, 2025
@paolosabatino paolosabatino marked this pull request as draft January 15, 2025 19:53
@paolosabatino
Copy link
Contributor Author

I move this to draft, there seems more involved into and I want to dig deeper...

@paolosabatino
Copy link
Contributor Author

paolosabatino commented Jan 21, 2025

Ok some updates about this.

I can confirm the pl330 driver in mainline kernel handles the cyclic/periodic dma transfers in a non optimal way; from what I could understand, the periodic transfers of buffers can be handled by the DMA controller itself if instructed to do so, but currently they are managed when IRQ triggers by the kernel. The kernel has to handle the IRQ and "rearm" the DMA controller to repeat the transfer and, under heavy load, this takes a bit of time that causes the audio to click, pop and buzz. Also it is less efficient because it makes the IRQ handling by the kernel longer.
The optimized approach is to tell the DMA controller to rearm by itself when there are circular buffers with periodic transfers, and just notify the kernel when a buffer has been transferred.

Latest rockchip vendor kernel (hosted on Radxa github), contains some enhancements and features (I see scatter/gather and interleaved transfers) to pl330 driver, since the DMA controller itself is actually a microcontroller on its own; more than that, the driver works perfectly as a drop-in replacement in the mainline kernel.

The iperf3 issue seems to be another problem: when there are heavy memory transfers (such it is a benchmark with mbw), the stmmac driver (the ethernet driver for practically all rockchip devices) handles the IRQ in a way it causes the ksoftirqd kernel thread to hog one cpu core. This is not an issue per-se, but slows down the system and it is a sign of some contention while handling IRQs, hence the click&pops reapper. Most affected SoCs are rk3308 and rk322x, but does not seem to happen on "bigger" SoCs like rk3328 and rk3399, where ksoftirqd thread is always to near-zero percent of cpu usage.

@paolosabatino paolosabatino force-pushed the rockchip-fix-sound-click branch 4 times, most recently from 49714eb to 40214ec Compare January 26, 2025 13:53
@paolosabatino paolosabatino force-pushed the rockchip-fix-sound-click branch from 40214ec to 1807020 Compare January 26, 2025 15:51
@paolosabatino
Copy link
Contributor Author

paolosabatino commented Jan 26, 2025

Update: after some checks and online research, I found that several patches were already supplied to the kernel mailing lists and have never been accepted or are waiting in line.

I provide a set of 5 patches:

  • general-pl330-01-fix-periodic-transfers.patch - fixes the initial issue (reference)
  • general-pl330-02-add-support-for-interleaved-transfers.patch - as per name, add support for hardware assisted interleaved and scatter/gather transfers (reference)
  • general-pl330-03-fix-data-race-on-tx-transfers.patch - should fix some statistics (reference)
  • general-pl330-04-bigger-mcode-buffer.patch - allocate few bytes more to accomodate bigger microcode for scatter/gather (reference)
  • general-pl330-05-fix-unbalanced-power-down.patch - fixes Runtime pm usage count underflow message on heavy load, with cracks and pops (reference)

Tests have been made on all the board mentioned in the main comment so far; all the boards boots fine and dmesg are clean. To give an idea of the problem and the solution, these two are the recordings from the analog output of an Orange PI 4 LTS boards.

This is the recording with the patch applied and kernel 6.12:
pl330-with-patch.ogg.gz

And this is the recording with vanilla pl330 driver and kernel 6.6 (fresh official debian minimal image):
pl330-no-patch.ogg.gz

Tests were run with ogg123 and mbw running together, as already mentioned in the main comment.

I remove the draft status for the Pull Request, this applies to rockchip and rockchip64 families and for both current and edge kernels. Current 6.12 kernel was tested, edge wasn't.

@paolosabatino paolosabatino marked this pull request as ready for review January 26, 2025 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Hardware Hardware related like kernel, U-Boot, ... Needs review Seeking for review Patches Patches related to kernel, U-Boot, ... size/large PR with 250 lines or more
Development

Successfully merging this pull request may close these issues.

1 participant