Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: KafkaProducerTest - Fix resource leakage and replace explicit invocation of close() method with try with resources #18678

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

pramithas
Copy link
Contributor

@pramithas pramithas commented Jan 23, 2025

In few tests, producer.close() method was not called. Thus, wrapped the logic of producer creation inside try with resources.

Also, wherever the resources were closed with explicit invocation of close() method, I replaced those with try with resources.

All the tests are passing after the changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@github-actions github-actions bot added triage PRs from the community producer tests Test fixes (including flaky tests) clients labels Jan 23, 2025
@clolov
Copy link
Contributor

clolov commented Jan 24, 2025

Heya, thank you for this contribution! A question from my side - how did you detect this resource leak? I vaguely remember that in other parts of Kafka we use a functionality of the testing framework to confirm that we don't have more threads hanging around than we are expecting which helped us catch cases similar to this one, but I am wondering why that wasn't caught in this case.

@ijuma ijuma requested a review from divijvaidya January 24, 2025 14:04
@divijvaidya
Copy link
Member

Great question @clolov! In other test suites, we use TestUtils.assertNoNonDaemonThreads. @gharris1727 has done some excellent work for client leaks at #14783 which hasn't been merged yet and we have a pending JIRA to catch thread leaks for all tests https://issues.apache.org/jira/browse/KAFKA-16072

For this specific test, @pramithas if we add the following lines, we are able to detect leaks in two tests:

@AfterEach
    public void detectLeaks() {
        List<Thread> nonDaemonThreads = Thread.getAllStackTraces().keySet().stream()
                .filter(t -> !t.isDaemon() && t.isAlive() && t.getName().startsWith(this.getClass().getName()))
                .collect(Collectors.toList());
        int threadCount = nonDaemonThreads.size();
        assertEquals(0, threadCount);
    }

Let's start with adding this leak detection here in the short term and then perhaps @pramithas , if you are interested, you may want to pick up https://issues.apache.org/jira/browse/KAFKA-16072 ?

@divijvaidya divijvaidya added ci-approved and removed triage PRs from the community labels Jan 24, 2025
@pramithas
Copy link
Contributor Author

pramithas commented Jan 25, 2025

@clolov After reviewing the test cases, I noticed that we missed closing the resources in a few of them. So, it was a manual observation in this case.

@divijvaidya Thanks for the great insights on how we're detecting leaks in Kafka already. I'll implement your suggestions shortly and then move on to KAFKA-16072.

@pramithas
Copy link
Contributor Author

pramithas commented Jan 25, 2025

@divijvaidya Please review now. The kafka producer threads is given a specific prefix kafka-producer-network-thread and it is explicitely marked as daemon thread. So, needed to make some adjustments accordingly.

And, I have moved the assertion logic to TestUtils for reusability.

Note: If there is any resource leakage in a test, all the following tests will fail as well.

@gharris1727
Copy link
Contributor

I'd like to weakly push back on adding new stateless assertions that threads aren't leaked.
See my comment here on why that strategy is flawed: #15101 (review)

I would recommend merging just the leak fixes, and not the AfterEach assertion or TestUtils method.

@pramithas
Copy link
Contributor Author

Thank you, @gharris1727,for your thoughtful feedback. You raised a valid concern about cascading test failures in this approach.

However, if we ensure all existing tests are fixed, any new test introduced with a resource leak would be caught during local development before reaching CI.

Given this safeguard, would cascading failures still be a significant issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved clients producer tests Test fixes (including flaky tests)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants