Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Open
Oaktribe opened this issue Aug 26, 2021 · 25 comments
Labels
bug Something isn't working

Comments

@Oaktribe
Copy link

Is it a duplicate question?
Do not thing so.

Describe the bug
During the day numerous alerts of a monitor timing out with the message 'timeout of 48000ms exceeded' is reported. Then it works again, and then another timeout message is reported.

I setup a custom .js script checking the site the same way Kuma does outside of the docker, and nothing is reported there during the day.
Could it be some issue with Docker for Desktop on Windows 10 with WSL2 producing false timeouts?

To Reproduce
Add monitor to site, and wait. That's how I do it.

Expected behavior
Not to throw a timeout since the site does respond.

Info

  • Uptime Kuma Version: 1.3.2
  • Using Docker?: Yes
  • OS: Windows
  • Browser: Firefox

Screenshots
1

Error Log
Nothing is logged to the docker log window.

@Oaktribe Oaktribe added the bug Something isn't working label Aug 26, 2021
@louislam
Copy link
Owner

Honestly I think Docker Desktop for Windows is buggy sometimes. Worth to try without Docker.

git clone https://github.com/louislam/uptime-kuma.git
cd uptime-kuma
npm run setup

npm run start-server

@Oaktribe
Copy link
Author

You were correct, I have let it run for about 16 hours now normally (not with Docker), and no errors has been reported.
Before there would always be some timeout errors.

So Docker for Windows + WSL2 was the culprit here.

@Oaktribe
Copy link
Author

Oaktribe commented Aug 29, 2021

Did some Googling and found this.
microsoft/Windows-Containers#145

So I ran that command on my Windows Host machine, restarted and fired up the docker again. So far it seems to have helped, before there would be at least 10 timeouts a day. It's been running for 48 hours now without any timeouts.

-- EDIT
Spoke to soon, timeouts are back. Oh well.

@CallMeTerdFerguson
Copy link

CallMeTerdFerguson commented Aug 31, 2021

I'm experiencing the same issue, but mine is running in Docker on Ubuntu 20.04.3. Like OP, the performance/response times that kuma is reporting it's getting aren't consistent with the actual response times of my services. I get an alternating graph of timeouts and outlandish response times like 30-40s when the actual service(s) are responding in <3s consistently outside of Kuma. Wondering if I should open a new issue or if this one needs re-opened since it's not just specific to Docker in Windows. @louislam preferences?

@louislam
Copy link
Owner

@Oaktribe @agrider Because I cannot reproduce the problem, it really hard to address the problem.

However, one of our contributions discovered that it could be Alpine Docker problem.
#294 (comment)

I will build a Debian/Ubuntu docker later for you guys testing.

@agrider
If possible, try to run it without Docker:
https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install#-without-docker-recommended-for-x86x64-only

@CallMeTerdFerguson
Copy link

@louislam I don't install basically anything bare metal on my server if I can help it, so I can't help with the non-docker option, but I did clone down the repo to my Gitlab instance and rebuilt the image on Debian Bullseye instead to see if it's an Alpine issue. I'm going to leave it up and running a day or two and see what the result is.

@louislam
Copy link
Owner

@louislam I don't install basically anything bare metal on my server if I can help it, so I can't help with the non-docker option, but I did clone down the repo to my Gitlab instance and rebuilt the image on Debian Bullseye instead to see if it's an Alpine issue. I'm going to leave it up and running a day or two and see what the result is.

Thank you so mcuh!

@CallMeTerdFerguson
Copy link

Ok, so I don't need to wait a day, I can pretty conclusively say that the issue is not alpine related as my times were still atrocious, which doesn't surprise me as I use numerous other alpine based images with no latency issues. I did resolve my issue though. When I stood up my instance in my compose file, I forgot to apply my default configs anchor, one of which is setting DNS directly on the container to point to my local unbound instance. As soon as I set DNS to go directly to the DNS server instead of routing through the docker gateway for DNS resolution (which still ultimately landed at unbound), my times dropped to the expected values of <100ms, under both the image I made and the latest alpine based image. My guess is that the if you have a large number of containers like I do (>80), or a high check rate set, that you can overwhelm the DNS resolver in the Docker gateway. Long story short, it's not a conclusive fix necessarily, but I'd try setting the DNS on your kuma container to your preferred upstream server directly instead of letting the docker gateway manage it.

That said, is uptime-kuma set to cache DNS results and respect the TTL's of DNS records it requests? I have the vast majority of my local DNS entries set to an enormously large TTL, so even with the DNS going through the docker gateway and a high check rate, there should have been a relatively small number of DNS queries, but it looks like that wasn't the case, based on the traffic I was seeing it seemed like Kuma was making a DNS request per check.

@louislam
Copy link
Owner

louislam commented Sep 1, 2021

I was seeing it seemed like Kuma was making a DNS request per check.

Thank you for your finding, it seems that it is worth to implement dns cache in Uptime Kuma. I always thought dns cache is managed by OS.

@CallMeTerdFerguson
Copy link

It usually is, though Alpine is a pretty sparse distro, so it may not have a caching resolver configured/enabled by default? I'm really not familiar enough and haven't encountered dns issues like this with it before.

It also might and my conjecture about root cause could be totally wrong, in which case I'm unsure why switching to direct dns config resolved the issue.

@yasharne
Copy link

same problem on 1.10.2, switching to debian based image fixed the problem

@csakaszamok
Copy link

v1.21.3
unfortunately this false positive error still occurs every day

@UtechtDustin
Copy link

UtechtDustin commented Jun 14, 2023

We have the same issue (multiple times per day) with Kuma Version: 1.21.3, Kuma is installed without docker.
So the docker-image can't be the issue, it's running on our infrastructure server which also includes an DNS-Server.

@CommanderStorm
Copy link
Collaborator

@csakaszamok @UtechtDustin @yasharne
Have you activeted the DNS-cache?
image

@csakaszamok
Copy link

It was disabled.
Now I've set to enable, thx the tip

@UtechtDustin
Copy link

It was disabled, lets see if that fixes it.

@UtechtDustin
Copy link

@CommanderStorm it seems that option don't fixed the issue.
Last night we got a lot of "spam"-messages(32 Messages within a few Minutes) with the same errror timeout of 48000ms.
I'm not sure why the DNS should be the cause of an timeout of 48 Seconds.

@solracsf
Copy link

solracsf commented Aug 8, 2023

#3472 could help here.
To be shipped on 1.23.0.

@Aj7Ay
Copy link

Aj7Ay commented Apr 22, 2024

@csakaszamok @UtechtDustin @yasharne Have you activeted the DNS-cache? image

Its Deprecated any new option adding ?

@CommanderStorm
Copy link
Collaborator

What are you asking?

I think you are asking what we replaced this with.
We have replaced this with the Name service caching daemon if you are using the docker container. For a native installation, you will have to install this yourself.

@UPSOKen
Copy link

UPSOKen commented Jul 10, 2024

I am still experiencing this issue on Ubuntu docker. Majority of my sites with a specific hosting company do this ALL day, even though the sites are all working fine.

@af7567

This comment was marked as resolved.

@CommanderStorm

This comment was marked as resolved.

@GordonHannan
Copy link

I am still experiencing this issue on Ubuntu docker. Majority of my sites with a specific hosting company do this ALL day, even though the sites are all working fine.

You're not alone, I'm seeing it too. Name service caching daemon is enabled and I've tried on the public release and the beta. Using Ubuntu with docker (using cloudflare tunnel though). Not sure if that is causing issues or not?

@CommanderStorm CommanderStorm reopened this Nov 7, 2024
@puregraphx
Copy link

Could it help if you set up Retries to 1 or higher? I had 1 site on a Hetzner server that also hosts other WP sites. This one site got occasional 48s time outs, but it was the only site on Kuma without Retries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests