Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Oaktribe · 2021-08-26T13:04:52Z

Is it a duplicate question?
Do not thing so.

Describe the bug
During the day numerous alerts of a monitor timing out with the message 'timeout of 48000ms exceeded' is reported. Then it works again, and then another timeout message is reported.

I setup a custom .js script checking the site the same way Kuma does outside of the docker, and nothing is reported there during the day.
Could it be some issue with Docker for Desktop on Windows 10 with WSL2 producing false timeouts?

To Reproduce
Add monitor to site, and wait. That's how I do it.

Expected behavior
Not to throw a timeout since the site does respond.

Info

Uptime Kuma Version: 1.3.2
Using Docker?: Yes
OS: Windows
Browser: Firefox

Screenshots

Error Log
Nothing is logged to the docker log window.

louislam · 2021-08-26T14:25:19Z

Honestly I think Docker Desktop for Windows is buggy sometimes. Worth to try without Docker.

git clone https://github.com/louislam/uptime-kuma.git
cd uptime-kuma
npm run setup

npm run start-server

Oaktribe · 2021-08-27T07:21:01Z

You were correct, I have let it run for about 16 hours now normally (not with Docker), and no errors has been reported.
Before there would always be some timeout errors.

So Docker for Windows + WSL2 was the culprit here.

Oaktribe · 2021-08-29T12:18:43Z

Did some Googling and found this.
microsoft/Windows-Containers#145

So I ran that command on my Windows Host machine, restarted and fired up the docker again. So far it seems to have helped, before there would be at least 10 timeouts a day. It's been running for 48 hours now without any timeouts.

-- EDIT
Spoke to soon, timeouts are back. Oh well.

CallMeTerdFerguson · 2021-08-31T15:51:48Z

I'm experiencing the same issue, but mine is running in Docker on Ubuntu 20.04.3. Like OP, the performance/response times that kuma is reporting it's getting aren't consistent with the actual response times of my services. I get an alternating graph of timeouts and outlandish response times like 30-40s when the actual service(s) are responding in <3s consistently outside of Kuma. Wondering if I should open a new issue or if this one needs re-opened since it's not just specific to Docker in Windows. @louislam preferences?

louislam · 2021-08-31T16:09:29Z

@Oaktribe @agrider Because I cannot reproduce the problem, it really hard to address the problem.

However, one of our contributions discovered that it could be Alpine Docker problem.
#294 (comment)

I will build a Debian/Ubuntu docker later for you guys testing.

@agrider
If possible, try to run it without Docker:
https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install#-without-docker-recommended-for-x86x64-only

CallMeTerdFerguson · 2021-08-31T17:26:59Z

@louislam I don't install basically anything bare metal on my server if I can help it, so I can't help with the non-docker option, but I did clone down the repo to my Gitlab instance and rebuilt the image on Debian Bullseye instead to see if it's an Alpine issue. I'm going to leave it up and running a day or two and see what the result is.

louislam · 2021-08-31T17:36:15Z

@louislam I don't install basically anything bare metal on my server if I can help it, so I can't help with the non-docker option, but I did clone down the repo to my Gitlab instance and rebuilt the image on Debian Bullseye instead to see if it's an Alpine issue. I'm going to leave it up and running a day or two and see what the result is.

Thank you so mcuh!

CallMeTerdFerguson · 2021-08-31T18:42:10Z

Ok, so I don't need to wait a day, I can pretty conclusively say that the issue is not alpine related as my times were still atrocious, which doesn't surprise me as I use numerous other alpine based images with no latency issues. I did resolve my issue though. When I stood up my instance in my compose file, I forgot to apply my default configs anchor, one of which is setting DNS directly on the container to point to my local unbound instance. As soon as I set DNS to go directly to the DNS server instead of routing through the docker gateway for DNS resolution (which still ultimately landed at unbound), my times dropped to the expected values of <100ms, under both the image I made and the latest alpine based image. My guess is that the if you have a large number of containers like I do (>80), or a high check rate set, that you can overwhelm the DNS resolver in the Docker gateway. Long story short, it's not a conclusive fix necessarily, but I'd try setting the DNS on your kuma container to your preferred upstream server directly instead of letting the docker gateway manage it.

That said, is uptime-kuma set to cache DNS results and respect the TTL's of DNS records it requests? I have the vast majority of my local DNS entries set to an enormously large TTL, so even with the DNS going through the docker gateway and a high check rate, there should have been a relatively small number of DNS queries, but it looks like that wasn't the case, based on the traffic I was seeing it seemed like Kuma was making a DNS request per check.

louislam · 2021-09-01T06:39:33Z

I was seeing it seemed like Kuma was making a DNS request per check.

Thank you for your finding, it seems that it is worth to implement dns cache in Uptime Kuma. I always thought dns cache is managed by OS.

CallMeTerdFerguson · 2021-09-01T12:40:57Z

It usually is, though Alpine is a pretty sparse distro, so it may not have a caching resolver configured/enabled by default? I'm really not familiar enough and haven't encountered dns issues like this with it before.

It also might and my conjecture about root cause could be totally wrong, in which case I'm unsure why switching to direct dns config resolved the issue.

yasharne · 2021-11-23T11:34:19Z

same problem on 1.10.2, switching to debian based image fixed the problem

csakaszamok · 2023-06-01T13:20:50Z

v1.21.3
unfortunately this false positive error still occurs every day

UtechtDustin · 2023-06-14T07:37:49Z

We have the same issue (multiple times per day) with Kuma Version: 1.21.3, Kuma is installed without docker.
So the docker-image can't be the issue, it's running on our infrastructure server which also includes an DNS-Server.

CommanderStorm · 2023-06-14T14:50:09Z

@csakaszamok @UtechtDustin @yasharne
Have you activeted the DNS-cache?

csakaszamok · 2023-06-16T07:55:06Z

It was disabled.
Now I've set to enable, thx the tip

UtechtDustin · 2023-06-16T08:19:47Z

It was disabled, lets see if that fixes it.

UtechtDustin · 2023-06-19T06:23:18Z

@CommanderStorm it seems that option don't fixed the issue.
Last night we got a lot of "spam"-messages(32 Messages within a few Minutes) with the same errror timeout of 48000ms.
I'm not sure why the DNS should be the cause of an timeout of 48 Seconds.

solracsf · 2023-08-08T16:04:23Z

#3472 could help here.
To be shipped on 1.23.0.

Aj7Ay · 2024-04-22T10:37:13Z

@csakaszamok @UtechtDustin @yasharne Have you activeted the DNS-cache?

Its Deprecated any new option adding ?

CommanderStorm · 2024-04-22T13:29:42Z

What are you asking?

I think you are asking what we replaced this with.
We have replaced this with the Name service caching daemon if you are using the docker container. For a native installation, you will have to install this yourself.

UPSOKen · 2024-07-10T18:12:26Z

I am still experiencing this issue on Ubuntu docker. Majority of my sites with a specific hosting company do this ALL day, even though the sites are all working fine.

GordonHannan · 2024-11-06T18:20:22Z

I am still experiencing this issue on Ubuntu docker. Majority of my sites with a specific hosting company do this ALL day, even though the sites are all working fine.

You're not alone, I'm seeing it too. Name service caching daemon is enabled and I've tried on the public release and the beta. Using Ubuntu with docker (using cloudflare tunnel though). Not sure if that is causing issues or not?

puregraphx · 2024-11-11T12:38:04Z

Could it help if you set up Retries to 1 or higher? I had 1 site on a Hetzner server that also hosts other WP sites. This one site got occasional 48s time outs, but it was the only site on Kuma without Retries.

Oaktribe added the bug Something isn't working label Aug 26, 2021

Oaktribe closed this as completed Aug 27, 2021

CallMeTerdFerguson mentioned this issue Aug 31, 2021

[DEBIAN Base] Possibly Fix Random Down, DNS Timeout, EAGAIN, EAI_AGAIN #295

Closed

Jayspek mentioned this issue Jan 29, 2024

Very high response time causing request timeout on some monitors #4432

Closed

2 tasks

This comment was marked as resolved.

Sign in to view

CommanderStorm reopened this Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Oaktribe commented Aug 26, 2021

louislam commented Aug 26, 2021

Oaktribe commented Aug 27, 2021

Oaktribe commented Aug 29, 2021 •

edited

Loading

CallMeTerdFerguson commented Aug 31, 2021 •

edited

Loading

louislam commented Aug 31, 2021

CallMeTerdFerguson commented Aug 31, 2021

louislam commented Aug 31, 2021

CallMeTerdFerguson commented Aug 31, 2021

louislam commented Sep 1, 2021

CallMeTerdFerguson commented Sep 1, 2021

yasharne commented Nov 23, 2021

csakaszamok commented Jun 1, 2023

UtechtDustin commented Jun 14, 2023 •

edited

Loading

CommanderStorm commented Jun 14, 2023

csakaszamok commented Jun 16, 2023

UtechtDustin commented Jun 16, 2023

UtechtDustin commented Jun 19, 2023

solracsf commented Aug 8, 2023

Aj7Ay commented Apr 22, 2024

CommanderStorm commented Apr 22, 2024

UPSOKen commented Jul 10, 2024

This comment was marked as resolved.

This comment was marked as resolved.

GordonHannan commented Nov 6, 2024

puregraphx commented Nov 11, 2024

Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Lots of 'timeout of 48000ms exceeded' during the day, but the site is still up? #275

Comments

Oaktribe commented Aug 26, 2021

louislam commented Aug 26, 2021

Oaktribe commented Aug 27, 2021

Oaktribe commented Aug 29, 2021 • edited Loading

CallMeTerdFerguson commented Aug 31, 2021 • edited Loading

louislam commented Aug 31, 2021

CallMeTerdFerguson commented Aug 31, 2021

louislam commented Aug 31, 2021

CallMeTerdFerguson commented Aug 31, 2021

louislam commented Sep 1, 2021

CallMeTerdFerguson commented Sep 1, 2021

yasharne commented Nov 23, 2021

csakaszamok commented Jun 1, 2023

UtechtDustin commented Jun 14, 2023 • edited Loading

CommanderStorm commented Jun 14, 2023

csakaszamok commented Jun 16, 2023

UtechtDustin commented Jun 16, 2023

UtechtDustin commented Jun 19, 2023

solracsf commented Aug 8, 2023

Aj7Ay commented Apr 22, 2024

CommanderStorm commented Apr 22, 2024

UPSOKen commented Jul 10, 2024

This comment was marked as resolved.

This comment was marked as resolved.

GordonHannan commented Nov 6, 2024

puregraphx commented Nov 11, 2024

Oaktribe commented Aug 29, 2021 •

edited

Loading

CallMeTerdFerguson commented Aug 31, 2021 •

edited

Loading

UtechtDustin commented Jun 14, 2023 •

edited

Loading