-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQLExecuteQueryOperator not timing out within expected timeframe #45930
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
cc @dabla maybe you will have time to look into this? Seems inconsistent behavior of SQLExecuteQueryOperator |
I've also encountered the same behaviour, as the connection is blocking the thread or something and thus the SQLExecuteQueryOperator task only stops once the connection returns back, at least that's what I experienced. So to avoid that issue, what I've done when using the JdbcHook, was by specifying a connection timeout on the JDBC connection, that way when the task has to timeout, the connection will also (even though the query will continue to run on the database) and thus not block the thread of the Airflow worker, but this was a specific solution with JDBC connections. Bellow the code:
So maybe in the hook some kind of timeout should also be specified on the connection being used, or if you want a more finegrained approach per operator, the you could specify the connection timeout through the hook_params of the SQLExecuteQueryOperator? Still, this al depends if the underlying connection supports it of course. |
Should we apply this logic for the underlying get hook function that we are using? While not generic to all if we can provide workaround for connections that support it I think we should. |
We could, but that possibly would not work for all cases and would be a workaround of the real issue. |
I am not sure it explains it. Airflow runs a process (that happens to be database connection) Airflow should be able to kill the process. The fact that the issue happens for Sql but not for Python is really odd. From the description it sounds like something is blocking the scheduler from killing the task (because it tried to kill it after its finished). I think that the issue is with the scheduler itself. |
Question, at least for me, is how do you debug something like this? |
Apache Airflow Provider(s)
common-sql
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==9.0.0
apache-airflow-providers-common-sql==1.19.0
apache-airflow-providers-mysql==5.7.3
Apache Airflow version
2.10.3
Operating System
Amazon Linux 2023
Deployment
Amazon (AWS) MWAA
Deployment details
The issue happens on a production Amazon (AWS) MWAA environment, and is also reproducible locally using the MWAA local runner (deployed on a Mac).
What happened
The
SQLExecuteQueryOperator
fails with timeout, but only after the query has completed.This is the workflow currently happening:
T0: Start -> T1: Run query -> T2: Expected timeout failure -> T3: Query finishes -> T4: Operator fails on timeout
What you think should happen instead
Once the execution timeout is reached, the
SQLExecuteQueryOperator
should kill the query and then fail.This is the expected workflow:
T0: Start -> T1: Run query -> T2: Expected timeout, operator kills query and fails
How to reproduce
The issue is reproducible both on a production Amazon (AWS) MWAA environment, and locally using the MWAA local runner (deployed on a Mac).
The DAG below contains two tasks:
SQLExecuteQueryOperator
.PythonOperator
.The
PythonOperator
fails its execution exactly after one minute (as expected), theSQLExecuteQueryOperator
takes a few minutes until the query completes (not as expected).Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: