Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ASYNC execution mode "No such file or directory" while reading generated SQL from target folder #1585

Open
1 task
pankajastro opened this issue Mar 4, 2025 · 1 comment · May be fixed by #1588
Open
1 task
Assignees
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone

Comments

@pankajastro
Copy link
Contributor

pankajastro commented Mar 4, 2025

Astronomer Cosmos Version

1.9.0

dbt-core version

1.9

Versions of dbt adapters

No response

LoadMode

AUTOMATIC

ExecutionMode

LOCAL

InvocationMode

DBT_RUNNER

airflow version

2.10

Operating System

mac

If a you think it's an UI issue, what browsers are you seeing the problem on?

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened?

For async execution mode if dbt project contain package dependency the run command generate SQL files in respective package folder inside target folder. Example

(venv) pankaj@Pankajs-MacBook-Pro jaffle_shop % tree
.
├── LICENSE
├── README.md
├── dbt_package
│   └── dbt_penta
│       ├── README.md
│       ├── analyses
│       ├── dbt_project.yml
│       ├── macros
│       ├── models
│       │   └── example
│       │       ├── my_first_dbt_model.sql
│       │       ├── my_second_dbt_model.sql
│       │       └── schema.yml
│       ├── seeds
│       ├── snapshots
│       └── tests
├── dbt_packages
│   └── dbt_penta
│       ├── README.md
│       ├── analyses
│       ├── dbt_project.yml
│       ├── macros
│       ├── models
│       │   └── example
│       │       ├── my_first_dbt_model.sql
│       │       ├── my_second_dbt_model.sql
│       │       └── schema.yml
│       ├── seeds
│       ├── snapshots
│       └── tests
├── dbt_project.yml
├── logs
│   └── dbt.log
├── macros
├── models
│   ├── customers
│   │   └── customers.sql
│   ├── docs.md
│   ├── orders.sql
│   ├── overview.md
│   ├── schema.yml
│   └── staging
│       ├── schema.yml
│       ├── stg_customers.sql
│       ├── stg_orders.sql
│       └── stg_payments.sql
├── package-lock.yml
├── packages.yml
├── profiles.yml
├── seeds
│   ├── raw_customers.csv
│   ├── raw_orders.csv
│   └── raw_payments.csv
└── target
    ├── compiled
    │   ├── dbt_penta
    │   │   └── models
    │   │       └── example
    │   │           ├── my_first_dbt_model.sql
    │   │           └── my_second_dbt_model.sql
    │   └── jaffle_shop
    │       └── models
    │           ├── customers
    │           │   └── customers.sql
    │           ├── orders.sql
    │           └── staging
    │               ├── stg_customers.sql
    │               ├── stg_orders.sql
    │               └── stg_payments.sql
    ├── graph.gpickle
    ├── graph_summary.json
    ├── manifest.json
    ├── partial_parse.msgpack
    ├── run
    │   ├── dbt_penta
    │   │   └── models
    │   │       └── example
    │   │           ├── my_first_dbt_model.sql
    │   │           └── my_second_dbt_model.sql
    │   └── jaffle_shop
    │       ├── models
    │       │   ├── customers
    │       │   │   └── customers.sql
    │       │   ├── orders.sql
    │       │   └── staging
    │       │       ├── stg_customers.sql
    │       │       ├── stg_orders.sql
    │       │       └── stg_payments.sql
    │       └── seeds
    │           ├── raw_customers.csv
    │           ├── raw_orders.csv
    │           └── raw_payments.csv
    ├── run_results.json
    └── semantic_manifest.json

In the above folder structure, we can see the SQL file for package dbt_penta is generated inside the dbt_penta folder in the target folder, but Cosmos is trying to read it inside the jaffle_shop

Relevant log output

Error Log

[2025-03-04, 07:47:15 UTC] {local.py:656} INFO - Assigning inlets/outlets with DatasetAlias
[2025-03-04, 07:47:15 UTC] {taskinstance.py:3313} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 768, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 734, in _execute_callable
    return ExecutionCallableRunner(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/airflow/utils/operator_helpers.py", line 252, in run
    return self.func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 424, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/astro/.local/lib/python3.12/site-packages/cosmos/operators/_asynchronous/bigquery.py", line 143, in execute
    self.build_and_run_cmd(context=context, run_as_async=True, async_context=self.async_context)
  File "/home/astro/.local/lib/python3.12/site-packages/cosmos/operators/local.py", line 708, in build_and_run_cmd
    result = self.run_command(
             ^^^^^^^^^^^^^^^^^
  File "/home/astro/.local/lib/python3.12/site-packages/cosmos/operators/local.py", line 559, in run_command
    self._handle_async_execution(tmp_project_dir, context, async_context)
  File "/home/astro/.local/lib/python3.12/site-packages/cosmos/operators/local.py", line 493, in _handle_async_execution
    sql = self._read_run_sql_from_target_dir(tmp_project_dir, async_context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/astro/.local/lib/python3.12/site-packages/cosmos/operators/local.py", line 411, in _read_run_sql_from_target_dir
    with run_sql_path.open("r") as sql_file:
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/pathlib.py", line 1013, in open
    return io.open(self, mode, buffering, encoding, errors, newline)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp6e63sgdd/target/run/jaffle_shop/models/example/my_first_dbt_model.sql'
[2025-03-04, 07:47:15 UTC] {taskinstance.py:1226} INFO - Marking task as FAILED. dag_id=simple_dag_async, task_id=my_first_dbt_mode

How to reproduce

import os
from datetime import datetime
from pathlib import Path
from include.constants import jaffle_shop_path, dbt_executable

from cosmos import DbtDag, ExecutionConfig, ExecutionMode, ProfileConfig, ProjectConfig, RenderConfig, TestBehavior
from cosmos.profiles import GoogleCloudServiceAccountDictProfileMapping

DEFAULT_DBT_ROOT_PATH = Path(__file__).parent / "dbt"
DBT_ROOT_PATH = Path(os.getenv("DBT_ROOT_PATH", DEFAULT_DBT_ROOT_PATH))

profile_config = ProfileConfig(
    profile_name="airflow_db",
    target_name="dev",
    profile_mapping=GoogleCloudServiceAccountDictProfileMapping(
        conn_id="gcp_gs_conn", profile_args={"dataset": "release_18", "project": "astronomer-dag-authoring"}
    ),
)

simple_dag_async = DbtDag(
    project_config=ProjectConfig(
        jaffle_shop_path,
    ),
    profile_config=profile_config,
    execution_config=ExecutionConfig(
        execution_mode=ExecutionMode.AIRFLOW_ASYNC,
    ),
    render_config=RenderConfig(
    ),
    # normal dag parameters
    schedule_interval=None,
    start_date=datetime(2023, 1, 1),
    catchup=False,
    dag_id="simple_dag_async",
    tags=["simple"],
    operator_args={"location": "us", "install_deps": True},
)

Anything else :)?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Contact Details

No response

@pankajastro pankajastro added bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone labels Mar 4, 2025
Copy link

dosubot bot commented Mar 4, 2025

To resolve the "No such file or directory" error in ASYNC execution mode with Astronomer Cosmos version 1.9.0 and dbt-core version 1.9, ensure that you have configured the remote_target_path and remote_target_path_conn_id settings correctly. These settings are necessary for the SetupAsyncOperator to generate SQL files and upload them to a remote location.

Here's what you need to do:

  1. Set the remote_target_path to a valid remote storage path supported by Airflow's Object Store feature (e.g., s3://your_s3_bucket/target_dir/). Cosmos supports copying files only from the compiled directory within the target folder when the execution mode is set to ExecutionMode.AIRFLOW_ASYNC.

  2. Set the remote_target_path_conn_id to the appropriate Airflow connection ID for accessing the remote storage.

Ensure these configurations are correctly set in your Airflow environment variables or configuration files:

export AIRFLOW__COSMOS__REMOTE_TARGET_PATH='s3://your_s3_bucket/target_dir/'
export AIRFLOW__COSMOS__REMOTE_TARGET_PATH_CONN_ID='your_connection_id'

Additionally, ensure that the async_py_requirements parameter within the ExecutionConfig of your DbtDag specifies the necessary dbt adapter. This is crucial for the SetupAsyncOperator to function correctly, as it must install the required dbt adapter in the virtual environment created during execution [1][2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@dosubot dosubot bot added the area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc label Mar 4, 2025
@pankajastro pankajastro self-assigned this Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc bug Something isn't working triage-needed Items need to be reviewed / assigned to milestone
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant