【问题标题】:Airflow 2 conn URI, extra field JSON for AWS in secret managerAirflow 2 conn URI,秘密管理器中 AWS 的额外字段 JSON
【发布时间】:2026-02-18 00:10:02
【问题描述】:

我已在 Airflow 2 中将 AWS Secret Manager 设置为我的秘密后端。 我在秘密管理器(纯文本)中定义了一个 aws_default 连接:
aws:///extra?region_name=us-east-1&session_kwargs={"profile_name": "my_profile"}

当我用它调用钩子 (AwsGlueCrawlerHook(aws_conn_id='aws_default')) 时,我收到以下错误:

Traceback (most recent call last):
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/task/task_runner/standard_task_runner.py", line 85, in _start_by_fork
    args.func(args, dag=self.dag)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/cli/commands/task_command.py", line 292, in task_run
    _run_task_by_selected_method(args, dag, ti)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/cli/commands/task_command.py", line 107, in _run_task_by_selected_method
    _run_raw_task(args, ti)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/cli/commands/task_command.py", line 184, in _run_raw_task
    error_file=args.error_file,
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/utils/session.py", line 70, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
    self._execute_task_with_callbacks(context)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
    result = self._execute_task(context, self.task)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
    result = execute_callable(context=context)
  File "/home/airflow/airflow/dags/reboots/operators/start_glue_crawler_operator.py", line 29, in execute
    AwsGlueCrawlerHook(aws_conn_id=self.aws_conn_id).start_crawler(crawler_name=self.crawler_name)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/glue_crawler.py", line 120, in start_crawler
    crawler = self.glue_client.start_crawler(Name=crawler_name)
  File "/home/airflow/venv/lib64/python3.7/site-packages/cached_property.py", line 36, in __get__
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/glue_crawler.py", line 48, in glue_client
    return self.get_conn()
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 494, in get_conn
    return self.conn
  File "/home/airflow/venv/lib64/python3.7/site-packages/cached_property.py", line 36, in __get__
    value = obj.__dict__[self.func.__name__] = self.func(obj)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 476, in conn
    return self.get_client_type(self.client_type, region_name=self.region_name)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 442, in get_client_type
    session, endpoint_url = self._get_credentials(region_name)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 418, in _get_credentials
    conn=connection_object, region_name=region_name, config=self.config
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 74, in create_session
    self.basic_session = self._create_basic_session(session_kwargs=session_kwargs)
  File "/home/airflow/venv/lib64/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py", line 100, in _create_basic_session
    **session_kwargs,
TypeError: type object argument after ** must be a mapping, not str

我不知道如何形成我的气流连接 URI,以便最终正确加载(也称为字典,而不是字符串)。 我试图逃避引号等。我只是想不通。如果我只使用它不会出错: aws:///extra?region_name=us-east-1

所以我知道问题出在我编写 session_kwargs 参数的方式上。 我知道我可以在 backend_kwargs 中将 full_url_mode 更改为 false,但此时我真的很好奇如何编写 conn URI。

【问题讨论】:

    标签: python amazon-web-services airflow hook


    【解决方案1】:

    嗯,在这里找到答案:https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html

    from airflow.models.connection import Connection
    
    extra = {
        "config_kwargs": {
            "proxies": {
                "http": "http://user:pwd@proxy.net:3128",
                "https": "http://user:pwd@proxy.net:3128"
            }
        }
    }
    
    c = Connection(
        conn_id="some_conn",
        conn_type="aws",
        description="",
        host="",
        login="",
        password="",
        extra=json.dumps(extra),
    )
    print(c.get_uri())
    

    它需要进行 URL 编码,但不完全。无论如何,如果你有一个 JSON 编码,气流为你提供了这样做的工具。

    【讨论】: