【发布时间】:2021-09-06 20:19:21
【问题描述】:
我正在尝试使用气流和 minio 运行 docker 容器,并将气流任务连接到 minio 中定义的存储桶。我正在使用新版本 - 气流 2.1.3 和最新的 minio 映像。
如何在 minio 中获取连接的访问密钥和访问密钥? 如何定义气流中的连接?
我尝试了多种方法和设置,但我不断收到:botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
我将通过 UI 的连接定义为:
conn type: s3
host: locals3 (name of the service in docker-compose)
login: user (also minio_root_user)
password: password (also minio_root_password)
port: 9000
这是我用来测试连接的任务 (taken from another stackoverflow question):
sensor = S3KeySensor(
task_id='check_s3_for_file_in_s3',
bucket_key='test',
bucket_name='airflow-data',
# aws_conn_id="aws_default",
timeout=18 * 60 * 60,
poke_interval=120,
dag=dag)
谢谢。
编辑: Docker-compose 文件:
version: '3.8'
# ====================================== AIRFLOW ENVIRONMENT VARIABLES =======================================
x-environment: &airflow_environment
- AIRFLOW__API__AUTH_BACKEND=airflow.api.auth.backend.basic_auth
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS=False
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@postgres:5432/airflow
- AIRFLOW__CORE__STORE_DAG_CODE=True
- AIRFLOW__CORE__STORE_SERIALIZED_DAGS=True
- AIRFLOW__WEBSERVER__EXPOSE_CONFIG=True
x-airflow-image: &airflow_image apache/airflow:2.1.3-python3.8
# ====================================== /AIRFLOW ENVIRONMENT VARIABLES =======================================
services:
postgres:
image: postgres:13-alpine
healthcheck:
test: [ "CMD", "pg_isready", "-U", "airflow" ]
interval: 5s
retries: 5
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5432:5432"
init:
image: *airflow_image
depends_on:
- postgres
environment: *airflow_environment
entrypoint: /bin/bash
command: -c 'airflow db init && airflow users create --username user --password password --firstname Marin --lastname Marin --role Admin --email admin@example.org'
webserver:
image: *airflow_image
restart: always
depends_on:
- postgres
ports:
- "8080:8080"
volumes:
- logs:/opt/airflow/logs
environment: *airflow_environment
command: webserver
scheduler:
build:
context: docker
args:
AIRFLOW_BASE_IMAGE: *airflow_image
# image: *airflow_image
restart: always
depends_on:
- postgres
volumes:
- logs:/opt/airflow/logs
- ./dags:/opt/airflow/dags
environment: *airflow_environment
command: scheduler
locals3:
image: minio/minio
ports:
- "9000:9000"
- "9001:9001"
environment:
- MINIO_ROOT_USER=user
- MINIO_ROOT_PASSWORD=password
command: "server --console-address :9001 /data"
volumes:
- "locals3-data:/data"
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9000/minio/health/live" ]
interval: 30s
timeout: 20s
retries: 3
locals3_init:
image: minio/mc
depends_on:
- locals3
entrypoint: >
/bin/sh -c "
while ! /usr/bin/mc config host add locals3 http://locals3:9000 user password; do echo 'MinIO not up and running yet...' && sleep 1; done;
echo 'Added mc host config.';
/usr/bin/mc mb locals3/airflow-data;
exit 0;
"
volumes:
logs:
locals3-data:
【问题讨论】:
-
minio 的日志有什么线索吗?
-
这里的具体要求是什么。
-
你的 docker-compose 文件是什么?您是否覆盖了 Airflow Connections UI 中的 aws_default 连接?
-
@SergiyKolesnikov 是的,我确实覆盖了默认的 aws_default 连接。
-
@PrakashS 我只是想测试如何将气流 2 连接到 minio s3 存储桶