본 포스팅은 EKS 에 Airflow 를 배포하는 작업을 수행하면서 해당 작업에 대한 기록을 남겨보고자 작성합니다.
Airflow
Airflow는 워크플로우를 작성, 예약 및 모니터링을 할 수 있고 Python 코드로 워크플로우를 스케쥴링하고 모니터링 하는 플랫폼으로, Airflow를 통해 ETL 작업을 자동화하고 정교한 파이프라인을 설정할 수 있으며 다양한 Mansged Service를 사용할 수 있습니다.
사전 준비
StorageClass 생성
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: airflow-sc
parameters:
basePath: /airflow
directoryPerms: "700"
fleSystemId: <EFS ID>
provisioningMode: efs-ap
uid: "1001"
gid: "1002"
provisioner: efs.csi.aws.com
reclaimPolicy: Retain
volumeBindingMode: Immediate
ServiceAccount 생성
apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
annotations:
meta.helm.sh/release-name: bitnami-airflow
meta.helm.sh/release-namespace: ns-airflow
labels:
app.kubernetes.io/instance: bitnami-airflow
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: airflow
helm.sh/chart: airflow-14.1.1
helm.toolkit.fluxcd.io/name: bitnami-airflow
helm.toolkit.fluxcd.io/namespace: ns-airflow
name: airflow
namespace: ns-airflow
Helm Repository 생성
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: bitnami-airflow
namespace: ns-airflow
spec:
interval: 1h0m0s
url: https://charts.bitnami.com/bitnami
Helm Ingress Controller 생성
replicaCount: 1
service:
type: NodePort
nodePorts:
http: 30080
nodeSelector:
nodegroup: <배포 대상 Node>
defaultBackend:
nodeSelector:
nodegroup: <배포 대상 Node>
helm install -f airflow-ingress.yaml bitnami-ingress bitnami/nginx-ingress-controller --version 11.6.12 -n ns-airflow
Helm Release 생성
global:
storageClass: airflow-sc
web:
nodeSelector:
nodegroup: <배포 대상 Node>
extraEnvVars:
- name: AIRFLOW__WEBSERVER__EXPOSE_CONFIG
value: 'True'
scheduler:
nodeSelector:
nodegroup: <배포 대상 Node>
extraEnvVars:
- name: AIRFLOW__LOGGING__REMOTE_LOGGING
value: 'True'
- name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
value: "<S3 URL>"
ingress:
enabled: true
ingressClassName: nginx
hostname: <Domain URL>
path: /
rbac:
create: true
createSCCRoleBinding: false
allowPodLaunching: true
serviceAccount:
create: true
name: <생성한 SA 이름>
executor: CeleryKubernetesExecutor
auth:
username: admin
password: admin
postgresql:
auth:
enablePostgreUser: True
username: bn_airflow
password: "bn_airflow_admin"
database: bitnami_airflow
git:
dags:
enabled: True
repositories:
- repository: "<git project URL>"
branch: "main"
name: "<git project name>"
path: "dags"
sync:
interval: 60
resources:
limits:
cpu: 1000m
memory: 14Gi
requests:
cpu: 600m
memory: 1Gi
dags:
extraEnvVars:
- name: AIRFLOW__LOGGING__REMOTE_LOGGING
value: 'True'
- name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
value: "<S3 URL>"
worker:
nodeSelector:
nodegroup: <배포 대상 Node>
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 2
targetCPU: 80
targetMemory: 80
resources:
limits:
cpu: 1000m
memory: 15Gi
requests:
cpu: 500m
memory: 1Gi
extraEnvVars:
- name: AIRFLOW__LOGGING__REMOTE_LOGGING
value: 'True'
- name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
value: "<S3 URL>"
helm install -f airflow-helm-release.yaml bitnami-airflow bitnami/airflow --version 14.1.1 -n ns-airflow
'IT Knowledge' 카테고리의 다른 글
Fluent Bit - k8s Loging (0) | 2025.04.28 |
---|---|
Node Exporter 설치 & Prometheus 설정 (0) | 2024.12.11 |
Teams Workflow 생성하기 (0) | 2024.08.20 |
인터넷 상에 노출된 자격증명 탐지 (0) | 2024.05.03 |
Vault by HashiCorp (0) | 2024.03.07 |