-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Customize statsd dagprocessor labels#55832
Conversation
TL;DR: Add support in Helm chart for custom labels in redis, statsd and dagProcessor objects
Description
I would like to be able to customize the labels defined for Airflow Kubernetes resources so I can comply with my company guidelines and be able to track all objects in the same way.
This PR adds the ability to specify custom labels for all Airflow objects and pods defined in the Helm chart. Labels can be set globally through .Values.labels and component-specifically through . These labels are merged, with component-specific labels taking precedence.
Changes
- Added labels property to redis, statsd and dagProcessor components.
- Updated respective templates to get custom labels and merge them with the global labels.
- Updated values.schema.json to include schema definitions for label fields
- Added documentation in
chart/docs/customizing-labels-for-pods.rstexplaining the labeling system
Example Usage
labels:
environment: production
# Component-specific labels
scheduler:
labels:
role: scheduler
workers:
labels:
role: worker
webserver:
labels:
role: ui
Testing
I tested my changes by building the helm chart locally and templating our current values.yaml on top of the local chart archive.
# Package chart in original repo
helm package chart/
cp airflow-1.19.0-dev.tgz ./path/to/my/repo/chart/charts/
# Render full configuration with our own instance of the chart
helm template ./chart --name-template airflow --namespace airflow --values ./path/to/values.yaml --debug > test-template.yaml
Test Case 1 - Using global labels and component labels without overlap
Example config in values.yaml:
# Global labels
labels:
service: airflow
# Component specific labels
dagProcessor:
enabled: true
labels:
miro_function: etl
component_name: dag-processor
Generated Dag Processor objects:
apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-dag-processor
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
service: airflow
spec:
# ....
template:
metadata:
labels:
tier: airflow
component: dag-processor
release: airflow
component_name: dag-processor
miro_function: etl
service: airflow
# ...
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
name: "airflow-dag-processor"
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
component_name: dag-processor
miro_function: etl
service: airflow
# ....
Test Case 2 - Override global labels with component labels
Example config in values.yaml:
# Global labels
labels:
service: airflow
# Component specific labels
dagProcessor:
enabled: true
labels:
service: airflow-dag-processor
miro_function: etl
component_name: dag-processor
Generated Dag Processor objects:
apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-dag-processor
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
service: airflow
spec:
# ....
template:
metadata:
labels:
tier: airflow
component: dag-processor
release: airflow
component_name: dag-processor
miro_function: etl
service: airflow-dag-processor
# ...
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
name: "airflow-dag-processor"
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
component_name: dag-processor
miro_function: etl
service: airflow-dag-processor
# ....
Test Case 3 - Use only component labels
Example config in values.yaml:
# Component specific labels
dagProcessor:
enabled: true
labels:
service: apache-airflow
miro_function: etl
component_name: dag-processor
Generated Dag Processor objects:
apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-dag-processor
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
spec:
# ....
template:
metadata:
labels:
tier: airflow
component: dag-processor
release: airflow
component_name: dag-processor
miro_function: etl
service: airflow-dag-processor
# ...
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
name: "airflow-dag-processor"
labels:
tier: airflow
component: dag-processor
release: airflow
chart: "airflow-1.19.0-dev"
heritage: Helm
component_name: dag-processor
miro_function: etl
service: airflow-dag-processor
# ....
Checklist
- Description above provides context of the change
- Added schema definitions for new configuration options
- Documented new values in docs/
- Used
mustMergeto properly handle label merging - No breaking changes introduced
- Tested configuration changes
Additional Notes
Deployments are currently only labeled with global labels and I don't know if this is done for a specific reason. If possible, I would also like to implement custom labels for Deployments.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
1e144fd to
8e6f559
Compare
3a5a297 to
3c2d421
Compare
|
Looks good! |
a001ba7 to
31fc098
Compare
Miretpl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change!
Could you add the test cases which would valided the changed behaviour of the labels?
8cfc187 to
e58c261
Compare
Thanks for reviewing! I've added 3 test cases in the PR description |
|
@Miretpl btw, I still think that the whole label configuration is a bit inconsistent. For example, at the moment we can customize labels for Pods created under a Deployment (here), but we cannot customize the Deployment labels themselves (here). I didn't add such changes in this PR to keep consistency with the other resources, but I will open a proposal for that separately. |
e58c261 to
78e97cb
Compare
Miretpl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dstandish @jedcunningham @hussein-awala @romsharon98, could you take a look at it?
|
@goncalo-m-c yeah, there are a lot of like unfinished things in the chart or inconsistencies. I agree that the helm chart needs a little more love from the whole community. |
78e97cb to
f8a9462
Compare
jscheffl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR as well as also the documentation added.
Just a small nit - can you add a small test case as well to ensure the function is not degrading with some changes applied?
1423795 to
314903c
Compare
Thanks for reviewing @jscheffl, can you clarify your request? I've just added new unit tests to reflect the changes I am proposing. |
7c72b90 to
ecb44e9
Compare
My bad, I did not enable those. However, I cannot get all the checks to work properly both on my local environment and on a clean Codespace. What I was able to do was to run only the hooks that were failing, which are passing now, and then commit the changes. # prek end-of-file-fixer ruff-format update-breeze-cmd-output --directory helm-tests
Changes are in a separate commit |
|
Oh, still a minor fix is needed as the CLI output of the breeze tooling needs a bit of updates. Can you run |
Updated |
jscheffl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks!
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
Co-authored-by: Goncalo
* docs(helm): Write helm labels customization doc
Co-authored-by: Goncalo
* feat(helm): Clarify dagProcessor values.yaml comments
Co-authored-by: Przemyslaw Mirowski
* docs(helm): Clarify component-specific labels override behavior
* feat(helm): Remove parenthesis from label checks
* tests(helm): Add label tests for redis, statsd and dagprocessor
* tests(helm): Fix tests formatting
* fix(docs): Update breeze output docs
---------
Co-authored-by: Goncalo Costa
Co-authored-by: Goncalo
Co-authored-by: Przemyslaw Mirowski