Skip to content

Conversation

@exolightor
Copy link

@exolightor exolightor commented Feb 11, 2026

Closes: #61771

Removed access token from cookie for auth code flow in Keycloak provider.

For large access tokens containing multiple realm roles, access token + refresh token exceeds browser cookie limit of 4KB. Passing only refresh token in cookie is sufficient, as with refresh_user() the access token can be retrieved via refresh token.

@boring-cyborg
Copy link

boring-cyborg bot commented Feb 11, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

Copy link
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. This will result in an extra refresh request to Keycloak but the impact is minimal.

Copy link
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw the failing test. Looks like it needs to be updated to accommodate access_token being empty. My approval is conditional on all CI checks passing.

@exolightor exolightor force-pushed the bugfix/keycloak-provider-token-cookie-size-limit branch from c9a9599 to c6ba658 Compare February 12, 2026 09:37
@exolightor
Copy link
Author

Thanks for the review @SameerMesiah97, I adjusted the test accordingly. CI Workflow needs to be approved again and it looks like it needs review from either @vincbeck or @bugraoz93.

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really do not understand this change. And I do not think it solves anything. When the user is logging in you no longer save the access token (which is very much needed to call Keycloak API). Then, in the refresh token flow, you refresh the token and the token is saved there. So at the end, the token is still saved in the cookie but this time it takes an extra step to generate it.

The issue you are describing (having refresh token and access token in a same cookie exceed the limit of 4K) is real, and I had it on my list but I do not think you are fixing it properly. I think the fix should be to store the access token and refresh in separate cookies. Happy to provide more details if needed.

@exolightor
Copy link
Author

Hello @vincbeck, thanks for looking at the change. With my changes, the access token is not saved in the cookie even after the refresh token flow. After refresh token flow in KeycloakAuthManager.refresh_user() the access token is only stored as a attribute in the KeycloakAuthManagerUser class. After login is successful, the KeycloakAuthManager.is_authorized() method does not look into the cookie but only at the deserialized attribute KeycloakAuthManagerUser.access_token. So the access token is not stored at any time in the cookie.

Splitting access token and refresh token across two cookies does not solve the problem in the long run, as the access token alone can easily exceed 4KB. With 15+ realm roles it already should be big enough. Other open source applications (e.g. superset) solve this by storing the access token server-side instead of in the cookie. This is why I changed the code to only store the access token server side, .i.e., only in the KeycloakAuthManagerUser class and storing only the refresh token in the cookie. Refresh token stays small since roles are not stored in it.

I have tested it locally and deployed with helm on OpenShift and issue #61771 is only resolved with my change. Otherwise the login does not work. In our Keycloak realm access tokens can include 15+ realm roles.

@vincbeck
Copy link
Contributor

Hello @vincbeck, thanks for looking at the change. With my changes, the access token is not saved in the cookie even after the refresh token flow. After refresh token flow in KeycloakAuthManager.refresh_user() the access token is only stored as a attribute in the KeycloakAuthManagerUser class. After login is successful, the KeycloakAuthManager.is_authorized() method does not look into the cookie but only at the deserialized attribute KeycloakAuthManagerUser.access_token. So the access token is not stored at any time in the cookie.

Splitting access token and refresh token across two cookies does not solve the problem in the long run, as the access token alone can easily exceed 4KB. With 15+ realm roles it already should be big enough. Other open source applications (e.g. superset) solve this by storing the access token server-side instead of in the cookie. This is why I changed the code to only store the access token server side, .i.e., only in the KeycloakAuthManagerUser class and storing only the refresh token in the cookie. Refresh token stays small since roles are not stored in it.

I have tested it locally and deployed with helm on OpenShift and issue #61771 is only resolved with my change. Otherwise the login does not work. In our Keycloak realm access tokens can include 15+ realm roles.

But at the end this is saved in a cookie. It has to be saved somewhere and we do not store anything on the backend. If you look at

, the token generated saved in KeycloakAuthManagerUser is serialized in then saved in the cookie.

KeycloakAuthManagerUser is an object that is serialized and saved in a cookie when logging in (or when the token is refreshed), then when the backend receives a request, it fetches the serialized object from the token, deserialized it and use it to retrieve access token.

@exolightor
Copy link
Author

Hello @vincbeck, thanks for looking at the change. With my changes, the access token is not saved in the cookie even after the refresh token flow. After refresh token flow in KeycloakAuthManager.refresh_user() the access token is only stored as a attribute in the KeycloakAuthManagerUser class. After login is successful, the KeycloakAuthManager.is_authorized() method does not look into the cookie but only at the deserialized attribute KeycloakAuthManagerUser.access_token. So the access token is not stored at any time in the cookie.
Splitting access token and refresh token across two cookies does not solve the problem in the long run, as the access token alone can easily exceed 4KB. With 15+ realm roles it already should be big enough. Other open source applications (e.g. superset) solve this by storing the access token server-side instead of in the cookie. This is why I changed the code to only store the access token server side, .i.e., only in the KeycloakAuthManagerUser class and storing only the refresh token in the cookie. Refresh token stays small since roles are not stored in it.
I have tested it locally and deployed with helm on OpenShift and issue #61771 is only resolved with my change. Otherwise the login does not work. In our Keycloak realm access tokens can include 15+ realm roles.

But at the end this is saved in a cookie. It has to be saved somewhere and we do not store anything on the backend. If you look at

, the token generated saved in KeycloakAuthManagerUser is serialized in then saved in the cookie.
KeycloakAuthManagerUser is an object that is serialized and saved in a cookie when logging in (or when the token is refreshed), then when the backend receives a request, it fetches the serialized object from the token, deserialized it and use it to retrieve access token.

Ok, thanks for the pointer. I do find it quite odd then that in my case with my changes the access token fits into the cookie after refresh token flow but not after the initial login callback. I will investigate this further if I get the chance.

I forgot to copy an additional KeycloakAuthManager.refresh_user() call to the PR, which was needed to solve Issue #61771 on my end, I included it in my most recent commit. There I assumed that the access token would be stored in the python class, but if we would exclude the access token from the cookie entirely this should lead to the refresh token flow being triggered on each request, which would be very unnecessary overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keycloak provider login fails if access token is too large because of cookie size limit

3 participants