Choose Authorization Engine
- 
Status: draft 
- 
Deciders: - 
Florian Waibel 
- 
Lars Francke 
- 
Lukas Menzel 
- 
Bernd Fondermann 
- 
Oliver Hessel 
- 
Sönke Liebau 
 
- 
- 
Date: n/a 
Context and Problem Statement
We need some form of authorization engine both for the products that are deployed via our stack as well as for our internal apis. This engine should have the ability to express universal access controls, as it will need to be adapted to many different end products:
- 
Stackable 
- 
Hadoop 
- 
Kafka 
- 
Airflow 
- 
Elasticsearch 
- 
… 
Depending on which option is chosen, there is a second, implicit, decision that is taken as part of this record: whether or not to include an identity provider. Keycloak and Ranger both offer user management on top of authoriztion, whereas Open Policy Agent is purely an authorization engine.
I’m not sure if we need to split this decision out into a separate ADR, but I suspect that it may make sense. If Open Policy Agent is chosen as part of this ADR, at some point we need to decide whether we also need an identity provider and if so, which one we should pick.
Decision Drivers <!-- optional -→
- 
Availability of plugins for initial components or expected effort for implementation 
- 
Flexibility of rule engine 
Decision Outcome
Pros and Cons of the Options
Ranger
Ranger is the de facto default authorization tool in the big data ecosystem. It offers existing integrations with a variety of tools and is used by the Cloudera offer as central access management component.
- 
Good, because most necessary integrations already exist 
- 
Good, existing know how applies 
- 
Good, because it offers id provider functionality 
- 
Bad, because adding new tools is complex 
- 
Bad, because objects to authorize on need to be defined in code (see Open Policy Agent for comparison) 
- 
Bad, because user synchronization mechanisms are fairly limited 
Open Policy Agent
Open Policy Agent is a universal authorization engine that has become popular in the Kubernetes (but not exclusively) environment lately. OPA defines ACLs in an abstract language called Rego which allows keeping authorization logic in the ACL definition, instead of source code. This gives a much higher degree of abstraction and thus flexibility than having to hard-code the logic for every application.
- 
Good, because relatively small effort to implement new tools 
- 
Good, because very flexible system to define ACLs 
- 
Bad, because no real HA concept 
- 
Bad, because only one authorizer (Kafka) already implemented 
- 
Bad, because would require additional identity provider 
Keycloak
Keycloak is based on a Wildfly application server and probably the most fully featured alternative of the ones discussed. It allows integration with LDAP and AD, offers authorization, a clustered mode for high availability and much more.
- 
Good, because gives a high degree of flexibility in adapting customers id solutions 
- 
Good, because well established and widely used (GAIA-X, SCS) 
- 
Bad, because no existing authorization plugins 
- 
Bad, because objects to authorize on need to be defined in code (see Open Policy Agent for comparison)