Class of Policy
Each instance of Policy contains an atomic permission. Note that atomic permission is defined based on AKASA engineering needs. The AWS resources’ naming needed to be designed appropriately to accommodate this need.
For example, we separate clients’ data using S3 prefixes, like s3://ops-data/{client}. Then the authorizations needed to read from s3://ops-data/tcm/* is an atomic permission and should be used by an IAM user for client tcm. The authorizations needed to read from s3://ops-data/* is also an atomic permission, for the usage of developers. (AKASA developers work across all clients, instead of being assigned to specific clients.)
from typing import List INDENT = ' ' DEFAULT_INDENT = 5 class Condition(): def __init__( self, name, operator: str, condition_key: str, values: List[str], ): self.name = name self.operator = operator self.condition_key = condition_key self.values = values def to_yaml(self, **kargs) -> str: """Convert the condition to a yaml string""" base_indent = (DEFAULT_INDENT + 2) * INDENT name_str = f"\n{base_indent}# {self.name}" operator_str = f"\n{base_indent}{self.operator}:" condition_str = f"\n{base_indent}{INDENT}{self.condition_key}:" values_str = '' for value in self.values: values_str += f"\n{base_indent}{2*INDENT}- \"{value}\"" yaml = name_str + operator_str + condition_str + values_str for k, v in kargs.items(): self._template_replace(yaml, k, v) return yaml def __lt__(self, other): return self.name def _template_replace(self, template, key, value): return template.replace("{{{}}}".format(str(key)), value) class Policy(): def __init__( self, name: str, effect: str, actions: List[str], resources: List[str], conditions: List[Condition] = None, ): """Initialize the Policy object""" self.name = name self.effect = effect self.actions = actions if isinstance(actions, str): raise ValueError("Actions must be a list, got %r." % actions) self.resources = resources self.conditions = conditions def to_yaml(self, **kargs) -> str: """Convert the policy to a yaml string""" base_indent = DEFAULT_INDENT * INDENT name_str = f"\n{base_indent}# {self.name}" effect_str = f"\n{base_indent}- Effect: \"{self.effect}\"" action_str = '' for action in self.actions: action_str += f"\n{base_indent}{2*INDENT}- \"{action}\"" action_str = f"\n{base_indent}{INDENT}Action:{action_str}" resource_str = '' for resource in self.resources: resource_str += f"\n{base_indent}{2*INDENT}- \"{resource}\"" resource_str = f"\n{base_indent}{INDENT}Resource:{resource_str}" condition_str = '' if self.conditions is not None: condition_str += f"\n{base_indent}{INDENT}Condition:" for condition in self.conditions: condition_str += condition.to_yaml() yaml = name_str + effect_str + action_str + resource_str + condition_str for k, v in kargs.items(): yaml = self._template_replace(yaml, k, v) return yaml def __lt__(self, other): return self.name def _template_replace(self, template, key, value): return template.replace("{{{}}}".format(str(key)), value)
An atomic permission needs a name, effect, a list of actions, a list of resources, and an optional list
of conditions. See below for an example Policy named S3AllowListOnBucketMldata that allows s3:ListBucket on the bucket ops-data.
Note that s3:ListBucket grants permission to list some or all objects in this bucket, and the resource for it must be buckets. If one wants to allow listing only a subset of objects, s/he can do it by using setting up a condition, see example of S3AllowListOnBucketMldataPrefixDevserver, which allows s3:ListBucket only on prefixes that looks like devserver/* (regex-wise).
POLICY_S3_LIST_MLDATA_BUCKET = Policy( 'S3AllowListOnBucketMldata', 'Allow', ['s3:ListBucket'], [ 'arn:aws:s3:::ops-data', ], ) POLICY_S3_LIST_MLDATA_BUCKET_PREFIX_DEVSERVER = Policy( 'S3AllowListOnBucketMldataPrefixDevserver', 'Allow', ['s3:ListBucket'], [ 'arn:aws:s3:::ops-data', ], conditions = [ Condition( name=f"RestrictListBucketToPrefix", operator="StringLike", condition_key="s3:prefix", values=["devserver/*"]) ] )
As shown in the architecture, the middle and leaf nodes are PolicyGroups. Each leaf node is mapped to an engineer. Code of such mapping is in the Generator.
As shown in the code below, PolicyGroup contains a list of Policies and a list of PolicyGroups. The flatten() function will fetch all the Policies recursively through a depth-first search. The to_yaml() function will sort these flattened Policies by name and render them to a YAML format. The generated yaml files will be managed by git. Sorting is mainly for ease of showing file differences
in GitHub pull-requests.
class PolicyGroup(): def __init__(self, policies=None, policy_groups=None): """Initialize the object with policies and policy_groups. Note that PolicyGroup has a nested definition, i.e. a policyGroup can contain other policyGroups. But there is no need to do cycle detections, since the policyGroups are immutable, thus a parent can never refer to its child (the child does not exist yet when the parent is initialized), thus cycle does exist.""" self.policies = policies self.policy_groups = policy_groups self.policies_dedupped = None def flatten(self): if self.policies_dedupped is not None: return self.policies_dedupped self.policies_dedupped = set() if self.policies is not None: for policy in self.policies: self.policies_dedupped.add(policy) if self.policy_groups is not None: for policy_group in self.policy_groups: self.policies_dedupped.update(policy_group.flatten()) return self.policies_dedupped def to_yaml(self, **kargs) -> str: """ Convert the policyGroup to a yaml string""" flattened = self.flatten() yaml_str = '' for policy in sorted(list(flattened)): yaml_str += policy.to_yaml(**kargs) return yaml_str
Generator
For each engineer, we use CloudFormation to generate an IAM instance profile, which is attached to the engineer’s EC2 devserver. The CloudFormation manifest is generated by filling a template. The template includes a few things: a few IAM policies, an IAM role that is associated with these IAM policies, and an IAM instance profile that “contains” the IAM role.
Theoretically, each PolicyGroup produces a single IAM Policy. Why do we have a few IAM policies in a CloudFormation stack? The reason is AWS IAM has a hard limit of 6,144 characters for each IAM policy (see Ref). The generated IAM policy for a PolicyGroup easily exceeds 6,144 characters. That’s why we need to split it. As of now, we use the heuristic rule, i.e. all S3-related permissions go to the 1st IAM policy, all SecretsManager-related permissions go to the 2nd IAM policy, and the remaining goes to the 3rd IAM policy.
We can always do more splitting if needed.
Template for instance profile
The CloudFormation template for IAM instance profile is shown below. {Policies} is a combination of multiple IAM policies, which are generated by the IAM policy template below. Using engineer Jane Doe as an example, RoleName will be PersonalRoleJaneDoe, InstanceProfileName will be PersonalInstanceProfileJaneDoe, and the three PolicyNames will be PersonalPolicyS3JaneDoe, PersonalPolicySecretManagerJaneDoe and PersonalPolicyMiscJaneDoe.
The code used to render the CloudFormation manifest from the template is in github.
AWSTemplateFormatVersion: {TemplateFormatVersion} Description: {ProfileDescription} Resources: {Policies} {RoleName}: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: {AssumeRolePolicyDocumentVersion} Statement: - Effect: Allow Principal: Service: - ec2.amazonaws.com Action: - sts:AssumeRole - Effect: Allow Principal: AWS: - arn:aws:iam::025412125743:user/ServiceUserProdKrun Action: - sts:AssumeRole Path: "/" ManagedPolicyArns: - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy {PolicyReferences} RoleName: {RoleName} {InstanceProfileName}: Type: AWS::IAM::InstanceProfile Properties: InstanceProfileName: {InstanceProfileName} Path: "/" Roles: - !Ref {RoleName} IAM policy template: {PolicyName}: Type: AWS::IAM::ManagedPolicy Properties: Description: {PolicyDescription} ManagedPolicyName: {PolicyName} Path: / PolicyDocument: Version: {PolicyDocumentVersion} Statement: {Effects}
Note that in addition to personal roles, the framework discussed above can be applied to other kinds of IAM needs, as long as appropriate templates are defined.
For example, to use the framework to managed an IAM user, the template is defined like below. We won’t go into details for this template in this article.
AWSTemplateFormatVersion: {TemplateFormatVersion} Description: {StackDescription} Resources: {PolicyName}: Type: AWS::IAM::ManagedPolicy Properties: Description: {PolicyDescription} ManagedPolicyName: {PolicyName} Path: "/" PolicyDocument: Version: {PolicyDocumentVersion} Statement: {Effects} {GroupName}: Type: AWS::IAM::Group Properties: GroupName: {GroupName} ManagedPolicyArns: - !Ref {PolicyName} {UserName}: Type: AWS::IAM::User Properties: Groups: - !Ref {GroupName} UserName: {UserName}
Procedure of Changes
Using the IAM-as-code discussed above, the procedure to make IAM changes is as below.
-
- An engineer makes code changes in the python scripts
- The engineer runs CloudFormation generator which modifies/adds files to the repo
- The engineer submits a Github PR with the file changes
- The DevOps team reviews the PR, comments and fixes
- The engineer (or DevOps) merges the PR
- The DevOps team deploys the IAM changes (only DevOps team has permissions to do so)
Example: Add read permission to an S3 bucket (and prefix) to a team
The tech lead of the team needs to create an S3-read Policy, see below. It allows listing all objects in the bucket and reading the objects in prefix streaming/*.
And then the tech lead adds the Policy to the PolicyGroup of the team.
Since each ML team member’s PolicyGroup either contains the POLICY_GROUP_TEAM_ML or is POLICY_GROUP_TEAM_ML itself, this change will be reflected in all ML team members.
POLICY_S3_READ_MLDATA_BUCKET_PREFIX_STREAMING = Policy( 'S3AllowReadOnBucketMldataPrefixStreaming', 'Allow', [ 's3:GetObject', 's3:GetObjectAcl', 's3:GetObjectVersion', 's3:ListBucket', 's3:ListObjectVersions', 's3:GetBucketLocation', ], [ 'arn:aws:s3:::ml-data', 'arn:aws:s3:::ml-data/streaming/*', ], ) POLICY_GROUP_TEAM_ML = PolicyGroup(policies=[ ## Existing policies ... POLICY_S3_READ_MLDATA_BUCKET_PREFIX_STREAMING, ])
Example: The tech lead needs a special permission
Same as above, the tech lead (name is John Doe) creates a Policy for the special permission. And then s/he adds the new Policy to the personal PolicyGroup, see below. If the tech lead used to use JohnDoe: POLICY_GROUP_TEAM_ML mapping in the generator, now he can change it to JohnDoe: POLICY_GROUP_JOHN_DOE. You can tell how flexible the system is.
POLICY_ML_SPECIAL = ... POLICY_GROUP_JOHN_DOE = PolicyGroup( policies=[ POLICY_ML_SPECIAL, ], policiGroups=[ POLICY_GROUP_TEAM_ML, ], )
Additional Discussion
Since we use Kubernetes (AWS managed EKS), developers need to be authenticated when operating the Kubernetes cluster. We use a ConfigMap to map the developers’ IAM roles to a ClusterRole (called eng-clusterrole), which has the proper authorizations to operate the Kubernetes cluster. If you do not know ClusterRole in Kubernetes, refer to this link.
Since we restrict the developers’ access to the AWS console, a problem arises: how can the developers get the CodeBuild artifact and logs? (Context: at AKASA we use AWS CodeBuild to build the docker images or artifacts.) So our solution is to build a tool called cbgm, which is a script containing AWS CLI commands, to assist the developers in fetching CodeBuild logs and artifacts.
Summary
It was a long journey, but after much collaboration and trial and error, we found a system that works for our team and allows us to build world-class automation for healthcare operations. To recap:
- Each engineer has his/her own IAM role, thus all logged activities have a unique owner and can be easily tracked.
- Due to the multi-inheritance structure, we eliminate duplicates.
- Principle of least privilege. The DevOps team has a centralized control that adheres to this principle.
- Changing permission is a systematic, fast and secure process.
If you’re interested in joining our talented team and want to help us drive down the cost of healthcare in the U.S., we’re always looking for talented engineers. Be sure to check out our open positions today. We can’t wait for you to be a part of our growing team.