# Authentication Architecture
This guide explains how authentication and authorization work in SYNDI, including the JWT token flow, Cognito integration, and RBAC implementation.
## Overview
SYNDI uses **AWS Cognito** for authentication with **JWT tokens** for API authorization. The system implements **role-based access control (RBAC)** using Cognito groups with fine-grained permissions.
### Key Components
- **AWS Cognito User Pools** - User authentication and management
- **JWT Tokens** - Stateless authorization (Access and ID tokens)
- **Cognito Groups** - Role assignment (ADMINS, LAB_MANAGERS, RESEARCHERS, CLINICIANS)
- **Permission System** - Wildcard-based permissions (`submit:SOP*`, `view:*`, etc.)
- **Backend Validation** - AuthValidator in `backend/rawscribe/utils/auth.py`
- **Frontend Enforcement** - UX-level access control
## Authentication Flow
### Complete Login Flow
```
1. User enters credentials in frontend
↓
2. Frontend sends to Cognito
POST https://cognito-idp.{region}.amazonaws.com/
Body: {username, password, clientId}
↓
3. Cognito validates credentials
↓
4. Cognito returns JWT tokens
Response: {AccessToken, IdToken, RefreshToken}
↓
5. Frontend stores tokens (localStorage or sessionStorage)
↓
6. Frontend sends requests with Authorization header
Authorization: Bearer {AccessToken}
↓
7. API Gateway validates token (optional Cognito authorizer)
↓
8. Lambda receives request with token
↓
9. AuthValidator validates token and extracts user info
↓
10. Lambda processes request with user context
↓
11. Response returned to frontend
```
### Token Lifecycle
**Token Issuance:**
```
User Login → Cognito → JWT Tokens (signed with Cognito private key)
```
**Token Validation:**
```
Request → Lambda → AuthValidator → Fetch Cognito public keys (JWKS)
→ Verify signature
→ Check expiration
→ Extract user claims
→ Grant/deny access
```
**Token Expiration:**
- **Access Token**: 1 hour
- **ID Token**: 1 hour
- **Refresh Token**: 30 days
**Token Refresh:**
```
Frontend detects token near expiry
↓
Send refresh token to Cognito
↓
Receive new Access and ID tokens
↓
Continue using API
```
## JWT Token Structure
### Access Token
Used for API authorization:
**Claims:**
```json
{
"sub": "uuid-1234-5678", // User ID (UUID)
"cognito:groups": ["RESEARCHERS"], // User's groups
"token_use": "access", // Token type
"iss": "https://cognito-idp.us-east-1.amazonaws.com/{pool-id}",
"client_id": "abc123def456",
"username": "uuid_with_underscores", // Hyphens replaced
"exp": 1706654321, // Expiration timestamp
"iat": 1706650721 // Issued at timestamp
}
```
**Usage:** Send in Authorization header for API requests
**Email derivation:** For UUID usernames, email derived as `{username}@cognito.local`
### ID Token
Used for user identity information:
**Claims:**
```json
{
"sub": "uuid-1234-5678", // User ID
"cognito:groups": ["RESEARCHERS"], // User's groups
"email": "researcher1@myorg.com", // User's email
"name": "Jane Researcher", // User's name
"cognito:username": "researcher1", // Username (email prefix)
"token_use": "id", // Token type
"iss": "https://cognito-idp.us-east-1.amazonaws.com/{pool-id}",
"exp": 1706654321,
"iat": 1706650721
}
```
**Usage:** Get user profile information, display name in UI
## JWT Validation Process
### Backend Implementation
Located in: `backend/rawscribe/utils/auth.py`
**Validation steps:**
```python
# 1. Extract token from Authorization header
token = request.headers.get('Authorization', '').replace('Bearer ', '')
# 2. Decode token (verify signature)
decoded = jwt.decode(
token,
cognito_public_key, # Fetched from JWKS endpoint
algorithms=['RS256'],
options={"verify_signature": True}
)
# 3. Validate claims
# - Check expiration (exp)
# - Verify issuer matches User Pool
# - Verify token type (access or id)
# - Check audience (client_id)
# 4. Extract user information
user_id = decoded.get('sub')
username = decoded.get('username') or decoded.get('cognito:username')
groups = decoded.get('cognito:groups', [])
# 5. Map groups to permissions
permissions = self._map_cognito_permissions(groups)
# 6. Create user context
user = {
'id': user_id,
'username': username,
'email': email,
'groups': groups,
'permissions': permissions
}
```
### Environment Variables vs Config Files
**Priority order:**
```python
# 1. Check environment variables (from CloudFormation)
cognito_region = os.environ.get('COGNITO_REGION')
cognito_pool_id = os.environ.get('COGNITO_USER_POOL_ID')
cognito_client_id = os.environ.get('COGNITO_CLIENT_ID')
# 2. Fall back to config file
if not cognito_pool_id:
cognito_pool_id = config.get('lambda', {}).get('auth', {}).get('cognito', {}).get('userPoolId')
```
**Why environment variables first:**
- Set by CloudFormation (always correct for deployment)
- No config file loading failures
- Faster access
- Automatic updates on redeployment
## RBAC Implementation
### Group to Permission Mapping
Located in: `backend/rawscribe/utils/auth.py:_map_cognito_permissions()`
```python
def _map_cognito_permissions(self, groups: List[str]) -> List[str]:
"""Map Cognito groups to SYNDI permissions"""
permission_mapping = {
'ADMINS': ['*'],
'LAB_MANAGERS': ['submit:*', 'view:*', 'approve:*', 'export:*'],
'RESEARCHERS': ['submit:SOP*', 'view:own', 'view:group', 'draft:*'],
'CLINICIANS': ['submit:clinical*', 'view:own']
}
permissions = []
for group in groups:
permissions.extend(permission_mapping.get(group, ['view:own']))
return list(set(permissions))
```
**Note:** Legacy mapping also supports lowercase group names (admin, researcher, viewer) for backward compatibility.
### Permission Format
Permissions follow pattern: `{action}:{resource}`
**Examples:**
- `*` - All permissions (ADMINS only)
- `submit:SOP*` - Submit any SOP
- `submit:clinical*` - Submit clinical forms only
- `view:own` - View own submissions
- `view:group` - View team submissions
- `view:*` - View all submissions
- `draft:*` - Full draft management
- `approve:*` - Approve submissions
- `export:*` - Export data
### Permission Checking
```python
def has_permission(user: dict, required_permission: str) -> bool:
"""Check if user has required permission"""
user_permissions = user.get('permissions', [])
# Admin wildcard
if '*' in user_permissions:
return True
# Exact match
if required_permission in user_permissions:
return True
# Wildcard match (e.g., submit:* matches submit:SOP123)
for perm in user_permissions:
if perm.endswith('*'):
prefix = perm[:-1]
if required_permission.startswith(prefix):
return True
return False
```
## Cognito Integration
### User Pool Configuration
Created by CloudFormation when `ENABLE_AUTH=true`:
```yaml
CognitoUserPool:
Type: AWS::Cognito::UserPool
Properties:
UserPoolName: !Sub 'rawscribe-${Environment}-${Organization}-userpool'
UsernameAttributes: [email]
AutoVerifiedAttributes: [email]
Policies:
PasswordPolicy:
MinimumLength: 8
RequireUppercase: true
RequireLowercase: true
RequireNumbers: true
RequireSymbols: true
Schema:
- Name: email
Required: true
Mutable: false
- Name: name
Required: false
Mutable: true
```
### App Client Configuration
```yaml
CognitoUserPoolClient:
Type: AWS::Cognito::UserPoolClient
Properties:
ClientName: !Sub 'rawscribe-${Environment}-${Organization}-client'
UserPoolId: !Ref CognitoUserPool
GenerateSecret: false
ExplicitAuthFlows:
- ALLOW_USER_PASSWORD_AUTH
- ALLOW_REFRESH_TOKEN_AUTH
- ALLOW_ADMIN_USER_PASSWORD_AUTH
PreventUserExistenceErrors: ENABLED
```
**Auth flows enabled:**
- `ADMIN_USER_PASSWORD_AUTH` - For backend user creation
- `USER_PASSWORD_AUTH` - For frontend login
- `REFRESH_TOKEN_AUTH` - For token refresh
### Cognito Groups
Four groups created automatically:
```yaml
CognitoAdminGroup:
GroupName: ADMINS
Precedence: 1
CognitoLabManagerGroup:
GroupName: LAB_MANAGERS
Precedence: 2
CognitoResearcherGroup:
GroupName: RESEARCHERS
Precedence: 3
CognitoClinicianGroup:
GroupName: CLINICIANS
Precedence: 4
```
**Precedence:** Lower number = higher priority (used for token claims)
## API Gateway Authorization
### Cognito Authorizer
Configured in template.yaml:
```yaml
ApiGateway:
Type: AWS::Serverless::Api
Properties:
Auth:
Authorizers:
CognitoAuthorizer:
UserPoolArn: !If
- CreateUserPool
- !GetAtt CognitoUserPool.Arn
- !Sub 'arn:aws:cognito-idp:${AWS::Region}:${AWS::AccountId}:userpool/${CognitoUserPoolId}'
```
### Endpoint Protection
**Protected endpoints:**
```yaml
ApiProxy:
Path: /api/{proxy+}
Method: ANY
Auth:
Authorizer: CognitoAuthorizer
ApiV1Proxy:
Path: /api/v1/{proxy+}
Method: ANY
Auth:
Authorizer: CognitoAuthorizer
```
**Unprotected endpoints:**
```yaml
RootGet:
Path: /
Method: GET
# No Auth - public health check
HealthGet:
Path: /health
Method: GET
# No Auth - public health check
```
**Endpoint protection levels:**
- ✅ `/` and `/health` - Public (no auth required)
- 🔒 `/api/*` - Requires valid JWT token
- 🔐 Specific endpoints - RBAC enforced by Lambda code
## Security Model
### Multi-Layer Security
**Layer 1: API Gateway**
- Cognito authorizer validates JWT signature
- Checks token not expired
- Verifies token from correct User Pool
**Layer 2: Lambda Backend**
- Re-validates JWT (defense in depth)
- Extracts user information
- Maps groups to permissions
- Checks endpoint-specific permissions
**Layer 3: Frontend**
- UX-level enforcement
- Hides unauthorized features
- Client-side validation only (not trusted)
### Token Security
**What's in environment variables:**
- `COGNITO_USER_POOL_ID` - Public identifier (e.g., `us-east-1_ABC123`)
- `COGNITO_CLIENT_ID` - Public client ID (e.g., `abc123def456`)
- `COGNITO_REGION` - AWS region (e.g., `us-east-1`)
**These are NOT secrets** - They're configuration pointers telling Lambda which User Pool to validate against.
**Actual security:**
- JWT tokens signed by Cognito's private key
- Validation uses Cognito's public keys (fetched via HTTPS from JWKS endpoint)
- Signature proves token issued by correct Cognito User Pool
- Cannot forge tokens without Cognito's private key
### Cross-Organization Security
**Isolation mechanism:**
- Org1 Lambda has `COGNITO_USER_POOL_ID` = org1's pool
- Org2 Lambda has `COGNITO_USER_POOL_ID` = org2's pool
- Token from org1 won't validate in org2's Lambda
- Complete user and data isolation
## Username Handling
### Username Format Requirements
**Valid formats:**
- Email addresses: `user@myorg.com`
- No hyphens allowed (filesystem delimiter conflict)
**Username transformations:**
```python
# backend/rawscribe/utils/auth.py line 258
# UUID usernames have hyphens replaced with underscores
username = username.replace('-', '_')
# Email derivation for UUID usernames
if '@' not in username:
email = f"{username}@cognito.local"
else:
email = username
```
**Why no hyphens:**
- Filesystem path delimiters use hyphens
- Prevents path traversal issues
- Ensures consistent username format
### Username Types
**Email-based usernames:**
- Username: `researcher1@myorg.com`
- Email: `researcher1@myorg.com`
- Display name: From `name` attribute
**UUID-based usernames:**
- Username: `uuid_with_underscores` (hyphens replaced)
- Email: `uuid_with_underscores@cognito.local`
- Display name: From `name` attribute if set
## Token Validation Implementation
### AuthValidator Class
Located in: `backend/rawscribe/utils/auth.py`
**Initialization:**
```python
class AuthValidator:
def __init__(self, config: dict):
# Get Cognito configuration
self.cognito_region = os.environ.get('COGNITO_REGION') or \
config.get('lambda', {}).get('auth', {}).get('cognito', {}).get('region')
self.cognito_user_pool_id = os.environ.get('COGNITO_USER_POOL_ID') or \
config.get('lambda', {}).get('auth', {}).get('cognito', {}).get('userPoolId')
self.cognito_client_id = os.environ.get('COGNITO_CLIENT_ID') or \
config.get('lambda', {}).get('auth', {}).get('cognito', {}).get('clientId')
# Fetch Cognito public keys for JWT verification
self.cognito_keys = self._fetch_cognito_public_keys()
```
**Token validation:**
```python
async def validate_token(self, token: str) -> dict:
"""Validate JWT token and return user info"""
try:
# Decode and verify JWT
decoded = jwt.decode(
token,
self.cognito_public_key,
algorithms=['RS256'],
audience=self.cognito_client_id,
issuer=f"https://cognito-idp.{self.cognito_region}.amazonaws.com/{self.cognito_user_pool_id}"
)
# Extract user information
user_id = decoded.get('sub')
username = decoded.get('username') or decoded.get('cognito:username')
groups = decoded.get('cognito:groups', [])
# Map groups to permissions
permissions = self._map_cognito_permissions(groups)
# Handle username format
username = username.replace('-', '_')
if '@' not in username:
email = f"{username}@cognito.local"
else:
email = username
return {
'id': user_id,
'username': username,
'email': email,
'groups': groups,
'permissions': permissions,
'isAdmin': '*' in permissions
}
except jwt.ExpiredSignatureError:
raise AuthenticationError("Token expired")
except jwt.InvalidTokenError as e:
raise AuthenticationError(f"Invalid token: {str(e)}")
```
### Cognito Public Keys (JWKS)
**JWKS Endpoint:**
```
https://cognito-idp.{region}.amazonaws.com/{pool-id}/.well-known/jwks.json
```
**Fetching keys:**
```python
def _fetch_cognito_public_keys(self):
"""Fetch Cognito public keys for JWT verification"""
jwks_url = f"https://cognito-idp.{self.cognito_region}.amazonaws.com/" \
f"{self.cognito_user_pool_id}/.well-known/jwks.json"
response = requests.get(jwks_url)
jwks = response.json()
# Convert JWKS to public key objects
keys = {}
for key in jwks['keys']:
keys[key['kid']] = jwk.construct(key)
return keys
```
**Key rotation:**
- Cognito automatically rotates keys
- JWKS fetched on Lambda cold start
- Cached during Lambda warm state
- Validates against current and previous keys
## Permission System
### Permission Schema
Format: `{action}:{resource}`
**Actions:**
- `submit` - Create new submissions
- `view` - Read submissions
- `draft` - Manage drafts
- `approve` - Approve submissions
- `export` - Export data
- `admin` - Administrative actions
- `*` - All actions (wildcard)
**Resources:**
- `SOP*` - All SOPs
- `clinical*` - Clinical forms
- `own` - User's own data
- `group` - Team/group data
- `*` - All resources (wildcard)
### Wildcard Support
**Full wildcard** (`*`):
- Grants all permissions
- ADMINS only
- Matches any permission check
**Action wildcard** (`submit:*`):
- Grants all submit actions
- Matches `submit:SOP123`, `submit:clinical456`, etc.
**Resource wildcard** (`view:*`):
- Grants view on all resources
- Matches `view:own`, `view:group`, `view:all`
### Permission Checking in Routes
```python
from fastapi import Depends, HTTPException
from .utils.auth import get_current_user
@router.post("/api/v1/eln/submit")
async def submit_eln(user: dict = Depends(get_current_user)):
"""Submit ELN - requires submit:SOP* permission"""
if not has_permission(user, 'submit:SOP*'):
raise HTTPException(status_code=403, detail="Insufficient permissions")
# Process submission
...
```
## Frontend Authentication
### Token Storage
**Development:**
```typescript
// Store in localStorage for persistence across tabs
localStorage.setItem('syndi_access_token', accessToken);
localStorage.setItem('syndi_id_token', idToken);
localStorage.setItem('syndi_refresh_token', refreshToken);
```
**Production:**
```typescript
// Consider sessionStorage for higher security
sessionStorage.setItem('syndi_access_token', accessToken);
```
### Auth Context
Frontend provides authentication context:
```typescript
// frontend/src/shared/lib/auth.tsx
const AuthContext = React.createContext({
user: null,
isAuthenticated: false,
login: async (username, password) => {...},
logout: () => {...},
refreshToken: async () => {...}
});
```
### Protected Routes
```typescript
function ProtectedRoute({ children, requiredPermission }) {
const { user, isAuthenticated } = useAuth();
if (!isAuthenticated) {
return ;
}
if (requiredPermission && !hasPermission(user, requiredPermission)) {
return ;
}
return children;
}
```
## Security Considerations
### Token Security
1. **HTTPS Only** - Never send tokens over HTTP
2. **Secure Storage** - Use httpOnly cookies in production (TBD)
3. **Token Expiration** - Tokens expire after 1 hour
4. **Refresh Tokens** - Stored securely, used to get new tokens
5. **Logout** - Clear all tokens from storage
### Cognito Security
1. **Password Policy** - Strong passwords enforced
2. **MFA Support** - Can enable multi-factor authentication
3. **Account Recovery** - Email-based password reset
4. **Audit Logging** - CloudTrail logs all Cognito operations
5. **User Pool Isolation** - Each org has separate pool
### API Security
1. **JWT Validation** - Every request validated
2. **Permission Checks** - Endpoint-level authorization
3. **CORS** - Configured per organization
4. **Rate Limiting** - API Gateway throttling
5. **Encryption in Transit** - HTTPS required
## Advantages of Cognito RBAC
1. **Centralized Management** - Single identity provider
2. **Scalability** - Handles thousands of users
3. **MFA Support** - Built-in multi-factor authentication
4. **Federated Identity** - Can integrate with corporate SSO
5. **Audit Trails** - CloudTrail logging for compliance
6. **Compliance** - SOC, PCI DSS, HIPAA eligible
7. **No Infrastructure** - Fully managed service
8. **Token Standards** - Industry-standard JWT/OAuth
## Testing Authentication
See [Testing Authentication](../authentication/testing-auth.md) for complete testing guide.
**Quick test:**
```bash
# Test locally
make test-jwt-local ENV=stage ORG=myorg
# Test on AWS
make test-jwt-aws ENV=stage ORG=myorg
```
## Related Documentation
- [Authentication Provider Pattern](auth-provider-architecture.md) - Technical deep-dive into provider abstraction
- [RBAC System](../authentication/rbac.md) - Detailed RBAC documentation
- [User Management](../authentication/user-management.md) - Managing users
- [Testing Authentication](../authentication/testing-auth.md) - Auth testing
- [Configuration System](configuration-system.md) - Cognito configuration