# Configuration System Architecture This document explains SYNDI's configuration architecture, including how configuration files are structured, merged, and deployed across different environments and organizations. ## Overview SYNDI uses a **three-tier configuration system** designed to separate infrastructure values from application behavior while supporting multiple organizations: 1. **CloudFormation Outputs** → Lambda Environment Variables (infrastructure values) 2. **Base JSON Configs** → Application behavior settings 3. **Org-specific Override Configs** → Organization customizations This architecture follows these principles: - **Infrastructure values** (Cognito IDs, bucket names, API endpoints) → CloudFormation outputs → Lambda environment variables - **Application behavior** (file limits, retry logic, UI preferences) → JSON config files - **Deployment parameters** (ENV, ORG, ENABLE_AUTH, CREATE_BUCKETS) → Makefile command line - **Zero redundancy** - Each value defined exactly once - **Pure functional deployment** - `f(ENV, ORG, parameters) → Infrastructure` ## Configuration File Structure ### Directory Layout ``` infra/.config/ # Central config management (NOT in git) ├── lambda/ │ ├── dev.json # Base Lambda config for dev │ ├── dev-{org}.json # Org-specific Lambda overrides for dev │ ├── test.json # Base Lambda config for test │ ├── stage.json # Base Lambda config for stage │ ├── stage-{org}.json # Org-specific Lambda overrides for stage │ ├── prod.json # Base Lambda config for prod │ └── prod-{org}.json # Org-specific Lambda overrides for prod ├── webapp/ │ ├── dev.json # Base webapp config for dev │ ├── dev-{org}.json # Org-specific webapp overrides for dev │ ├── test.json # Base webapp config for test │ ├── stage.json # Base webapp config for stage │ ├── stage-{org}.json # Org-specific webapp overrides for stage │ ├── prod.json # Base webapp config for prod │ └── prod-{org}.json # Org-specific webapp overrides for prod └── cloudformation/ # CloudFormation parameter files ├── dev.json ├── stage.json └── prod.json ``` ### What Goes Where #### ✅ JSON Config Files (Application Behavior) Store these in `infra/.config/lambda/{env}.json` or `infra/.config/webapp/{env}.json`: - File upload limits (`max_file_size_mb`) - Retry policies (`max_retries`, `backoff_multiplier`) - Email settings (`from_email`, `support_email`) - UI preferences (`theme`, `showStatus`) - Feature flags (`autosave.enabled`) - CORS allowed origins - Service account configurations #### ❌ NOT in JSON Config Files (Infrastructure Values) These come from CloudFormation outputs or environment variables: - S3 bucket names - Cognito User Pool IDs - Cognito Client IDs - API Gateway endpoints - Lambda function names - Any CloudFormation outputs ## Configuration Merge Process ### 1. Base + Org-specific Merge The `config-merger.py` script performs deep merging: ```bash # Merge process (automatic via Makefile) python infra/scripts/config-merger.py \ "infra/.config/lambda/stage.json" \ # Base config "infra/.config/lambda/stage-uga.json" \ # Org-specific overrides "backend/rawscribe/.config/config.json" # Output (merged) ``` **Deep Merge Strategy:** - Org-specific values **override** base values - Nested objects are merged recursively - Arrays are replaced (not merged) - Null values in org config remove base values **Example:** Base config (`stage.json`): ```json { "lambda": { "auth": { "provider": "cognito", "required": true }, "file_uploads": { "max_file_size_mb": 25 } } } ``` Org-specific config (`stage-uga.json`): ```json { "lambda": { "file_uploads": { "max_file_size_mb": 50 }, "email_settings": { "from_email": "noreply@uga.edu" } } } ``` Merged result: ```json { "lambda": { "auth": { "provider": "cognito", "required": true }, "file_uploads": { "max_file_size_mb": 50 }, "email_settings": { "from_email": "noreply@uga.edu" } } } ``` ### 2. CloudFormation Output Sync After deployment, `sync-configs-from-cloudformation.py` updates org-specific configs with infrastructure values: ```bash # Sync configs from deployed stack make sync-configs ENV=stage ORG=uga ``` This script: 1. Queries CloudFormation stack outputs 2. Extracts API endpoint, Cognito User Pool ID, and Client ID 3. Deep-merges these values into `infra/.config/webapp/stage-uga.json` 4. Preserves all existing custom fields 5. Updates only infrastructure-related values **What gets synced:** - `webapp.apiEndpoint` ← CloudFormation `ApiEndpoint` output - `webapp.auth.cognito.userPoolId` ← CloudFormation `CognitoUserPoolId` output - `webapp.auth.cognito.clientId` ← CloudFormation `CognitoClientId` output - `webapp.api.proxyTarget` ← CloudFormation `ApiEndpoint` output ### 3. Runtime Configuration Loading #### Lambda Backend The Lambda function receives configuration from two sources: **Environment Variables (from CloudFormation):** ```yaml Environment: Variables: ENV: stage ORG: uga CONFIG_S3_BUCKET: rawscribe-lambda-stage-uga-123456789 CONFIG_S3_KEY: config.json COGNITO_REGION: us-east-1 COGNITO_USER_POOL_ID: us-east-1_ABC123 COGNITO_CLIENT_ID: abc123def456 FORMS_BUCKET: rawscribe-forms-stage-uga-123456789 ELN_BUCKET: rawscribe-eln-stage-uga-123456789 DRAFTS_BUCKET: rawscribe-eln-drafts-stage-uga-123456789 ``` **Config File (uploaded to S3):** The merged JSON config is uploaded to S3 during deployment: ```bash # Automatic via rs-deploy-only aws s3 cp infra/.config/lambda/stage-uga.json \ s3://rawscribe-lambda-stage-uga-123456789/config.json ``` **Auth Provider System:** The backend uses pluggable authentication providers with environment-aware configuration: ```python # backend/rawscribe/utils/auth_providers/factory.py provider = AuthProviderFactory.create(config) # Returns CognitoProvider or JWTProvider # backend/rawscribe/utils/auth_providers/cognito_provider.py # Environment variables take precedence (CloudFormation truth) def get_user_pool_id(self) -> str: return os.environ.get('COGNITO_USER_POOL_ID') or \ self._cognito_config.get('userPoolId') ``` **Runtime Config Endpoint:** Clients can query `/api/config/runtime` to get actual deployed configuration: ```bash curl https://api.example.com/api/config/runtime ``` Returns: ```json { "auth": { "provider": "cognito", "config": { "userPoolId": "us-east-1_ABC123", "clientId": "abc123def456", "region": "us-east-1", "source": "environment" } } } ``` The `source` field indicates whether config came from environment variables (CloudFormation) or config file (baked into Lambda). #### Frontend Webapp Configuration is built into the static assets: ```bash # Build process merges configs python infra/scripts/config-merger.py \ "infra/.config/webapp/stage.json" \ "infra/.config/webapp/stage-uga.json" \ "frontend/public/config.json" # Frontend build includes config.json npm run build # config.json → frontend/dist/config.json ``` Frontend loads config at runtime: ```typescript // Fetch config based on environment const config = await fetch('/config.json').then(r => r.json()); ``` ## Configuration Workflow ### Initial Deployment (New Organization) ```bash # 1. Deploy infrastructure with bucket creation ENABLE_AUTH=true CREATE_BUCKETS=true \ ORG=neworg ENV=stage make rs-deploy # 2. Sync configs from CloudFormation outputs make sync-configs ENV=stage ORG=neworg # 3. Review and customize org-specific config vi infra/.config/webapp/stage-neworg.json vi infra/.config/lambda/stage-neworg.json # 4. Redeploy with updated configs ORG=neworg ENV=stage make rs-deploy-only ``` ### Updating Application Settings ```bash # 1. Edit base or org-specific config vi infra/.config/lambda/stage-uga.json # 2. Deploy configuration changes ORG=uga ENV=stage make rs-deploy-only ``` ### Updating Infrastructure When CloudFormation outputs change (e.g., new Cognito pool): ```bash # 1. Deploy infrastructure changes ORG=uga ENV=stage make rs-deploy # 2. Sync updated values to configs make sync-configs ENV=stage ORG=uga # 3. Verify changes git diff infra/.config/webapp/stage-uga.json ``` ## Configuration Examples ### Minimal Lambda Config (Base) `infra/.config/lambda/stage.json`: ```json { "lambda": { "auth": { "provider": "cognito", "required": true, "cognito": { "region": "us-east-1" } }, "file_uploads": { "max_file_size_mb": 25, "allowed_extensions": [".pdf", ".doc", ".txt", ".xls", ".xlsx"], "temp_storage_retention_days": 7 }, "retry": { "max_retries": 3, "backoff_multiplier": 2 }, "cors": { "allowedOrigins": [ "http://localhost:3000", "http://localhost:5173" ] } } } ``` ### Organization-Specific Lambda Override `infra/.config/lambda/stage-uga.json`: ```json { "lambda": { "email_settings": { "from_email": "noreply@uga.edu", "support_email": "support@uga.edu" }, "file_uploads": { "max_file_size_mb": 50 }, "cors": { "allowedOrigins": [ "https://syndi.uga.edu", "http://localhost:3000" ] } } } ``` ### Minimal Webapp Config (Base) `infra/.config/webapp/stage.json`: ```json { "webapp": { "apiEndpoint": "TO_BE_FILLED_BY_SYNC_CONFIGS", "auth": { "required": true, "provider": "cognito", "cognito": { "region": "us-east-1", "userPoolId": "TO_BE_FILLED_BY_SYNC_CONFIGS", "clientId": "TO_BE_FILLED_BY_SYNC_CONFIGS" }, "session": { "timeout": 3600000, "refreshBuffer": 300000 } } } } ``` ### Organization-Specific Webapp Override `infra/.config/webapp/stage-uga.json`: ```json { "webapp": { "branding": { "title": "SYNDI - University of Georgia", "org_name": "UGA Research Labs" }, "ui": { "theme": "light", "logo": "/assets/uga-logo.png" } } } ``` Note: Infrastructure values (apiEndpoint, userPoolId, clientId) are automatically filled by `sync-configs`. ## Configuration Precedence Configuration values are resolved in this order (highest to lowest priority): 1. **Lambda Environment Variables** (from CloudFormation) 2. **Org-specific JSON Config** (`{env}-{org}.json`) 3. **Base JSON Config** (`{env}.json`) 4. **Application Defaults** (hardcoded in code) Example resolution for `COGNITO_USER_POOL_ID`: ``` 1. Check Lambda env var: COGNITO_USER_POOL_ID 2. Check org config: lambda.auth.cognito.userPoolId (from stage-uga.json) 3. Check base config: lambda.auth.cognito.userPoolId (from stage.json) 4. Use default: None (authentication disabled) ``` ## Troubleshooting ### Config Merge Issues **Problem:** Org-specific config not taking effect **Solution:** ```bash # Verify merge is happening make config ENV=stage ORG=uga # Check merged output cat backend/rawscribe/.config/config.json | jq '.lambda.file_uploads' ``` ### CloudFormation Sync Issues **Problem:** `sync-configs` not updating values **Solution:** ```bash # Verify stack exists and has outputs aws cloudformation describe-stacks \ --stack-name rawscribe-stage-uga \ --query 'Stacks[0].Outputs' # Run sync with debug output python -u infra/scripts/sync-configs-from-cloudformation.py \ --env stage --org uga ``` ### Missing Configuration **Problem:** Lambda can't find config values **Solution:** ```bash # Check environment variables aws lambda get-function-configuration \ --function-name rawscribe-stage-uga-backend \ --query 'Environment.Variables' # Check S3 config upload aws s3 cp s3://rawscribe-lambda-stage-uga-123456789/config.json - | jq ``` ## Best Practices 1. **Use Base Configs for Shared Settings**: Put common settings in base configs (`stage.json`) 2. **Use Org Configs for Customizations**: Put organization-specific values in org configs (`stage-uga.json`) 3. **Never Hardcode Infrastructure Values**: Always use CloudFormation outputs or environment variables 4. **Run sync-configs After Deployment**: Always sync configs after deploying infrastructure changes 5. **Version Control Base Configs**: Base configs can be committed (no sensitive data) 6. **Keep Org Configs Private**: Org-specific configs may contain sensitive information 7. **Validate JSON**: Use `jq` to validate JSON before deployment 8. **Document Custom Fields**: Add comments (via commit messages) explaining custom config fields ## Related Documentation - [Deployment Guide](../deployment/makefile-deployment.md) - How to deploy with configs - [Config Examples](../configuration/config-examples.md) - More configuration examples - [Sync Configs Guide](../configuration/sync-configs.md) - Detailed sync-configs usage - [Makefile Reference](../reference/makefile-commands.md) - All config-related commands