AI Agent Context for CLAIRE MVP Implementation¶

Project Overview¶

CLAIRE is a TypeScript Electronic Lab Notebook (ELN) data collection system that renders schema-driven SOPs as tabbed forms with regulatory-compliant immutable storage. This document provides context for AI agents implementing the system.

Core Nomenclature (CRITICAL)¶

SOPTemplate: The generic schema that defines what SOPs are in general (meta-schema)
SOP: A specific Standard Operating Procedure schema that complies with the SOPTemplate
SOPMetadata: Additional data attached to an SOP (filename, author, etc.)
ELN: The actual filled-out data collected using an SOP

Flow¶

General data flow:

SOPTemplate (meta-schema) → SOP (specific procedure) → ELN (filled data)

In practice, the SOPTemplate serves dual purposes:

SAM (Authoring): SOPTemplateSchema → [Create New SOP] → sopTest1.yaml
CLAIRE (Validation): SOPTemplateSchema → [Validate SOP] → ✅ sopTest1.yaml conforms

Current Project State¶

Base Project: Working SAM application, a template-driven SOP authoring app
Goal: Extend SAM app with CLAIRE client/server webapp MVP functionality to operationalize the authored SOPs and deploy to AWS with user authentication (and in most cases, also authorization)
Implementation Status: Third refactoring of CLAIRE, this time starting fresh with modular prompts
DO NOT REGRESS FRONTEND FUNCTIONALITY: /frontend/src/sam/ works perfectly and has dependencies in /frontend/src/shared/, /frontend/.config/, do not regress
do not edit /frontend/build/ targets directly These files are generated
/Makefile for creating targets
/docs/source for markdowns

Directory Structure¶

<root>/
├── frontend/
│   ├── src/
│   │   ├── shared/           # Existing shared utilities
│   │   │   ├── schema/SOPTemplateSchema.yaml
│   │   │   │                 # Template for authoring SOPs (source)  
│   │   │   ├── lib/          # config-loader.ts, auth.tsx (basic versions)
│   │   │   ├── hooks/        # useSimpleAutosave.ts
│   │   │   ├── components/   # Reusable UI components
│   │   │   ├── types/        # Reusable types (config.ts)
│   │   │   └── views/        # Webapp-wide views (LoginPage.tsx)
│   │   ├── sam/              # SAM-specific code: SOP editor (adds no SOP schema template properties)
│   │   └── claire/           # CLAIRE-specific code: ELN data collector
│   ├── tests/                # unit and e2e tests
│   ├── build/
│   │   └── SOPTemplateSchema.ts  # Existing Zod schema for enforcing SOP template rules on new SOPs
│   └── public/
│       └── config.json       # env-specific target, copied from webapp/{env}.json
├── backend/
│   ├── rawscribe/            # Python FastAPI backend
│   │   ├── .config
│   │   │   └── config.json       # env-specific target, copied from lambda/{env}.json
│   │   ├── main.py               # FastAPI app with ELN API
│   │   ├── routes/           # API routes 
│   │   └── utils/            # Storage, auth, and RBAC utilities
│   └── tests/                # Backend tests (unit, integration)
├── infra/
│   ├── .config/              # Potentially sensitive configs (gitignored)
│   │   ├── stack/            # CloudFormation parameters
│   │   ├── webapp/           # Frontend service configs → webapp bucket
│   │   ├── lambda/           # Backend service configs → lambda bucket
│   │   ├── forms/            # Forms service configs → forms bucket
│   │   └── eln-drafts/       # Draft service configs → eln-drafts bucket
│   ├── example-.config/      # Example templates (committed, same structure)
│   ├── cloudformation/       # CloudFormation templates
│   └── scripts/              # Deployment scripts (deploy-configs.sh, etc.)
└── .local/s3/                # Local development S3 simulation (gitignored)
    ├── webapp/               # Local prod-simulated webapp bucket
    │   └── webapp/           # build targets created with `make deploy-frontend`
    │       ├── serve.py      # Copied from infra/, simulates CloudFront
    │       ├── assets/       # Copied from frontend/dist/
    │       ├── config.json   
    │       └── index.html
    ├── lambda/               # Local prod-simulated lambda artifacts bucket
    │   ├── function.zip      # Simulated lambda artifact, from `make deploy-backend`
    │   └── build_mock/       # For simulating lambda function (unzip'd, served with uvicorn)
    ├── forms/sops/           # Local SOP files for live dev
    ├── eln/                  # ELN bucket simulation
    │   └── submissions/      # Final ELN storage
    └── eln-drafts/           # Draft bucket simulation
        └── drafts/           # Draft storage

Existing Resources¶

Key Files to Reference¶

SOP Schema: frontend/build/SOPTemplateSchema.ts - Zod validation schema, built with make schemas
Sample SOP: .local/s3/forms/sops/sopTest1.yaml - Test data
Current Config: frontend/src/shared/lib/config-loader.ts
Auth Context: frontend/src/shared/lib/auth.tsx (supports 3-role RBAC: Admin, Researcher, Viewer)
Autosave Hook: frontend/src/shared/hooks/useAutosave.ts - Existing autosave logic

Implementation¶

Configuration System: - Simple bucket-based deployment from infra/.config/ to service buckets
Tests: All tests passing; failing tests are skipped
Source:
- infra/example-.config/{service}/{env}.json - examples of configs
- infra/.config/{service}/{env}.json - sensitive configs (gitignored), can mirror examples, pre-prod
Staging:
- Hot reload servers for rapid development served from:
  - fronend(webapp): frontend/src/main.tsx,
  - backend(lambda function): backend/rawscribe/main.py
- Locally staged servers for emulating production, served from:
  - webapp bucket: .local/s3/webapp/webapp/index.html
  - lambda function bucket: .local/s3/lambda/build_mock/server.py
Config Targets:
- webapp/frontend:
  - dev: frontend/public/config.json
  - locally staged:.local/s3/webapp/webapp/public/config.json
  - prod: s3://<webapp-bucket-name>/webapp/public/config.json
- lambda/backend:
  - dev: backend/rawscribe/.config/config.json
  - locally staged: .local/s3/lambda/function.zip ->.local/s3/lambda/build_mock/rawscribe/.config/config.json`
  - prod: s3://<lambda-bucket-name>/rawscribe/.config/config.json
Deployment:
- Make rules:
  - make setup-local RUN ONCE - creates local configs and deploys to hot reload servers
  - make config ENV="dev" deploys to hot reload servers for dev
  - make start-dev ENV="{env} deploys to hot reload servers for dev
  - make mirror ENV="{env} builds and deploys .local/s3 for prod emulation
  - make stop-all stops all local servers, dev and staged
- Retrieving config:
  - Frontend: fetch('/config.json') from webapp bucket (public config only)
  - Backend: load from config_loader.load_config() with graceful failure (no fallbacks)
  - Security: Public/private config split - sensitive data only in private backend configs

Development Environment Setup¶

# run all tests
make test-all

# start hot reload servers
make start-dev

# stage locally to emulate production
make mirror

Testing Structure¶

Locally Staged Data and Apps: .local/s3/ - S3 simulation for live, manual testing
Test Fixtures: fixtures/s3-simulation/ - S3 simulation for Playwright testing (same structure)

Key Dependencies¶

Frontend: React, TypeScript, React Hook Form, Zod, RJSF, Lucide React
Backend: FastAPI, Pydantic, boto3, PyJWT
Development: Node.js 18+, Python 3.9+ We avoid Docker by not using LocalStack, simplifying the dev tech stack, set-up, and maintenance, while also minimizing compute requirements, and vendor (AWS) dependence.

Important Context for AI Agents¶

Schema Independence¶

Critical: Never embed assumptions about form field names, schemas or schema patterns
Use: Only explicit schema declarations (format, type, ui_config, validation)
Avoid: Name pattern matching (e.g., checking if field contains ‘date’ or splitting id on ‘_’)
Strict Schemas: The SOPTemplateSchema.yaml uses additionalProperties: false to enforce a strict structure. The Zod schema generator translates this to .strict(), preventing any properties not explicitly defined in the schema. This is critical for data integrity.
@type: Used for providing context to SOP developers (sam), avoid in SOP presentation (claire) so as to maintain schema independence.

Schema Structure Assumptions that are Allowed:¶

Rendering:
- taskgroups: SOP schemas have a taskgroups array property, rendered as cards. Each array item is a hierarchy of schema objects:
  - children/parents (optional) if present, property defines schema object hierarchy.
  - Immediate children of taskgroups render as tabs
  - Ancestors of immediate children of taskgroups render as nested cards, recursively:
    - [ cards ] (taskgroups) → [ tabs ] → [ nested cards ] (recursive)
- Schema object properties:
  - id for rendering, import, export identification;
  - ui_config, name, title, description (optional) for rendering
  - type,validation (optional) for rendering as RJSF inputs. Only render if type property is present
  - ordinal (optional) an integer, (0,n), dictates where to render, relative to siblings, and the value can/should be rendered unless its an item in the taskgroups array or an immediate child (tab). Examples: a step in a protocol, or a reagent to source/prepare should render the ordinal value. In the task groups array and immediate children, the ordinal indicates where to put the card/tab; so if the ordinal is ‘1’ for a card or tab, just make sure it shows up first.
  - annotation is to be rendered as a string input type on an SOP object with a ‘type’: it’s to allow SOP executers to add annotations on any schema object that an SOP author creates, giving more flexibility to the SOPs themselves. The SOP author should be able to disable annotations per object, to enforce stricter SOPs for operational QC, and that functionality is TBD.
Schema Type Detection:
- SAM Usage: Use detectSchemaType() from frontend/src/shared/lib/schema-registry.ts to identify schema element types for rendering context in SOP authoring interface. This provides schema names to SOP builders for reference during development.
- CLAIRE Usage: Use detectSchemaType() for schema element identification in ELN components.
- Antipattern: Using detectSchemaType() for structural logic creates assumptions about schema structure that violate schema independence principles. Use schema-driven approaches through the schema registry and property analysis instead.
- JSON-LD Type Annotation: All schema objects include an @type property following JSON-LD standards for reliable type detection:
  - Runtime objects created via createDefaultObject() use a non-enumerable @type property to avoid validation issues
  - The schema generator automatically adds '@type': z.string().optional() to all object schemas with additionalProperties: false
  - This ensures compatibility with both JSON-LD conventions and Zod’s strict validation.
  - The @type is intended for rendering hints during SOP construction: DO NOT use it to create links between application logic and the SOP template data model.
Filename Generation: Special schema objects for ELN filename generation and placed in children arrays:
- specify which fields should be used in filename generation. To identify these components:
  - Look for objects with a filename_component: true property
  - Has no type property (do not render it)
- Properties: filename_component: true (marks as filename component) and order: number (specifies sequence)
- Example: A Field schema element with PatientId has its id has a child ELNFilenameComponent with filename_component: true, order: 2; then the PatientId field’s value becomes the 2nd component in the ELN filename: <prefix>-<component1>-<PatientID-value>-<component...>-<component-n>-<suffix>.json. Remember: do not hardcode ELNFilenameComponent or any other schema name, it breaks schema independence.
- Design Rationale: These filename_component objects are not rendered as form inputs, they are used to send configuration information to the backend

Schema Design: Children vs Properties for Non-Renderable Objects¶

Design Principle¶

Schema objects that should not be rendered as user inputs but are configuration objects should be placed in the children arrays of schema objects rather than as direct properties.

Why This Matters¶

CLAIRE (user-facing forms) only renders schema elements with type properties as form inputs
SAM (designer tools) needs to detect and manage these configuration objects
Backend processes these configurations for filename generation, export logic, etc.

Example: Field Schema¶

Here, Field is a terminal schema object with configuration objects ELNFilenameComponent and ExportConfiguration. DO NOT HARDCODE ANY OF THESE OBJECT NAMES it will break schema independence.

Field:
  type: string          # ✅ Rendered as input string by CLAIRE
  name: "Project ID"    # ✅ Used as label
  children:             # ✅ Configuration objects (no `type` property, not rendered)
    - $ref: '#/definitions/ELNFilenameComponent'  # For filename generation
    - $ref: '#/definitions/ExportConfiguration'    # For export settings

Benefits¶

CLAIRE: Only renders user inputs, ignores configuration children
SAM: Can detect configuration schemas via detectSchemaType() for designer tools
Backend: Can be send data from configuration children to execute business logic
Clear separation: User data vs. configuration metadata

Alternative (Avoid)¶

Field:
  type: string
  export: {...}         # ❌ Would be rendered as form input
  eln_filename: {...}   # ❌ Would be rendered as form input

This design ensures configuration objects serve their intended purpose without interfering with user-facing form rendering.

Schema-Driven UI¶

ui_config Property: Schema elements can have a ui_config property that dictates their appearance and behavior (e.g., icons, card styles, collapsibility).
UIConfigProvider: The application uses a UIConfigProvider (frontend/src/shared/lib/ui-config-provider.tsx) to process these properties. This centralizes UI logic and ensures consistent rendering based on the schema.
Principle: Instead of hardcoding UI decisions in components, use the useUIConfig hook to interpret the schema’s ui_config and apply the correct styles and behaviors.

File Naming Conventions¶

Python: snake_case (config_loader.py, auth_provider.py)
TypeScript: kebab-case (config-loader.ts, auth-provider.ts)
React Components: PascalCase (ELNCreator.tsx, TabRenderer.tsx)

Test Structure¶

Frontend: tests/ directory at frontend root with unit/ for unit tests, e2e/ end-to-end integration.
Backend: tests/ directory at backend root with unit/, integration/

Performance Targets¶

SOP Load Time: <2 seconds
Form Render Time: <1 second
Field Response Time: <100ms
Memory Usage: <100MB for form data
Draft Save Time: <500ms

Debug Panel Requirements¶

Debug Panel: Must render both SOP metadata AND ELN form data, including user-entered form data

Batch I/O¶

In addition to manually inputing & submitting data, users can input data via:

authorized RESTful commands
uploads
integrated, API-enabled instruments (TBD)
integrated, API-enabled sample registries (TBD) Traditionally, outputs to ELN only, but hooks can be developed to detect the ELN submission and trigger sample registry submissions and instrument programs based on the ELN contents.

Sample Registry Records¶

Sample registry integration into input fields is TBD

Instrument Readouts¶

Instrument integration into input fields is TBD

File Uploads¶

CLAIRE supports file attachments for ELN fields with type: "file". The system uses a two-stage process:

Stage 1: Temporary Upload (Draft)¶

Files uploaded via /api/v1/files/upload are stored in draft storage
Filename format: {user_id}-{field_id}-{file_upload_uuid}-{original_filename}.{ext}
Location: .local/s3/eln-drafts/drafts/{sop_id}/attachments/
Example: dev_user-field_123-a1b2c3d4-protocol.pdf

Stage 2: ELN Attachment (Final)¶

When ELN is submitted, files move from draft to final storage via /api/v1/files/attach-to-eln
Filename preserved: Same exact filename as draft (keeps file_upload_uuid for audit trail)
Location: .local/s3/eln/submissions/{sop_id}/attachments/
Operation: Simple copy/move with no renaming

Key Implementation Points¶

File IDs generated by FilenameGenerator.generate_temp_file_id() (8-char, no dashes)
Both S3 and Local storage backends supported
Schema-driven configuration via file_config in SOP field definitions
Files are SOP-scoped for proper organization and access control

How to Use This Context¶

Read this file first before starting any prompt
Reference existing files mentioned in the “Key Files to Reference” section
Follow the directory structure exactly as specified

Memory Preferences¶

Maintain schema independence - never embed field name assumptions
User prefers simpler Functional Approach over OOP unless OOP is well justified
Auth, autosave, and schema logic go in lib/ not utils/
View components go in views/ not pages/
User prefers correct, concise, clear, and comprehensive responses
User reads through documents thoroughly until no more edits needed
Use frontend/src/shared/lib/logger instead of console.log

This context file should be referenced by AI agents before implementing any prompt to ensure consistency and awareness of existing resources.