System Design: Medical Imaging Storage (DICOM)

Requirements

Functional Requirements:

Ingest, store, and retrieve medical images in DICOM format from imaging modalities (CT, MRI, X-ray, Ultrasound, PET)
Implement DICOMweb (WADO-RS, STOW-RS, QIDO-RS) and legacy DICOM C-STORE/C-FIND/C-MOVE protocols for interoperability
Provide a zero-footprint web-based DICOM viewer with windowing, measurement tools, and multi-planar reconstruction
Support AI inference pipeline for automated detection (nodule detection, fracture identification, mammography screening)
Lifecycle management: hot storage for recent studies, warm for 1-5 years, cold archive for regulatory retention (7+ years)
Study sharing via secure links for patient access and inter-facility referrals

Non-Functional Requirements:

Store 5 petabytes of imaging data with 500TB annual growth
Ingest 50,000 DICOM studies/day with individual studies ranging from 10MB (X-ray) to 5GB (CT/MRI volumetric)
First-image display time under 2 seconds for studies in hot storage
HIPAA compliance with encryption at rest (AES-256) and in transit (TLS 1.3), full audit trail of image access
99.95% availability — imaging unavailability delays diagnosis and treatment

Scale Estimation

At 50,000 studies/day with an average study size of 200MB (weighted average across modalities), daily ingest volume is 10TB/day or 3.65PB/year. Each study contains an average of 200 DICOM instances (individual images/slices). Total instance count: 10M new instances/day, growing to 3.65B instances/year. Metadata per instance: ~2KB of DICOM tags (patient demographics, study description, modality, acquisition parameters) = 20GB/day of metadata. Image retrieval: radiologists open 500,000 studies/day for reading, with each viewing session loading 50-200 instances progressively. Peak retrieval: 5,000 concurrent radiologist sessions during business hours, each streaming images at 50 Mbps = 250 Gbps aggregate read bandwidth. AI pipeline processes 20,000 studies/day (subset routed for AI analysis) with average inference time of 30 seconds per study.

High-Level Architecture

The medical imaging platform follows a three-tier architecture: Ingestion Tier, Storage Tier, and Access Tier, connected by an event-driven backbone (Kafka). The Ingestion Tier receives DICOM data from imaging modalities via a DICOM Gateway that speaks both legacy DICOM (C-STORE SCP on port 104) and DICOMweb (STOW-RS over HTTPS). The gateway validates DICOM conformance, de-identifies images for research workflows (stripping PHI from DICOM tags per HIPAA Safe Harbor), and publishes StudyReceived events to Kafka.

The Storage Tier implements a tiered architecture using object storage. Hot storage (S3 Standard or MinIO) holds studies from the last 30 days for fast retrieval. Warm storage (S3 Infrequent Access) holds 1-5 year old studies. Cold archive (S3 Glacier Deep Archive) stores studies beyond 5 years at minimal cost. A Lifecycle Manager moves studies between tiers based on last-access time and study age. DICOM metadata is indexed in PostgreSQL for structured queries (QIDO-RS) and Elasticsearch for full-text search across clinical descriptions. Pixel data is stored as DICOM Part 10 files in object storage with the SOP Instance UID as the object key.

The Access Tier provides multiple interfaces: DICOMweb APIs for programmatic access, a zero-footprint HTML5 DICOM viewer (based on Cornerstone.js/OHIF) for web-based reading, and DICOM C-MOVE for legacy PACS integration. The AI Pipeline Tier runs as an asynchronous consumer: it subscribes to StudyReceived events, routes studies to appropriate AI models based on modality and body part, runs inference on GPU instances, and publishes results as DICOM Structured Reports (SR) and DICOM Secondary Capture images with annotations.

Core Components

DICOM Gateway & Ingestion Pipeline

The DICOM Gateway is the entry point for all imaging data. It runs dcm4chee-arc (open-source DICOM archive) as the protocol handler, supporting C-STORE SCP for modality push and STOW-RS for web-based uploads. Upon receiving a DICOM instance, the gateway: (1) validates DICOM conformance (checks required tags, transfer syntax support), (2) extracts metadata from DICOM headers and writes it to PostgreSQL and Elasticsearch, (3) optionally applies de-identification rules (replacing patient name, MRN with pseudonyms for research data), (4) stores the DICOM file to hot-tier object storage with server-side AES-256 encryption, (5) publishes an InstanceReceived event to Kafka. Study-level aggregation is handled by a Flink job that groups instances into studies using the Study Instance UID and publishes a StudyComplete event when no new instances arrive for a configurable timeout (typically 30 minutes for CT/MRI studies that arrive slice-by-slice).

Progressive Image Streaming & Viewer

The web-based DICOM viewer must display the first image within 2 seconds, even for large volumetric studies. This is achieved through a progressive loading strategy: the WADO-RS endpoint serves DICOM instances in JPEG 2000 or HTJ2K (High-Throughput JPEG 2000) transfer syntax, which supports progressive resolution decoding — the viewer displays a low-resolution preview in under 500ms while higher-resolution data continues streaming. For CT/MRI series with hundreds of slices, the viewer uses a predictive prefetch algorithm: when a radiologist scrolls through slices, the system predicts the scroll direction and prefetches the next 20 slices in that direction from the CDN. A WebAssembly-based DICOM decoder (Cornerstone.js with WASM codecs) runs in the browser, eliminating server-side rendering. Multi-planar reconstruction (MPR) for volumetric data is computed client-side using WebGL shaders for real-time sagittal, coronal, and oblique views.

AI Inference Pipeline

The AI pipeline is a modular, event-driven system that routes studies to appropriate ML models based on DICOM metadata. A Router Service consumes StudyComplete events from Kafka and evaluates routing rules: chest X-rays (modality=CR, body part=CHEST) are sent to a lung nodule detection model; mammography studies (modality=MG) to a breast cancer screening model; head CTs to an intracranial hemorrhage detection model. Each model runs on GPU instances (NVIDIA A100) behind a model serving framework (NVIDIA Triton Inference Server) with dynamic batching for throughput. Model inputs are preprocessed by a Preprocessing Service that normalizes pixel values, resamples voxel spacing, and applies windowing. Inference results are written back to the PACS as DICOM Structured Reports (containing findings, confidence scores, and bounding box coordinates) and DICOM Secondary Capture images (with visual overlays). The results are presented to the radiologist inline in the viewer alongside the original study.

Database Design

The metadata store uses PostgreSQL with a schema closely aligned to the DICOM Information Model hierarchy: patients (patient_id as DICOM Patient ID, patient_name_encrypted, date_of_birth_encrypted, gender), studies (study_instance_uid PK, patient_id FK, study_date, study_description, accession_number, referring_physician, modalities_in_study, number_of_series, number_of_instances, study_size_bytes, storage_tier), series (series_instance_uid PK, study_instance_uid FK, modality, series_description, body_part, number_of_instances), instances (sop_instance_uid PK, series_instance_uid FK, instance_number, rows, columns, bits_allocated, transfer_syntax, object_storage_key, content_hash_sha256). PHI fields are encrypted at the column level using pgcrypto with keys managed by AWS KMS.

Elasticsearch indexes a flattened view of study metadata for complex search queries: full-text search on study descriptions and clinical indications, filtered by date ranges, modalities, and referring physicians. The AI results table: ai_results (result_id, study_instance_uid, model_id, model_version, findings JSONB, confidence_scores, structured_report_uid, processing_time_ms, created_at).

API Design

POST /dicomweb/studies (STOW-RS) — Store DICOM instances; accepts multipart/related with DICOM Part 10 payloads; returns stored instance references
GET /dicomweb/studies?PatientID={id}&StudyDate=20240101-20241231&ModalitiesInStudy=CT (QIDO-RS) — Query for studies matching DICOM attributes; returns JSON array of matching study metadata
GET /dicomweb/studies/{studyUID}/series/{seriesUID}/instances/{instanceUID}/frames/1 (WADO-RS) — Retrieve pixel data for a specific DICOM instance/frame; supports content negotiation for transfer syntax (JPEG 2000, HTJ2K)
POST /v1/ai/analyze — Submit a study for AI analysis; body contains study_instance_uid and requested models; returns job_id for async tracking

Scaling & Bottlenecks

The primary bottleneck is image retrieval bandwidth during peak radiology reading hours. With 5,000 concurrent viewing sessions at 50 Mbps each, the storage tier must sustain 250 Gbps of read throughput. This is addressed by a multi-layer caching strategy: (1) a CloudFront CDN caches recently accessed DICOM instances at edge locations with a 24-hour TTL — cache hit rate is 40% since radiologists often re-open recent studies; (2) a Redis-based metadata cache stores study/series/instance hierarchies eliminating PostgreSQL lookups for viewer navigation; (3) SSD-backed hot storage (S3 Express One Zone or local NVMe) for today's studies provides sub-10ms first-byte latency.

The ingestion pipeline must handle bursty traffic — imaging modalities often queue studies overnight and flush them during early morning. The Kafka-based architecture absorbs bursts: the DICOM Gateway writes to Kafka at wire speed, and downstream consumers process at their own pace. Object storage write throughput is virtually unlimited with S3's partitioned namespace. The AI pipeline is scaled by autoscaling GPU instances based on the Kafka consumer lag metric — if the queue depth exceeds 1,000 studies, additional GPU instances are provisioned within 5 minutes.

Key Trade-offs

Object storage over block storage for DICOM files: Object storage provides unlimited capacity and tiered lifecycle management at low cost, but has higher first-byte latency (50-100ms) compared to block storage (1-5ms) — mitigated by CDN caching and SSD-backed hot storage for recent studies
HTJ2K progressive streaming over uncompressed DICOM transfer: HTJ2K enables fast first-image display via progressive resolution but requires client-side WASM decoders and introduces a quality/latency trade-off — full diagnostic quality loads within 5 seconds even for large instances
Asynchronous AI pipeline over synchronous inference: Async processing via Kafka decouples AI latency from the radiology reading workflow and enables batch GPU utilization, but AI results may not be available when the radiologist first opens the study — mitigated by prioritizing stat/urgent studies and targeting 5-minute turnaround
DICOM Part 10 file storage over pixel-only storage: Storing complete DICOM objects preserves all metadata and provenance for regulatory compliance, but increases storage by 5-10% over stripped pixel data — essential for legal defensibility and interoperability