Secondary Use FHIR Server Implementation Guide
0.1.0 - ci-build

Publish Box goes here

Synthetic Dataset

Synthetic Dataset

Dataset Types

The server generates named fake datasets:

Dataset Type Default size Notes
hospital hospital-focused IBD fixture 240 patients richer procedure and hospital-care pattern
gp huisarts / GP view 270 patients primary-care view with overlap against hospital
default alias 240 patients resolves to hospital

Use --patient-count=<n> to override the configured size for a named dataset.

Persistence

Generated data is persisted below the configured data directory:

data/generated/instance-seed.txt
data/generated/hospital.json
data/generated/gp.json
data/generated/hospital-1000-patients.json

The seed file is created once per data directory. Within one directory, hospital and GP datasets share the seed so their synthetic BSN overlap is deterministic.

A new data directory creates a different synthetic site.

Scenario Model

Every generated patient is based on a deterministic patient scenario.

A scenario includes:

  • IBD type
  • disease severity
  • flare state
  • comorbidity burden
  • medication plan
  • care setting
  • procedure pattern

The generator then emits a patient-centered record set. The data is fake but deliberately less uniform than a hand-written fixture.

Resource Set

The generated record set includes:

  • Patient with fake BSN, name, gender, birth date, telecom, address, GP, and managing organization
  • one IBD Condition
  • optional related and unrelated Condition records
  • Procedure records for hospital patients and a subset of GP patients
  • functional status Observation
  • fecal calprotectin Observation
  • laboratory Observation records for CRP, hemoglobin, ferritin, and vitamin D
  • AllergyIntolerance for a subset of patients
  • MedicationRequest as medication agreement
  • MedicationStatement as medication use
  • one dataset-level Provenance export metadata record

Main Clinical Codes

Diagnosis examples:

Code Display
34000006 Crohn disease
64766004 Ulcerative colitis
235744008 Inflammatory bowel disease unclassified

Medication examples:

Code Display
387506007 Mesalazine
116602009 Azathioprine
386872004 Infliximab
417982006 Adalimumab
372826007 Ustekinumab

Procedure examples:

Code Display
73761001 Colonoscopy
274025005 Ileocolonoscopy
86174004 Biopsy of colon

The fixture uses SNOMED CT for these clinical concepts. The fake BSN identifier system is urn:oid:2.16.840.1.113883.2.4.6.3.

Data-Quality Issues

Low-rate deterministic data-quality issues are injected to make adapter and linkage behavior easier to test:

  • missing BSN
  • minor name typo
  • stale address
  • missing telecom
  • birth date off by one day
  • duplicate local patient row for the same person

These are practical fixture assumptions, not national statistics.