Designing an environment
Walk through a real PK/PD study schema in mice.
Designing the schema for a study is the highest-leverage hour you'll spend in Dalea. A schema that captures the right entities and relationships will let you ask arbitrary analytical questions for the next decade. A schema that doesn't will leave you exporting CSVs and joining things in Pandas forever.
This page walks through the schema for a realistic mouse PK/PD study end to end.
The study
A 24-mouse single-dose PK study of a small-molecule kinase inhibitor (test article DLA-7) in C57BL/6 females. Three dose groups (3, 10, 30 mg/kg PO) plus vehicle. Plasma collected at 15 min, 1 h, 4 h, 24 h. Analyte is parent compound by LC-MS/MS.
The schema
Four entity tables and one result table:
Step-by-step
- Create the environmentWorkspace → Data → New environment
Name it
In-vivo PK, pick an icon, add an audit reason like "Initial schema for kinase-inhibitor PK studies." Audit reasons are mandatory in regulated tiers and recommended everywhere. - Add the test articles table
The most upstream entity. Columns:
article_id(text, primary key, generated by naming schemeTA-{N})name(text)modality(enum: small-molecule, mAb, ASO, peptide, mRNA…)lot(text)
- Add the study groups table
Bridges test article to dose level.
group_id(text, schemeGRP-{N})name(text — "Vehicle", "DLA-7 3 mg/kg", …)dose_mg_per_kg(number)route(enum: PO, IV, IP, SC)test_article(reference → test articles)
- Add the animals table
The actual subjects.
animal_id(text, schemeANM-{N:000}→ ANM-001, ANM-002…)sex(enum: M, F)strain(enum: C57BL/6, BALB/c, NSG…)baseline_weight_g(number, validation: 15–35)study_group(reference → study groups)
Note the validation: weights outside 15–35 g flag during entry — almost certainly a typo for an adult mouse.
- Add the plasma samples table
One row per timepoint per animal.
sample_id(text, schemeSMP-{N:0000})animal(reference → animals)timepoint_h(number, allowed values: 0.25, 1, 4, 24)collected_at(datetime)
- Add the PK results result table
The shape is different: a result table splits into dimensions and measurements.
- Dimensions:
animal(ref),timepoint_h(number) - Measurements:
concentration_ug_ml(number),auc_0_24(number),cmax(number),tmax(number)
Dimensions are what you'll group/filter by in queries. Measurements are the values you'll aggregate.
- Dimensions:
Why a result table — couldn't this be one big entity table?
It could. But result tables get two things for free:
- Batched recording. All four timepoints from one animal-day fit in one result batch with a single timestamp, operator and audit reason.
- Analytical query mode. You can ask "mean concentration at 4 h grouped by dose level" without writing SQL. The dimension/measurement split is what makes that possible.
Naming schemes recap
| Table | Scheme | Generates |
|---|---|---|
| Test articles | TA-{N} | TA-1, TA-2, … |
| Study groups | GRP-{N} | GRP-1, GRP-2, … |
| Animals | ANM-{N:000} | ANM-001, ANM-002, … |
| Samples | SMP-{N:0000} | SMP-0001, SMP-0002, … |
Schemes can be more elaborate ({YYYY}-{study}-{N}) when traceability matters more
than brevity. They're configurable per table; the counter resets per workspace by
default but can be made global.