Designing an environment

Walk through a real PK/PD study schema in mice.

Designing the schema for a study is the highest-leverage hour you'll spend in Dalea. A schema that captures the right entities and relationships will let you ask arbitrary analytical questions for the next decade. A schema that doesn't will leave you exporting CSVs and joining things in Pandas forever.

This page walks through the schema for a realistic mouse PK/PD study end to end.

The study

A 24-mouse single-dose PK study of a small-molecule kinase inhibitor (test article DLA-7) in C57BL/6 females. Three dose groups (3, 10, 30 mg/kg PO) plus vehicle. Plasma collected at 15 min, 1 h, 4 h, 24 h. Analyte is parent compound by LC-MS/MS.

The schema

Four entity tables and one result table:

Animalsentity tableanimal_id (PK)sexenumstrainenumbaseline_weight_gnumberstudy_group→ groupsStudy groupsentity tablegroup_id (PK)namedose_mg_per_kgnumberrouteenumtest_article→ articlesTest articlesentity tablearticle_id (PK)namemodalityenumlotPlasma samplesentity tablesample_id (PK)animal→ animalstimepoint_hnumbercollected_atdatePK resultsresult table— dimensions —animal→ animalstimepoint_hnumber— measurements —concentration_ug_mlnumberauc_0_24number
Hover a table to highlight its references. Dotted lines show reference columns; the result table splits explicitly into dimensions (the axes you query by) and measurements (the numbers you record).
Hover any table to see its outgoing references. Reference columns are dotted; result tables explicitly split into dimensions and measurements.

Step-by-step

  1. Create the environment
    Workspace → Data → New environment

    Name it In-vivo PK, pick an icon, add an audit reason like "Initial schema for kinase-inhibitor PK studies." Audit reasons are mandatory in regulated tiers and recommended everywhere.

  2. Add the test articles table

    The most upstream entity. Columns:

    • article_id (text, primary key, generated by naming scheme TA-{N})
    • name (text)
    • modality (enum: small-molecule, mAb, ASO, peptide, mRNA…)
    • lot (text)
  3. Add the study groups table

    Bridges test article to dose level.

    • group_id (text, scheme GRP-{N})
    • name (text — "Vehicle", "DLA-7 3 mg/kg", …)
    • dose_mg_per_kg (number)
    • route (enum: PO, IV, IP, SC)
    • test_article (reference → test articles)
  4. Add the animals table

    The actual subjects.

    • animal_id (text, scheme ANM-{N:000} → ANM-001, ANM-002…)
    • sex (enum: M, F)
    • strain (enum: C57BL/6, BALB/c, NSG…)
    • baseline_weight_g (number, validation: 15–35)
    • study_group (reference → study groups)

    Note the validation: weights outside 15–35 g flag during entry — almost certainly a typo for an adult mouse.

  5. Add the plasma samples table

    One row per timepoint per animal.

    • sample_id (text, scheme SMP-{N:0000})
    • animal (reference → animals)
    • timepoint_h (number, allowed values: 0.25, 1, 4, 24)
    • collected_at (datetime)
  6. Add the PK results result table

    The shape is different: a result table splits into dimensions and measurements.

    • Dimensions: animal (ref), timepoint_h (number)
    • Measurements: concentration_ug_ml (number), auc_0_24 (number), cmax (number), tmax (number)

    Dimensions are what you'll group/filter by in queries. Measurements are the values you'll aggregate.

Why a result table — couldn't this be one big entity table?

It could. But result tables get two things for free:

  • Batched recording. All four timepoints from one animal-day fit in one result batch with a single timestamp, operator and audit reason.
  • Analytical query mode. You can ask "mean concentration at 4 h grouped by dose level" without writing SQL. The dimension/measurement split is what makes that possible.

Naming schemes recap

TableSchemeGenerates
Test articlesTA-{N}TA-1, TA-2, …
Study groupsGRP-{N}GRP-1, GRP-2, …
AnimalsANM-{N:000}ANM-001, ANM-002, …
SamplesSMP-{N:0000}SMP-0001, SMP-0002, …

Schemes can be more elaborate ({YYYY}-{study}-{N}) when traceability matters more than brevity. They're configurable per table; the counter resets per workspace by default but can be made global.

What's next