Potency Assays as Stability-Indicating Methods under ICH Q5C: Validation Nuances and Reviewer-Ready Practices

Table of Contents

Designing Potency Assays that Truly Indicate Stability under ICH Q5C: Validation Depth, Statistical Discipline, and Defensible Use in Shelf-Life Decisions

Regulatory Frame & Why This Matters

Within the biologics paradigm, ICH Q5C requires that the claimed shelf life and storage statements be supported by data demonstrating preservation of clinically relevant function and structure across the labeled period. In plain terms, the analytical suite must do two things at once: (i) provide orthogonal structural coverage for aggregation, fragmentation, charge and chemical modifications, and particles; and (ii) quantify biological activity with a potency assay that is sufficiently fit-for-purpose to detect stability-relevant loss. A potency method that is insensitive to common degradation routes is not stability-indicating; conversely, a hypersensitive but poorly reproducible assay can generate noise that obscures true product drift. Regulators in the US/UK/EU therefore scrutinize how sponsors justify that their chosen potency readout—cell-based bioassay, receptor/ligand binding, enzymatic activity, neutralization titer, or composite—maps to the product’s mode of action, behaves robustly in the final matrix, and retains discriminatory power after storage, shipping, reconstitution, or dilution. They also look for statistical discipline derived from ICH Q1A(R2)/Q1E (for time-trend modeling at labeled storage) and

ICH Q2 (for method validation constructs), adapted to the idiosyncrasies of bioassays (relative potency, non-linear dose–response, parallelism). Because potency is often expiry-governing for biologics, weaknesses here propagate directly to shelf-life claims, labeling (e.g., in-use hold times), comparability, and post-approval change control. This section frames the central decisions: selecting an assay architecture tied to mechanism; defining what makes it stability-indicating; validating around its biological and statistical realities; and using it correctly in expiry models where one-sided 95% confidence bounds on fitted means at the labeled condition govern shelf life, while prediction intervals stay reserved for OOT policing. The aim is a potency system that is not merely “validated” in the abstract but demonstrably capable of detecting the kinds of potency erosion likely to occur during storage, transport, and preparation—so that shelf-life conclusions are both scientifically true and readily verifiable by FDA/EMA/MHRA reviewers. Throughout, we align our language with how professionals search and cross-reference content in internal SOPs and dossiers (e.g., ICH Q5C, protein stability assay, pharmaceutical stability testing, drug stability testing, and real time stability testing) to keep advice operational, not theoretical.

Study Design & Acceptance Logic

Design begins with a mode-of-action map that translates clinical mechanism into an assayable signal. If therapeutic effect depends on receptor activation/inhibition, a cell-based potency assay is first-line, with a binding surrogate only if correlation is demonstrated across stress states; if enzymatic replacement governs, a substrate-turnover method may be primary, with a cell-based readout as an orthogonal check. Having fixed the biological readout, articulate a potency governance hierarchy in the protocol: “Bioassay governs expiry; binding is supportive,” or, if justified, “Binding governs with bioassay corroboration,” and explain why. Acceptance logic must be explicit and level-specific: at each stability pull under labeled storage, compute relative potency with appropriate models (e.g., parallel-line or four-parameter logistic (4PL) fits), confirm assay validity (slope/shape similarity, parallelism tests), and trend the potency estimate over time. Shelf life is then governed by a one-sided 95% confidence bound on the fitted mean potency at the proposed dating period; if lots/presentations are pooled, declare and test time×batch/presentation interactions. Prediction intervals and OOT tests are reserved for signal policing, not dating. For multi-attribute products (e.g., mAbs engaging multiple effector functions), define whether a composite potency is used or whether the most mechanism-critical or most drift-sensitive assay governs; justify either choice with pharmacology. In multi-region programs, harmonize acceptance phrasing so that identical mathematics appear across sequences, minimizing divergent queries. Finally, bind potency acceptance to label-relevant claims: if in-use stability is proposed, declare that both potency and structure must remain within limits over the hold; if reconstitution is required, specify that drug product and reconstituted solution are separately governed. The design should show restraint (diagnostic accelerated legs, conservative governance when parallelism is marginal) and completeness (pre-declared triggers to increase sampling or split models when assumptions fail). Reviewers react favorably when acceptance is a chain of “if→then” statements they can verify from tables, rather than narrative optimism.

Conditions, Chambers & Execution (ICH Zone-Aware)

Execution fidelity determines whether potency results are attributable to product behavior rather than laboratory choreography. At labeled storage (refrigerated or frozen), ensure chamber qualification (uniformity, recovery, excursion logging) and specify sample handling (orientation for syringes/cartridges to control interfacial exposure, inversion cadence for suspensions, controlled thaw for frozen presentations) because these factors can alter biological readouts independent of chemical change. Align climatic choices with the dossier’s regional scope: if long-term uses 5 °C for a narrow market or 2–8 °C for global reach, keep the potency modeling anchored there; use intermediate or accelerated only to illuminate mechanism or support excursion adjudication. For photolability risks, Q1B exposures should be performed on the marketed configuration, but interpret potency changes under light through mechanism (e.g., oxidation at functional residues) and keep expiry grounded in labeled storage unless validated assumptions are met. Execution SOPs should standardize critical pre-analytical variables that affect potency: thaw/refreeze prohibitions; hold-times before assay; aliquotting tools/materials (adsorption to plastics can “lose” active); and shear/light exposure during sample prep. For reconstituted/ diluted products, simulate clinical practice (diluent, IV bag, tubing) and control temperature and light during holds; then state in the protocol that in-use claims are governed by paired potency and structural metrics (e.g., SEC-HMW, particles). Record measured environmental parameters, not just setpoints, and cross-reference them in the potency dataset so any deviations are transparent. Finally, ensure sample placement and rotation in chambers preclude positional bias across pulls; reviewers often request proof that edge/corner loads did not experience different thermal histories. By making chamber execution and sample handling auditable and reproducible, you de-risk the interpretation of potency trends and avoid common follow-ups that slow reviews.

Analytics & Stability-Indicating Methods

To be stability-indicating, a potency assay must detect functionally relevant loss caused by the storage-relevant degradation pathways of the product. Establish this by challenging the method with orthogonally characterized stressed samples representing plausible mechanisms: thermal, oxidative, deamidation, clipping, interfacial agitation, freeze–thaw. Demonstrate that potency drops when structural analytics indicate mechanism-linked change (e.g., aggregation or site-specific oxidation at functional residues) and that potency remains stable when changes are cosmetic or non-functional. For a cell-based method, qualify sensitivity to changes in receptor density/affinity and downstream signaling; show that matrix components (excipients, surfactant) and device contacts (e.g., silicone oil) do not create assay artifacts. For binding surrogates, supply correlation to bioassay across mechanisms and stress severities; correlation at release is insufficient to claim stability-indicating behavior. Pre-establish and lock processing pipelines: fixed plate layout rules, control placement, curve-fitting model (usually 4PL with constrained asymptotes), weighting strategy, and validity criteria (AICC/BIC thresholds, residual diagnostics, Hill slope plausibility). Confirm linearity in the relative potency domain by dilutional linearity and bracketing of test samples with reference ranges. Define and verify robustness parameters: incubation times/temperatures, cell passage windows, detection reagent lots, instrument settings. For products with multiple mechanisms (e.g., ADCC/CDC in addition to binding), explain which mechanism governs clinical effect at the labeled dose and under what circumstances a secondary potency assay becomes threshold-governing. Finally, integrate potency with the rest of the stability panel in a way that reflects real decision-making: show how potency, SEC-HMW, particles, charge variants, and peptide mapping converge or diverge on the same samples; where they diverge, present a mechanistic rationale (e.g., slight acidic variant shift without potency impact). This alignment converts “validated assay” into “stability-indicating system” and is the heart of reviewer confidence.

Risk, Trending, OOT/OOS & Defensibility

Potency data are variable by nature; defensibility comes from pre-declared rules that separate signal from noise. Encode out-of-trend (OOT) policing using prediction intervals from your time-trend model at labeled storage or appropriate non-parametric trend tests; keep these constructs out of expiry computation. In every potency run, document validity gates before looking at sample outcomes: reference curve asymptotes and slope within historical ranges; goodness-of-fit metrics acceptably low; parallelism tests (for parallel-line or 4PL ratio models) passed. If a run fails, stop; do not “salvage” by post-hoc curve manipulation. Define how many independent runs are averaged for each time point and how outliers are handled (pre-declared robust estimators beat discretionary deletion). When a potency OOT occurs, investigate in layers: (1) analytical—confirm system suitability, curve performance, control recoveries, plate effects; (2) pre-analytical—sample thawing, handling, timing; (3) product—contemporaneous structure data (SEC-HMW, particles, charge variants) consistent with functional decline. If analytics and handling are clean but potency decline lacks structural corroboration, temporarily increase potency sampling density, assess method precision on the affected matrix, and consider tightening validity gates; if functional decline matches structural drift (e.g., site-specific oxidation), update expiry modeling and, if margins compress, shorten dating rather than over-interpreting noise. For OOS, follow classic confirmatory testing and root-cause analysis; if confirmed and mechanism-linked, compute expiry conservatively (earliest element governs when pooling is marginal). Document slope changes and decisions transparently; regulators reward plans that choose conservatism when ambiguity persists. Above all, keep model constructs distinct: one-sided 95% confidence bounds at labeled storage govern shelf life; prediction bands govern OOT policing; accelerated legs remain diagnostic unless validated; and earliest expiry governs when poolability is unproven. This separation—spelled out in captions and text—preempts many common deficiency letters.

Packaging/CCIT & Label Impact (When Applicable)

Container-closure and presentation can influence potency readouts by altering exposure to interfaces, oxygen, light, or leachables. For prefilled syringes or cartridges, quantify silicone droplets and assess their impact on assay performance (adsorption of protein to plastics, interference with detection). If potency declines are observed in device presentations but not in vials under identical storage, explore mechanisms (interfacial denaturation, agitation during transport) and add appropriate orthogonal structure metrics (LO/FI particles, SEC-HMW) to attribute cause. For lyophilized products, ensure reconstitution protocols used in potency testing mirror clinical practice; variations in diluent, mixing force, and hold time can create transient potency artifacts unrelated to storage drift. Where photostability is relevant (clear devices or windows), perform marketed-configuration Q1B exposures; if light causes potency-relevant changes (e.g., tryptophan oxidation at functional epitopes), tie protection claims directly to potency and structural evidence and reflect the minimal effective protection in label text (“protect from light,” “keep in carton”). Container-closure integrity (CCI) should be demonstrated for the presentation at issue; if ingress (oxygen/humidity) could influence potency via oxidation or hydrolysis, present sensitivity data and link to observed trends. Label implications must be truth-minimal: do not add prohibitions or protections not supported by data, and do not omit those that are clearly warranted. In-use claims (post-reconstitution or dilution hold times) must be supported by paired potency and structural metrics over realistic conditions (light, temperature, IV sets), with acceptance criteria prespecified; reviewers will not accept potency-only claims if particles or aggregation increase beyond action bands. By explicitly connecting packaging science and CCI to potency outcomes and label wording, you convert potential sources of reviewer concern into precise, verifiable statements.

Operational Framework & Templates

High-maturity teams encode potency governance into procedural standards that read the same way across products. A robust protocol template should include: (1) mode-of-action mapping and potency governance hierarchy; (2) assay architecture (cell-based, binding, enzymatic) with justification; (3) validation plan tailored to bioassays (parallelism/linearity in the relative domain, dilutional linearity, intermediate precision, robustness windows, matrix applicability, stability-indicating challenges); (4) statistical plan for dose–response fitting (model family, weighting, validity checks) and for time-trend modeling at labeled storage (pooling criteria, one-sided 95% confidence bounds for expiry, prediction-interval OOT policing); (5) triggers for increased sampling, model splitting, or governance shifts when assumptions fail; (6) cross-references to structural analytics and how divergent signals are adjudicated; and (7) an evidence-to-label crosswalk. A matching report template should open with a decision synopsis (expiry, storage/in-use statements), followed by recomputable artifacts: Run Validity Table (curve parameters, goodness-of-fit, parallelism), Relative Potency Summary (per run, per time point, per lot), Expiry Computation Table (fitted mean at proposed dating, SE, one-sided t-quantile, bound vs limit), Pooling Diagnostics (time×batch/presentation interactions), and a Completeness Ledger (planned vs executed pulls; missed-pull dispositions). Figures must keep constructs separate: (a) confidence-bound expiry plots at labeled storage; (b) separate OOT policing plots with prediction bands; (c) mechanism panels that overlay potency with SEC-HMW/particles/charge variants. Keep conventional leaf titles in CTD (e.g., “Potency—bioassay method and validation,” “Potency—stability trends and expiry computation”) so assessors land on answers quickly. These templates make potency governance auditable and reduce inter-product variability, which reviewers notice and reward with shorter assessment cycles.

Common Pitfalls, Reviewer Pushbacks & Model Answers

Patterns recur in deficiency letters. (1) Surrogate overreach. Sponsors claim binding governs potency without proving stability-indicating behavior across stress states. Model answer: “Binding correlates to cell-based activity (R≥0.95) under thermal/oxidative/aggregation stress; potency is governed by bioassay; binding monitors fine changes during in-use; expiry is set from bioassay confidence bounds at labeled storage.” (2) Construct confusion. Prediction intervals are used on expiry plots or accelerated legs are used to justify dating. Answer: “Expiry is determined from one-sided 95% confidence bounds at labeled storage; prediction intervals police OOT only; accelerated data are diagnostic unless validated.” (3) Unstable curve fitting. Runs are accepted with poor asymptote/slope behavior, hidden via manual weighting or curation. Answer: “Run validity gates are pre-declared (asymptotes/slope ranges, residuals, AIC/BIC); failed runs are rejected and repeated; plate effects monitored.” (4) Parallelism ignored. Relative potency is computed without demonstrating parallel slopes or acceptable Hill slopes between reference and test. Answer: “Parallelism/hill-slope tests are executed each run; non-parallel runs are invalid; if persistent, model split and earliest expiry governs.” (5) Matrix inapplicability. Assay validated at release matrix but not in final presentation/dilution. Answer: “Matrix applicability (excipients, device contact) is demonstrated; silicone quantitation/FI provide attribution in syringe systems.” (6) Narrative acceptance. Acceptance criteria are implicit or move during review. Answer: “Acceptance logic is pre-declared; expiry tables are recomputable; any governance shift is tied to triggers.” (7) Over-reliance on single mechanism. Only one functional pathway assayed when clinical action is multi-mechanistic. Answer: “Primary mechanism governs; secondary function trended; governance shifts if secondary becomes limiting.” Proactively building these answers into protocol and report language—using the reviewer’s vocabulary—preempts cycles of clarification and narrows discussion to genuine scientific uncertainties.

Lifecycle, Post-Approval Changes & Multi-Region Alignment

Potency governance does not end at approval. As real-time data accrue, refresh expiry computations and pooling diagnostics, and lead with a “delta banner” (“+12-month data; bound margin +0.3% potency; expiry unchanged”). Tie change control to triggers that invalidate assumptions: changes in cell line or detection reagents; shifts in reference standard or control curve behavior; manufacturing or formulation modifications that alter matrix or presentation; device or packaging changes that influence interfacial exposure; and laboratory platform updates (reader, software) that can bias curve fits. For each trigger, run micro-studies sized to risk (e.g., cross-over validation with old/new cells/reagents; bridging of curve-fit software; potency stability check after siliconization change), and, if bias is detected, split models and let earliest bound govern until convergence is re-established. In global programs, harmonize scientific cores—tables, figure numbering, captions—across FDA/EMA/MHRA sequences; adapt only administrative wrappers. If regional norms differ (e.g., style of parallelism evidence), include the stricter artifact globally to avoid divergence. For post-approval extensions (new strengths, presentations), declare whether potency governance portably applies or whether a new assay/validation is required; where proportional formulations and common mechanisms allow, justify read-across explicitly. Finally, maintain an assay lifecycle file capturing cell history, reference standard timeline, drift in curve parameters, and control-chart limits; reviewers often ask for this during inspections and queries. The objective is simple: keep potency as a living, auditable truth that remains aligned with product, presentation, and platform realities—so that shelf-life claims, in-use statements, and label qualifiers continue to be conservative, correct, and quickly verifiable across regions.