Menu Close

Algorithmic approach

A More Performant Approach than USP Recommendations

The recommendations of the USP (United States Pharmacopeia) — notably the “Beyond-Use Date” (BUD) chapter — rely on conservative, empirical rules designed to ensure safety when stability data are limited. These standardized durations (e.g., 14 days for refrigerated aqueous oral preparations) do not account for real-world variability stemming from formulation, packaging, or storage conditions.

Smart Formulation adopts a scientific approach that predicts the beyond-use date of compounded preparations based on:

  • Molecular descriptors of the active substance (from resources such as ChemBL),
  • Formulation characteristics (type and proportion of excipients),
  • Environmental parameters (temperature, humidity, light exposure),
  • Primary packaging (glass, plastic, barrier materials, closure systems),
  • and published stability data from the literature and validated databases (Stabilis).

Leveraging multivariate regression and explainable supervised learning models developed in KNIME, Smart Formulation estimates the BUD for a specific preparation and reports a quantified confidence interval, instead of applying one-size-fits-all values.

In practice, this enables personalized, scientifically justified conservation durations that more closely reflect the actual product context than standardized USP BUDs, while meeting pharmaceutical safety requirements.

Model performance was evaluated on several hundred compounded preparations and validated against durations reported in the literature and validated datasets (Stabilis, ChemBL), demonstrating a significant improvement in predictive accuracy compared with empirical rules.

The Science Behind Smart Formulation

The algorithmic approach in Smart Formulation relies on predictive models built in KNIME that integrate formulation, environmental, and packaging variables to estimate the Beyond-Use Date (BUD) of pharmaceutical preparations. These models, based on multivariate regression and explainable supervised learning techniques, are experimentally validated and documented in the scientific literature.
Machine learning workflow KNIME
Machine learning workflow used in Smart Formulation to predict the beyond-use date (BUD) of active pharmaceutical ingredients (APIs) in oral solid dosage forms. The approach relies on a Tree Ensemble Regression Algorithm capturing complex non-linear relationships between molecular properties, formulation parameters, and environmental conditions. The dataset includes: 1) API descriptors (18) — e.g., MW, logP, RB, PSA, HBD, HBA; 2) Formulation descriptors (4) — encoded excipient compositions (e.g., lactose, silica, cellulose, mannitol, sucrose, HPMC) and API% content; 3) Conditioning & storage descriptors (5) — packaging type (glass/plastic/paper), storage temperature, storage class. The model predicts BUD (days) and highlights an inverse correlation between logP and BUD. Adapted from Smart Formulation: AI-Driven Web Platform for Optimization and Stability Prediction of Compounded Pharmaceuticals Using KNIME, Grigoryan A, et al. Pharmaceuticals (Basel). 2025 Aug 21; 18(8):1240.

Scientific Foundations

Smart Formulation models stability by integrating excipient interactions, environmental conditions (temperature, humidity), and packaging parameters. This enables rational formulation development and anticipation of degradation risks, grounded in validated experimental data.

Key Scientific Publication

Smart Formulation: AI-Driven Web Platform for Optimization and Stability Prediction of Compounded Pharmaceuticals Using KNIME

Grigoryan A, Helfrich S, Lequeux V, Lapras B, Marchand C, Merienne C, Bruno F, Mazet R, Pirot F.
Pharmaceuticals (Basel). 2025 Aug 21; 18(8):1240.
DOI: 10.3390/ph18081240  |  PMID: 40872628  |  PMCID: PMC12389346

This publication presents Smart Formulation, a KNIME-based web tool leveraging AI and predictive modeling to optimize formulation and predict stability. The study demonstrates how integrated models relate excipient properties, environmental conditions, and formulation parameters to anticipated degradation kinetics.

Approach & Validation

  • Design of experiments and machine learning implemented in KNIME,
  • Cross-validation across multiple dosage forms (oral solids/liquids, parenterals, topicals),
  • Systematic comparison of predictions with experimental stability data.

These foundations underpin the Smart Formulation modules, enabling practical, reproducible, and rapid application in pharmaceutical development.

Note: The interactive modules are research prototypes and are provided for demonstration and educational purposes only.