Biosimilars: Lessons Learned From Regulatory Approvals
By Sarfaraz K. Niazi, Ph.D., and Sunitha Lokesh
The EU has approved 16 biological molecules, and the U.S. has licensed the same molecules (including approving some as 505(b)(2) approvals), except for follitropin alfa, which amounts to 114 biosimilars, combined. This article, based on the EMA’s European public assessment reports (EPARs)1 and the FDA’s BLA (biologics license application) reviews,2 identifies many inconsistencies that require revision of regulatory guidance to assure faster approval of biosimilars.
All submissions provide data to support biosimilarity based on four stepwise approaches: analytical similarity (recently changed to analytical evaluation by the FDA3), non-clinical pharmacology, clinical pharmacology, and comparative safety and efficacy testing, where required. There was a large difference between the regulatory submissions for the same molecule and many extraneous studies pointed out by the regulatory agencies. While future developers of biosimilars can learn much from these regulatory filings, they are advised not to follow them blindly.
In this article, we unpack the large differences among biosimilar candidates we uncovered by analyzing the FDA and EMA biosimilar approval documentation.
Analytical assessment, the core of development, relies on methodologies that have become more sensitive, revealing more differences in the biosimilar candidates. This results in developers resorting to multiple orthogonal studies, such as in the case of MVASI (bevacizumab), where more than 90 different tests were conducted to justify the variations in the post-translational modifications. It’s important to note, however, the FDA has taken a definitive step by replacing the term “analytical similarity” with “analytical evaluation” to educate the developers that differences are acceptable as long as they do not produce any clinically meaningful impact.4
For example, the lower percentage of afucosylated species in Remsima (infliximab),5 was resolved by showing that its binding to the FcγRIIIa receptor was not significantly different under the physiological conditions, e.g., with the addition of serum, or using peripheral blood mononuclear cells as effector cells.6 Despite these differences, there were no differences in the PK and clinical efficacy studies.
Similarly, the EU and U.S. reference products of rituximab differed in their quality attributes (e.g., charge variants, glycan structures, antibody-dependent cellular cytotoxity [ADCC]), but were ultimately permitted as appropriate references based on their safety and efficacy history.
There is a large misunderstanding by developers about animal toxicology and pharmacology studies. Currently, manufacturers treat biosimilars as new biologics; they’re often conducting toxicity evaluation in multiple species, testing for responses that the animal species do not give, such as immunogenicity. The studies for biosimilars are intended to compare, not characterize, a toxic response, and testing them in species that are not capable of showing a toxic response is irrelevant, as the EMA and the FDA have repeatedly pointed out.
For example, MVASI (bevacizumab) reported five non-clinical studies, including studies in mice and rats, to prove that the high mannose variants in the biosimilar candidate are not meaningful. In our view, however, all of these studies were irrelevant since rodents do not have receptors to bind monoclonal antibodies. The same can be said about testing trastuzumab in mice (Trazimera), where trastuzumab does not identify the neu receptor. Another trastuzumab biosimilar, Herzuma, also conducted studies in rats. None of these studies were considered by the FDA in its evaluation of the products.
Infliximab biosimilars (Reflexis and Inflectra) also conducted studies in mice and rats. In one case, the FDA did not require additional studies since infliximab had a long history of proven safety in clinical use.
Etanercept biosimilars (Eticovo7 and Erlzi8) conducted studies in mice, rats, and monkeys, but only the studies in monkeys were relevant, as the FDA determined that rodents do not have the receptors to respond to etanercept.
The epoetin biosimilar Retacrit reported 15 animal toxicology studies, most of which were considered irrelevant, except the study in beagle dogs and one study in rats, as the developer presented support for the high antidrug antibodies found. It was later discovered that the differences in the ADA were due to differences in the formulation and the route of administration.
The non-clinical development for the two pegfilgrastim biosimilars (Udenyca and Fulphilia) is also noteworthy in this regard. Udenyca reported a toxicity study in Cynomolgus monkeys, while Fulphilia was studied in rats. The diversity of the testing model can likely be attributed to the parallel development of these products, in which the developers were not aware of the studies conducted by their competitors.
The product-specific guidelines of the EMA show that for the following molecules, pharmacokinetic (PK) studies in healthy volunteers are acceptable: teriparatide, low-molecular-weight heparin, insulin, interferon-β, pegfilgrastim, somatropin, follitropin-α, epoetin, etanercept, trastuzumab, bevacizumab, adalimumab, and infliximab. Only for rituximab does the EMA recommend a PK study in one therapeutic area plus an efficacy/safety trial (plus PK data) in the other therapeutic area.
The requirement that the PK studies must meet the equivalence criteria that cannot be overcome by a successful clinical efficacy testing is based on the assertion that the PK studies are more sensitive in detecting potential product-related differences.
There were several failed PK studies reported in the EPARs: Cyltezo (adalimumab), Hyrimoz (adalimumab), Ziextenzo (pegfilgrastim), Terrosa (teriparatide), Grastofil (filgrastim) and Efgratin (pegfilgrastim). For adalimumab, two studies failed initially,9 and the failure was attributed to differences in glycan structures that are known to affect PK but the high mannose differences were too small to justify argument; at least 20 percent difference is needed to affect receptor-mediated elimination.10 The differences in the buffer system could not justify the difference, either. PK similarity was demonstrated in a second, with a larger subject sample size correcting for many variables.
In the case of pegfilgrastim, in particular, the high PK variability11 is due12 to a disproportional dose-response relationship, wherein healthy subjects, given a tenfold increase in dose, showed an approximately 75-fold increase in exposure.13 This points to the importance of accuracy in dosing.
The examples presented above represent pegfilgrastim-specific issue in PK studies; we noted high intersubject variability and a non-linear disposition (e.g., due to long half-life of pegfilgrastim) were some common elements responsible for the failure of PK studies.
Comparative Clinical Safety And Efficacy
“Additional clinical studies” are intended to remove any “residual uncertainty” remaining as a result of any marginal difference in analytical similarity, non-clinical pharmacology, and clinical pharmacology. A clinical efficacy study does not resolve “residual uncertainty” resulting from differences in quality attributes; on the contrary, failed efficacy studies can be overruled, such as in the case of trastuzumab biosimilars. In these examples, the upper bound of the predefined equivalence margins for the primary endpoint (pCR), while confirming non-inferiority, did not exclude the possibility of superior efficacy.14 These products in the end were declared biosimilar, despite differences in the ADCC binding. Several studies have pointed out that differences in antidrug antibodies do not necessarily affect clinical efficacy measures, such as infliximab (CR20 response)15 and also etanercept.16 A significant observation in the studies intended to demonstrate equivalent or non-inferior determination is the use of clinical endpoints. Hard clinical outcomes, such as overall survival, are often insensitive and influenced by disease- and patient-related factors. The FDA encourages developers to suggested novel validated clinical markers as a better choice than recording the clinical response in patients.17 The FDA and EMA accept many clinical endpoints in place of hard clinical outcome monitoring.
In the presence of suitable pharmacodynamic (PD) endpoints and a clear understanding of the mechanism of action, a PK/PD study may be sufficient for marketing approval.18 However, for complex, multifunctional biologicals, comparative efficacy and safety clinical trials in patients are still viewed as a necessary component of biosimilar development.19 The unresolvable complexity of interactions resulting from the size of the molecule, diverse moieties with different functions (e.g., Fab/Fc-parts), multiple mechanisms of action, the impact of glycosylation pattern and potential for immunogenicity, and potentially life-threatening adverse effects (AEs) drive the need for testing in patients.
An examination of the EU guidance and EPARs shows that several biosimilars did not need clinical efficacy testing: teriparatide; low-molecular-weight heparin; insulins; filgrastim; pegfilgrastim; interferon-β (efficacy and safety in patients with MS using MRI-related efficacy endpoint); somatropin (efficacy/safety trial in children with growth hormone deficiency using height velocity as an efficacy endpoint); follitropin-α (efficacy/safety trial in patients undergoing superovulation for ART using “number of oocytes retrieved” as an efficacy endpoint); epoetin (efficacy/safety trial in patients with renal anemia using hemoglobin as an efficacy endpoint); etanercept (efficacy/safety trial in patients with RA using ACR20 or DAS28 as a primary efficacy endpoint); rituximab (PK study in one therapeutic area plus efficacy/safety trial [plus PK data] in the other therapeutic area); and infliximab (efficacy/safety trial in patients in a therapeutic indication approved for the reference medicine, which is sensitive to detect potential differences). The FDA does not provide any such recommendation.
However, recent statements by the FDA point to a major change in considering alternatives to clinical efficacy testing.20
For one, the agency’s interchangeability guidance clarifies that, “Although assessments of efficacy endpoints can be supportive, at therapeutic doses, many clinical efficacy endpoints would generally be less sensitive to detect changes in exposure and/or activity that may arise as a result of alternating or switching.”
The FDA’s Biosimilars Action Plan21 states: “The FDA’s goals in this area also include the development and validation of pharmacodynamic biomarkers tailored to biosimilar development and in silico modeling and simulation to evaluate pharmacokinetic and pharmacodynamic response versus clinical response relationships using existing clinical data. The development and validation of these tools, alongside others, can allow development programs to be more efficient and can reduce the size of clinical studies. These smaller clinical studies, in turn, can enable more biosimilars to reach the market in a much more cost-effective and timely manner.”
Finally, Dr. Janet Woodcock, the FDA’s director of the Center for Drug Evaluation and Research (CDER), recently stated,22 “I believe the clinical trial system is broken.”
We believe that biosimilar approvals are ready without extensive — and, in some cases, any — clinical efficacy testing. But it will depend on the developers to challenge the agencies, which are now open to these suggestions. Clinical efficacy testing constitutes the majority of the cost and time for biosimilar approval; a shift in this strategy will prove pivotal in making biosimilars affordable.
The EPARs, the BLA reviews, and our first-hand experience of negotiating the scope of the biosimilars development plan with global agencies have led us to offer manufacturers the following takeaways and suggestions:
- The EMA and FDA continue revising their pivotal guidance, resulting in a significant change in the study requirements to demonstrate biosimilarity. Both agencies also encourage and accept novel testing methods. Biosimilar developers are strongly advised not to follow the approval paths of similar products; instead, develop a plan that proposes fewer studies, along with securing concurrence of the agencies before beginning the development.
- Structural and functional differences are acceptable if they are not clinically relevant, such as the ADCC profiles. A significant post-translational modification (PTM) difference should be justified based on conclusive function studies, not through parallel orthogonal testing; creative testing like testing under modified specific physiologic conditions may resolve uncertainties. Orthogonal testing is not always the best choice.
- Some products no longer need animal toxicology testing; manufacturers could request a waiver for all products, except in the case of monoclonal antibodies (mAbs), where a PK study in four to six monkeys will suffice. Do not extrapolate animal dosing from human dosing; use only dosing in linear dose-response on the higher side, and do not study the immune response in animal models. It’s not advised to test mAbs in rodents, despite its allowance by India’s regulatory agency, the CDSCO.23 (In our research, we determined that the CDSCO is the only agency to have such clauses.)
- Pharmacokinetic studies are more sensitive in pointing to clinically meaningful differences; agencies may be willing to accept differences in comparative efficacy studies if the pharmacokinetic profiles are similar. In silico pharmacokinetic studies should be offered to secure a waiver of comparative clinical efficacy testing. Be creative with inclusion criteria — know that it is a comparative study, not a Phase 1 study. Novel study protocols can combine pharmacokinetic, pharmacodynamic, and immunogenicity profiling in a single study.
- Immunogenicity testing using in situ models can add to the robustness of the data collected in healthy subjects and patients. Testing for immunogenicity in naïve subjects for highly immunogenic products can be an ethical consideration, but we cannot determine immunogenicity by conducting testing in animal speciesagainst the advice of the regulatory agencies. (As noted above, the Indian regulatory authorities currently approve of such testing.)
- Many product categories do not require comparative efficacy testing (insulin, low-molecular-mass heparins, ophthalmic drugs, and pegfilgrastim), where physicochemical, functional, PK, and PD comparisons provide pivotal evidence of similarity. Exceptions to this recommendation include complex, multifunctional biologicals, where comparative efficacy and safety clinical trials in patients are inevitable as a residual uncertainty always remains. Efficacy study waivers are available where a suitable PD parameter is available, such as in the case of many cytokines. Use clinical endpoints rather than hard efficacy endpoints. We suggest manufacturers offer the argument that M2 is always arbitrary, and surpassing M1 should be excluded in biosimilarity determination in the protocol. (M1 is the entire effect of the active control assumed to be present in a non-inferiority study and M2 is the largest clinically acceptable difference [degree of inferiority] of the test drug compared to the active control.)
We also believe there are several areas in which the FDA can provide additional clarification or guidance. These include:
- Explaining the rationale behind switching from the use of “analytical similarity” to “analytical evaluation.”
- Listing the structural differences that are less relevant to clinical efficacy.
- Educating that toxicology studies are not appropriate in animal species that do not have the appropriate receptors.
- Explaining how the FDA would determine if there is any “residual uncertainty.”
- Providing details of in silico PK modeling to obviate clinical efficacy testing.
- Explaining the rationale of selecting an M2 value on a product-specific basis, given that biologic products often have non-linear efficacy response.
- Creating development templates based on the category of products.
- Encouraging developers to propose novel in vitro, in situ, and in vivo models for immunogenicity testing.
- EMA, EPARs: https://www.ema.europa.eu/en/search/search/field_ema_web_categories%253Aname_field/Human/ema_group_types/ema_medicine/search_api_aggregation_ema_medicine_types/field_ema_med_biosimilar?search_api_views_fulltext=biosimilar
- Weise M, Kurki P, Wolff-Holz E, Bielsky M, Schneider C. Biosimilars: the science of extrapolation. Blood. 2014;124:3191–6
- EPAR Hyrimoz EMA/CHMP/404076/2018. (2018) and EPAR Cyltezo EMA/CHMP/750187/2017. (2017). http://www.epa.europa.eu.
- Goetze AM, Diana Y, Zhang Z, Shah B, Lee E, Bondarenko PV, et al. High-mannose glycans on the Fc region of therapeutic IgG antibodies increase serum clearance in humans. Glycobiology. 2011;21(7):949–59.
- Guideline on similar biological medicinal products containing recombinant granulocyte-colony stimulating factor (rG-CSF), EMEA/CHMP/BMWP/31329/2005 Rev 1. (2018). http://www.ema.europa.eu.
- EPAR Ziextenzo (pegfilgrastim), EMA/47326/2017; Withdrawal of the marketing authorisation application. (2017) and EPAR Efgratin. EMA/18691/2017; Withdrawal assessment report. (2016). http://www.ema.europa.eu.
- Roskos L, Lum F, Lockbaum F, Schwab G, Yang B. Pharmacokinetic/pharmacodynamic modeling of pegfilgrastim in healthy subjects. Clin Pharmacol. 2006;46:747–57.
- EPAR Kanjinti EMA/CHMP/261937/2018 and EPAR Ontruzant EMA/CHMP/9855/2018. (2017). http://www.ema.europa.eu
- EPAR Flixabi: procedural steps taken and scientific information after the authorization. http://www.ema.europa.eu.
- EPAR Benepali EMA/CHMP/819219/2015. (2015). http://www.ema.europa.eu.
- EPAR Filgrastim Hexal EMEA/CHMP/651324/2008. (2009). http://www.ema.europa.eu; EPAR Terrosa EMA/84371/2017. (2016). http://www.ema.europa.eu.
- Guideline on similar biological medicinal products containing biotechnology-derived proteins as active substance: nonclinical and clinical issues EMEA/CHMP/BMWP/42832/2005 Rev. 1. (2014). http://www.Aema.europa.eu
About The Authors:
Sarfaraz K. Niazi, Ph.D., is an adjunct professor at the University of Illinois and founder and executive chairman at Pharmaceutical Scientist, LLC. He has developed many biosimilars and advises regulatory agencies, including the FDA, and biosimilar developers on designing cost-effective strategies and faster regulatory approvals worldwide. You can reach him at email@example.com.
Sunitha Lokesh is a biotechnology engineer and research scientist with extensive experience in developing the most complex biosimilars and taking them to regulatory filings. She works with Niazi in creating cGMP-compliant development and manufacturing platforms You can reach her at firstname.lastname@example.org.