Guest Column | April 6, 2026

From Principles To Practice: Building Quality Into Generative AI-Assisted Pharma Operations

By Ricardo Torres-Rivera, PMP, CEO and president, Xevalics Consulting, LLC

AI start button on car-GettyImages-2160458879

In Part 1 of this series, we established the case for why quality must be built into generative AI-assisted development, not inspected in after the fact. Drawing on the foundational work of Deming, Crosby, and Juran, and grounded in quality by design (QbD), computer systems validation (CSV), computer software assurance (CSA), and the pharmaceutical industry’s hard-won quality frameworks, Part 1 argued that the dominant vibe coding workflow represents a regression to the inspection-dependent thinking the quality management discipline spent decades dismantling. It cataloged the new risk dimensions generative AI introduces with prompts as uncontrolled executable logic, non-deterministic outputs, model drift, and context injection that traditional validation practices were never designed to address. This article picks up where that argument left off: what regulators are already asking, what quality must mean in this new context, and how organizations can begin building the governance infrastructure the moment demands.

The Question Regulators Are Already Asking

Consider a scenario that is no longer hypothetical. A pharmaceutical company uses a generative AI tool to support a regulatory submission, perhaps to analyze clinical pharmacology data, summarize adverse event trends, generate a statistical report, or review batch production records for anomalies. The output looks reasonable. It is incorporated into the regulatory package or used to support a batch disposition decision.

Now a regulatory agency reviewer or inspector asks: How was this analysis performed? What prompt was used? What model version generated the output? What parameters were set? What data context was provided? Can you reproduce this result?

If the organization cannot answer these questions and, today, most cannot, the analysis has no regulatory standing. Not because generative AI is inherently untrustworthy, but because the organization failed to build quality into the process that produced the result.

This raises a question the industry has not yet formally addressed: do we need prompting standards?

If prompts now function as executable logic determining how regulated data is processed, analyzed, and interpreted in GMP, GLP, and GCP environments, then prompts are regulatory artifacts. They require the same governance we apply to any controlled component in a validated system: version control, change management, verification testing, and documentation. The fact that a prompt is written in natural language rather than Python or SQL does not make it less consequential. If anything, the inherent ambiguity of natural language makes governance more critical, not less.

The FDA’s January 2025 draft guidance on the use of artificial intelligence to support regulatory decision-making for drug and biological products already signals this direction. Credibility, transparency, and reproducibility are central expectations. The pharmaceutical industry should not wait for enforcement actions to begin building the quality infrastructure that these expectations require.1

Reclaiming The Definition Of Quality

Part of the problem is that “quality” has become a vague word in too many organizations, synonymous with “good enough” or “it passed testing.” In the context of generative AI-built tools proliferating across pharmaceutical operations, this vagueness is dangerous.

Quality must be redefined, or rather, reclaimed, in precise, operational terms that every quality professional, every laboratory scientist, and every manufacturing leader can apply. Drawing from Crosby, Juran, and ISO 9000, a working definition for our industry is:2,3,4

Quality is conformance to fit-for-use requirements, demonstrated by objective evidence, scaled to risk, with data integrity treated as part of the requirement set.

Every element of this definition matters in pharmaceutical context. Conformance provides discipline; it grounds quality in verifiable standards rather than subjective impressions. Fit-for-use ensures requirements are meaningful, aligned with intended use, real-world operational conditions, and patient safety. Objective evidence makes the claim defensible under inspection. Scaled to risk prevents both over-documentation of trivial functions and under-control of critical ones the balance that CSA was designed to achieve.5 And data integrity as a requirement recognizes that in pharmaceutical systems, a record that cannot be trusted is a quality failure regardless of whether the software functions correctly, a principle reinforced by 21 CFR Part 11, ALCOA+ expectations, and FDA’s data integrity guidance.6,7

Applied to generative AI-built tools now appearing in pharmaceutical operations, this definition sets a clear standard. Before the first prompt is written, there must be requirements. Those requirements must be fit for use in the intended good practice (GxP) context. The outputs must be traceable back to those requirements. And the integrity of the data, including the prompt, the model configuration, and the context, must be controlled as rigorously as any other element in a validated system.

Without this, what organizations have is not quality. It is a prototype running in a GMP environment.

Do It Right The First Time

Crosby’s admonition to do it right the first time is not a slogan. It is an economic and regulatory argument. In pharmaceutical manufacturing, the price of nonconformance is measured in rejected batches, 483 observations, warning letters, consent decrees, delayed product launches, and, at the most serious level, compromised patient safety.2 The cost of building quality in from the start has always been lower than the cost of fixing what went wrong.

In the context of generative AI-built tools and vibe coding, “doing it right the first time” means establishing quality infrastructure before the coding begins, not retrofitting governance after an AI-generated tool is already embedded in a GMP workflow and an auditor is asking questions no one prepared to answer.

This is the essence of QbD: understand the variables, define the requirements, control the process, and produce evidence of conformance before the product reaches the point where failure has consequences. QbD taught us this lesson for drug manufacturing.8 The same principle applies to the software tools we now use to support manufacturing, quality, and regulatory decisions.

Leaders who are championing vibe coding within their organizations carry a responsibility here. The message to teams cannot simply be “go build.” It must be “go build with quality built in.” Otherwise, the speed that makes generative AI attractive becomes the speed at which ungoverned, unvalidated tools proliferate across quality systems, laboratory operations, and manufacturing processes. Technical debt that once accumulated over months now accumulates in days. And in our industry, technical debt is not merely an engineering inconvenience, it is a compliance liability with direct implications for product quality and patient safety.

Translating Principles Into Practice

The preceding sections establish why quality must be built into generative AI-assisted development, not inspected in after the fact. But principles without actionable guidance leave organizations in the same position: knowing what to do in theory, but uncertain where to begin in practice. The following considerations are designed to translate the quality principles discussed above into concrete starting points for pharmaceutical organizations navigating AI adoption in regulated environments.

1. Establish Prompting Standards

If prompts function as executable logic, governing how regulated data is processed, analyzed, and interpreted, then prompts require the same governance discipline that organizations apply to code in traditional software development life cycle (SDLC) environments. Just as coding standards define how software must be written, reviewed, and documented before deployment into production, prompting standards should define how prompts must be authored, versioned, reviewed, and maintained when they generate or influence regulated content.

At a minimum, prompting standards should address: version control of prompts used in GxP workflows, documentation of model identity and configuration parameters, traceability between prompts and the outputs they generate, and clear criteria for when a prompt revision constitutes a change requiring formal change control. The goal is not to bureaucratize every interaction with an AI tool but to ensure that prompts generating regulated content are governed as the controlled inputs they functionally are.

2. Require Secondary Human Review for High-Risk, Critical Actions

Not every AI-assisted output carries the same risk. A generative AI tool summarizing meeting notes operates in a fundamentally different risk tier than one analyzing clinical pharmacology data or supporting a batch disposition decision. Organizations should establish risk-based tiering that scales human oversight of the consequence of the AI-assisted action.

For high-risk applications, those where AI outputs influence patient safety, regulatory submissions, or GMP/GLP/GCP decision-making, a secondary human review should be mandatory. This is consistent with the human-in-the-loop (HITL) principle that FDA’s framework increasingly expects: the human is responsible for the decision, not the tool. The AI output is decision support, not decision authority. HITL review must be auditable, with documented evidence including reviewer identification, timestamp, review criteria applied, and — critically — override rationale when the human judgment differs from the AI recommendation. Without this evidence trail, the organization cannot demonstrate that a qualified human actually evaluated the output before it entered a regulated record or informed a regulated decision.6

3. Extend AI Governance Policy to Cover Prompting and Generative AI Use

Many pharmaceutical organizations are developing AI governance policies, but few have extended those policies to address the specific challenges of generative AI and prompting in regulated environments. A comprehensive AI governance policy should define which generative AI tools are authorized for use in GxP contexts, who is permitted to use AI-generated outputs in regulated workflows, and what documentation is required before AI-assisted content enters a controlled record.

The policy should also establish that the human is the author of record for any regulated content, regardless of how that content was drafted. AI assistance does not transfer authorship or accountability. Verification is required before any AI-generated content enters a controlled record. These are not aspirational principles; they are the operational controls that give an AI governance policy regulatory teeth.

4. Make the Prompting Process and Its Output Self-Explanatory

When a regulatory inspector reviews a record that was generated or influenced by AI, the process that produced that record should be self-explanatory to the extent possible. This means the prompt itself, the model configuration, the input data context, and the resulting output should together tell a coherent, traceable story that a reviewer can follow without requiring the original author to narrate it.

This is directly analogous to what regulatory inspectors already expect of traditional validated systems: the documentation should stand on its own. The same standard must now extend to AI-assisted processes. If an inspector cannot reconstruct why an AI tool produced a particular result, what was asked, under what conditions, with what data, then the record lacks the transparency that regulatory credibility requires. Provenance must be readable by humans, not just machines.

5. Build Credibility and Transparency into Both the Process and the Output

FDA’s January 2025 draft guidance on AI in regulatory decision-making places credibility and transparency at the center of its expectations. Organizations should not treat these as abstract ideals. They are operational requirements that must be demonstrated through evidence.1

Credibility means the organization can show that the AI system is fit for its intended use through validation or assurance controls, performance monitoring, and evidence that the outputs are reliable in the specific context of use. Transparency means the organization can show how the AI-assisted process works: what inputs drive the outputs, what parameters and model versions are in play, and what human oversight is applied.

The distinction between validation (system-level assurance over time) and credibility (decision-level trust for a specific context) is one that FDA’s framework introduces and that our industry must internalize: a validated system does not automatically produce credible outputs. Context of use matters. Risk-based confidence scaling applies: the higher the consequence and the greater the AI influence on the decision, the stronger the credibility evidence and assurance controls must be.

Charting New Waters

These are new waters to chart. The convergence of generative AI, code democratization, and pharmaceutical regulatory expectations creates a landscape that no existing framework fully addresses. Classical software development life cycle practices provide the governance foundation. Quality pioneers from Deming to Welch provide the principles. QbD, CSV, and CSA provide the validation discipline. But the specific challenges of AI-assisted development, prompt governance, model reproducibility, non-deterministic behavior, and context injection require our industry to extend its thinking into territory that has not yet been fully mapped.

The questions that pharmaceutical leaders should be asking today are not comfortable ones:

  • If a prompt determines how GMP data is analyzed or how a batch release decision is supported, is that prompt being governed as the regulatory artifact it functionally is?
  • If an AI model version changes and the output of a quality or laboratory tool changes with it, where is the change control?
  • When we encourage our teams to build tools through vibe coding, are we also ensuring they understand what “fit for use” means in a GxP environment?
  • Are we building quality into AI-assisted processes, or are we simply inspecting outputs and hoping for the best?
  • Would Deming recognize what we are doing as “building quality in,” or would he see the same inspection-dependent thinking he spent his career fighting?

The quality pioneers solved this kind of problem before. They taught us that quality is not a phase, not a department, and not a final check. It is a discipline that must be present from the first design decision to the last record produced. QbD proved it could work in drug manufacturing.8 The same intellectual rigor must now be extended to the software tools we build to support our regulated operations.

Generative AI changes the tools. It does not change the quality principles.

The pharmaceutical organizations that recognize this early and build quality into AI-assisted development before regulators require it will lead the next generation of innovation in our industry. The ones that do not will discover the lesson through 483 observations, failed submissions, and compromised data that could have been prevented.

Generative AI can write the code, but it cannot generate accountability. It cannot generate quality. That responsibility remains, as it always has, with us.


References

  1. FDA — Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products (Draft Guidance, January 2025). https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological
  2. Crosby, Philip B. Quality Is Free: The Art of Making Quality Certain. McGraw-Hill, 1979.
  3. Juran, Joseph M., and A. Blanton Godfrey, editors. Juran’s Quality Handbook. 5th ed., McGraw-Hill, 1999.
  4. International Organization for Standardization. ISO 9000:2015: Quality Management Systems — Fundamentals and Vocabulary. 4th ed., 2015. https://www.iso.org/standard/45481.html
  5. FDA — Computer Software Assurance for Production and Quality System Software (Final Guidance, September 24, 2025). https://www.fda.gov/regulatory-information/search-fda-guidance-documents/computer-software-assurance-production-and-quality-system-software
  6. FDA — Data Integrity and Compliance With Drug CGMP: Questions and Answers (Final Guidance, December 13, 2018). https://www.fda.gov/regulatory-information/search-fda-guidance-documents/data-integrity-and-compliance-drug-cgmp-questions-and-answers
  7. eCFR — 21 CFR Part 11: Electronic Records; Electronic Signatures. https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11
  8. ICH Q8(R2) — Pharmaceutical Development. International Council for Harmonisation, 2009.

Transparency Statement

This article was developed with the assistance of AI tools for research and drafting. The author has reviewed, edited, and takes full responsibility for all content.

Regulatory and Compliance Disclaimer

This article is provided for informational and educational purposes only and does not constitute regulatory guidance, legal advice, or an official interpretation of applicable laws, regulations, or guidance documents. Organizations remain solely responsible for determining regulatory applicability and compliance.

About the Author:

Ricardo Torres-Rivera, PMP, is the CEO and president of Xevalics Consulting, LLC, a Minneapolis-based firm specializing in computer systems validation (CSV), computer software assurance (CSA), data integrity, GLP compliance, and project management for regulated life sciences organizations. He chairs the SQA CVIC Steering Committee and the CVIC AI Subcommittee and is a recognized speaker and instructor in the quality and compliance community.