Showcase:
Gene Therapy Manufacturing
A Software/Cloud Application for NGS-based Product Testing
When a product of pharmaceutical research and subsequent manufacturing is represented by a DNA molecule, as is the case with gene therapy, the ultimate option for testing the integrity and purity of the product is next-generation (long-read) sequencing. This process must be aided by adequate software solutions.
Ribocon has developed a software application specifically for this task: within few minutes, output data from the sequencing platform are analysed in a fully integrated process, resulting in a report which provides the answers you need.
The application is tailored to support your routine, but can of course also be configured to adapt the processing parameters to your specific data and requirements.
Input Data and General Workflow
The application is made for DNA sequence data in BAM format and its workflow includes the steps listed below. The term "Target" describes the product (e.g. a plasmid sequence carrying a specific insert) for which the analysis (test) is done.
- Computation of initial (raw) statistics.
- Statistical read reduction (sampling) to start with the optimum amount of data.
- Mapping of sampled reads against the Target reference and selected Contamination references.
- Initial filtering and computation of global statistics (incl. mapping/Contamination stats).
- Creation of a multiple sequence alignment (MSA) out of the Target-mapped reads after further quality filtering.
- Computation of positional frequencies for the MSA and creation of a single Consensus sequence.
- Computation of the overall identity between the Consensus and the Target reference.
- Creation and provision of final output files incl. detailed statistics (Reporting).
A set of 75k sequence reads with an average length of 10k bases (750M bases total),
for example, is typically processed on a common business laptop within 5
minutes.
The entire process is completely automated, only input, reference, and config data
must be defined.
Key Features
- Can currently be used "manually" as common Linux command line tool or embedded into an AWS environment to operate as a fully automated cloud service with an Amazon S3 data exchange interface.
- Target identity verification, contamination check, and detailed quality/data statistics.
- Results archive, with PDF reports and various supplemental output files (for more details, see Right Sidebar).
- Centralized configuration of analysis parameters by application.properties file.
- Integrity control of input data and GxP-compliant data audit trail (for each read, its full analysis history is recorded).
The Target Identity Verification section of the application report: If an
100% target identity can not be confirmed and the number of mismatches is
limited, for each mismatch position detailed background information is given as
exemplified. This shall help to decide if there is real divergence observed for
a particular position, or if it is just caused by sequencing noise (often
extremely hard to eliminate within homopolymeric regions of a target sequence).
In the example shown, in 17.85% of the sequenced molecules an additional C
occurred at the position listed. Because of the configured Variability Threshold
of 85%, the additional base (not present in the target) has then been accepted
for creation of the Consensus sequence. However, the regional information
presented clearly suggests an artifact caused by the homopolymeric character of
the target region.
Compliance Status
We are operating the application as a cloud service for GMP-compliant data analysis in product testing for one of the biggest pharmaceutical companies in the world. To reach the validation status requested by the regulatory agencies, the service has been subjected to Computerized System Validation (CSV), following GAMP 5 standards. Our activities are based on a quality management system in accordance with ISO 9001, which we operate at the corporate level, as well as fifteen years of experience in CSV for GMP-compliant software/service development and operation.
This means that even if you intend to use the Linux command line tool for research and development purposes only, for example, you will always benefit from its GMP quality grade.
