Showcase:
Microbial ID in QC Laboratories

Our GxP-compliant dblast System —
Software as a Service for rRNA Gene-based Microbial Identification

dblast is our Software as a Service (SaaS) for the GxP-compliant identification of Bacteria and Fungi based on ribosomal RNA gene sequences.

Query sequences are automatically transferred from your Sanger sequencing platform (pairs of forward and reverse read per specimen) to the secure dblast service and analyzed immediately. The corresponding results of a project (N specimen) can be viewed, user annotated, and commented on after logging in to the web interface of the service. This initial annotation process then has to be reviewed by another user role (the service ensures the dual control principle). The entire workflow is supported by electronic signatures that users must provide to complete their role-specific processes. Once a project has been finally approved, a PDF/JSON report with specimen annotations, data audit trail, signature information, and service identification information is automatically downloaded to the customer environment for supplier-independent archiving.

Key Features

  • Microbial identification based on rRNA gene sequences as accurate and assisted as possible.
  • Nearly 20k reference species for Bacteria and selected Fungi.
  • Up-to-date nomenclature and regular addition of new species.
  • Unique strand orientation detection for incoming read pairs and other advanced quality control/filtering mechanisms.
  • Consensus sequence creation by merging forward and reverse read pairs, followed by consensus identification.
  • Electronic workflow for aided result evaluation, subsequent review of this process, and associated signatures.
  • Customer admin function for user and role management in your team.
  • Full 21 CFR Part 11 compliance; years of use within GMP-regulated environments.

Specimen Information View: Provides basic information for a selected Specimen within a multi-Specimen project. Shown are various information on forward/reverse read evaluation and consensus creation by the service.

The eSignature Interface: The screenshot shows the project status after user annotation and review by a second person. The reviewer is just going to complete the acceptance process by signing the report.

The dblast Reference Database and Reporting Approach

The dblast reference database consists of two subsets of ribosomal rRNA gene sequences, which are searched and reported in parallel for each consensus sequence generated by the service:

  • All validly described bacterial species (full length 16S rRNA gene).
  • Selected fungal species of taxa relevant for industrial quality control (D2 region of the 28S rRNA gene).

The dblast reference database follows the principle of providing a single reference sequence for each validly described bacterial species, which is only accepted if it originates from one of its public type strains and, if possible, has been taken from a corresponding genome assembly.
We extract the relevant "raw data" from the public (sequence) repositories and operate an internal workflow for data integration and intensive quality control/filtering.

This concept offers three outstanding advantages:

  1. Absence of redundancy (per species) allows for unambiguous identification and easy results assessment.
  2. Only type strains guarantee correctly assigned species names to reference sequences.
  3. Public repositories ensure scientifically comprehensive and up-to-date data.

The preparation and qualification of the dblast reference database is subject to a quality management process that complies with industrial GxP requirements. In conclusion, the dblast service is directly in line with our mission to make the status quo of microbial systematics available for industrial applications.

Reporting: For maximum clarity and easy assessment, the dblast service does not simply display its results as a list of the best N hits, but organizes the hit list as described below.
In this context, the service follows the concept of a threshold of 99% sequence identity for reliable rRNA gene-based microbial species identification, combined with a cut-off value of 97% for genus assignment.

  • If the best match has 100% identity: All other 100% matches are listed (if any) and the first match <100%.
  • If the best match has ≥99% to <100% identity: All matches within this range are shown and the first match <99%.
  • If the best match has ≥97% to <99% identity: All matches within this range are shown and the first match <97%.
  • If the best match has <97% identity: The best 20 matches are shown.

Search Results View for a selected Specimen: There are matches within the Bacteria subset and the results are organized as described above. Details are opened for the best hit. The "Final Result" field must be filled in by the user (supported by the service to minimize effort). In this case, it is a clear identification at the species level ("Bacillus cereus") due to the 100% sequence identity of the generated consensus and the dblast reference sequence.

What is the Meaning of 'dblast'?

The name dblast stands for 'diagnostic blast' and briefly describes (i) how the service works from a technical perspective and (ii) what you can expect in terms of the quality of its results.

When you read 'blast', you probably think of the web service that allows you to quickly search/compare your own sequences online in the GenBank databases. But more specifically, BLAST (Basic Local Alignment Search Tool) is a generic toolbox for local installation based on a specific algorithm. The original BLAST manuscript is one of the most-cited papers in scientific literature and for more than two decades the toolbox has been considered an industry-standard workhorse. Therefore, we decided to apply selected BLAST components in the dblast core module for robust initial sequence alignment and search. However, the dblast service does not provide a standard BLAST output to you. Parameters have been optimized for searching rRNA gene reference databases and an advanced post-processing routine filters the output for potential invalid hits (e.g., based on partial alignments) and also categorizes the results further (see details in section above).

In combination with the high-quality reference database (see section above), the dblast service offers extremely reliable species identification compared to the non-optimized web service offered for the public repositories. For this reason, we decided on the extension “diagnostic” to express the potential of the service.

Target Environments

The service was originally developed for laboratories that rely on extremely reliable identification results (species level) based on the analysis of ribosomal RNA gene sequences and that operate in regulated environments such as quality control laboratories in the pharmaceutical industry. However, the service can of course be utilized in any environment that wishes to benefit from the quality level offered.

Compliance Status

To reach the validation status requested by the regulatory agencies, the service has been subjected to Computerized System Validation (CSV), following GAMP 5 standards. Our activities are based on a quality management system in accordance with ISO 9001, which we operate at the corporate level, as well as fifteen years of experience in CSV for GMP-compliant software/service development and operation.