The Genetic Link

Welcome to The Genetic Link, a blog providing new insights into DNA and RNA sample collection by DNA Genotek. DNA Genotek is a subsidiary of OraSure Technologies, Inc.

All Posts

Article by: Mike Tayeb

The impact of bacterial DNA in saliva on whole genome sequencing


Mike Tayeb is the manager of technical support at DNA Genotek. 

Several months ago our Vice President of Research and Development, Rafal Iwasiow, wrote a blog discussing the performance of DNA extracted from Oragene/saliva samples in next-generation sequencing applications. In that post, he mentioned validations that demonstrate the suitability of saliva for sequencing that targets particular regions of the genome, such as the whole exome or the major histocompatibility complex (MHC). Also discussed was a study conducted by Dr. Cory McLean of 23andMe regarding the utility of Oragene/saliva for whole genome sequencing (WGS).

As you may already know, the majority of DNA in saliva comes from white blood cells however, human saliva also contains bacteria. When extracting DNA from saliva, some of that bacterial DNA is recovered along with the human DNA. When compared to other oral sampling methods, such as buccal swabs or mouthwash, saliva collected with Oragene contains a substantially lower amount of bacterial DNA with the mean bacterial DNA content being approximately 11%. What I’d like to focus on here is a question that many researchers have regarding the use of saliva for WGS: What is the impact of bacterial DNA in saliva when sequencing the entire genome?

Recently, we conducted a preliminary study to help answer this question. We collected saliva samples using Oragene from a large group of volunteers and extracted the DNA using prepIT•L2P, an ethanol precipitation-based procedure. Three of the individuals also provided blood samples to serve as controls, and DNA was isolated from these using a Qiagen QIAamp DNA Blood Mini Kit. All the DNA samples were quantified using Picogreen and the bacterial DNA content determined using qPCR. From the large set of saliva samples collected, we selected 21 samples that spanned a wide range of bacterial DNA content, from approx 5% to 40%. (It is important to note that samples with bacterial content as high as 40% are rare, however we wanted to include some of these “worst case scenario” samples in this study. Please see this white paper for a more complete discussion about bacterial DNA in saliva samples.)

Sequencing libraries were prepared using Illumina TruSeq DNA kits and barcoded paired-end adapters. The libraries were sequenced on the Illumina HiSeq2000, multiplexed to 8 samples per lane on the flow cell. Alignment of the generated sequences with the reference was performed using Illumina’s CASAVA software.

Not surprisingly we found a direct correlation (R2=0.8) between the amount of bacterial DNA in the samples and the proportion of reads that did not map to the human reference genome (see the chart below). For the blood samples, which were confirmed via qPCR to contain virtually no bacterial DNA, an average of 4% of reads did not align to the human reference and this value was used as a background correction for all samples. For the Oragene/saliva samples, 5.3% of the total reads, on average, after background correction, were unmapped. In the “worst case scenario” sample (bacterial content of 40.3%), 9.9% of the reads did not map to the reference. Interestingly, for all samples, the amount of unmapped reads is lower than the % bacterial DNA estimated by qPCR.

WGS saliva chart

These results indicate that although there is some reduction in the proportion of reads that map to the human reference, that decrease is small, even in samples containing an unusually high amount of bacterial DNA.

We are currently conducting further whole-genome sequencing validation studies of Oragene/saliva samples including a more thorough investigation of the relative performance between blood- and Oragene/saliva-derived DNA using paired samples from the same donor and the effect on variant calling. We expect to have results later this year. If you have any unanswered questions about Oragene/saliva samples and next-generation sequencing (either whole genome or targeted) please email us at 

Request free trial kits of any DNA Genotek saliva kit

Subscribe and Get The Latest News

Recent Posts

Kumamoto University, Japan, research group finds saliva a suitable alternative to blood for DNA methylation

Epigenetic information such as DNA methylation is a useful biomarker that reflects complex gene‐environme...
Katherine Lawless Apr 6, 2021 12:00:00 PM

All-of-US research program - collecting 1 million diverse DNA samples across America.

Project highlights presented at the Festival of Genomics 2021. In January of this year, the 6th annual Fe...
Katherine Lawless Mar 31, 2021 1:33:28 PM

Saliva samples for COVID-19 testing – scalable self-collection for anyone, anywhere

It has been over a year since sample collection for COVID-19 diagnostic testing began. Over the past year...
Scott Rabuka Mar 16, 2021 9:00:00 AM