At the recent American Society of Human Genetics (ASHG) meeting in San Francisco, I attended numerous presentations describing successful use of saliva for next generation sequencing, both exome/targeted and whole genome. For example in Dr. Kitzman’s presentation entitled “Non-invasive whole-genome sequencing (WGS) of a human fetus”, they used saliva to sequence the father’s genome. The data presented at ASHG was in line with all the validation work we have done and continue to do at DNA Genotek. We have previously reported results of Oragene/saliva samples on exome sequencing comparing paired blood and saliva samples. These reports clearly demonstrate that saliva is an excellent source of genomic DNA for exome sequencing. Similarly, we used Raindance technologies to target the MHC region which was then sequenced and allowed us to accurately make HLA calls using saliva as the sample type. Complete Genomics has also demonstrated that saliva is a good source of DNA for WGS and that the bacterial content is not an issue when interpreting the data. These studies build on data we previously generated using microarrays to demonstrate the utility of saliva as a reliable source of DNA.
But this year at ASHG, one poster in particular caught my attention. It was a poster presented by Dr. Cory McLean of 23andme in which he described WGS of 50 saliva samples. These saliva samples were collected using Oragene and archived at room temperature before use. The Oragene/saliva samples were archived for up to 26 months at room temperature before purification. Dr. McLean’s work focused on the LRRK2 G2019S mutation in a Parkinson’s disease cohort, I encourage you to have a look at his poster.
What I’d like to focus on here is the technical information from that poster. The DNA extracted from these archived Oragene/saliva samples was sequenced on the Illumina GAIIx to a median depth of 44.9 fold coverage and covered 97.8 – 98.2% of the genome. After identifying the variants in these samples Dr. McLean compared the results to data from the same cohort previously determined using a genotyping array and observed a 99.91 – 99.97% concordance, indicating that Oragene/saliva samples provide consistent results across different technology platforms. The question that many researchers continue to ask is: what impact does bacterial content from saliva have on WGS? We’ve clearly demonstrated that when performing exome or targeted sequencing that the bacterial content is a non issue. Complete Genomics demonstrated that bacterial content has no impact on performance when doing WGS but as expected, a certain percentage of reads don’t map back to the human reference. Speaking to Dr. McLean, he informs me that the time samples were stored at room temperature does not correlate with percentage of aligned reads (r2= 0.0013). This can be attributed to the bacteriostatic properties of the Oragene chemistry. DNA Genotek recently completed a study of 24 samples examining the impact of bacterial content on WGS. This data is still being analyzed and will be presented in more detail at a later date but in brief, sequencing of samples on the Illumina HiSeq we observe on average < 10% of reads not aligning to the human reference. In comparison when sequencing blood, on average 4% also doesn’t align to the human reference.
So what impact does all of this have on your overall project? Well the impact isn’t as much as you may think when you take into consideration sample acquisition and purification, library prep, sequencing, and analysis. Here is how I see the costs breaking down per sample: When recruiting patients for a study you might spend $50-100 for collection shipping and purification ($50 for saliva and $100 for blood due to the increased shipping costs and incentive required for an invasive sample), and don’t forget compliance rates are dramatically higher with saliva over blood (Rylander-Rudqvist T et al. Quality and quantity of saliva DNA obtained from the self-administrated oragene method--a pilot study on the cohort of Swedish men. Cancer Epidemiol Biomarkers Prev. 15(9):1742-1745 (2006).); next you will probably use a service provider to sequence your sample with depth of coverage > 40x at a cost of $5000; and now you are left with analysis which can vary greatly but I’d estimate $2000 - $10,000. You will now quickly realize that although on average <10% of the sequencing reads don’t map back to the human reference, that your service provider has exceeded the >40x specification of their service and the quality of sequence is indistinguishable between saliva and blood samples. In fact because it’s more economical to use saliva (easier to recruit, cheaper to collect and ship) and that the impact to sequencing is so low, your study will benefit overall from using Oragene/saliva samples.
Thinking about trying non-invasive saliva samples for WGS or exome sequencing? We can provide evaluation kits for you.