2015-01-27
I’m sure you are familiar with the expression “Garbage in, Garbage out”. Popular in the field of computer science, it implies that incorrect or low quality input will result in correspondingly poor output. The mere existence of a result is no guarantee of its quality or accuracy, as surely as cooking with the wrong ingredients will not produce your desired dish.
Nowhere is the notion of “Garbage in, Garbage out” more relevant than in the field of genetics. Innovations in sequencing technology have enabled the field to generate an enormous amount of data for each individual or species being genotyped. While much focus is placed on the generation of this data and the ability to effectively store, analyse and share the information produced, very little attention is given to ensuring that the actual data is based on the highest quality sample and that the sample has not changed between collection and analysis. We should consider, “what is the point of analysing data that may at its root be flawed? What is the impact of basing findings on potentially inaccurate data?”
Ultimately, genetic findings can only be as good as the biological samples used. Yet often when samples are analysed, they are no longer truly representative of the state at time of collection. This can be due to errors in the collection protocol or changes to the sample between acquisition and analysis due to time, temperature fluctuations or processing, and can result in false or inaccurate data. Simply, you must get your research “right from the start”, because better samples lead to better results.
In an episode of the Mendelspod Life Sciences podcast, biospecimen specialist Dr. Carolyn Compton observed that despite the lack of reliable samples being a major issue in biological research, there remain many researchers who are essentially unaware of the problem. Dr. Compton theorized that this lack of knowledge of the issue could be due to many researchers being so far downstream that they are not conscious of possible upstream changes to the biospecimen. There is also often an incorrect belief that a sample that produces results may be presumed to be correct, despite the potential for changes to occur between collection and analysis. This assumption is a startling example of what Dr. Compton refers to as how “the technological capacity exists to produce low quality data from low quality analytes with unprecedented efficiency”.
While most researchers have an appreciation of the technological progress in downstream sample analysis, the improvements in sample collection and storage are not as widely known or sought after. Even for many who acknowledge that samples may undergo changes between collection and analysis, the traditional response has been to accept this as an unfortunate, inevitable fact and to either ignore or attempt to account for these changes. Instead, we should ask if there is a better way and seek a method that optimizes each sample to keep these changes from occurring and completely remove them from the equation.
The issue of reliability from sample collection to results is prevalent throughout the biological sciences. A survey conducted by the National Cancer Institute’s Biorepositories and Biospecimen Research Branch in Rockville, MD determined that 70% of scientists polled found it “somewhat difficult” to “very difficult” to attain the quantity and/or quality of biospecimens needed.
The same survey found that 60% of respondents “occasionally” to “always” question their data due to the quality of their biospecimens, and over 80% at least “sometimes” limit the scope of their work due to the quality and availability of biospecimens. These statistics demonstrate the reality that many researchers cannot carry out their ideal projects or be confident in their outcomes due to the calibre of available biological samples.
As a company our focus is to ensure the highest quality sample for downstream analysis. In the coming months, we will be presenting a series of blog posts discussing sample-related challenges and how DNA Genotek provides for a seamless path in collecting and maintaining high quality samples.
To begin this series, let’s discuss the impact of the sample on discoveries within human genetics, animal genetics, microbiome analysis and infectious disease.
Genetics: Access and Scale
The easy acquisition of statistically relevant and reliable samples for analysis is the foremost challenge in the fields of human and animal genetics. The samples collected must contain DNA that is of high quantity and quality to be capable of running on complex downstream applications such as microarrays and next generation sequencing.
Obtaining the requisite number of samples can also be a considerable hurdle for a study. A research project may require thousands of samples to become statistically significant, and it can be difficult to recruit donors if they are required to travel to a clinic and have blood drawn when there is no immediate benefit to them. This is especially true when a study involves a specific, targeted demographic. Any steps that can be taken to increase compliance will benefit the study in terms of both the rate at which the necessary number of samples can be collected as well as in the accuracy of the data produced.
Gathering samples can be further complicated by the need for an expert to be on hand to collect the sample from willing donors. Studies have demonstrated that offering participants an easy-to-use, non-invasive method for self-collection anywhere and on their own schedule can dramatically improve compliance rates. Not requiring an expert to be present during collection offers the flexibility to obtain samples outside of a clinic, even in remote areas via the mail. This allows you to design and conduct your ideal study instead of having to adapt your study to a limited number of available, high quality samples.
Microbiome: “Snapshot” the Sample
Recent research has shown that the gut microbiome may influence obesity, asthma, diabetes, cancer, autoimmune disorders and heart disease, while potentially impacting drug response, sleep patterns, mood, anxiety and other behaviours. Since there are millions of combinations of microbial cells making up the gut microbiome that can influence many aspects of human health, it is absolutely necessary that a microbiome sample is a truly accurate representation (“snapshot”) of the in vivo state of the donor at time of collection.
The microbiome profile is highly susceptible to changes if it is handled improperly or exposed to unfavourable environmental conditions during the collection and shipping process. These changes to the microbiome profile could lead to missed or inaccurate discoveries. The only way to be completely assured that your data is accurate is to have the microbiome sample stabilized and protected from the moment of collection.
Since microbiome is a relatively new area of research, its foundational knowledge is still being developed. If incorrect data is produced and leveraged as the baseline in the early stages of microbiome discovery, then the future of the science could be built on false assumptions. The risk of creating an inaccurate baseline increases the need for all experts in the field to focus on obtaining a true “snapshot” of the microbiome in order to achieve correct and reproducible results.
Infectious Disease: Optimizing Samples for Accurate Results
Advances in infectious disease diagnostics and treatment are driven by research in several areas, including pathogen-host interaction, biomarkers, pathogen epidemiology, public health and clinical research. No matter the area of infectious disease research involved, sample quality is fundamental to successful discoveries.
The research methods underpinning new discoveries must be as technically accurate and sensitive as possible to provide a clear and reproducible view of the biological targets and pathways being observed. Research outcomes can be impacted by sample quality and target stability, by the abundance of available target and the ability to access it, as well as by the sensitivity of methods used to detect and quantify the target.
Samples can be impacted by harsh environmental conditions, disrupted cold chain and time delays involved in shipping and storage. These conditions can be expected when collecting samples in high-burden, low resource settings where much infectious disease research takes place. Low quality samples can provide false negatives, leading to inaccurate findings and conclusions.
Immediate stabilization of samples and targets at point of collection ensures the highest quality starting material for critical research. Stabilization methods that fully preserve a sample for days at ambient temperature are optimal for studies that collect samples outside of the hospital or lab and require transport to a centralized location for analysis. Reliable ambient temperature stabilization of samples enables researchers to collect, transport and store more high quality samples and eliminates the need for cold chain. Increased efficiency both inside and outside the lab decreases costs and complexity of sample management while supporting process batching and increases safety for medical and laboratory resources.
In summary
Scientific discovery is only valuable if the data used to produce new knowledge is accurate. Technology has afforded us the ability to carry out research more quickly and on a larger scale than ever before—unfortunately, in the words of Dr. Carolyn Compton, it has also allowed us to “get the wrong answers with unprecedented speed.” It is absolutely vital that scientific discovery be accurate and reproducible in order to avoid false conclusions.
The ability to collect and process reliable, trustworthy samples is the most important challenge to genetic discoveries. In future blog posts we will discuss how DNA Genotek is committed to overcoming this obstacle as well as many other challenges faced by the scientific community surrounding sample collection, stabilization, transportation and preparation to ensure your results are truly correct and reproducible, “right from the start”.