by Julie MacKinnon, Microbiome Services Manager
High throughput sequencing of the 16S ribosomal RNA gene has become an increasingly popular analysis in recent years. One of the most high-profile applications of this technology has been in analysing and understanding the human gut microbiome and this is a service that we provide at NCIMB to support our customers’ research and development work. Analysing stool samples may not sound like the most pleasant occupation, but the research being undertaken in this area, for example, in understanding the relationship between microbiome, nutrition and human health is so fascinating, that it has become a surprisingly attractive area of work!
However, while many people now think of the term “microbiome” as being synonymous with the human gut microbiome, the definition is in fact much broader than that, and really applies to any population of microorganisms inhabiting a specific environment.
Consequently, in addition to sequencing the human gut microbiome, we have received samples for high throughput 16S sequencing from a highly diverse selection of industries and environments, including soils, oil and gas production facilities and raw materials for food production. This reflects the fact that in many different industry sectors, people are realising that an understanding of microscopic communities can yield information that can be used to inform decision making – relating to, for example, corrosion prevention, food quality or soil health.
When we receive enquiries, people often ask if we undertake “16S sequencing”, but several other phrases are sometimes used to describe this kind of analysis. For example, some of the most used terms are 16S metagenomics, next generation sequencing and amplicon sequencing. Not all the customers we deal with who are seeking this kind of analysis are themselves microbiologists or molecular biologists, and the terminology can quickly become confusing – so this post aims to demystify some of the most commonly used terms!
The term “16S sequencing” simply refers to DNA sequencing of the 16S ribosomal RNA gene.This gene is approximately 1500 base pairs long.Ribosomes exist within all cells, and their function is to translate the instructions encoded within DNA to assemble proteins. In bacteria, ribosomes are comprised of a large subunit and a small subunit. 16S ribosomal RNA is part of the small subunit, and the gene that encodes 16S ribosomal RNA – 16SrDNA – is often sequenced to identify bacteria. At NCIMB we undertake 16S rDNA sequencing by two different methods for different purposes: Sanger sequencing and high throughput/ next generation sequencing.
This method, also known as the “chain termination method” was developed by Frederick Sanger in 1977. At NCIMB, we use this sequencing method to identify individual isolates of bacteria, by sequencing either the first 500 base pairs of the 16S ribosomal RNA gene, or the full gene. This analysis is often requested by pharmaceutical manufacturers carrying out environmental monitoring of their production facilities, or food and drink manufacturers who have isolated a contaminant from a production line.
High throughput sequencing/ next generation sequencing
High throughput sequencing, also known as “next generation sequencing”, is a more recently developed approach to sequencing than Sanger sequencing, that offers a much higher throughput, and this is the technique that has revolutionised microbiome research. This higher throughput, which is achieved through massively parallel sequencing technology, has been applied to the sequencing of whole genomes of individual organisms, as well as sequencing to understand the make-up of microbial communities.
Metagenomics, 16S metagenomics and metataxonomics
Strictly speaking “metagenomics” is the analysis of the whole genome of all organisms within a sample, and this can be undertaken using high throughput sequencing. However, the taxonomic makeup of bacteria within microbiome samples can be studied by sequencing a section of the genome – part of the same gene that is sequenced for identification of bacterial isolates – 16S rDNA. Sometimes this is referred to as 16S metagenomics, or metataxonomics. Although the target is the 16S gene, the sections of the gene that are typically sequenced for this purpose – the “amplicons” are a little different to that used for identification of isolates.
An amplicon is a section of DNA or RNA that is amplified – most commonly by polymerase chain reaction (PCR) – an analytical technique that the general population became very familiar with during the COVID pandemic! Before sequencing can be undertaken, the genetic material must be amplified, to create a sufficient quantity to work with. PCR is used to target and amplify the section of interest. When we undertake high throughput 16S sequencing to reveal the taxonomic make-up of microbiome samples, we can look at two different regions of the 16S gene, or “amplicons”:
- V1/V2 (variable region one and variable region 2)
- V3/V4 (variable region 3 and variable region 4).
The 16S ribosomal gene includes nine variable regions interspersed between conserved regions. For comparison the first 500 base pairs sequenced for the purpose of isolate identification includes V1, V2 and part of V3.
What the results look like
As you might expect, microbiome sequencing produces large amounts of data. However, we present the results of high throughput 16S sequencing in three different easy to understand formats – tables, bar plots and sunburst plots. The results show the complexity of the make-up of the samples at genus level, and the tables give the relative abundance of the particular genus within the singular sample and the combined abundance across a group of samples.
|Sample 1 abundance
|Sample 1 relative abundance
|Sample 2 abundance
|Sample 2 relative abundance
|Clostridium sensu stricto 1
Of course, some environmental samples may include genera for which no sequence data yet exists, but the tabulated data also presents the kingdom, phylum, class, order and family, depending on the published data available.
The bar plots allow for an easy visual comparison of the difference in genera between samples – for example either samples taken from different locations or changes that have occurred over time.
Interactive sunburst plots can also be provided, and this format is useful for visualising the hierarchical relationships between organisms present in the sample analysed.
In many different industry sectors, people are realising that an understanding of microscopic communities can yield information that can be used to inform decision making