Next generation sequencing for analysing bacterial communities – an introduction

Next generation sequencing has transformed understanding of bacterial communities, and in particular the gut microbiome. Consequently, it is becoming increasingly popular in a variety of industry sectors as people begin to understand how they can use the data generated. However, it can be difficult to get to grips with the terminology and processes involved in next generation sequencing, so in this blog NCIMB’s microbiome services manager Julie MacKinnon takes a closer look at the process for analysis of mixed communities of bacteria. 

The application of next generation sequencing techniques to studying the gut microbiome has arguably resulted in a new era of understanding of human health and nutrition. It is can also be an extremely useful technique in environmental microbiology. It’s a fascinating and exciting area of work to be involved in, and at NCIMB we have used the method for applications as diverse as analysis of the human gut microbiome, and assessing the risk of microbiologically influenced corrosion in marine structures.  

The process of next generation sequencing and data analysis is quite complex, but it can be broken down into five basic steps, illustrated in figure 1 below.

Steps involved in 16S – next generation sequencing

The steps are: extraction of DNA from the sample, amplification of a section of a gene found in all bacteria, sequencing of that gene, use of a database to determine which species are present, and calculation of the relative proportions of each species.  

  1. DNA extraction: In order to sequence the DNA present in a sample it is necessary to first get it out of the bacterial cells. This process involves breaking open, or “lysing” the cells, and separating the DNA from the other cellular components.  
  1. 16S gene amplified by PCR: There are different approaches to sequencing the extracted DNA, but one of the most commonly used approaches is to sequence sections of the 16S gene. This is a gene that is common to all bacteria, as it is essential for cell function, but there are variations in the detail of the genetic code within this gene between different species of bacteria. These variations have been used to build databases that are used to identify bacteria following sequencing.   PCR is essentially a method for making lots of copies of the DNA that has been extracted in step one, in order to provide a sufficient volume to work with. Specifically it generates copies of the sections of the 16S gene targeted for identification.   The 16S ribosomal gene includes nine variable regions interspersed between conserved regions. When we undertake high throughput 16S sequencing to reveal the taxonomic make-up of microbiome samples, we can look at two different regions of the 16S gene, or “amplicons”: V1/V2 (variable region one and variable region 2) and V3/V4 (variable region 3 and variable region 4).
Simplified representation of the 16S ribosomal gene: this diagram shows nine variable regions labelled V1-V9 interspersed between conserved regions shown in grey.
  1. PCR products are sequenced: The game changing aspect of next generation sequencing compared to previously available sequencing methods is that sequencing is carried out in a massively parallel way. In other words, rather than sequencing one section at a time, millions of fragments of DNA can be sequenced simultaneously.  
  1. Look up sequences against a database to assign species. The DNA sequences or “reads” generated in step three undergo some quality checking and filtering prior to running them against a database to determine the species present within the initial sample.  
  1. Calculate the relative proportions of the species present. This is a semi-quantitative method. In other words, while the process cannot determine the number of cells of each species present, it can be used to determine their relative abundance, and results can be presented as bar plots indicating the most and least prevalent species in the original sample.  This makes it easy to compare the prevalence of different groups of bacteria in different samples.
Example of a bar plot. The relative abundance of each genus is shown as a percentage of the total operational taxonomic units present in the sample.

For more information about any of our microbiome services, visit our sequencing web page or use the form below to get in touch!

Julie MacKinnon, Microbiome Services Manager

At NCIMB we have used this method for applications as diverse as analysis of the human gut microbiome, and assessing the risk of microbiologically influenced corrosion in marine structures.