News

Genomics: Best combinations between wet and dry biology

Genomics: Best combinations between wet and dry biology

actualite

NGS and Genomics

With the advances of next generation sequencing, bacterial genomics is currently booming. It allows exploring the genetic diversity of isolates from a bacterial population, identifying characteristic functions of pan genomes and doing microbial source tracking. Today two main technologies are mainly used: the Ion Torrent PGM from Life Technologies and the MiSeq from Illumina. These methods require an amplification step of target DNA to bring it to a detectable quantity by machines. They generate short reads from 100 to 400 nucleotides in length and are compatible with paire-end protocols. The short fragments are finally assembled to reconstruct the entire genome of the cell.

Bioinformatics

Several methods exist to perform this assembly. As for solving a puzzle, algorithms can rely on a model (a reference genome) to rebuild the genome. In this case of resequencing, the bioinformatics strategy is called mapping and require few computation resources. Otherwise, if the target organism is newly studied, de-novo assembly pipelines for short reads are used. This step is still a complicated process even if reads are increasingly long, and it depends on the nature of the target genome (repeat regions, DNA accessibility, etc.). Various assemblers are available such as Velvet, Ray, Mira, Soap or Abyss.

Combination of sequencing and bioinformatics strategies

Surprisingly, the efficiency of de-novo assemblers depends on the type of sequencer and the coverage. A too-low depth leads to the absence of reads for some genetic regions, while a too-high depth increases the risk of artefacts and decreases the quality of the assembly. Ray assembler is known to be more effective for MiSeq data with a 100-fold coverage and Mira for Ion Torrent data with a 25-fold coverage (bacterial genomics).

For more information: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3625192/ and https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-14-675.

Third-generation sequencers

The limit of NGS resides in two parameters: the size of reads and the amplification steps that produce errors. Third-generation sequencers are able to sequence a single molecule of DNA in its native form. They also produce larger reads from 4 to 20 kb. The assembly step is thus facilitated, leading to the generation of better contigs and scaffolds.