The Pomegranate (Punica granatum L.) draft de novo genome assembly explores genetic divergence between soft and hard seeded cultivars

In this study, researcher’s generated a high‐quality and long‐range contiguity chromosome‐scale de novo genome assembly of the soft‐seeded pomegranate cultivar ‘Tunisia’ with the help of pacbio long reads sequencing and high‐throughput chromosome conformation capture techniques. Re-sequencing of 26 pomegranate varieties with varying seed hardness was also performed. Comparative genomic analyses revealed many genetic differences between soft‐ and hard‐seeded pomegranate varieties. Phenotypic variations between soft- and hard-seeded pomegranate varieties is crucial for molecular marker-assisted selection breeding of new soft-seeded cultivars.

Soft-seeded “Tunisia” genome was sequenced on Pacific Bioscience (PacBio) Sequel platform yielding 20.94 GB long raw reads data having coverage ~62x. De novo assembly of the high-quality PacBio reads having ~55x coverage was completed with the default parameters of the Canu pipeline and the resulting draft genome assembly was polished with Illumina short reads using pilon. Hi-C libraries were prepared for ‘Tunisia’ roots, stems, leaves, flowers, fruit peels, and seeds.

Canu assembler was used to generate high quality de novo genome assembly with Pacbio long reads. N50 contig length for these new ‘Tunisia’ reference genome were 67- and 46-fold longer than recently published ‘Dabenzi’ and ‘Taishanhong’ genomes, respectively.

Table: Summary of soft-seeded ‘Tunisia’ de novo Genome Assembly:

Total Size No. of contigs Longest contig N50 No. Of Scaffold Longest Scaffold N50
320.31Mb 661 14.77Mb 94.49Mb 473 55.56Mb 55.56Mb

A total of 33,594 high-confidence protein-coding gene models were predicted in the ‘Tunisia’ genome, with an average coding-sequence length of 2,229 bp and an average exon length of 263 bp. In addition to protein-coding genes, 52 miRNA, 1,468 rRNA, 440 tRNA, and 1,388 pseudogenes were identified in the ‘Tunisia’ pomegranate genome.

Comparative genomic analyses revealed many genetic differences between soft- and hard-seeded pomegranate varieties. A set of selective loci containing SUC8-like, SUC6, FoxO, and MAPK were identified by the selective sweep analysis between hard- and soft-seeded populations. An exceptionally large selective region (26.2 Mb) was identified on chromosome 1. This high quality Pacbio de novo genome assembly of pomegranate genome is more complete than other pomegranate available genome assemblies. Some indication were also found that genomic variations and selective genes may have contributed to the genetic divergence between soft- and hard-seeded pomegranate varieties. Pacbio long reads resolved complex regions in this high quality de novo assembly of soft-seeded ‘Tunisia’genome.

If you’re looking for quality genome assembly and annotation services get in touch with us at

Our de novo assembly page:


Xiang Luo, Haoxian Li, Zhikun Wu, Wen Yao, Peng Zhao, Da Cao, Haiyan Yu, Kaidi Li, Krishna Poudel, Diguang Zhao,  Fuhong Zhang, Xiaocong Xia, Lina Chen, Qi Wang, Dan Jing, Shangyin Cao. The pomegranate (Punica granatum L.) draft genome dissects genetic divergence between soft‐ and hard‐seeded cultivars.

Related articles

Structural Variation identification in Breast Cancer with Whole Genome Sequencing using Long reads

doi: Link: Advancement of high throughput sequencing technologies have enabled...

Next Generation Sequencing reveals transcriptional analysis of Masson Pine (Pinus massoniana) under High CO2 Stress

Link: Citation: Genes 2019, 10(10), 804; Masson pine (Pinus massoniana) is a...

Human Chromosome Y Sequence Assembly using Oxford Nanopore Reads

First Human chromosome Y sequence was published nearly two...

Case Studies