With the availability of cheap long read sequences and efficient software packages, novel genome assemblies are made available frequently. It is not rare to produce multiple references of the same or closely related species in a project. These assemblies being produced independently raw assembly file include chromosomes or scaffolds in random order and with random sequence names. Scientists usually decide to organize assembly files in the same way and therefore have to rename, orient and order the sequences in the files as a reference. The genomorder pipeline has been developped to automatize these tasks. It uses a reference assembly and reorganizes and renames sequences found in other assembly files according to the reference. If the reference file is in chromosomes and some of the other assemblies are in scaffolds it can organize scaffold as chromosomes. After producing several novel related assemblies scientists are also interest in visually comparing them. Therefore genomorder produces per chromosome d-genies archive output files for a given list of chromosome names. These files can be uploaded to http://dgenies.toulouse.inra.fr which will render the dot plots enabling to visualy compare all corresponding chromosomes from the produced assemblies.
With the availability of cheap long read sequences and efficient assembly software packages, novel genome assemblies are made available frequently. It is not rare to produce multiple references of the same or closely related species in a project. These assemblies being produced independently raw assembly file include chromosomes or scaffolds in random order, orientation and with random sequence names. Genomorder enable to automatically rename, order, orient and scaffold assemblies accordingly to reference and produce dot plots of a list of chromosomes.
# Statement of need
GenomOrder is a Nextflow pipeline with multiple features made to facilitate work on genomic sequence alignments. Among these features, we first have the possibility of reordering and renaming the scaffolds of multiple assemblies, in accordance with a given reference. It is also possible to scaffold several contiguous assemblies, in accordance with a give reference. In addition, the pipeline can quickly produce sequence alignment and the resulting D-Genies back-up file, allowing rapid visual comparison of chromosomes from 2 given assemblies. These file can be uploaded and visualized with the online tool D-Genies : http://dgenies.toulouse.inra.fr/
Finally and as a new feature, the pipeline allows multiple pairwise alignment for a list of given scaffolds and produce a D-Genies back-up file allowing the visualisation of all these alignments in the same dot-plot.
GenomOrder is a Nextflow pipeline with two main features : assembly reorganisation and assembly comparision. The first corresponds to the possibility of reordering, reorienting and renaming scaffolds of multiple assemblies, in accordance with a given reference. It includes the possibility to scaffold assemblies in chromosomes as a give reference. The second pipeline feature is to quickly produce chromosome alignments and the corresponding D-Genies back-up file, allowing rapid visual comparison of chromosomes for two to five given assemblies and a list of chromosomes. These file can be uploaded and visualized with the online tool D-Genies : http://dgenies.toulouse.inra.fr/