Investigate hifiasm-meta assemblies
In some case, hifiasm-meta produce a lot of small and redundant contigs. It seems to appear when the sequencing depth is quite high. It has a parameter to apply a read selection based on the frequency of kmer found in reads.
Read selection:
--force-preovec
enable and force read selection.
--lowq-10
lower 10% runtime kmer frequency threshold. [50]
--lowq-5
lower 5% runtime kmer frequency threshold. [50]
--lowq-3
lower 3% runtime kmer frequency threshold. [10]
However in the case of the mock zymobiomics (8 bacteria + 2 yeast in same abundance), the filtering threshold are not reached and all read are kept for the assembly:
with hifiasm_meta -o asm -t 32 --lowq-10 --force-preovec mockBact_dnaminiprep.fastq.gz
[prof::yak_count] step 1 total 3.16 s, step2 133.78 s, step3 29.07 s.
[debug::ha_pt_gen] tot_cnt is 116962304, pt->tot_pos is 116962304
[M::ha_pt_gen::672.670*7.67] ==> indexed 116962304 positions
[prof::hamt_pre_ovec_v2] start ~ done ha_idx: 252.51 s
[M::hamt_pre_ovec_v2] 28269 reads with more than desired targets(150). (Total reads: 890071)
[prof::hamt_pre_ovec_v2] ha_idx ~ done estimation: 94.98 s
[M::hamt_pre_ovec_v2] keeping all reads.
[M::hamt_pre_ovec_v2] finished read selection, took 347.49s.
[M::hamt_assemble] read selection decided to keep all reads.
A solution would be to identify this highly redundant contigs and filter them out of the assembly or to apply before the assembly a read selection.