idea : using single copy genes to estimates species richness
Starting from Quantifications_and_functional_annotations.tsv, we could count the number of different genes that correspond to each SCG.
The true richness is probably between the min and max occurence of the SCG.
The set of SCG could be the one from the checkM2db or the one from Creevey et al which seems to be of good quality (https://doi.org/10.1038/s41598-022-18762-z)