Commit 76b89fd5 authored by Jean Mainguy's avatar Jean Mainguy
Browse files

Add documentation for HiFi QC and assembly steps #181

parent 1ac2a7c2
Pipeline #60134 skipped with stage
......@@ -238,13 +238,16 @@ No parameter available for this substep.
**WARNING 2:** you need to use either `--kaiju_db_dir` or `--kaiju_db` or `--skip_kaiju`. If it is not the case, an error message will occur.
**Note:** For HiFi reads, adapters and low quality reads are not filtered. Host reads are removed using minimap2 that does not required index files and therefore the `--host_index` parameter is not necessary.
#### **`S02_ASSEMBLY` step:**
**WARNING 3:** `S02_ASSEMBLY` step depends on `S01_CLEAN_QC` step. You need to use the mandatory files of these two steps to run `S02_ASSEMBLY`. See [II. Input files](https://forgemia.inra.fr/genotoul-bioinfo/metagwgs/-/blob/master/docs/usage.md#ii-input-files) and **WARNINGS 1 and 2**.
* `--assembly ["metaspades" or "megahit"]`: allows to indicate the assembly tool. Default: `metaspades`.
**WARNING 4:** the user can choose between `metaspades` or `megahit` for `--assembly` parameter. The choice can be based on CPUs and memory availability: `metaspades` needs more CPUs and memory than `megahit` but our tests showed that assembly metrics are better for `metaspades` than `megahit`.
**WARNING 4:** For short reads, the user can choose between `metaspades` or `megahit` for `--assembly` parameter. The choice can be based on CPUs and memory availability: `metaspades` needs more CPUs and memory than `megahit` but our tests showed that assembly metrics are better for `metaspades` than `megahit`.For PacBio HiFi reads, the user can choose between `hifiasm-meta` or `metaflye`.
**Note:** you may need to tweak the memory and cpus settings of the Nextflow process, especially if you are using `metaspades`. If this is the case, create a `nextflow.config` file in our working directory and modify these parameters (be aware that the memory must be in GB) such as :
```bash
......@@ -254,6 +257,8 @@ No parameter available for this substep.
}
```
**Note** It is also possible to assemble HiFi reads with the tool HiCanu. If you want to use HiCanu generate the assembly first and then launch the rest of the pipeline.
#### **`S03_FILTERING` step:**
**WARNING 5:** `S03_FILTERING` step depends on `S01_CLEAN_QC` and `S02_ASSEMBLY` steps. You need to the use mandatory files of these three steps to run `S03_FILTERING`. See [II. Input files](https://forgemia.inra.fr/genotoul-bioinfo/metagwgs/-/blob/master/docs/usage.md#ii-input-files) and **WARNINGS 1, 2, 3 and 4**.
......@@ -270,7 +275,7 @@ No parameters.
**WARNING 7:** if you haven't previously done `S03_FILTERING`, calculation time of `S04_STRUCTURAL_ANNOT` can be important. Some cluster queues have defined calculation time, you need to adapt the queue you use to your data.
> For example, if you are on [genologin cluster](http://bioinfo.genotoul.fr/) and you haven't done the `S03_FILTERING` step, you can write a `nextflow.config` file in your working directory containing these lines:
> ```bash
> withName: prokka {
> withName: PROKKA {
> queue = 'unlimitq'
> }
> ```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment