Skip to content
Snippets Groups Projects
README.md 2.86 KiB
Newer Older
PIAT LUCIEN's avatar
PIAT LUCIEN committed
# Warnings / Issues
PIAT LUCIEN's avatar
PIAT LUCIEN committed
> **`/!\`:** Act with care; this workflow uses significant memory if you increase the values in `.masterconfig`. We recommend keeping the default settings and running a test first.
> **`/!\`:** For now dont run multiple split at once

PIAT LUCIEN's avatar
PIAT LUCIEN committed
> **`/!\`:** The Transduplication and Reciprocal Translocation sections in the `visor_sv_type.yaml` config file are placeholders; do not use them yet.

# How to Use
## A. Running on the CBIB
PIAT LUCIEN's avatar
PIAT LUCIEN committed
### 1. Set up
Clone the Git repository and switch to my branch:
Sukanya Denni's avatar
Sukanya Denni committed
```bash
git clone https://forgemia.inra.fr/pangepop/MSpangepop.git
cd MSpangepop
git checkout dev_lpiat
Sukanya Denni's avatar
Sukanya Denni committed
```
Sukanya Denni's avatar
Sukanya Denni committed

### 2. Add your files
PIAT LUCIEN's avatar
PIAT LUCIEN committed
- Add a `.fasta.gz` file; an example can be found in the repository.
PIAT LUCIEN's avatar
PIAT LUCIEN committed
### 3. Configure the pipeline
PIAT LUCIEN's avatar
PIAT LUCIEN committed
- Edit the `.masterconfig` file in the `.config/` directory with your sample information. 
- Edit the `visor_sv_type.yaml` file with the mutations you want.
- Edit line 17 of `job.sh` and line 13 of `./config/snakemake_profile/clusterconfig.yaml` with your email.

### 4. Run the WF
PIAT LUCIEN's avatar
PIAT LUCIEN committed
The workflow has two parts: `split` and `simulate`. Always run the split first and once its done (realy quick) run the simulate.
PIAT LUCIEN's avatar
PIAT LUCIEN committed
sbatch job.sh [split or simulate] dry
PIAT LUCIEN's avatar
PIAT LUCIEN committed
If no warnings are displayed, run:
PIAT LUCIEN's avatar
PIAT LUCIEN committed
sbatch job.sh [split or simulate] 
PIAT LUCIEN's avatar
PIAT LUCIEN committed
> **Nb 1:** to create a visual representation of the workflow, use `dag` instead of `dry`. Open the generated `.dot` file with a [viewer](https://dreampuf.github.io/GraphvizOnline/) that supports the format.
PIAT LUCIEN's avatar
PIAT LUCIEN committed
> **Nb 2:** Frist execution of the workflow will be slow since images need to be pulled.
PIAT LUCIEN's avatar
PIAT LUCIEN committed
> **Nb 3:** The workflow is in two parts because we want to execute the simulations chromosome by chromosome. Snakemake cannot retrieve the number of chromosomes in one go and needs to index and split first.

> **Nb 4:** Since the cbib dose not support `python:3.9.7` we cant use cookie cutter config, use the `cbib_job.sh` to run. 

PIAT LUCIEN's avatar
PIAT LUCIEN committed
## B. Run localy
- Ensure `snakemake` and `singularity` are installed on your machine, then run the workflow:
PIAT LUCIEN's avatar
PIAT LUCIEN committed
./local_run [split or simulate] dry
PIAT LUCIEN's avatar
PIAT LUCIEN committed
```
If the workflow cannot download images from the container registry, install `Docker`, log in with your credentials, and rerun the workflow:
PIAT LUCIEN's avatar
PIAT LUCIEN committed
docker login -u "<your_username>" -p "<your_token>" "registry.forgemia.inra.fr" 
```

PIAT LUCIEN's avatar
PIAT LUCIEN committed
# Workflow
PIAT LUCIEN's avatar
PIAT LUCIEN committed
![Dag of the workflow](workflow/dag.svg)
# More informations
Sukanya Denni's avatar
Sukanya Denni committed

The variants generation is inspired by [VISOR](https://github.com/davidebolo1993/VISOR).
Sukanya Denni's avatar
Sukanya Denni committed

You can extract a VCF from the graph using the `vg deconstruct` command. It is not implemented in the pipeline.
Sukanya Denni's avatar
Sukanya Denni committed

You can use the script `workflow/scripts/split_path.sh` to cut the final fasta into chromosome level fasta files. 
```bash
./split_fasta.sh input.fasta /path/to/output_directory
```
# Dependencies
TODO
PIAT LUCIEN's avatar
PIAT LUCIEN committed
pandas, msprime, argprase, os, multiprocessing, yaml, Bio.Seq
singularity, snakemake
vg:1.60.0, bcftools:1.12, bgzip:latest, tabix:1.7.