README.md 1.65 KB
Newer Older
jacqueslagnel's avatar
jacqueslagnel committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# WGS pipeline for Paired-end Illumina sequencing
## From raw fastq data to vcf file.
## Pipeline features:
### 1) read preprocessing,
### 2) reads QC,
### 3) mapping with bwa,
### 4) mapping QC
### 5) SNPs calling using freebayes
## Snakemake features: fastq from csv file, config, modules, SLURM

### Workflow steps are descibed in the dag_rules.pdf
### Snakemake rules are based on [Snakemake modules](https://forgemia.inra.fr/gafl/snakemake_modules)


### Files description:
### 1) Snakefile
    - Snakefile.smk  self-contained snakefile (all rules are included)
    - Snakefile_modules.smk uses external rules (include directive)

### 2) Configuration file in yaml format:, paths, singularity images paths, parameters,....
    - config.yaml

### 3) a sbatch file to run the pipeline: (to be edited)
    - run_snakemake_pipeline.slurm

### 4) A slurm directive (#core, mem,...) in json format. Can be adjusted if needed
    - cluster.json

### 5) samples file in csv format
    Must contens at least 2 columns for SE reads and 3 for PE reads (tab separator )
    SampleName  fq1     fq2
    SampleName : your sample ID
    fq1: fastq file for a given sample
    fq2: read 2 for paired-end reads

    - samples.csv


## RUN:

### 1) Optionaly If using external rules/modules get all modules from the git:
`git clone https://forgemia.inra.fr/gafl/snakemake_modules.git smkmodules`

### 2) edit the config.yaml

### 3) set your samples in the sample.csv

### 4) adjust the run_snakemake_pipeline.slurm file

### 5) run pipelene:
`sbatch run_snakemake_pipeline.slurm`


### [Dicoexpress gitlab MIA](https://forgemia.inra.fr/GNet/dicoexpress)


#### Documentation being written (still)