... | ... | @@ -57,9 +57,7 @@ At least one assembly, one library and one database are required. |
|
|
# General options
|
|
|
|
|
|
## --library [mandatory]
|
|
|
You can provide fastq files which will be copied to the data directory and be available in the download page.
|
|
|
|
|
|
''If the fastq file is not provided, you must use the nb-sequence attribute to populate the database in order to compute the histograms presented in the user interface.''
|
|
|
You can provide fastq files which will be copied to the data directory and be available in the download page. If the fastq file is not provided, you must use the nb-sequence attribute to populate the database in order to compute the histograms presented in the user interface.
|
|
|
|
|
|
List of available attribute (with * mandatory attribute):
|
|
|
* library-name* : [string] the internal library name, must be uniq
|
... | ... | @@ -68,11 +66,11 @@ List of available attribute (with * mandatory attribute): |
|
|
* tissue : [string] tissue
|
|
|
* dev_stage : [string] developpement stage
|
|
|
* type* : [string] library type , available options :
|
|
|
** se : single end
|
|
|
** pe : paired end
|
|
|
** ose : oriented single end
|
|
|
** ope : oriented paired end
|
|
|
** mp : mate pair
|
|
|
* se : single end
|
|
|
* pe : paired end
|
|
|
* ose : oriented single end
|
|
|
* ope : oriented paired end
|
|
|
* mp : mate pair
|
|
|
* insert-size : [int] for paired end library you can provide the insert size
|
|
|
* remark : [string] any comment
|
|
|
* sequencer : [string] sequencer type
|
... | ... | @@ -83,15 +81,14 @@ List of available attribute (with * mandatory attribute): |
|
|
* files* : [string] fastq file path ( if paired space separate file names)
|
|
|
|
|
|
If you have several library you have to use the library option several times.
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
--library library-name=brain_400 sample-name=Brain replicat=1 tissue=Brain type=pe insert-size=400 remark="100bp to 400bp insert" \
|
|
|
sequencer=HiSeq2000 files=workflows/rnaseqdenovo/data/brain_400.1.fastq.gz,workflows/rnaseqdenovo/data/brain_400.2.fastq.gz
|
|
|
```
|
|
|
## ASSEMBLY section
|
|
|
|
|
|
|
|
|
== ASSEMBLY section ==
|
|
|
|
|
|
===--assembly [mandatory] ===
|
|
|
### --assembly [mandatory]
|
|
|
The assembly option is mandatory. The possible attributes are :
|
|
|
* file* : Fasta file, can be gz.
|
|
|
* software-name* : [string] assembly software name
|
... | ... | @@ -100,19 +97,21 @@ The assembly option is mandatory. The possible attributes are : |
|
|
* comments : [string] any comments on this analysis
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
--assembly file=workflows/rnaseqdenovo/data/contigs.fasta software-name=oases software-parameters="" software-version="0.2.06" comments="Transcript assembly"
|
|
|
```
|
|
|
|
|
|
|
|
|
=== --rename ===
|
|
|
### --rename
|
|
|
Flag to set if you want to rename yours contigs with the gene name of the best annotation.
|
|
|
|
|
|
=== --prefix ===
|
|
|
### --prefix
|
|
|
Prefix value to set for all contig when renaming.
|
|
|
```bash
|
|
|
--rename --prefix "GG_"
|
|
|
```
|
|
|
## ANNOTATION section
|
|
|
|
|
|
== ANNOTATION section ==
|
|
|
|
|
|
===--assembly-annot-db [mandatory] ===
|
|
|
### --assembly-annot-db [mandatory]
|
|
|
You can provide much as you want databases for annotation with ncbi blast+. The database must have been indexed with makeblastdb.
|
|
|
* file* : Database file (with index in the same directory).
|
|
|
* type* : [string] kind of data : [genome|nucleic|protein|transcript|unknown]
|
... | ... | @@ -122,39 +121,37 @@ You can provide much as you want databases for annotation with ncbi blast+. The |
|
|
* evalue : [float] The maximum e-value for the alignements.
|
|
|
* software : [string] The type of NCBI-Blast+ used for the alignment. [blastx|blastp...] path to exec file is retrieve from PATH or from application.properties.
|
|
|
|
|
|
=== --min-identity ===
|
|
|
### --min-identity
|
|
|
option to filter on minimum fraction of identity [0.00-1.00]
|
|
|
=== --min-coverage ===
|
|
|
### --min-coverage
|
|
|
option to filter on minimum fraction of query coverage [0.00-1.00]
|
|
|
|
|
|
===--go ===
|
|
|
### --go
|
|
|
A GO (Gene Ontology) file enables to associate GO names, evidences ... to each contig
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
--go go.txt
|
|
|
|
|
|
More info about [[ go file ]]
|
|
|
|
|
|
===--skip-rm===
|
|
|
```
|
|
|
### --skip-rm
|
|
|
This option enable to skip repeat masker annotation.
|
|
|
|
|
|
===--skip-rnammer===
|
|
|
### --skip-rnammer
|
|
|
Skip RNAmmer step (RNAmmer is used to add rRNA predictions).
|
|
|
|
|
|
===--skip-trna===
|
|
|
### --skip-trna
|
|
|
Skip RNAmmer step (RNAmmer is used to add rRNA predictions).
|
|
|
|
|
|
|
|
|
==IPRscan section==
|
|
|
===--skip-iprscan===
|
|
|
## IPRscan section
|
|
|
### --skip-iprscan
|
|
|
This option enable to skip iprscan annotation. Iprscan is long but provide a good annotation for protein domains, ORF, and GO.
|
|
|
===--max-orf-nb===
|
|
|
|
|
|
### --max-orf-nb
|
|
|
[int] The maximum number of ORF by contig to report in annotation.
|
|
|
|
|
|
|
|
|
|
|
|
==VARIANT section ==
|
|
|
===--variant===
|
|
|
## VARIANT section
|
|
|
### --variant
|
|
|
If you perform your own variant detection you can provide the file else the pipeline will detect it with GATK3.
|
|
|
|
|
|
This file contains for each the variation informations of the contigs contigs : snps, insertion or deletion. The expected file format is VCF (Variant Calling Format).
|
... | ... | @@ -168,13 +165,14 @@ Here is the list of attributes for this options : |
|
|
* comments : [string] comments on analysis
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
--variant file=variant.vcf software-name=GATK software-parameters="realignement/recalibration/glm BOTH" software-version="v2.4-9-g532efad"
|
|
|
|
|
|
===--two-steps-calling''' ===
|
|
|
```
|
|
|
### --two-steps-calling
|
|
|
The SNP calling is realised in two step. The first step (recalibration, calling, filter) has hard filters. The second step (recalibration, calling,
|
|
|
filter) has standard filters and the variants detected in the first step are used as database of known polymorphic sites.
|
|
|
|
|
|
===--variant-annot-db ===
|
|
|
### --variant-annot-db
|
|
|
You can use a well known species to annotate your SNP by similarities. You must use a species from Ensembl.
|
|
|
* species : the species name for the species used as reference.
|
|
|
* fasta : the proteins sequences for the species used as reference.
|
... | ... | @@ -182,57 +180,55 @@ You can use a well known species to annotate your SNP by similarities. You must |
|
|
* vcf [optional] : known variants for the species used as reference.
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
--variant-annot-db species="Danio rerio" fasta=Danio_rerio.Zv9.pep.all.fa gtf=Danio_rerio.Zv9.77.gtf vcf=Danio_rerio.vcf
|
|
|
```
|
|
|
|
|
|
=Monitoring workflow=
|
|
|
|
|
|
# Monitoring workflow
|
|
|
|
|
|
To get information about all workflows :
|
|
|
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py status
|
|
|
|
|
|
```
|
|
|
|
|
|
To get information about a running workflow
|
|
|
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py status --workflow-id XX
|
|
|
|
|
|
```
|
|
|
|
|
|
To get information about workflow errors
|
|
|
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py status --workflow-id XX --errors
|
|
|
```
|
|
|
|
|
|
|
|
|
=Delete a project=
|
|
|
# Delete a project
|
|
|
|
|
|
The deleteproject option permits to remove a project from an instance.
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py deleteproject --project-name MyProject
|
|
|
|
|
|
=Delete an instance =
|
|
|
|
|
|
```
|
|
|
# Delete an instance
|
|
|
This sub command will delete instance repository but NOT project inside instance.
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py deleteinstance --project-name myinstance
|
|
|
|
|
|
=Launch web server=
|
|
|
|
|
|
```
|
|
|
# Launch web server
|
|
|
Once you have loaded the data in you project you can give access to the user interface by launching the instance using the runinstance option. This will start the corresponding web-server.
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py runinstance --instance-name myinstance
|
|
|
|
|
|
|
|
|
```
|
|
|
To stop the web-server use :
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
python ./bin/ngspipelines_cli.py runinstance --instance-name myinstance --command stop
|
|
|
|
|
|
=Web-server connection=
|
|
|
|
|
|
```
|
|
|
# Web-server connection
|
|
|
Once the web-server is started you will be able to access it using the URL.
|
|
|
The URL has to include the port separated by ':' .
|
|
|
|
|
|
Example :
|
|
|
```bash
|
|
|
http://ngspipelines.toulouse.inra.fr:9000/
|
|
|
``` |
|
|
\ No newline at end of file |