For debug or for development only, you can launch dgenies through flask in webserver mode:
dgenies run -m webserver
Optional parameters:
`-d` run in debug mode
`-o <IP>` specify the host into run the application (default: 127.0.0.1, set 0.0.0.0 for distant access)
`-p <port>` run in a specified port (default: 5000)
`--no-crons` don't run the crons automatically
`--no-browser` don't start the browser automatically (always true if *-d* option is given)
Running with a cluster
----------------------
If you want to run jobs on a cluster, some configuration is required. We only support SLURM and SGE schedulers. But please note that only SLURM scheduler has been fully tested.
Note: submitting jobs on a cluster is only available for webserver mode.
Jobs are submitted throw the DRMAA library. So you need it for your scheduler. Please see [configuration below](#cluster) to define path to this library.
Also, scripts for preparing data must be moved in a location accessible by all nodes of your cluster. You must move them in a same folder and set the full path to `all_prepare.py` script (full path on the nodes of the cluster) in the configuration file ([see below](#cluster)).
With `<dir>`: the folder into save the scripts (must be accessible by cluster nodes).
Configuration
-------------
Changing the default configuration is not required for standalone mode, but you can want to custom some parts of the program.
Configuration is stored in the `/etc/dgenies/application.properties` file (linux). The file is divided in 8 parts described below.
To change this file, please copy it into `application.properties.local` (at the same location) to avoid erase of the file on upgrades.
### Global
Main parameters are stored into this section:
*`config_dir`: where configuration file will be stored.
*`upload_folder`: where uploaded files will be stored.
*`data_folder`: where data files will be stored (PAF files and other files used for the dotplot).
*`threads_local`: number of threads to use for local jobs.
*`web_url`: public URL of your website.
*`max_upload_size`: max size allowed for query and target file (-1 to avoid the limit) - size uncompressed.
*`max_upload_size_ava`: max size allowed for target file for all-vs-all mode (only target given, -1 to avoid the limit) - size uncompressed.
*`max_upload_file_size`: max size of the uploaded size (real size of the file, compressed or not, -1 to avoid the limit).
For webserver mode only (ignored in standalone mode):
*`batch_system_type`: local for run all jobs locally, sge or slurm to use a cluster scheduler.
### Softwares
Paths to the external software should be set here:
*`minimap2`: path to minimap2 software (keep as it to use built-in)
And if you use a cluster:
*`minimap2_cluster`: path to minimap2 on the cluster
### Debug
Some parameters for debug:
*`enable`: True to enable debug
*`log_dir`: folder into store debug logs
### Cluster
This section concerns only the webserver mode with *batch_system_type* not set to *local*.
*`drmaa_lib_path`: absolute path to the drmaa library. Required as we use the DRMAA library to submit jobs to the cluster.
*`native_specs`: how to set memory, time and number of CPU on the cluster (should be kept as default).
By default, small jobs are still launched locally even if *batch_system_type* is not set to *local*, and if not too much of these jobs are running or waiting. This limit can be customized:
*`max_run_local`: max number of jobs running locally (if this number is reached, future jobs will be submitted to the cluster regardless of their size). Set to 0 to run all jobs on the cluster.
*`max_wait_local`; max number of jobs waiting for a local run (if this number is reached, future jobs will be submitted to the cluster regardless of their size). Set to 0 to run all jobs on the cluster.
You can also customize the size from which jobs are submitted on the cluster. If only one of these limit is reached, the job is submitted on the cluster.
*`min_query_size`: minimum size for the query (uncompressed).
*`min_size_target`: minimum size for the target (uncompressed).
Other parameters:
*`prepare_script`: absolute path to the all_prepare.py script downloaded in the section [above](#running-with-a-cluster).
*`python3_exec`: path to python3 executable on the cluster.
*`memory`: max memory to reserve on the cluster.
*`memory_ava`: max memory to reserve on the cluster un all-vs-all mode (should be higher than memory).
*`threads`: number of threads for launching jobs on the cluster (must be a divider of the memory).
### Database
This section concerns only the webserver mode.
In webserver mode, we use a database to store jobs.
*`type`: sqlite or mysql. We recommend mysql for better performances.
*`url`: path to the sqlite file, or url to the mysql server (localhost if the mysql server is on the same machine).
If type is mysql, some other parameters must be filled:
*`port`: port to connect to mysql (3306 as default).
*`db`: name of the database.
*`user`: username to connect to the database.
*`password`: the associated password.
### Mail
This section concerns only the webserver mode.
At the end of the job, a mail is send to the user to advise him of the end of the job.
*`status`: mail to use for status mail.
*`reply`: mail to use as reply to.
*`org`: name of the organisation who send the mail.
*`send_mail_status`: True to send mail status, False else. Should be True in production.
### Cron
This section concerns only the webserver mode.
We use crons to launch the local scheduler who start jobs, and some script that clean old jobs.
*`clean_time`: time at which we launch the clean job (example: 1h00)
*`clean_freq`: frequency of the clean job execution
### Jobs
This section concerns only the webserver mode.
Several parameters for jobs:
*`run_local`: max number of concurrent jobs launched locally.
*`data_prepare`: max number of data prepare jobs launched locally.
*`max_concurrent_dl`: max number of concurrent upload of files allowed.
Maintenance
-----------
The `dgenies` command can be used to do some maintenance staff.
**Clear all jobs:**
dgenies clear -j [--max-age <age>]
`--max-age` (opt): set the max age of jobs to delete (default: 0, for all)
**Clear all log:**
dgenies clear -l
**Clear crons (webserver mode):**
dgenies clear -c
Gallery
-------
To add a job to the gallery, copy illustrating picture file into the *gallery* folder inside the data folder (*~/.dgenies/data/gallery* as default, create it if not exists). Then use the *dgenies* command: