Commit 1f0058f7 authored by Mouhamadou Ba's avatar Mouhamadou Ba
Browse files

Update README.md

parent 68956b82
# epmc-crawler
Acces fulltext articles from Europe PMC
\ No newline at end of file
Get XML articles (fulltext) from dois, pmid or/and pmcids.
## install
```
git clone https://forgemia.inra.fr/mandiayba/epmc-crawler.git
cd epmc-crawler
conda env conda env create -f softwares/envs/snakemake-5.13.0-env.yaml
```
## usage (on migale)
* corpus from dois
```
conda activate snakemake-5.13.0-env
snakemake --nolock --verbose --printshellcmds --use-singularity --use-conda --reason --latency-wait 60 --jobs 2 --snakefile get_corpus_from_doiss.snakefile all --cluster "qsub -v PYTHONPATH='' -l mem_free=4G -V -cwd -e log/ -o log/ -q short.q -pe thread 2" --config --config DOIS_FILE=data/pmids.txt --config CORPUS_FOLDER=data/corpus
```
* corpus from pmids ()
```
conda activate snakemake-5.13.0-env
snakemake --nolock --verbose --printshellcmds --use-singularity --use-conda --reason --latency-wait 60 --jobs 2 --snakefile get_corpus_from_pmids.snakefile all --cluster "qsub -v PYTHONPATH='' -l mem_free=4G -V -cwd -e log/ -o log/ -q short.q -pe thread 2" --config --config PMID_FILE=data/pmids.txt --config CORPUS_FOLDER=data/corpus
```
* corpus from pmcids
```
conda activate snakemake-5.13.0-
snakemake --nolock --verbose --printshellcmds --use-singularity --use-conda --reason --latency-wait 60 --jobs 2 --snakefile get_corpus_from_doiss.snakefile all --cluster "qsub -v PYTHONPATH='' -l mem_free=4G -V -cwd -e log/ -o log/ -q short.q -pe thread 2" --config --config DOIS_FILE=data/pmcids.txt --config CORPUS_FOLDER=data/corpus
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment