README.md 2.35 KB
Newer Older
Mouhamadou Ba's avatar
Mouhamadou Ba committed
1
>> This is work in progress, contact us if you have questions
Mouhamadou Ba's avatar
Mouhamadou Ba committed
2

Mouhamadou Ba's avatar
Mouhamadou Ba committed
3
4
# About

Mouhamadou Ba's avatar
Mouhamadou Ba committed
5
This project is designed to extract entities (i.e., `taxa`, `phenotypes`, `habitats`, `disease names`, `hosts`, `pathogen`, `vector`, `dates` and `geographic names`) from textual data for the purpose of scientific watch.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
6

Mouhamadou Ba's avatar
Mouhamadou Ba committed
7
The project contains a workflow based on Framework [AlvisNLP](https://github.com/Bibliome/alvisnlp) and uses the Ontobiotope Ontology and NCBI taxonomy.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
8
9


Mouhamadou Ba's avatar
Mouhamadou Ba committed
10
## Usage
Mouhamadou Ba's avatar
Mouhamadou Ba committed
11
The workflow works on command line (e.g., `GNU bash, version 4.4.x`) with `singularity version 3.4.x` 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
12
installed on your computer ([how to install singularity ?](https://sylabs.io/guides/3.4/user-guide/quick_start.html#quick-installation-steps)). 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
13
It is compatible with [`AlvisNLP version 0.7.1`](https://github.com/Bibliome/alvisnlp/tree/0.7.1) provided into a [Singularity](https://sylabs.io/) image. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
14
15

Run the following steps to test the workflow,
Mouhamadou Ba's avatar
Mouhamadou Ba committed
16
a test corpus is provided here `corpus/pesv/Xylella-test/txt/`, `16Go` RAM is required to process the test corpus).
Mouhamadou Ba's avatar
Mouhamadou Ba committed
17

Mouhamadou Ba's avatar
Mouhamadou Ba committed
18

Mouhamadou Ba's avatar
Mouhamadou Ba committed
19
1. clone the project.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
20
21
22

```
git clone https://forgemia.inra.fr/mandiayba/pesv-tm.git
Mouhamadou Ba's avatar
Mouhamadou Ba committed
23

Mouhamadou Ba's avatar
Mouhamadou Ba committed
24
cd pesv-tm
Mouhamadou Ba's avatar
Mouhamadou Ba committed
25
26
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
27
2. pull the singularity image of AlvisNLP. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
28
> `login` and `password` are required to pull the AlvisNLP singularity image from forgemia, please contact the maintainer if you don't have permissions.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
29
30

```
Mouhamadou Ba's avatar
Mouhamadou Ba committed
31
cd pesv-tm/
Mouhamadou Ba's avatar
Mouhamadou Ba committed
32

Mouhamadou Ba's avatar
Mouhamadou Ba committed
33
singularity pull alvisnlp-0.7.1.sif oras:registry.forgemia.inra.fr/bibliome/pesv-tm/alvisnlp:0.7.1
Mouhamadou Ba's avatar
Mouhamadou Ba committed
34
35
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
36
3. run the workflow. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
37
> execute the workflow with the test corpus `corpus/pesv/Xylella-test/txt/`, results are stored into `corpus/pesv/Xylella-test/`
Mouhamadou Ba's avatar
Mouhamadou Ba committed
38
39

```
Mouhamadou Ba's avatar
Mouhamadou Ba committed
40
41
cd pesv-tm/

Mouhamadou Ba's avatar
Mouhamadou Ba committed
42
./alvisnlp.sif -J-Xmx16G -verbose -cleanTmp \
Mouhamadou Ba's avatar
Mouhamadou Ba committed
43
44
45
46
47
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
Mouhamadou Ba's avatar
Mouhamadou Ba committed
48
49
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
50
51
4. See results from `corpus/Xylella/visualisation_html`

Mouhamadou Ba's avatar
Mouhamadou Ba committed
52
*. You may browser the results by using option `-browser`: run the following command, check the logs and goto [http://localhost:8878](http://localhost:8878) 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
53
54
55
56

```
cd pesv-tm/

Mouhamadou Ba's avatar
Mouhamadou Ba committed
57
./alvisnlp.sif -J-Xmx16G -verbose -cleanTmp \
Mouhamadou Ba's avatar
Mouhamadou Ba committed
58
59
60
61
62
63
64
65
-browser
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
66
## Maintainer
Mouhamadou Ba's avatar
Mouhamadou Ba committed
67
Mouhamadou Ba : mouhamadou.ba@inrae.fr