README.md 4.91 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# Rare project - Data discovery

## Setup

### Backend

The project uses Spring (5.x) for the backend,
with Spring Boot.

You need to install:

- a recent enough JDK8

Then at the root of the application, run `./gradlew build` to download the dependencies.
Then run `./gradlew bootRun` to start the app.

### Frontend

The project uses Angular (6.x) for the frontend,
with the Angular CLI.

You need to install:

- a recent enough NodeJS (8.11+)
- Yarn as a package manager (see [here to install](https://yarnpkg.com/en/docs/install))

Then in the `frontend` directory, run `yarn` to download the dependencies.
Then run `yarn start` to start the app, using the proxy conf to reroute calls to `/api` to the backend.

The application will be available on http://localhost:4200

## Build

To build the app, just run:

    ./gradlew assemble

This will build a standalone jar at `backend/build/libs/rare.jar`, that you can run with:

    java -jar backend/build/libs/rare.jar

And the full app runs on http://localhost:8080


## CI

The `.gitlab-ci.yml` file describes how Gitlab is running the CI jobs.

It uses a base docker image named `ninjasquad/docker-rare`
available on [DockerHub](https://hub.docker.com/r/ninjasquad/docker-rare/)
and [Github](https://github.com/Ninja-Squad/docker-rare).
The image is based on `openjdk:8` and adds a Chrome binary to let us run the frontend tests
(with a headless Chrome in `--no-sandbox` mode).

We install `node` and `yarn` in `/tmp` (this is not the case for local builds)
to avoid symbolic links issues on Docker.

You can approximate what runs on CI by executing:

    docker run --rm -v "$PWD":/home/rare -w /home/rare ninjasquad/docker-rare ./gradlew build
Jean-Baptiste Nizet's avatar
Jean-Baptiste Nizet committed
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140

## Harvest

Harvesting (i.e. importing genetic resources stored in JSON files into ElasticSearch) consists in
placing the JSON files into a directory where the server can find them.

This directory, by default is `/tmp/rare/resources`. But it's externalized into the Spring Boot property
`rare.resource-dir`, so it can be easily changed by modifying the value of this property (using an 
environment variable for example).

The files must have the extension `.json`, and must be stored in that directory (not in a sub-directory).
Once the files are ready and the server is started, the harvest is triggered by sending a POST request
to the endpoint `/api/harvests`, without any request body.

Example with the `http` command ([HTTPie](https://httpie.org/)):

    http POST http://localhost:8080/api/harvests
    
Example with the `curl` command:

    curl -i -X POST http://localhost:8080/api/harvests
    
The harvest job is executed asynchronously, and a response is immediately sent back, with the URL allowing
to get the result of the job. For example:

    HTTP/1.1 201 
    Content-Length: 0
    Date: Tue, 24 Jul 2018 12:58:04 GMT
    Location: http://localhost:8080/api/harvests/abb5784d-3006-48fb-b5db-d3ff9583e8b9
    
To get the result of the job, you can then send a GET request to the returned URL:

    http GET http://localhost:8080/api/harvests/abb5784d-3006-48fb-b5db-d3ff9583e8b9

or

    curl http://localhost:8080/api/harvests/abb5784d-3006-48fb-b5db-d3ff9583e8b9
    
`http` has the advantage of nicely formetting the returned JSON.

The response contains a detailed report containing the start instant, and the list of files
that have been processed, with the number of successfully imported resources, and the errors
that occurred, if any.

It's only when the property `endInstant` of the returned JSON is non-null that the job is complete.
```
{
    "endInstant": "2018-07-24T12:56:28.077Z",
    "files": [
        {
            "errorCount": 0,
            "errors": [],
            "fileName": "rare_pilier_microbial.json",
            "successCount": 10
        },
        {
            "errorCount": 2,
            "errors": [
                {
                    "column": 4,
                    "error": "Error while parsing object: com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize instance of `java.lang.String` out of START_ARRAY token\n at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: fr.inra.urgi.rare.domain.GeneticResource[\"name\"])",
                    "index": 4790,
                    "line": 105594
                },
                {
                    "column": 4,
                    "error": "Error while parsing object: com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize instance of `java.lang.String` out of START_ARRAY token\n at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: fr.inra.urgi.rare.domain.GeneticResource[\"countryOfCollect\"])",
                    "index": 5905,
                    "line": 130127
                }
            ],
            "fileName": "rare_pilier_plant.json",
            "successCount": 14522
        }
    ],
    "globalErrors": [],
    "id": "55e70557-79e8-4e40-a44b-2ef4b3df076a",
    "startInstant": "2018-07-24T12:56:27.322Z"
}
```