diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4211bdf23d6dd49b07af370637412618b27ecdbb..d7aa977221fd77f7b2740083ca26d0df394693c4 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -68,9 +68,11 @@ * `detect-secrets scan --update .secrets.baseline` to update the secret baseline, then * `detect-secrets audit .secrets.baseline` to tag it as a false positive if relevant. -### Test data -`docker compose up` will start an Elasticsearch instance -`./scripts/harvest.sh -jsonDir data/test/ -es_host localhost -env dev -v` will load the test data with the necessary indices and mappings. +### Run backend tests +After loading test data into your local Elasticsearch instance, you can run the following command to run backend tests: + +`./gradlew test jacocoTestReport -s sonarqube` + ## Testing recommendations Behaviour driven development (upon [TDD](https://dannorth.net/2012/05/31/bdd-is-like-tdd-if/)) is recommended for all new developments. diff --git a/HOW-TO-LOAD-DATA.md b/HOW-TO-LOAD-DATA.md new file mode 100644 index 0000000000000000000000000000000000000000..85d1b0e22a5d9c23b369470e95623ef02b012a1f --- /dev/null +++ b/HOW-TO-LOAD-DATA.md @@ -0,0 +1,143 @@ + + +# TL;DR +For general data loading commands, see [Data Harvesting and Indexing](#Data Harvesting and Indexing). +For loading test data, see [Test data](#Test data). + + +## Data Harvesting and Indexing + +Before all, take care to get data locally before running any indexing script. + +### Indexing commands without Docker + +Start an elasticsearch instance using +```sh +docker-compose up +``` +Then, to index the data in the specified directory (`/path/to/local/data/`) into the Elasticsearch instance running on localhost, run the following command: + +```sh +./scripts/harvest.sh -jsonDir /path/to/local/data/ -es_host localhost -env dev -v +``` +See `./scripts/harvest.sh --help` for more information. + +### Indexing commands with Docker per Operating System + +The FAIDARE Docker image uses Alpine Linux as the base, which can lead to compatibility issues on certain systems, such as macOS with ARM processors (Apple Silicon). Below are the specific instructions for running the indexing command on Linux, macOS, and Windows: +- Finding the Container for `--network` + To determine the name of the container to use with the `--network=container:<container_name>` option, run the following command: +```sh +docker ps + +``` +This will list all running containers. Look for the container name in the NAMES column. For example, if the container name is `elasticsearch-faidare`, you would use it as follows: +```sh +--network=container:elasticsearch-faidare + +``` + +1. Linux + Run the command as is: +```sh +docker run -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ + +``` +2. MacOS: + On Apple Silicon (ARM64), ensure Rosetta is enabled for Docker Desktop and specify the platform explicitly: +```sh +docker run --platform linux/amd64 -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ + +``` +For Intel-based Macs, no additional flags are needed: +```sh +docker run -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ + +``` +3. Windows: + Adapt the volume path to Windows format (e.g., C:/path/to/local/data): +```sh +docker run -t --volume C:/path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ + +``` + +To more help add `--help` parameter to the command. + +If you depend on committed changes in indexing scripts under a specific branch (the docker image should have been automatically created by the CI), you need to change the tag of the docker image according to the branch name (ie. for branch `epic/merge-faidare-dd`, use tag `epic-merge-faidare-dd`, see `CI_COMMIT_REF_SLUG` [Gitlab predefined variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#predefined-variables-reference)), as following: + +```sh +docker run -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:epic-merge-faidare-dd` -jsonDir /opt/data/ --help +``` + +### Docker container maintenance + +[Data Harvesting and Indexing](#Data Harvesting and Indexing) section above expects to have an available docker image on the forgemia docker registry. The Gitlab CI rebuild it when needed, but you can update or push such an image using the following commands: + +```sh +# build the image +docker build -t registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest . + +# Login before pushing the image +TOKEN= # your PAT from forgeMIA +echo "$TOKEN" | docker login registry.forgemia.inra.fr/urgi-is/docker-rare -u <your ForgeMIA username> --password-stdin + +# push the built image +docker push registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest +``` + +That should ease the indexing of data without having to craft a dedicated environment. + + + +## Test data +`docker compose up` will start an Elasticsearch instance + +To load the test data with the necessary indices and mappings, you can use one of the following methods: +1. Using the Docker image (recommended for simplicity and consistency). +2. Running the script locally on your machine. + +### Option 1: Using the Docker Image +For detailed instructions on using the FAIDARE Docker image, refer to the relevant section in the [README.md](https://forgemia.inra.fr/urgi-is/faidare/-/blob/fix/NewReadMeHowToDevelopOnFaidare/README.md#data-harvesting-and-indexing). + ```sh +docker run -t --volume ./data/test:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ + ``` +*NB*: adapt the docker command depending on your [operating system](### Indexing commands with Docker per Operating System). + +*NB2*: Ensure you have an up to date access token to the container registry. If not, you can generate one from the [ForgeMIA](https://forgemia.inra.fr/urgi-is/docker-rare/-/settings/access_tokens) website or contact us. + +For instance for MacOS ARM on the new-api branch, the command would be: +```sh +docker compose up + +docker run --platform linux/amd64 -t --volume ./data/test:/opt/data/ \ +--network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:fix-newreadmehowtodeveloponfaidare \ +-jsonDir /opt/data/ +``` + +### Option 2: Running the Script Locally +If you prefer, you can run the `harvest.sh` script directly on your machine. However, please ensure the following dependencies are installed: +- jq (v1.6+): https://github.com/stedolan/jq/releases/tag/jq-1.6 +- GNU parallel: https://www.gnu.org/software/parallel/ +- gzip: http://www.gzip.org/ + +**Instructions by Operating System** + +1. Linux: Run the script as follows: + +`./scripts/harvest.sh -jsonDir data/test/ -es_host localhost -env dev -v` + +2. macOS: Ensure GNU utilities like readlink are available. If using a macOS-specific environment, you might need to install them using Homebrew: + +`brew install coreutils gnu-parallel jq gzip` + +Then run: + +`./scripts/harvest.sh -jsonDir data/test/ -es_host localhost -env dev -v` + +3. Windows: You can run the script using a Bash environment like Git Bash, WSL, or Cygwin. Ensure all required dependencies are installed within the environment: + + `./scripts/harvest.sh -jsonDir data/test/ -es_host localhost -env dev -v` + +Note for macOS and Windows users: Compatibility issues might arise due to system differences. If you encounter any issues, it is recommended to use the Docker-based method. + + diff --git a/README.md b/README.md index f49182dcc156aff02c7d0fd22c3bef276dbf6570..148fe0e3eefb552a611eff58bb0456e3097f5a3b 100644 --- a/README.md +++ b/README.md @@ -1,70 +1,81 @@ # FAIDARE: FAIR Data-finder for Agronomic Research -This application provides web services (based on the BrAPI standard) and a web interface with easy to use filters to facilitate the access to plant datasets from a federation of sources. +FAIDARE is a application that provides web services (based on the BrAPI standard) and a user-friendly web interface to access plant datasets from a federation of sources. [[_TOC_]] -## How to contribute +## Overview +- Purpose: Facilitate access to federated plant datasets for agronomic research. +- Key Features: + + - BrAPI-compliant web services. + - Intuitive filters for dataset exploration. + - Support for Elasticsearch and Kibana. + +## How to Contribute Look at the [contribution guide](CONTRIBUTING.md). -## Install development environment +## Data loading + +For loading data in the FAIDARE Elasticsearch, see [HOW-TO-LOAD-DATA.md](HOW-TO-LOAD-DATA.md). -- Install `node` and `yarn` +## Setting Up the Development Environment -Installation via `nvm` is recommended for easier control of installed version: -https://github.com/creationix/nvm +### Prerequisites +1. Node.js and Yarn +Install Node.js (v16.14.0 recommended) and Yarn. Using nvm is advised for version control: https://github.com/creationix/nvm. ```sh nvm install 16.14.0 nvm use v16.14.0 ``` +2. Java JDK17 +Install the latest JDK17 version for your operating system. -- Install JS dependencies +3. Docker +Required to run Elasticsearch and Kibana locally. Ensure Docker and Docker Compose are installed. +### Installation Steps +1. Install JavaScript Dependencies +Navigate to the web directory and install dependencies: ```sh cd web yarn ``` -- Install latest Java JDK8 - -See latest instructions for your operating system. - -- (Optional) Install `docker` - -If you want to run an Elasticsearch and Kibana instance on your machine. -You can use your favorite package manager for that - - -## Run backend development server - -First make sure you have access to an Elasticsearch HTTP API server on `http://127.0.0.1:9200` (either via ssh tunneling or by running a local server). - -If you want to run an Elasticsearch server on your development machine you can use the `docker`/`docker-compose` configuration like so: - +2. Start Elasticsearch and Kibana +Launch using Docker Compose: ```sh docker compose up ``` -> This will launch an Elasticsearch server (with port forwarding `9200`) and a Kibana server (with port forwarding `5601`) +- Elasticsearch available at http://127.0.0.1:9200 + +- Kibana available at http://127.0.0.1:5601 + + Note: Prepare your Elasticsearch indices before proceeding. -> **Warning**: This repository does not automatically index data into Elasticsearch, you need to prepare your indices beforehand. +## Running the Backend Server +### Basic API Server -If you just need access to API, you can run: +Run the backend server with: ```sh ./gradlew bootRun ``` -If you are developing and need to work on the `web` assets (scripts, styles, etc), -you'll need to run the application with the `dev` profile: +### Development Server + +If you are working on frontend assets, start the backend with the dev profile: ```sh ./gradlew bootRun --args='--spring.profiles.active=dev' ``` -Otherwise, for the complete server (backend APIs + web interface), you can run: +### Complete Backend + Web Interface + +For the full application: ```sh ./gradlew assemble && java -jar backend/build/libs/faidare.jar @@ -92,45 +103,6 @@ otherwise the changes won't be shown in the browser. `yarn watch:prod` is also available to use production settings, while `yarn build` and `yarn build:prod` do the same but without watching the changes. -## Harvest - -Before all, take care to get data locally before running any indexing script. - -### TL;DR - -Data indexing to your local Elasticsearch is done using the following command (take care to change the path to local data). Note that your local Elasticsearch instance should be already runing using `docker-compose up`: - -```sh -docker run -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -jsonDir /opt/data/ --help -``` - -Remove the `--help` parameter to run the loading with default params. - -If you depend on committed changes in indexing scripts under a specific branch (the docker image should have been automatically created by the CI), you need to change the tag of the docker image according to the branch name (ie. for branch `epic/merge-faidare-dd`, use tag `epic-merge-faidare-dd`, see `CI_COMMIT_REF_SLUG` [Gitlab predefined variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#predefined-variables-reference)), as following: - -```sh -docker run -t --volume /path/to/local/data:/opt/data/ --network=container:elasticsearch-faidare registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:epic-merge-faidare-dd` -jsonDir /opt/data/ --help -``` - -### Portability - -#### Docker - -[TL;DR](#TLDR) section above expects to have an available docker image on the forgemia docker registry. The Gitlab CI rebuil it when needed, but you can update or push such an image using the following commands: - -```sh -# build the image -docker build -t registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest . - -# Login before pushing the image -docker login registry.forgemia.inra.fr/urgi-is/docker-rare -u <your ForgeMIA username> - -# push the built image -docker push registry.forgemia.inra.fr/urgi-is/docker-rare/faidare-loader:latest -``` - -That should ease the indexing of data without having to craft a dedicated environment. - ## GitLab CI When creating merge requests on the ForgeMIA GitLab, the GitLab CI will @@ -138,19 +110,25 @@ automatically run the tests of the project (no need to do anything). If you want to run the GitLab CI locally, you have to follow this steps: -1. [Install gitlab-runner](https://docs.gitlab.com/runner/install/) -2. Run the following command (with the correct GnpIS security token): +### Important: +The `gitlab-runner exec` command was fully removed in GitLab Runner 16.0 and is no longer available in version 17.0 (released May 2024). This command was deprecated in GitLab 15.7 (December 2022), and its removal was part of the breaking changes introduced in GitLab Runner 16.0. -```sh -gitlab-runner exec docker test -``` +For more information, see the official [deprecation notice](https://docs.gitlab.com/ee/update/deprecations.html?utm_source=chatgpt.com#the-gitlab-runner-exec-command-is-deprecated). + +### Alternatives to gitlab-runner exec: +1. Use the Validate option on GitLab +GitLab provides an integrated Pipeline Editor that allows you to validate and simulate the execution of your .gitlab-ci.yml file before actually running the pipeline. +To use this feature, go to CI/CD > Pipelines > Editor in your GitLab project. Here, you can paste your .gitlab-ci.yml file and click the Validate button to check the syntax and simulate its execution. + +2. Emulators +While gitlab-runner exec is deprecated, third-party tools and emulators can help simulate GitLab CI pipelines locally. These tools may be useful for testing and troubleshooting, though they may not replicate the GitLab CI environment exactly. ## Spring Cloud config On bootstrap, the application will try to connect to a remote Spring Cloud config server to fetch its configuration. -The details of this remote server are filled in the `bootstrap.yml` file. +The details of this remote server are filled in the `bootstrap.yml` file. ( TODO: This file is not found ) By default, it tries to connect to the remote server on http://localhost:8888 but it can of course be changed, or even configured via the `SPRING_CONFIG_URI` environment variable.