[feature] add support for user-provided experiments
Ideally, the user should be able to add a few (between 1 and 10?) of its own processed experiment files to be clustered with the dataset and added to the heatmap.
The input file format will depend on the type of dataset selected.
- for RNAseq data, an gene expression matrix with a first column containing gene name, and subsequent columns expression data. Probably a tsv file, with gene names as Ensembl id ? Or rather one file per experiment ?
- for ChIPseq data, a bed file of peaks for each experiment
Steps to do for RNA-seq:
- optionally-scale user data (i.e. log10(x+1) or asinh() ?)
- compute new correlations values between the user provided experiment(s) and all the selected FAANG experiments
- compute the new clustering
- display the heatmap
For step 2, there are two approaches:
- compute the new correlation once for the entire dataset
- compute the correlations each time, but only for the selected experiment in the dataset
Approach 1 is initially longer, but then the user can filter the heatmap without the need for new correlation computation. Approach 2 can be initially faster (let say if the user in only looking at liver samples) but will be slower each time the user is changing the filters.
I vote for approach 1, but you can change my mind.