Projects
Sequences utilities
-
fqconcat: concatenate paired-end reads from FASTQ into a unique sequence.
-
rmdup: remove duplicated sequences from FASTA/Q files by sequence.
-
seqbyacc-rs: can apply mulkfilters the sequences of a FASTA file from their accessions
-
seqCollapse: collapse identical sequences from FASTQ/A files into a unique sequence and reporting reads counts.
VHH annotation from NGS
-
ngs2VHH: a snakemake pipeline annotating VHH sequences from NGS sequences from trimming sequences to tracking evolution of sequences in mulitples banks.
-
ngs2pep-rs: can deduce protein sequence from an alignment between DNA sequences (typically NGS) and a reference protein sequences.
-
seq2kmermat: encodes the sequences from a fasta file (AA or NT) into a presence-absence kmer matrix. This matrix can be used for UMAP projection.
-
seq_annembed: perform sequences embedding using annembed (Zhao et al., 2024 Approximate Nearest Neighbor Graph Provides Fast and Efficient Embedding with Applications in Large-scale Biological Data doi:10.1101/2024.01.28.577627). It’s a variation on UMAP (McInnes et al., 2018 UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction doi:10.48550/arXiv.1802.03426)
Epinenetics
-
chipseq-nf: a nextflow pipeline dedicated to analyzing CHIPSeq using either epic2 (broad peaks) or PePr (narrow peaks).
-
BSSeq.nf: a nextflow pipeline dedicated to analyzing WGBS using Bismark.
Video and DL
cf projetred