Hi,
I have a Minigraph-cactus pangenome graph and would like to used mpmap
to align RNA-seq data.
According to the transcriptomics analysis wiki, I should construct a spliced pangenome graph using:
vg rna -p --threads <threads> --transcripts annotation.[gtf|gff3] --use-hap-ref --gbz-format graph.gbz > spliced_graph.pg
However, for the input graph.gbz
file, I am wondering which graph I should use for RNA-seq alignment:
- full.gbz: the full graph without clipping and filtering
- clip.gbz: clipped graph
- filter.gbz: clipped and filtered by haplotype frequency
- sample.gbz: haplotype sampled graph using the WGS data of the same sample of the RNA-seq data
When using giraffe
to align WGS data, I learnt that the performance is sample > filter > clip when there are many haplotypes. Is it similar for aligning RNA-seq data? Can I just use the clip.gbz (or even full.gbz)?
Many thanks,
Han
Thank you so much! I will use the clipped graph.