Physical phasing of polyploid SNPs from RNA-seq data
1
0
Entering edit mode
3.5 years ago
jfaberha ▴ 50

Hi, I'm working with transcriptomic data from a diploid species with an incomplete reference genome (likely 80% of genome represented in reference). I aligned paired-end reads with STAR then generated counts with StringTie. After running DE analyses, most of the results make sense and look good, although we found a surprisingly high number of DE genes between genetic lineages.

After visualizing SNPs/indels from the alignment files in IGV, it looks like a number of genes have alignments that indicate more than two haplotypes for a single locus, indicating potential misalignment of transcripts from lineage-specific CNVs or paralogs/pseudogenes missing from the reference genome.

Now, I would like to quantify these misalignments in a more systematic way and my thought was to extract physical phasing information per sample to see if we can detect more than two overlapping haplotypes at each locus, which would then be flagged as a potential misalignment. So far, I've tried the GATK haplotype caller pipeline for RNA-seq data (barcwiki.wi.mit.edu/wiki/SOP/CallingVariantsRNAseq), but the problem I'm running into is that phasing information isn't generated in vcf output files when ploidy is set higher than two, and running haplotype caller on samples set as diploid doesn't tell us if we have extra unexpected alleles.

Can anyone suggest another tool that can output phasing information for more than two haplotypes, or just suggest an entirely different strategy to address my original problem quantifying possible misalignments of RNA-seq data? Thanks.

GATK phasing transcriptome haplotype polyploid • 851 views
ADD COMMENT
1
Entering edit mode
2.5 years ago

Late, but I think this is one of the few tools which can do polyploid phasing:

https://github.com/bluenote-1577/flopp

Another is perhaps nphase

https://github.com/OmarOakheart/nPhase

ADD COMMENT

Login before adding your answer.

Traffic: 2170 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6