Question

Are there any software packages available to calculate ploidy from RNA-Seq data?

1

Entering edit mode

8.8 years ago

JacobS ▴ 980

I realize copy number variation from RNA-Seq data is a poor idea since expression differences between samples will confound copy number data, but what about general inferences of ploidy?

I have large RNA-Seq sets for >50 samples, and want to determine which are aneuploidy and which are not. Does anyone know of a means to do this?

RNA-Seq CNV ploidy • 2.9k views

ADD COMMENT • link updated 18 months ago by Ram 43k • written 8.8 years ago by JacobS ▴ 980

0

Entering edit mode

Not sure, but if you have a bam file and samtools you can find the depth of coverage using the

$ samtools depth in.bam -r 1:100-200

It will print out a per-base pair depth of coverage, which can be normalized to the average coverage of the sample. So if the reads in the region have twice the coverage they are duplicated ( a normalized value of 1.5, as 0.5 is a heterozygous deletion)

You can perform the same task with

$ samtools view -c in.bam 1:100-200

and normalize by (read count / region size) * average read length / average coverage

Doing it systematically for RNA-seq I'm not sure, but you can probe around your bam files and see if there is an amplification.

ADD REPLY • link 8.8 years ago by QVINTVS_FABIVS_MAXIMVS ★ 2.5k

1

Entering edit mode

RNASeq has read coverage variation by gene usage, so every gene has a lot of difference between samples. Your coverage method won't work for ploidy. You must be thinking of genome sequencing.

I think we'd have to use SNP frequency and look for non-binary SNP sites.

ADD REPLY • link updated 18 months ago by Ram 43k • written 8.8 years ago by karl.stamm 4.1k