Software for inversions in short read WGS
23 months ago

Hi,

I apologize in advance for the vague titled. I did some WGS on a sample that had an interesting disease. The disease is X-linked and the sample was taken from a male. There have been a couple of papers published about this disease and they found a large inversion ~3kb in an intron. I was looking at the raw .bam file and coverage gets pretty low, next to none, in the same region described in previous papers. So I am wondering if it is just a coincidence or maybe the software I used has trouble mapping that inversion since its a male and all the reads would be inverted due to hemizygosity.

I used bwa for aligning the reads, picard to sort/dedup, and samtools for variant calling.

I saw that pindel is a potential piece of software I could use but I saw that it uses bam files, which I can already tell has poor coverage in the area of interest. I think I have to use something other than bwa?

I could be completely wrong here, maybe bwa does a good job a mapping reads like this and there is just poor coverage in this region. I just think it is interesting there is poor coverage in this region. This region doesn't seem to have repeated sequences either so I doubt it is difficult to sequence.

Let me know what you think!

I apologize in advance for the vague titled.

I have changed your title to make it more specific.

So I am wondering if it is just a coincidence

If you want to know if the low coverage is related to the inversion, then compare the coverage to other (control) genomes from males?

I used has trouble mapping that inversion since its a male and all the reads would be inverted due to hemizygosity.

I'm not sure if I follow here...

I didn't do WGS on any controls. This sample is a horse and we don't very much horse related project so I didn't want to spend the money to sequence a control. I figured if I could something I could do sanger sequence to confirm.

This was my thought process, and this could be incorrect since it is male sample there is obviously just one X all the reads in this region would be inverted. I am wondering if since these reads are inverted, they wouldn't map very well to the genome? I feel like that doesn't make sense though.

all the reads in this region would be inverted.

Why would that be the case?

Yeah... the more I think about it makes me think there is just poor coverage here :(

23 months ago

Testing for the presence of a known inversion is relatively easy: make a PCR amplicon which will only yield a product if the inversion is present.