Question: Variant calling within and between 2 nematode strains
2.6 years ago by
standonn20 wrote:

Dear all,

I have the genome sequence of a nematode species I'm working with. This genome was assembled using reads (3 pair-end libraries and 2 mate-pair) from one particular strain (let's call it Strain A).

Now I have one pair-end library for another strain (Strain B).

I would like to call SNPs and InDels within and between strains. I am unsure about how to do this.

I thought about using one of the following pipelines (or a similar one): or

Basically, I would align my reads against the genome, mark the duplicates, run the UnifiedGenotyper of GATK and filter the variants.

Now some questions and concerns: - I have a lots more read files for Strain A then Strain B (only one library). Should I only use one library of Strain A? Does the insert size matter? - The pipeline I described above will enable me to get variants within each strain but not between strains. Do you know any way I could do this analysis?

Many thanks for your suggestions and insights! Sophie

2.6 years ago by
Medhat8.2k wrote:

you can follow GATK best practice , where you can detect snps between samples but you will follow HaplotypeCaller caller Why do joint calling rather than single-sample calling?
Or FreeBayes

Thanks a lot for your answer! The links you sent really do help!

I also encourage reading about FreeBayes

  • much faster
  • somewhat better in detecting concordant SNPs and indels
  • worse with False Positives and False Negatives

Good luck :)

Please accept the answer so that other readers of the forum will benefit.

