Extract mitochondrial DNA from paired-end RAD-seq data: bwa and stacks
Entering edit mode
5.7 years ago
CaffeSospeso ▴ 50


I'm using single digested and paired-end RAD-seq data. I would like to extract mitochondrial DNA data. I would expect to have some sequences derived from mtDNA, however I cannot be sure about that because the species on which I'm working has no sequenced mitogenome.

I aligned cleaned paired-end reads on the mitogenome of a "relatively" closely related species.

bwa mem -t $threads -M $mitogenome $input1 $input2 | samtools sort -@ $threads -m 7G -o ${output}.bam

The output bam files contain reads.

Then, I used the gstacks tool included in stacks v2.0. The gstacks tool builds loci from the single and/or paired-end reads before calling SNPs. The results of such analyses are the following:

Reading BAM headers...
Processing all loci...

Read 79645748 BAM records:
  kept 7279 primary alignments (0.0%), of which 7279 reverse reads
  skipped 0 primary alignments with insufficient mapping qualities (0.0%)
  skipped 8224 excessively soft-clipped primary alignments (0.0%)
  skipped 79630245 unmapped reads (100.0%)

  Per-sample stats (details in 'gstacks.log.distribs'):
    read 4685044.0 records/sample (344176-6873442)
    kept 0.0%-0.0% of these

Built 0 loci comprising 0 forward reads and 0 matching paired-end reads; mean insert length was 0.0 (sd: nan).

Would you be able to explain me what is it happening? If bwa mem detected matching reads, why gstacks does not?

1) problems related to the fact that the reference genome is not a closely related species? 2) may be I should perform some alignement which takes into account indels? or ....

stacks bwa RAD-SEQ • 1.6k views

Login before adding your answer.

Traffic: 1458 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6