HTseq-count finding single-end alignment after samtools markdup
1
0
Entering edit mode
2.0 years ago
Cory ▴ 10

I found that one of my samples had many more reads than the others (~113 million vs 30 million in the flagstat output for the BAM files). I ran samtools markdup after sorting the .bam file by name, tagging with fixmate, then sorting by position for markdup. Below is the markdup stats output, showing that I ended up with about 33 million reads in the markdup.bam file after duplicate removal.

COMMAND:

samtools markdup -r -s Gerson-54_paired_pec.positionsort.bam Gerson-54_paired_pec.markdup.bam
READ: 113282791
WRITTEN: 33036290
EXCLUDED: 0
EXAMINED: 113282791
PAIRED: 106910760
SINGLE: 6372031
DUPLICATE PAIR: 74415464
DUPLICATE SINGLE: 5831037
DUPLICATE PAIR OPTICAL: 0
DUPLICATE SINGLE OPTICAL: 0
DUPLICATE NON PRIMARY: 0
DUPLICATE NON PRIMARY OPTICAL: 0
DUPLICATE PRIMARY TOTAL: 80246501
DUPLICATE TOTAL: 80246501
ESTIMATED_LIBRARY_SIZE: 16975904

I then sorted this markdup.bam file by name and tried to run HTseq (htseq-count -f bam -r name --stranded=yes), but I got the following message:

Error occured when processing BAM input (record #6 in file Gerson-54_paired_pec.markdup.sorted.bam): Sequence of paired-end alignments expected, but got single-end alignment.

I see that in the markdup stats that there are singles, but I'm not sure how to deal with these as they appear to be interfering with the HTseq-count. Am I doing something wrong? Is there a way to eliminate the singe-end alignments, or is there a better way to deal with the issue?

Thanks for the help!

samtools rna-seq htseq • 598 views
ADD COMMENT
1
Entering edit mode
2.0 years ago

try filtering the BAM file to retain paired alignments only

samtools view -b -F 1 input.bam > output.bam

where

samtools flags 1
0x1 1   PAIRED
ADD COMMENT
0
Entering edit mode

Thank you! Seems like a simple answer, but it means a lot for this novice!

ADD REPLY

Login before adding your answer.

Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6