Picard Markduplicates output
1
0
Entering edit mode
7.9 years ago
VasGene ▴ 20

Dear,

I used the picard MarkDuplicates option on exome-seq data and I got the following output:

LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED UNMAPPED_READS  UNPAIRED_READ_DUPLICATES    READ_PAIR_DUPLICATES    READ_PAIR_OPTICAL_DUPLICATES    PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE

Unknown Library 16796213    7941846 73335   8044138 1283816 7383    0.324719    21938234

Are the numbers for UNPAIRED_READS_EXAMINED (=16,796,213) and READ_PAIRS_EXAMINED (=7,941,846) as expected or they indicate a problem?

Thanks

next-gen picard markduplicates • 7.0k views
ADD COMMENT
2
Entering edit mode
7.9 years ago
Amitm ★ 2.2k

hi, Yes that indicates something unexpected. For a paired-end data successfully aligned, the unpaired reads should/would be a fraction of the total read pairs examined. In your case its reverse. Also note that the % duplication value is showing as 32.4% How was the FastQC report? Here is an e.g. MarkDuplicates output -

UNPAIRED_READS_EXAMINED 53150
READ_PAIRS_EXAMINED 95066758
UNMAPPED_READS  191324
UNPAIRED_READ_DUPLICATES    20376
READ_PAIR_DUPLICATES    4303567
READ_PAIR_OPTICAL_DUPLICATES    1049365
PERCENT_DUPLICATION 0.045363
ESTIMATED_LIBRARY_SIZE  1326608424

This WES sample had ~95M PE and was aligned with bwa mem. The % dup. here is 4%. But I observe anything upto 10-12% duplication for 60-100x depth WES samples.

ADD COMMENT
0
Entering edit mode

Thanks very much for your answer. It is very informative. What fraction of total read pairs is supposed to be unpaired in general in WES data?

ADD REPLY
1
Entering edit mode

hi, Since this stats is what Picard sees in the aligned BAM file, unpaired reads would mean, of all reads that aligned, those where only one of the mate aligned. If you do samtools flagstat your_bam, then the (mapped) unpaired reads is given as the singleton%. I have seen this to be mostly < 1%.

ADD REPLY

Login before adding your answer.

Traffic: 2558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6