Question: Understanding Picards HsMetrics
gravatar for finswimmer
3.6 years ago by
finswimmer14k wrote:


I have some problems understanding the output of picards CollectHsMetrics. Because I don't have a bait file i uses my bed file with the target regions to produce the interval list.

java -jar picard.jar BedToIntervalList I=target.bed SD=genome.dict O=target.interval_list

This target interval list I now use for the BAIT parameter and the TARGET parameter:

java -jar picard.jar CollectHsMetrics I=Sample.bam O=hsmetrix.txt BI=target.interval_list TI=target.interval_list

In my understanding the results for ON_BAIT_BASES and ON_TARGET_BASES have to be the same as the interval list is the same (and also all other values where is differ between bait and target). But in my case I have this result:

ON_BAIT_BASES: 1012933970

ON_TARGET_BASES: 678789480

Please help me understanding where this difference come from.

Thanks a lot.

fin swimmer

sam bam picardtools • 2.6k views
ADD COMMENTlink modified 2.7 years ago by Biostar ♦♦ 20 • written 3.6 years ago by finswimmer14k
gravatar for Pierre Lindenbaum
3.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

In my understanding the 'bait' is the read and the 'target' is the reference . If you have insertions, clipped bases in your reads, there will be a higher number of bases for the reads.

looking at the code in picard:

                int onBaitBases = 0;

                if (!probes.isEmpty()) {
                    for (final Interval bait : probes) {
                        for (final AlignmentBlock block : record.getAlignmentBlocks()) {
                            final int end = CoordMath.getEnd(block.getReferenceStart(), block.getLength());

                            for (int pos = block.getReferenceStart(); pos <= end; ++pos) {
                                if (pos >= bait.getStart() && pos <= bait.getEnd()) ++onBaitBases;
ADD COMMENTlink written 3.6 years ago by Pierre Lindenbaum131k

Hello Pierre, that makes it a little bit clearer.

So ON_TARGET means here only those bases of read which are realy mapped to a target position, and ON_BAIT counts also those bases that are not strictly mapped to a reference position but "somewhere between" (insertions) or are clipped?

Is the ratio of my 2 values a common one?

fin swimmer

ADD REPLYlink written 3.6 years ago by finswimmer14k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1351 users visited in the last hour