Question: Understanding Picards HsMetrics
1
gravatar for finswimmer
21 months ago by
finswimmer8.9k
Germany
finswimmer8.9k wrote:

Hello,

I have some problems understanding the output of picards CollectHsMetrics. Because I don't have a bait file i uses my bed file with the target regions to produce the interval list.

java -jar picard.jar BedToIntervalList I=target.bed SD=genome.dict O=target.interval_list

This target interval list I now use for the BAIT parameter and the TARGET parameter:

java -jar picard.jar CollectHsMetrics I=Sample.bam O=hsmetrix.txt BI=target.interval_list TI=target.interval_list

In my understanding the results for ON_BAIT_BASES and ON_TARGET_BASES have to be the same as the interval list is the same (and also all other values where is differ between bait and target). But in my case I have this result:

ON_BAIT_BASES: 1012933970

ON_TARGET_BASES: 678789480

Please help me understanding where this difference come from.

Thanks a lot.

fin swimmer

sam bam picardtools • 1.3k views
ADD COMMENTlink modified 10 months ago by Biostar ♦♦ 20 • written 21 months ago by finswimmer8.9k
1
gravatar for Pierre Lindenbaum
21 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

In my understanding the 'bait' is the read and the 'target' is the reference . If you have insertions, clipped bases in your reads, there will be a higher number of bases for the reads.

looking at the code in picard:


                int onBaitBases = 0;

                if (!probes.isEmpty()) {
                    for (final Interval bait : probes) {
                        for (final AlignmentBlock block : record.getAlignmentBlocks()) {
                            final int end = CoordMath.getEnd(block.getReferenceStart(), block.getLength());

                            for (int pos = block.getReferenceStart(); pos <= end; ++pos) {
                                if (pos >= bait.getStart() && pos <= bait.getEnd()) ++onBaitBases;
                            }
                        }
                    }
ADD COMMENTlink written 21 months ago by Pierre Lindenbaum116k

Hello Pierre, that makes it a little bit clearer.

So ON_TARGET means here only those bases of read which are realy mapped to a target position, and ON_BAIT counts also those bases that are not strictly mapped to a reference position but "somewhere between" (insertions) or are clipped?

Is the ratio of my 2 values a common one?

fin swimmer

ADD REPLYlink written 21 months ago by finswimmer8.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour