Picard API: Modify sam header
1
1
Entering edit mode
9.9 years ago

Hello,

I'm using the Picard Java API and I need to modify a SAMFileHeader object to contain only the sequences given in a SAMSequenceDictionary. I thought this could be done by setting the sequence dictionary using setSequenceDictionary(myDict) method, but it doesn't seem to be the case. What am I doing wrong?

This is an example:

## This is the header I want to modify:
System.out.println(
    newHeader.getTextHeader()
);

@HD    VN:1.4    SO:unsorted
@SQ    SN:CP001509.3_CT    LN:4558953
@SQ    SN:CP001509.3_GA    LN:4558953
@PG    ID:bowtie2    PN:bowtie2    VN:2.0.6

## ...and this is its sequence dict:
System.out.println(
    newHeader.getSequenceDictionary().getSequences()
);

[SAMSequenceRecord(name=CP001509.3_CT,length=4558953,dict_index=0,assembly=null), SAMSequenceRecord(name=CP001509.3_GA,length=4558953,dict_index=1,assembly=null)]

## This is the new sequence dict I want to put in the header:
System.out.println(
    newSeqDict.getSequences()
);
[SAMSequenceRecord(name=CP001509.3,length=4558953,dict_index=0,assembly=null)]

## So replace the sequence dictionary:
newHeader.setSequenceDictionary(newSeqDict);

## The sequence dict has been replaced...
System.out.println(
    newHeader.getSequenceDictionary().getSequences()
);

[SAMSequenceRecord(name=CP001509.3,length=4558953,dict_index=0,assembly=null)] 

## ...but not the actual sequences!?

System.out.println(
    newHeader.getTextHeader()
);

@HD    VN:1.4    SO:unsorted
@SQ    SN:CP001509.3_CT    LN:4558953
@SQ    SN:CP001509.3_GA    LN:4558953
@PG    ID:bowtie2    PN:bowtie2    VN:2.0.6

Hope this makes sense... Thanks!

Dario

EDIT:

The problem I had was that records written to the file with the new header came all out with * as reference (i.e. unmapped).

I was writing out records with (simplified, where sw is the output bam with the new header):

for (SAMRecord rec : sam){
    rec.setReferenceName(newRefName);
    sw.addAlignment(rec);
}

So I thought the header in sw was broken, hence my question above.

Turns out the header is actually fine (?). What I needed to do is:

for (SAMRecord rec : sam){
    rec.setHeader(newHeader);        // <-- Need this line!!
    rec.setReferenceName(newRefName);
    sw.addAlignment(rec);
}
Picard java • 3.2k views
ADD COMMENT
1
Entering edit mode
9.9 years ago

I'm afraid the class SAMFileHeader is just a POJO: changing one property (eg. setting the dict) will not affect the other properties. I would remove the @SEQ line from the text, add the lines of the dictionary and recreate a SAMFileHeader using SAMTextHeaderCodec https://github.com/samtools/htsjdk/blob/master/src/java/htsjdk/samtools/SAMTextHeaderCodec.java

ADD COMMENT
0
Entering edit mode

Thanks for taking time to reply Pierre! See my edit to the question if you are interested in what was happening.

(I'm not java expert, but I feel the Picard API is sometimes a bit too complicated?)

ADD REPLY

Login before adding your answer.

Traffic: 2371 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6