Question: A Error About Gatk Samfilereader Malformed
1
gravatar for huboqiang
7.0 years ago by
huboqiang10
huboqiang10 wrote:

I met some questions when using GATK -UnifiedGenoTyper to call SNPs. The code and Error message like this:

 java -jar /WPS/BP/huboqiang/software/GenomeAnalysisTK-1.6-13-g91f02df/GenomeAnalysisTK.jar -T UnifiedGenotyper -I s_6_HW.sorted.bam -R ../hg19.2.fa -o my.vcf --output_mode EMIT_ALL_SITES

##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/WPS/BP/huboqiang/data/BamResult/headerRG.sorted.bam} is malformed: Read HWUSI-EAS535_0025:6:10:13898:5547#0 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK.  Please use http://www.broadinstitute.org/gsa/wiki/index.php/ReplaceReadGroups to fix this problem

However, this website has been closed so I do not know how to deal with that problem. The Reads mentioned above is:

HWUSI-EAS535_0025:6:1:15164:14102#0     99      chr1    12096   0       76M     =       12243   223     TAGAGTGGGATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATACCACAACCAGGCATAGG    hhhfhhhhhhfhhghffhhgfhhhhhghhehhhhhghhhhgghhghhhhgahghhhhhhfhhghghhhhfehhQd]    XT:A:R  NM:i:0  SM:i:0  AM:i:0  X0:i:7  X1:i:1  XM:i:0  XO:i:0  XG:i:0 MD:Z:76
HWUSI-EAS535_0025:6:1:15164:14102#0     147     chr1    12243   0       76M     =       12096   -223    GCTCATCTCCTTGGCTGTGATACGTGGCCGGCCCTCGCTCCAGCAGCTGGACCCCTACCTGCCGTCTGCTGCCATC    Qa]aaW^X^\[Rb[_W]dcddb]deaadgggaf_a]da_Wffcc`fdfdf]dfdffefffcdcgfggagggegggg    XT:A:R  NM:i:0  SM:i:0  AM:i:0  X0:i:3  X1:i:5  XM:i:0  XO:i:0  XG:i:0 MD:Z:76

And this is a part of the head of the bam file:

@HD     VN:1.0
@SQ     SN:chr1 LN:249250621
@RG     ID:s_6.bam2     PL:illumina     PU:HWUSI-EAS535_0025    SM:s_6.bam
@PG     ID:bwa  PN:bwa  VN:0.6.1-r104

As it mentioned "Read HWUSI-EAS535_0025:6:10:13898:5547#0 is either missing the read group or its read group is not defined in the BAM header, both of which are required by the GATK.", I do not know what's wrong with the read group @RG or with the BAM header.

Thanks.

gatk bam • 3.2k views
ADD COMMENTlink modified 5.3 years ago by Biostar ♦♦ 20 • written 7.0 years ago by huboqiang10
3
gravatar for Arun
7.0 years ago by
Arun2.3k
Germany
Arun2.3k wrote:

Your header seems to be right with @RG. However, I don't see the corresponding ID on the read.

For example: This is a header from a bam file I created.

@HD VN:1.0  GO:none SO:coordinate
@SQ SN:C1   LN:304671
@SQ SN:C2   LN:196989
@SQ SN:C3   LN:459830
@SQ SN:C4   LN:18056
@SQ SN:C5   LN:65502
@RG ID:1    PL:illumina  PU:TopHat  LB:S_6_1    SM:U_xy

And my reads are like this:

HWI-EAS431_0038:2:30:10474:19256#0  16  C1  6999    255 71M87N7M    *   0   0   CACCAAGAATCTAATCAAGTGCGCGATTAAGCATACGGCTATGCATCTGGTCATTGTTGATTCAGTCATCACTTTAAG  ECEDDF?GGFDGGDGGFGFCGEGBGFGFFGGGGDGGGGEGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG  MD:Z:78 RG:Z:1  XG:i:0  NH:i:1  NM:i:0  XM:i:0  XO:i:0  AS:i:-1 XS:A:-

Note the RG:Z:1 column, which you don't have. I think this is the problem. Do you use picard tools to add read groups? Or you just add them to your bam header?

ADD COMMENTlink written 7.0 years ago by Arun2.3k
1

You could use Picard AddOrReplaceReadGroups as follows:

java -Xmx4g -jar AddOrReplaceReadGroups.jar -I s_6_HW.sorted.bam -O s_6_HW.sorted.rg.bam RGID=s_6 RGPL=illumina RGPU=HWUSI-EAS535_0025 RGSM=s_6
ADD REPLYlink written 7.0 years ago by Matt Shirley9.0k

I'm sorry Matt I had a hard time logging in this website these days. Thanks, it really works!

ADD REPLYlink written 6.9 years ago by huboqiang10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1304 users visited in the last hour