Cannot call haplotypes on bam
0
0
Entering edit mode
8 weeks ago
dec986 ▴ 340

I have used STAR to align some whole exome sequencing files:

STAR --readFilesCommand zcat --runThreadN 4 --genomeDir /usr/local/share/hg38 --outFileNamePrefix HGFLY --readFilesIn AGKD01_CKDN220004842-1A_HGFLYDSX3_L1_1.fq.gz AGKD01_CKDN220004842-1A_HGFLYDSX3_L1_2.fq.gz --outSAMtype BAM SortedByCoordinate

but when I run haplotypeCaller in the next step:

~/gatk-4.2.6.0/gatk --java-options "-Xmx4g" HaplotypeCaller --input HGFLYAligned.sortedByCoord.out.bam --reference /usr/local/share/hg38/hg38.fa -O HGFLY.g.vcf.gz -ERC GVCF

I get an error:

A USER ERROR has occurred: Argument emit-ref-confidence has a bad value: Can only be used in single sample mode currently. Use the --sample-name argument to run on a single sample out of a multi-sample BAM file.

but when I try to view the samples samtools view -H HGFLYAligned.sortedByCoord.out.bam | grep RG I get none.

I have found similar errors reported, https://gatk.broadinstitute.org/hc/en-us/community/posts/360075892511-emit-ref-confidence-error but I don't see a solution to my problem there.

It seems that my alignment produced no sample names, and I don't see how STAR can output sample names.

A sample view of my bam file:

A00564:479:HGFLYDSX3:1:2420:11686:10708 163 chr1    10001   255 6S108M36S   =   10191   
170948  TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
TAACCCTAACCCTAACCCGAAACCCAAAACCCAAAACCCGAAACCCAAAACCCA  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF
FFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFF,F::FFFF:F:F,FFFFF,,FF,:,,,FFF:F,,F::,:F,F,F,:F,  
NH:i:1  HI:i:1  AS:i:200    nM:i:10
A00564:479:HGFLYDSX3:1:2573:2889:29669  163 chr1    10001   255 1S113M36S   =   10005   
104 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC
CTAACCCTAACCCAAACCCTAACCCTAACCCTAAACCTAACCCTCACCCTAACC  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF
FFFFFF::FFFFFFFFFFF,FFFFFFF,FFF,FFFF,,::FFF,FFFFF,,FFFF,FFFFF,:FFF,,F:FF,:F,FFF,,F,FF,FFFFF,,,F,F,,,,F  
NH:i:1  HI:i:1  AS:i:207    nM:i:2
A00564:479:HGFLYDSX3:1:2140:5656:8844   163 chr1    10004   3   106M44S =   10004   110 
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC
CAGCGCCAGAGCCTAGGCGGGGGCCTACGGCTGGGCCGGAGGGTGG  FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFF:FFFFFFFF:F::FF,FF,FFFF:,,F,::,,,,,,,,:,,,,:,,,,,,,,,,,,,:,,:,,,,,,,,,,,,,,:,,  NH:i:2  
HI:i:1  AS:i:206    nM:i:4

How can I call haplotypes/variants?

gatk STAR • 590 views
ADD COMMENT
0
Entering edit mode

Please post also a brief subset of reads. Did you try to add a readgroup using AddOrReplaceReadGroups ?

ADD REPLY
0
Entering edit mode

@Shred I've placed a small view of the aligned bam file. AddOrReplaceReadGroups didn't change anything, unfortunately

ADD REPLY
0
Entering edit mode

What do you mean by

AddOrReplaceReadGroups didn't change anything

You're in the situation where your BAM file has no RG information, so the caller doesn't know how your reads are actually stored. An alternative could also be samtools addreplacerg as stated here.

Closing, why are you using STAR, designed for RNAseq, to align exome sequencing data?

ADD REPLY
0
Entering edit mode

Is STAR inappropriate for Whole Exome Sequencing?

I'll check out addreplacerg

ADD REPLY
0
Entering edit mode

Sure. Use bwa instead

ADD REPLY

Login before adding your answer.

Traffic: 1158 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6