bcftools mpileup sample labels, seg fault
2
0
Entering edit mode
4 months ago
Gregor Rot ▴ 490

Running this kind of command gives seg fault:

bcftools mpileup -s sample1,sample2 -f genome.fasta sample1.bam sample2.bam

However this works (without -s):

bcftools mpileup -f genome.fasta sample1.bam sample2.bam

bcftools 1.7
Using htslib 1.7-2

Any ideas why this could be?

Thanks,
Gregor

bcftools mpileup • 551 views
ADD COMMENT
1
Entering edit mode
4 months ago
  1. bcftools 1.7 was released 3 years ago. You should upgrade it.
  2. did you set the read-groups ? what is the output of

    samtools view -H sample1.bam | grep '@RG'
    
ADD COMMENT
0
Entering edit mode

Thank you Pierre, silly me, updated to the latest version now.

There is no more seg fault, however I get this now:

[mpileup] failed to find a file header with usable read groups

And like you wrote, there is no flag @RG in the bam files. Can you help me out, what is the easies way to set this in the bam files? I suppose I can set each bam file with it's unique RG? Thanks

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks for all the help, however I still get the same error after adding RG to bam files.

bcftools mpileup -s sample1,sample2 -f genome.fasta sample1.bam sample2.bam

If I do samtools view -H sample1.bam | grep "@RG" I get: @RG ID:sample1, same for sample2. So my bam files have RG.

Any ideas on why this still doesn't work? If I ommit the -s sample1,sample2 everything works.

bcftools 1.12
Using htslib 1.12

Thanks, Gregor

ADD REPLY
1
Entering edit mode

I get: @RG ID:sample1,

and you should have

@RG ID:whatever_for_sample1 SN:sample1
ADD REPLY
0
Entering edit mode

Thanks, and haha, this will take another over-night job to add this to all the bam files I have :-)

ADD REPLY
0
Entering edit mode

Can I ask, does bctools consider the @RG SN flag from bam files if present? If yes, does specifying the -s sample1,sample2 override the info from the bam files? But then again, why it doesn't work with the -s if bam files don't have the RG field. A little bit confusing for me, Thanks

ADD REPLY
0
Entering edit mode

the sample name is defined in RG/SN not RG/id, see the sam specification.

ADD REPLY
0
Entering edit mode

Thanks Pierre yes I understand that, however does the bcftools -s set the sample names for VCF output or just check if they match the ones defined in bam RG:SN? I just find it confusing, intuitively -s would probably set them, and if not, why doesn't bcftools just read them from the bam files? And why with -s the bam files would also need an RG:SN if -s would set them. Confusing to me.

ADD REPLY
1
Entering edit mode

the option will only use the samples defined with -s. You could have only one bam with 1000 samples. -s "S1,S2" would use only the samples S1 and S2.

ADD REPLY
0
Entering edit mode
4 months ago
Gregor Rot ▴ 490

Me again, so now I understood that bcftools mpileup -s sample1,sample2 will only consider sample1 and sample2 from the marked alignments (read in from bam files). Just as an exercise, I added sample1 and sample2 to the @RG SN of 2 bam files:

samtools view -H sample1.bam | grep "@RG"
# @RG     ID:sample1    SN:sample1

samtools view -H sample2.bam | grep "@RG"
# @RG     ID:sample2    SN:sample2

However, if now I do:

bcftools mpileup -s sample1,sample2 -f genome.fasta sample1.bam sample2.bam

I still get this error:

[mpileup] failed to find a file header with usable read groups

Please advise, thanks, Gregor

ADD COMMENT
0
Entering edit mode

works on my machine:

bcftools mpileup  -s S1,S2 --fasta-ref ~/src/jvarkit/src/test/resources/rotavirus_rf.fa ~/src/jvarkit/src/test/resources/S1.bam ~/src/jvarkit/src/test/resources/S2.bam | grep CHROM

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S1 S2

$ for F in ~/src/jvarkit/src/test/resources/S1.bam ~/src/jvarkit/src/test/resources/S2.bam ; do echo $F && samtools view -H $F | grep '^@RG' ; done /home/lindenbaum-p/src/jvarkit/src/test/resources/S1.bam @RG ID:S1 SM:S1 LB:L1 CN:Nantes /home/lindenbaum-p/src/jvarkit/src/test/resources/S2.bam @RG ID:S2 SM:S2 LB:L2 CN:Nantes ```

Ah, my bad, I wrote SN, but it's SM; How did you add the read group ? by hand ?

ADD REPLY
0
Entering edit mode

Aaaa, it's that yes, SM, I added the RG flag with this:

samtools addreplacerg -r ID:sample1 -r SN:sample1 sample1.bam -o sample1_RG.bam

No problem, will change SN to SM and it should work. Thanks Pierre!

ADD REPLY

Login before adding your answer.

Traffic: 1081 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6