bcftools multiple samples
0
0
Entering edit mode
4.5 years ago
evelyn ▴ 230

Hello,

I have downsampled sorted bam files using:

samtools view -h -s 0.5 E.sorted.bam > E.5x.sorted.sam

And again compressed the files:

samtools -bS E.5x.sorted.sam > E.5x.sorted.bam
samtools index E.5x.sorted.bam

Then I want to do variant calling using bcftools with multiple samples altogether in the list named bams.list with E.5x.sorted.bam files:

bcftools mpileup -Ov -f ref.fa -b bams.list | bcftools call -mv -o example.vcf

But it stops abruptly without giving an error. But it runs using the same command when I use the E.5x.sorted.sam files. I am not sure why it is not working with E.5x.sorted.bam. Thank you for your help!

snp • 3.3k views
ADD COMMENT
0
Entering edit mode

What is the terminal message that comes up when it stops? Have you check RAM consumption at that point?

ADD REPLY
0
Entering edit mode

There is no error showing. I just got this:

Note: none of --samples-file, --ploidy or --ploidy-file given, assuming all sites are diploid
[mpileup] 25 samples in 25 input files

This is the same message that I get for successful runs.

ADD REPLY
0
Entering edit mode

I see, but is there no further message even from your operating system? When you say that it just 'stops', what actually happens? - it just returns to the command line? Does it produce the VCF?

What is the content of bams.list?

But it stops abruptly without giving an error. But it runs using the same command when I use the E.5x.sorted.sam files. I am not sure why it is not working with E.5x.sorted.bam.

This sentence is somewhat contradictory, because you imply that it both does and does not work with E.5x.sorted.bam?

ADD REPLY
0
Entering edit mode

When I submit the script, it stops running after almost 10 seconds and I can see this output message that I shared earlier and it also produces the only header of the vcf file.

bams.list is a list of sorted bam files.

Yes, that is what I am trying to understand. It works when I use the SAM files obtained after downsampling sorted bam files but does not work when I use the sorted bam files by compressing the SAM files again for downsampled files.

ADD REPLY
0
Entering edit mode

I see. From what you're telling me, the command is actually running successfully but just not outputting anything. Are you downsampling your data too much such that there are very low numbers of reads remaining? For example, what is the depth of coverage profile for the problematic BAM? - you can use bedtools coverage to quickly check, I think

ADD REPLY
0
Entering edit mode

I am downsampling the files to an average of ~8x. How does it works for the SAM but not for the sorted Bam both for downsampled files.

ADD REPLY
0
Entering edit mode

Have you checked the contents of both?

ADD REPLY
0
Entering edit mode

I have checked the files that I compressed again using samtools view -bS E.5x.sorted.sam > E.5x.sorted.bam and have found that these output sorted bam files are all of size 71. So something is definitely wrong in compressing the files again.

ADD REPLY
0
Entering edit mode

Oh, should that not be:

samtools view -b E.5x.sorted.sam > E.5x.sorted.bam
ADD REPLY
0
Entering edit mode

I have updated my previous comment as I used

samtools view -bS E.5x.sorted.sam > E.5x.sorted.bam`

I can try using what you have suggested and will let you know. Thank you!

ADD REPLY
0
Entering edit mode

Ah, I see... it should, therefore, be okay. What are the contents of these SAM files? - do they contain reads?

ADD REPLY
0
Entering edit mode

Yes, these SAM files are almost 10 GB in size and contain reads. So I am trying to call variants for all files together for one chromosome at a time by specifying bcftools mpileup -r In order to use -r, index files of the respective input files are needed. When I try indexing these SAM files, it fails. So I wanted to convert these SAM files to SORTED BAM and then index these to use for variant calling for all files at the same time for one chromosome.

ADD REPLY
0
Entering edit mode

I imagine, therefore, that the issue is that you do not have enough RAM to sort the SAM files. How much RAM is available on your current machine?

ADD REPLY
0
Entering edit mode

I have 8gb on mac that I am using. However, I use the cluster for such jobs.

ADD REPLY
0
Entering edit mode

It is difficult for me to understand what is happening from here (remotely). Just some final pointers:

  • check the contents of all of your files, and at every step (including input and output)
  • check the default parameters of all of your commands
  • ensure that 'downsampling' to 5x is doing what you think that it is doing
ADD REPLY

Login before adding your answer.

Traffic: 2569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6