Question

Issue with Samtools variant calling

0

Entering edit mode

5.8 years ago

williamsbrian5064 ▴ 510

Hi,

I had a similar question about this last week but wasn't super clear about what I was asking. I have some old bam files that I don't have the original fastq files for. I tried using picard Samtofastq but I am having an issues when I realign the reads. I have been using picard to revert the old bam files back to fastq files, bwa to align the reads, and picard to sort and deduplicate. I used samtools flagstat to check out what was going on with my new bam files and this is what I got

473763117 + 0 in total (QC-passed reads + QC-failed reads)
1618429 + 0 secondary
0 + 0 supplementary
24690025 + 0 duplicates
473247331 + 0 mapped (99.89% : N/A)
472144688 + 0 paired in sequencing
236072344 + 0 read1
236072344 + 0 read2
466910892 + 0 properly paired (98.89% : N/A)
471430496 + 0 with itself and mate mapped
198406 + 0 singletons (0.04% : N/A)
3805718 + 0 with mate mapped to a different chr
2458472 + 0 with mate mapped to a different chr (mapQ>=5)

I have checked out my files and it doesn't look like it is an issue with bwa or picard sort/deduplicate. The only thing I can think of is that there is some issue with picard samtofastq. So the issue is that samtools mpileup isn't calling any variants for those particular samples. All the variants are "no calls" or ./. and I am not sure why this is happening? I know it's not an issue with samtools mpileup because it works fine with all my other samples, it is just those old files that are giving me an issue. I know it is just some minor issue but I am just not sure where to look.

This is the command I use

java -jar picard.jar SamToFastq I=C01767.bam FASTQ=C01767.R1.fastq SECOND_END_FASTQ=C01767.R2.fastq

Am I doing something wrong here?

I know GATK has a way to do this, but their method only give you one files here kind of want both of the fastq files

genome Assembly alignment • 1.4k views

ADD COMMENT • link updated 5.7 years ago by Biostar 20 • written 5.8 years ago by williamsbrian5064 ▴ 510

2

Entering edit mode

Keep in mind that BWA mem expects fastq files to be randomly ordered rather than by by chromosome (I mean reverting a sorted BAM file to fastq of course). That is due to how it calculates the insert sizes. To do that, you can simply sort the BAM by name and pipe that stream into your tool of choice. Sorting by name is equivalent to the order as it came from the sequencer.

ADD REPLY • link 5.8 years ago by ATpoint 81k

0

Entering edit mode

These files were different from my other posts. These were not split up by chromosome. I have to revert the bam files back to fastq files because there was an updated genome that came out and I want to align it to the updated genome.

ADD REPLY • link 5.8 years ago by williamsbrian5064 ▴ 510

1

Entering edit mode

Context: Issue with reverting bam file back to fastq files

ADD REPLY • link 5.8 years ago by Ram 43k

1

Entering edit mode

In your previous post you were suggested to try samtools fastqand to check the sorting by readname of your bam file. How did those options go?

ADD REPLY • link 5.8 years ago by WouterDeCoster 47k

0

Entering edit mode

Ahhh I tried using samtools bam2fq but was disappointed with the lack of options. I didn't realize samtools fastq had so many more options. I will give that a try. How do you check the sorting by readname of a bam file?

ADD REPLY • link 5.8 years ago by williamsbrian5064 ▴ 510

0

Entering edit mode

Those are redundant options. bam2fq is supposed to be deprecated in future. I am not sure what version of samtools you are using but I get identical help if I use either of those options with samtools v.1.8.

ADD REPLY • link 5.8 years ago by GenoMax 141k

0

Entering edit mode

I am using version 1.8

ADD REPLY • link 5.8 years ago by williamsbrian5064 ▴ 510

0

Entering edit mode

Then this does not seem correct.

I didn't realize samtools fastq had so many more options.

ADD REPLY • link 5.8 years ago by GenoMax 141k