Hi,
I had a similar question about this last week but wasn't super clear about what I was asking. I have some old bam files that I don't have the original fastq files for. I tried using picard Samtofastq but I am having an issues when I realign the reads. I have been using picard to revert the old bam files back to fastq files, bwa to align the reads, and picard to sort and deduplicate. I used samtools flagstat to check out what was going on with my new bam files and this is what I got
473763117 + 0 in total (QC-passed reads + QC-failed reads)
1618429 + 0 secondary
0 + 0 supplementary
24690025 + 0 duplicates
473247331 + 0 mapped (99.89% : N/A)
472144688 + 0 paired in sequencing
236072344 + 0 read1
236072344 + 0 read2
466910892 + 0 properly paired (98.89% : N/A)
471430496 + 0 with itself and mate mapped
198406 + 0 singletons (0.04% : N/A)
3805718 + 0 with mate mapped to a different chr
2458472 + 0 with mate mapped to a different chr (mapQ>=5)
I have checked out my files and it doesn't look like it is an issue with bwa or picard sort/deduplicate. The only thing I can think of is that there is some issue with picard samtofastq. So the issue is that samtools mpileup
isn't calling any variants for those particular samples. All the variants are "no calls" or ./. and I am not sure why this is happening? I know it's not an issue with samtools mpileup
because it works fine with all my other samples, it is just those old files that are giving me an issue. I know it is just some minor issue but I am just not sure where to look.
This is the command I use
java -jar picard.jar SamToFastq I=C01767.bam FASTQ=C01767.R1.fastq SECOND_END_FASTQ=C01767.R2.fastq
Am I doing something wrong here?
I know GATK has a way to do this, but their method only give you one files here kind of want both of the fastq files
Keep in mind that BWA mem expects fastq files to be randomly ordered rather than by by chromosome (I mean reverting a sorted BAM file to fastq of course). That is due to how it calculates the insert sizes. To do that, you can simply sort the BAM by name and pipe that stream into your tool of choice. Sorting by name is equivalent to the order as it came from the sequencer.
These files were different from my other posts. These were not split up by chromosome. I have to revert the bam files back to fastq files because there was an updated genome that came out and I want to align it to the updated genome.
Context: Issue with reverting bam file back to fastq files
In your previous post you were suggested to try
samtools fastq
and to check the sorting by readname of your bam file. How did those options go?Ahhh I tried using
samtools bam2fq
but was disappointed with the lack of options. I didn't realizesamtools fastq
had so many more options. I will give that a try. How do you check the sorting by readname of a bam file?Those are redundant options.
bam2fq
is supposed to be deprecated in future. I am not sure what version ofsamtools
you are using but I get identical help if I use either of those options withsamtools v.1.8
.I am using version 1.8
Then this does not seem correct.