Question: How can I restore paired-end fastq files from a sorted bam file with fastq remain sorted?
0
gravatar for lghust2011
2.2 years ago by
lghust201190
lghust201190 wrote:

I have two paired-end FASQ files named fq1.fastq and fq2.fastq. Then I use BWA MEM to map these two files with the paired-end mode, like this:

bwa mem reference.fasta fq1.fastq fq2.fastq > result.sam

Then I transfer this sam file to bam format and sort it by coordinates with samtools:

samtools view result.sam > result.bam && samtools sort result.bam -o result_sort.bam

Now I want to restore these two fastq files from result_sort.bam. I know that I can use picard SamToFastq to do this. But reads in fastq files not remain sorted like the result_sort.bam. So, is there any other way that I can restore fastq files with reads sorted? I want to do this, because when I get fq1.fastq and fq2.fastq from result_sort.bam and use BWA to map then I again, I can get a sam file within that many reads are sorted. Any reply will be much appreciated!

sequence alignment • 1.3k views
ADD COMMENTlink modified 2.2 years ago by Carlo Yague4.9k • written 2.2 years ago by lghust201190

I want to do this, because when I get fq1.fastq and fq2.fastq from result_sort.bam and use BWA to map then I again, I can get a sam file within that many reads are sorted

samtools sort accept also a sam file as input and can write to sam file. But why do you need a sam file instead of bam?

fin swimmer

ADD REPLYlink written 2.2 years ago by finswimmer13k

Either sam or bam format is OK for me

ADD REPLYlink written 2.2 years ago by lghust201190

May I ask why you need this?

ADD REPLYlink written 2.2 years ago by Gabriel R.2.7k

I want to get the sorted fastq files, so I can get the sorted sam file directly when I run BWA MEM with sorted fastq files. But it seems very difficult!

ADD REPLYlink written 2.2 years ago by lghust201190

Ok, what about reads getting assigned to random positions due to poor mapping quality, they will not end up in the same position? And to re-iterate my initial question, may I ask why you need this?

ADD REPLYlink written 2.2 years ago by Gabriel R.2.7k
2
gravatar for Carlo Yague
2.2 years ago by
Carlo Yague4.9k
Canada
Carlo Yague4.9k wrote:

You can't do that !

In bwa manual, it is stated that :

bwa mem [...] db.prefix reads.fq [mates.fq]
[...] If mates.fq file is absent and option -p is not set, this command regards input reads are single-end. If mates.fq is present, this command assumes the i-th read in reads.fq and the i-th read in mates.fq constitute a read pair.

Meaning that your read pairs must be sorted by name in the fastq files inputed in bwa mem, not by coordinates.


EDIT : To clarify, the OP's idea of keeping the order of reads extracted from a coordinate-sorted bam file could work, but only with single-end reads. I guess that is goal is to avoid sorting the bam file after re-mapping (why would he want to re-map is another question) and saving some time. However, as stated above, aligners such as bwa-mem require the read pairs to correspond in the fastq.1 and fastq.2 files, making the coordinate-sorting inappropriate.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Carlo Yague4.9k

Thanks for your answer!

ADD REPLYlink written 2.2 years ago by lghust201190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 794 users visited in the last hour