Question: How can I restore paired-end fastq files from a sorted bam file with fastq remain sorted?
0
gravatar for lghust2011
16 months ago by
lghust201190
lghust201190 wrote:

I have two paired-end FASQ files named fq1.fastq and fq2.fastq. Then I use BWA MEM to map these two files with the paired-end mode, like this:

bwa mem reference.fasta fq1.fastq fq2.fastq > result.sam

Then I transfer this sam file to bam format and sort it by coordinates with samtools:

samtools view result.sam > result.bam && samtools sort result.bam -o result_sort.bam

Now I want to restore these two fastq files from result_sort.bam. I know that I can use picard SamToFastq to do this. But reads in fastq files not remain sorted like the result_sort.bam. So, is there any other way that I can restore fastq files with reads sorted? I want to do this, because when I get fq1.fastq and fq2.fastq from result_sort.bam and use BWA to map then I again, I can get a sam file within that many reads are sorted. Any reply will be much appreciated!

sequence alignment • 972 views
ADD COMMENTlink modified 16 months ago by Carlo Yague4.4k • written 16 months ago by lghust201190

I want to do this, because when I get fq1.fastq and fq2.fastq from result_sort.bam and use BWA to map then I again, I can get a sam file within that many reads are sorted

samtools sort accept also a sam file as input and can write to sam file. But why do you need a sam file instead of bam?

fin swimmer

ADD REPLYlink written 16 months ago by finswimmer11k

Either sam or bam format is OK for me

ADD REPLYlink written 16 months ago by lghust201190

May I ask why you need this?

ADD REPLYlink written 16 months ago by Gabriel R.2.6k

I want to get the sorted fastq files, so I can get the sorted sam file directly when I run BWA MEM with sorted fastq files. But it seems very difficult!

ADD REPLYlink written 16 months ago by lghust201190

Ok, what about reads getting assigned to random positions due to poor mapping quality, they will not end up in the same position? And to re-iterate my initial question, may I ask why you need this?

ADD REPLYlink written 16 months ago by Gabriel R.2.6k
2
gravatar for Carlo Yague
16 months ago by
Carlo Yague4.4k
Belgium
Carlo Yague4.4k wrote:

You can't do that !

In bwa manual, it is stated that :

bwa mem [...] db.prefix reads.fq [mates.fq]
[...] If mates.fq file is absent and option -p is not set, this command regards input reads are single-end. If mates.fq is present, this command assumes the i-th read in reads.fq and the i-th read in mates.fq constitute a read pair.

Meaning that your read pairs must be sorted by name in the fastq files inputed in bwa mem, not by coordinates.


EDIT : To clarify, the OP's idea of keeping the order of reads extracted from a coordinate-sorted bam file could work, but only with single-end reads. I guess that is goal is to avoid sorting the bam file after re-mapping (why would he want to re-map is another question) and saving some time. However, as stated above, aligners such as bwa-mem require the read pairs to correspond in the fastq.1 and fastq.2 files, making the coordinate-sorting inappropriate.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Carlo Yague4.4k

Thanks for your answer!

ADD REPLYlink written 16 months ago by lghust201190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 695 users visited in the last hour