Question: merging fastq, sam or bam?
gravatar for ceboral
13 months ago by
ceboral0 wrote:

Hi all! I have some RNA-seq (single-read) datasets divided in two different SRA, one with ~30 million reads and the other with ~15 million reads. I have been reading that I could merge the fastq files, sam or bam files and I would like to know if there is any differences regarding the quality of the final dataset. Thanks!!

sam bam fastq • 626 views
ADD COMMENTlink modified 13 months ago by ATpoint15k • written 13 months ago by ceboral0

There should not be as long as you process them identically before merging the BAM files.

ADD REPLYlink written 13 months ago by genomax65k
gravatar for ATpoint
13 months ago by
ATpoint15k wrote:

I recommend to quality-trim & align them independently, with the aligner directly piped into SAMtools sort (that avoids the unnecessary SAM files). Then check the alignment rate for every file and keep only those that you feel comfortable with. I had it before that technical replicates (same library over multiple lanes over several years as part of a large published study) had strikingly different quality, with the first replicate showing like 95% alignment rate, and the last one like 40% with a lot of trash reads (maybe sample got degraded over time in the freezer, I don't know). In any case, do not merge too early as you may lose the ability to discard bad samples if necessary. Do not trust that published data are always good quality, there are a lot of junk datasets out there in the SRA.

ADD COMMENTlink modified 13 months ago • written 13 months ago by ATpoint15k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2123 users visited in the last hour