Question: Are merged bam files and merge fastq file -> bam files same?
gravatar for woongjaej
2.1 years ago by
woongjaej10 wrote:

Hi, guys

I'm processing NGS data and have a question.

I need to make my data have over 100,000,000 reads, so when my first processing is done, I check if they are good to go. When the bam files are not over 100,000,000 reads, I sequence those libraries which are more needed.

Here are the questions. 1. If I suppose my library, sample, sequencing machine and everything is the exactly the same, are the bam file which is merged after mapping and pre-merge fastq file, then mapped bam file same??

  1. And if they are same, can I merge bam files using samtools or sambamba??

Thank you very much.


sequencing bam merge • 960 views
ADD COMMENTlink modified 2.1 years ago by Devon Ryan94k • written 2.1 years ago by woongjaej10
gravatar for Devon Ryan
2.1 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

Yes, they'll be essentially the same. There's always a bit of randomness with aligners, so you might find things like a different primary alignment for some multimappers. But everything high quality should be the same. Yes, you can then merge the BAM files with sambamba or samtools. If you're doing variant calling, be sure to assign appropriate read groups to each run.

ADD COMMENTlink written 2.1 years ago by Devon Ryan94k

I guess in an RNAseq-setting this does not hold true, since some aligners have a threshold for junction detection. If you have split files, you'll may miss junctions. The merged BAM is still missing these junctions whereas the mapping of the total reads' set find those and store it in the BAM file.

For DNAseq, I agree.

ADD REPLYlink written 2.1 years ago by michael.ante3.6k

If you're doing something with 2-pass then yes, you could theoretically miss something. Given the numbers getting tossed around by OP I suspect that's not the case.

ADD REPLYlink written 2.1 years ago by Devon Ryan94k

Thank you for the replies guys.

So you mean I can either merge fastq files first and then process the mapping or process mapping first for the additional fastq file and then merge the bam file with existing bam file, right??

ADD REPLYlink written 2.1 years ago by woongjaej10

Right. Some things, like looking for novel splice junctions, work better if you align everything in a single go (so merge the fastq files). For most other things it doesn't much matter if you merge fastq or BAM files, you get more or less the same result either way.

ADD REPLYlink written 2.1 years ago by Devon Ryan94k

Thank you Ryan.

P.S : I'm learning a lot from your other issues's replies!

ADD REPLYlink written 2.1 years ago by woongjaej10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1040 users visited in the last hour