Question: Are merged bam files and merge fastq file -> bam files same?
gravatar for woongjaej
12 months ago by
woongjaej10 wrote:

Hi, guys

I'm processing NGS data and have a question.

I need to make my data have over 100,000,000 reads, so when my first processing is done, I check if they are good to go. When the bam files are not over 100,000,000 reads, I sequence those libraries which are more needed.

Here are the questions. 1. If I suppose my library, sample, sequencing machine and everything is the exactly the same, are the bam file which is merged after mapping and pre-merge fastq file, then mapped bam file same??

  1. And if they are same, can I merge bam files using samtools or sambamba??

Thank you very much.


sequencing bam merge • 611 views
ADD COMMENTlink modified 12 months ago by Devon Ryan88k • written 12 months ago by woongjaej10
gravatar for Devon Ryan
12 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Yes, they'll be essentially the same. There's always a bit of randomness with aligners, so you might find things like a different primary alignment for some multimappers. But everything high quality should be the same. Yes, you can then merge the BAM files with sambamba or samtools. If you're doing variant calling, be sure to assign appropriate read groups to each run.

ADD COMMENTlink written 12 months ago by Devon Ryan88k

I guess in an RNAseq-setting this does not hold true, since some aligners have a threshold for junction detection. If you have split files, you'll may miss junctions. The merged BAM is still missing these junctions whereas the mapping of the total reads' set find those and store it in the BAM file.

For DNAseq, I agree.

ADD REPLYlink written 12 months ago by michael.ante3.2k

If you're doing something with 2-pass then yes, you could theoretically miss something. Given the numbers getting tossed around by OP I suspect that's not the case.

ADD REPLYlink written 12 months ago by Devon Ryan88k

Thank you for the replies guys.

So you mean I can either merge fastq files first and then process the mapping or process mapping first for the additional fastq file and then merge the bam file with existing bam file, right??

ADD REPLYlink written 12 months ago by woongjaej10

Right. Some things, like looking for novel splice junctions, work better if you align everything in a single go (so merge the fastq files). For most other things it doesn't much matter if you merge fastq or BAM files, you get more or less the same result either way.

ADD REPLYlink written 12 months ago by Devon Ryan88k

Thank you Ryan.

P.S : I'm learning a lot from your other issues's replies!

ADD REPLYlink written 12 months ago by woongjaej10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1379 users visited in the last hour