Question: Need to parse pooled bam file based on sequence
gravatar for djvanderweele
5.1 years ago by
djvanderweele0 wrote:

I have a bam file (pool.bam) that represents PCR amplicons of many samples, pooled into one sequencing lane.  For some regions, the same amplicon was generated and sequenced for more than one sample.  There is a 3 base index at the beginning of the PCR primer that is used to tell the samples apart.  To look at the specific position of interest in each sample, I'd like to parse out the samples in the bam file, and use mpileup to evaluate the SNV that was amplified.  I tried converting to sam file (samtools view -o pool.sam pool.bam) and using grep for the individual primers used for each sample (more | grep TCACAAGAAACCCTGCT > grep.sam), which appeared to work well.  But if I try to convert to bam and index it before using mpileup, I get an error: missing SAM header; parse error at line 1; truncated file.

What is a better way to do this?

sequencing dnaseq • 1.6k views
ADD COMMENTlink modified 5.1 years ago by DG7.2k • written 5.1 years ago by djvanderweele0
gravatar for DG
5.1 years ago by
DG7.2k wrote:

If you do samtools view -H pool.sam > grep.sam followed by your grep but using an append: more | grep TCACAAGAAACCCTGCT >> grep.sam You should have the complete sam header at the start of your file. That will solve the particular problem you are having.

That said you could potentially get some unexpected results filtered into your data by just using grep like that, and depending on your experimental design may be missing data as well. You are only looking at the primer from one direction (and thats ok, depending on your experiment you may only expect it that way), but you could also get some off-target matches. Although that would also indicate some bad primer choices for your amplicons as well.

ADD COMMENTlink modified 13 months ago by _r_am32k • written 5.1 years ago by DG7.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1912 users visited in the last hour