Bam Files And Reads Quality Check
2
0
Entering edit mode
11.8 years ago
madkitty ▴ 690

I want to do remove certain reads from my bam file and the output is quite funny ..

I tried trim out BAM files with samtools view and awk, but the size of the output BAM files is tripled. Thus, if I re-use this output bam file it says that the header is missing. For example when I do the following :

samtools view 10iPS-1.sorted.bam |
awk '
BEGIN {
dict[65]
dict[177]
}
$2 in dict' > 1IPS-BAM1-RQ.bam

I don't think that AWK is the best way to select only certain reads in a BAM files .. if you have a better method please let me know !!

samtools bam • 4.2k views
ADD COMMENT
2
Entering edit mode
11.8 years ago
Rok ▴ 190

This depends on have you want to do the filtering. If you have a list od reads you want to include or exclude you can use FilterSamReads from Picard tools. It also works if you want to include/exclude aligned reads.

The size of the output is tripled because you are storing data back in uncompressed SAM format. To compress output back into BAM format you should use:

samtools view 10iPS-1.sorted.bam |
awk '
  BEGIN {
    dict[65]
    dict[177]
  }
  $2 in dict' |
samtools view -bS - > 10iPS-1.filtered.bam
ADD COMMENT
0
Entering edit mode

Yeah, this is the answer. I bet you will find that your 1IPS-BAM1-RQ.bam is human readable, which means it's an uncompressed .sam file.

Note that samtools view might refuse to compress it back without headers, and the awk command will likely sttip those off. So you may have to add them before using samtools view to compress back to .bam

ADD REPLY
0
Entering edit mode
11.8 years ago

use samtools view with the options -f or -F :

 -f INT   required flag, 0 for unset [0]
 -F INT   filtering flag, 0 for unset [0]

see also : http://picard.sourceforge.net/explain-flags.html

ADD COMMENT
0
Entering edit mode

Hey thanks for your answer, when I use the field separator like that :

     awk -F'\t' 'BEGIN ..  ?

The output is the same .. the size of the file is tripled, I don't think that BAM files are meant to be manipulated with AWK.. there should be a way with PICARD tools or smth..

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6