Question: How to remove bam files that don't contain matching reads
0
gravatar for samlambrechts299
18 months ago by
samlambrechts299150 wrote:

Dear Biostars,

I have 45 directories, each containing 2500 bam and 2500 bam.bai files. Each bam file represents alignment results from aligning (shotgun metagenomic) sequences to a reference fasta file. Many of the bam files are empty and only contain the header and no matching/aligned reads. Is there a way to remove these bam files that don't contain matching reads?

Cheers,

Sam

samtools bam bowtie2 • 413 views
ADD COMMENTlink modified 18 months ago by Pierre Lindenbaum130k • written 18 months ago by samlambrechts299150
3
gravatar for Pierre Lindenbaum
18 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum130k wrote:

check with :

 find /path/to/dir -type f -name "*.bam" | while read F; do samtools view  ${F} | grep -v  -E '^@' -m1 > /dev/null || echo $F; done

then replace echo with rm

ADD COMMENTlink modified 18 months ago • written 18 months ago by Pierre Lindenbaum130k
1

Works perfectly! Thank you!

ADD REPLYlink written 18 months ago by samlambrechts299150

May I suggest to use samtools view -H ${F} so only the header will be extracted and inspectioned ?

ADD REPLYlink written 18 months ago by Antonio R. Franco4.5k

no because there is always a header.

ADD REPLYlink written 18 months ago by Pierre Lindenbaum130k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1135 users visited in the last hour