Question: How to remove bam files that don't contain matching reads
0
gravatar for samlambrechts299
6 months ago by
samlambrechts299130 wrote:

Dear Biostars,

I have 45 directories, each containing 2500 bam and 2500 bam.bai files. Each bam file represents alignment results from aligning (shotgun metagenomic) sequences to a reference fasta file. Many of the bam files are empty and only contain the header and no matching/aligned reads. Is there a way to remove these bam files that don't contain matching reads?

Cheers,

Sam

samtools bam bowtie2 • 230 views
ADD COMMENTlink modified 6 months ago by Pierre Lindenbaum122k • written 6 months ago by samlambrechts299130
3
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

check with :

 find /path/to/dir -type f -name "*.bam" | while read F; do samtools view  ${F} | grep -v  -E '^@' -m1 > /dev/null || echo $F; done

then replace echo with rm

ADD COMMENTlink modified 6 months ago • written 6 months ago by Pierre Lindenbaum122k
1

Works perfectly! Thank you!

ADD REPLYlink written 6 months ago by samlambrechts299130

May I suggest to use samtools view -H ${F} so only the header will be extracted and inspectioned ?

ADD REPLYlink written 6 months ago by Antonio R. Franco4.1k

no because there is always a header.

ADD REPLYlink written 6 months ago by Pierre Lindenbaum122k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2410 users visited in the last hour