Question: How to remove bam files that don't contain matching reads
0
gravatar for samlambrechts299
7 weeks ago by
samlambrechts299130 wrote:

Dear Biostars,

I have 45 directories, each containing 2500 bam and 2500 bam.bai files. Each bam file represents alignment results from aligning (shotgun metagenomic) sequences to a reference fasta file. Many of the bam files are empty and only contain the header and no matching/aligned reads. Is there a way to remove these bam files that don't contain matching reads?

Cheers,

Sam

samtools bam bowtie2 • 134 views
ADD COMMENTlink modified 7 weeks ago by Pierre Lindenbaum119k • written 7 weeks ago by samlambrechts299130
3
gravatar for Pierre Lindenbaum
7 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

check with :

 find /path/to/dir -type f -name "*.bam" | while read F; do samtools view  ${F} | grep -v  -E '^@' -m1 > /dev/null || echo $F; done

then replace echo with rm

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by Pierre Lindenbaum119k
1

Works perfectly! Thank you!

ADD REPLYlink written 6 weeks ago by samlambrechts299130

May I suggest to use samtools view -H ${F} so only the header will be extracted and inspectioned ?

ADD REPLYlink written 7 weeks ago by Antonio R. Franco4.0k

no because there is always a header.

ADD REPLYlink written 7 weeks ago by Pierre Lindenbaum119k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 672 users visited in the last hour