Question: How to remove bam files that don't contain matching reads
0
gravatar for samlambrechts299
12 months ago by
samlambrechts299130 wrote:

Dear Biostars,

I have 45 directories, each containing 2500 bam and 2500 bam.bai files. Each bam file represents alignment results from aligning (shotgun metagenomic) sequences to a reference fasta file. Many of the bam files are empty and only contain the header and no matching/aligned reads. Is there a way to remove these bam files that don't contain matching reads?

Cheers,

Sam

samtools bam bowtie2 • 324 views
ADD COMMENTlink modified 12 months ago by Pierre Lindenbaum126k • written 12 months ago by samlambrechts299130
3
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum126k wrote:

check with :

 find /path/to/dir -type f -name "*.bam" | while read F; do samtools view  ${F} | grep -v  -E '^@' -m1 > /dev/null || echo $F; done

then replace echo with rm

ADD COMMENTlink modified 12 months ago • written 12 months ago by Pierre Lindenbaum126k
1

Works perfectly! Thank you!

ADD REPLYlink written 12 months ago by samlambrechts299130

May I suggest to use samtools view -H ${F} so only the header will be extracted and inspectioned ?

ADD REPLYlink written 12 months ago by Antonio R. Franco4.3k

no because there is always a header.

ADD REPLYlink written 12 months ago by Pierre Lindenbaum126k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1658 users visited in the last hour