Question: Extract Alignment By Read Id From A Sam File
1
gravatar for Nicolas Rosewick
6.5 years ago by
Belgium, Brussels
Nicolas Rosewick8.3k wrote:

Hi,

Is there a rapid way to extract alignment from a sam file using read ids (about~100 read ids in average). If the read ids are in a file (one per line), I could do :

cat in.sam | grep -f idFile.txt > out.sam

but with a big sam file (~40Gb) it takes a lot of time.... so is there maybe a method to extract these alignments faster ?

Thanks,

N.

read id sam • 7.9k views
ADD COMMENTlink modified 6.5 years ago by Pierre Lindenbaum123k • written 6.5 years ago by Nicolas Rosewick8.3k
2

well, not really duplicate. It was BAM, not SAM.

ADD REPLYlink written 6.5 years ago by Pierre Lindenbaum123k

duplicate of

Extracting subsets of reads from a BAM file

ADD REPLYlink written 6.5 years ago by Pierre Lindenbaum123k
9
gravatar for Pierre Lindenbaum
6.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

faster ?

 LC_ALL=C grep -w -F -f idFile.txt  < in.sam > subset.sam
ADD COMMENTlink written 6.5 years ago by Pierre Lindenbaum123k
1

+1 for C locale.

ADD REPLYlink written 6.5 years ago by Aaronquinlan11k

amazingly simple!!!! thanks so much

ADD REPLYlink written 5.8 years ago by rob234king570

thanks so much for this! any suggestion on how to also keep the sam header in the output subset.sam file?

ADD REPLYlink written 15 months ago by rasha0

Capture the header (Lines starting with ^@). Add to the new subset.sam file.

ADD REPLYlink written 15 months ago by genomax72k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2400 users visited in the last hour