Question: Extract Alignment By Read Id From A Sam File
1
gravatar for Nicolas Rosewick
6.0 years ago by
Belgium, Brussels
Nicolas Rosewick7.4k wrote:

Hi,

Is there a rapid way to extract alignment from a sam file using read ids (about~100 read ids in average). If the read ids are in a file (one per line), I could do :

cat in.sam | grep -f idFile.txt > out.sam

but with a big sam file (~40Gb) it takes a lot of time.... so is there maybe a method to extract these alignments faster ?

Thanks,

N.

read id sam • 7.0k views
ADD COMMENTlink modified 6.0 years ago by Pierre Lindenbaum118k • written 6.0 years ago by Nicolas Rosewick7.4k
2

well, not really duplicate. It was BAM, not SAM.

ADD REPLYlink written 6.0 years ago by Pierre Lindenbaum118k

duplicate of

Extracting subsets of reads from a BAM file

ADD REPLYlink written 6.0 years ago by Pierre Lindenbaum118k
8
gravatar for Pierre Lindenbaum
6.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

faster ?

 LC_ALL=C grep -w -F -f idFile.txt  < in.sam > subset.sam
ADD COMMENTlink written 6.0 years ago by Pierre Lindenbaum118k

+1 for C locale.

ADD REPLYlink written 6.0 years ago by Aaronquinlan10k

amazingly simple!!!! thanks so much

ADD REPLYlink written 5.2 years ago by rob234king570

thanks so much for this! any suggestion on how to also keep the sam header in the output subset.sam file?

ADD REPLYlink written 8 months ago by rasha0

Capture the header (Lines starting with ^@). Add to the new subset.sam file.

ADD REPLYlink written 8 months ago by genomax64k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1402 users visited in the last hour