I'm not sure I fully understand what you're doing, but maybe this would work?
First, convert the BAM file to BED with the bam2bed
utility:
$ bam2bed < reads.bam > reads.bed
This puts the sequence information into the twelfth column of the BED file, e.g.:
chr1 0 36 B7_591:4:96:693:509 73 + 99 36M * 0 0 CACTAGTGGCTCATTGTAAATGTGTGGTTTAACTCG <<<<<<<<<<<<<<<;<<<<<<<<<5<<<<<;:<;7 MF:i:18 Aq:i:73 NM:i:0 UQ:i:0 H0:i:1 H1:i:0
chr1 2 37 EAS54_65:7:152:368:113 73 + 99 35M * 0 0 CTAGTGGCTCATTGTAAATGTGTGGTTTAACTCGT <<<<<<<<<<0<<<<655<<7<<<:9<<3/:<6): MF:i:18 Aq:i:66 NM:i:0 UQ:i:0 H0:i:1 H1:i:0
chr1 4 39 EAS51_64:8:5:734:57 137 + 99 35M * 0 0 AGTGGCTCATTGTAAATGTGTGGTTTAACTCGTCC <<<<<<<<<<<7;71<<;<;;<7;<<3;);3*8/5 MF:i:18 Aq:i:66 NM:i:0 UQ:i:0 H0:i:1 H1:i:0
chr1 5 41 B7_591:1:289:587:906 137 + 63 36M * 0 0 GTGGCTCATTGTAATTTTTTGTTTTAACTCTTCTCT (-&----,----)-)-),'--)---',+-,),''*, MF:i:130 Aq:i:63 NM:i:5 UQ:i:38 H0:i:0 H1:i:0
...
You could grep
on this with your "needle" string:
$ bam2bed < reads.bam \
| grep -i GCTTCT - \
> filteredReads.bed
You could use awk
to reorder column fields, e.g.:
$ bam2bed < reads.bam \
| grep -i GCTTCT - \
| awk '{print $1"\t"$2"\t"$3"\t"$6}' \
> filteredReads.txt
This last file is no longer a true BED file at this point (as you're rearranging columns out of UCSC order) but that's up to you and your workflow.
The resulting files filteredReads.bed
or .txt
contain reads with tags that contain the sequence GCTTCT
somewhere in them. The BED coordinates are 0-indexed.
If this is not what you're after, please clarify your question.
Cross-post via: http://seqanswers.com/forums/showthread.php?p=109361
Hi Alex, thank you very much for your help! I just want to extract the read count from the BAM file for the position where has the tag GCTTCT or some other tags.
Thanks Alex for your nice tools bedops. could you please help to see another error when running "bam2bed < reads.bam > reads.bed":
* glibc detected sort-bed: corrupted double-linked list: 0x0000000011982df0 **
Many thanks!
It would be easier to troubleshoot if you post a message on our forum or mailing list: http://groups.google.com/group/bedops-discuss -- please try to include a description of your platform and what installer you used, as well as other relevant details. Thanks!
Sorry, it seems that I can't post there.
the platform is ubuntu 12.04 on a x86_64 machine;
for the installation, i use the "bedops_linux_x86_64-v2.2.0.tar.bz2". I also have tried the following, but it doesn't work.
svn checkout http://bedops.googlecode.com/svn/trunk/ bedops-read-only
cd bedops-read-only
make
make install
Thanks!