Hello all,
I have a sam file that I have mapped using Bowtie2, using the following parameters:
bowtie2 -q -p 8 -N 1 -X 1000
I want to get the uniquely mapped reads, as well as the best alignment in the case where there are multiple alignments.
To get the uniquely mapped reads I have done this, and it has worked well:
grep -E "@|NM:" input.sam | grep -v "XS:" > unique.sam
However, for the best alignment in the case of multimappers, when I want anything where AS > XS I have had no luck.
For example, of the four reads below, I would want to keep all reads except for the 3rd, in which the XS > AS.
A00129:1205:HLKCMDSX3:1:1101:3341:1235 99 chr16 74463701 42 150M = 74463850 299 TGTTTCCTCTCATTTTTAACCATTTATTTTGGTCCATTTTGGCTGGAAAAATTGACCAGTCCAGGTCAATTTGTGATGCTGAAATACTTTGCATGTAGGGCATTTCTGGAGGATGGTTGATCTTCTAGGGGTCATAAGTCAGATACTGGG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF,FFFFFFFFF:FFF:FFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:150 YS:i:-3 YT:Z:CP
A00129:1205:HLKCMDSX3:1:1101:3341:1235 147 chr16 74463850 42 150M = 74463701 -299 GAAAAAACAATACAAAACAAACAAAAACAAAACACAACAAGCAAAAACTGTTTTCCCTTTTCCCAGCCAGTATAAAAGCCAGTTAAGAACTTGACTAAGGGTAGAACTTTAGACCAGTTTTGAACTTTTTTTTTATTAGATATTTTCTTT FFFF:FFFF:FFF,F:FFFFFFFFF:FFFF:FFF,,FF:F:FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:-3 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:34A115 YS:i:0 YT:Z:CP
A00129:1205:HLKCMDSX3:1:1101:4734:1235 99 chr2 98497124 2 150M = 98497395 421 TGGCAAGAAAACTGAAAATCATGGAAAATGAGAAACATCCACTTGACGACTTGAAAAATGACGAAATCACTAAAAAACCTAAAAAATGAGAAATGCACACTGAAGGACCTGGAATATGGCGAGAAAACTGAAAATCACGGAAAATGAGAA FFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFF AS:i:-30 XS:i:-36 XN:i:0 XM:i:6 XO:i:0 XG:i:0 NM:i:6 MD:Z:1T37T38G1G7T37C23 YS:i:-34 YT:Z:CP
A00129:1205:HLKCMDSX3:1:1101:19578:1078 83 chr2 98497405 6 150M = 98496949 -606 AGACACATCCACTTGACGACTTGAAAAATGACGAAATCACTAAAAAACGTGAAAAATGAGAAATGCACACTGAAGGACCTGGAATATGGCGAGAAAACTGAAAATCACGGAAAATGAGAAATACACACTTTAGGACGTTAAATATGGCNA FFFFF:FFFFFFFFFF,FF:FFFFFFFFFFF,FF::FFFFFFFF,F:FFFFFFFF,FFFFFF::FFFFFFFFFFFFFFF:FFFF,FFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFF:FFFF:FFFFFFFFFF:FFFFF:FFFFFF#F AS:i:-16 XS:i:-21 XN:i:0 XM:i:4 XO:i:0 XG:i:0 NM:i:4 MD:Z:3A13T120G9G1 YS:i:-36 YT:Z:CP
I'm at a loss for how to do this in bash, so if someone would be happy to point me in the right direction, it would be really appreciated!
Many thanks.
Wonderful Pierre, it worked perfectly to get the best alignment! I merged this with the uniquely mapping reads to get the file I needed. Thank you very much.