Question: Mapping Inverse Reads On The Same Strand
0
gravatar for Vikas Bansal
7.0 years ago by
Vikas Bansal2.3k
Berlin, Germany
Vikas Bansal2.3k wrote:

Hello everyone,

I think, I am missing something about how mapping tools work. I am mapping reads using BWA.

My genome file (reference file in fasta) -

>chr1
AACCCCCCCCTCCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGC
CAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAAT
TTTATCTTTAGGCGGTATGCACTTTTAACAAAAAANNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
GCCCATCCTACCCAGCACACACACACCGCTGCTAACCCCATACCCCGAAC
CAACCAAACCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCTCNNNN

>chrMm
GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT
TTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTG
GAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATT
CTATTATTTATCGCACCTACGTTCAATATTACAGGCGAACATACCTACTA
AAGTGTGTTAATTAATTAATGCTTGTAGGACATAATAATAACAATTGAAT
GTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAANAATTTCCACC
My reads are in fasta format -
>Read#BetweenN's
TTTTAACAAAAAAGCCCATCCTACCCAGC
>VIK
CAACCAAACCCCAAAGACACCCCCCACAG
>VIK#Inverse
GACACCCCCCACAGAAACCCCAAACCAAC
>VIK#COMPLIMENTARY
GTTGGTTTGGGGTTTCTGTGGGGGGTGTC
>VIK#Inverse_of_COMPLIMENTARY
CTGTGGGGGGTGTCTTTGGGGTTTGGTTG

Commands for BWA -

bwa index -a is genome
bwa aln  genome reads > out.sai
bwa samse genome out.sai reads > out.sam

The output from BWA is (sam file) -

@SQ                           SN:chr1   LN:300
@SQ                           SN:chrMm  LN:300
Read#BetweenN's               4         *       0    0   *    *  0  0  TTTTAACAAAAAAGCCCATCCTACCCAGC  *
VIK                           0         chr1    251  37  29M  *  0  0  CAACCAAACCCCAAAGACACCCCCCACAG  *  XT:A:U  NM:i:0  X0:i:1  X1:i:0  XM:i:0  XO:i:0  XG:i:0  MD:Z:29
VIK#Inverse                   4         *       0    0   *    *  0  0  GACACCCCCCACAGAAACCCCAAACCAAC  *
VIK#COMPLIMENTARY             4         *       0    0   *    *  0  0  GTTGGTTTGGGGTTTCTGTGGGGGGTGTC  *
VIK#Inverse_of_COMPLIMENTARY  16        chr1    251  37  29M  *  0  0  CAACCAAACCCCAAAGACACCCCCCACAG  *  XT:A:U  NM:i:0  X0:i:1  X1:i:0  XM:i:0  XO:i:0  XG:i:0  MD:Z:29

So my first read did not mapped because half of it lies on left side of N's and half on right side of N's on chr1 (so no problem with this read). Second read mapped to forward strand (good).Fifth read (inverse of complimentary of second read) mapped to same position but on reverse strand (very good).

Question- Third read(VIK#Inverse) is inverse of second read. So if you will read the reference genome from right hand side at last line of chr1, it is the same sequence but it did not get mapped. So, is there any option with which I can map my read like this? If not, then why it is not a good idea to map reads like this? (same goes for the fourth read but according to reverse strand)

ADD COMMENTlink modified 7.0 years ago by lh331k • written 7.0 years ago by Vikas Bansal2.3k
1

You ask why it's not a good idea to map such reads -- it's because they're just not the same "thing". Can I put the shoe on the other foot and ask why you think it is a good idea for the VIK#Inverse read to be mapped here? What would it mean to consider, say, "GATACA" to be the same sequence as "ACATAG"?

ADD REPLYlink written 7.0 years ago by Steve Lianoglou5.0k

I am not saying to map these reads always, I asked if there is any option with which I can map reads like this. In some cases it will be useful - Consider I am looking for SNP's (not looking for inversion) and there is inversion and SNP (within the read) at this particular position, at least if I will map this read at that position, I can find SNP rather than just throwing it away as unmapped read.

ADD REPLYlink written 7.0 years ago by Vikas Bansal2.3k
0
gravatar for lh3
7.0 years ago by
lh331k
United States
lh331k wrote:

How to present read sequences is mandated by the SAM spec. There is no way to output otherwise. As to why, because for most applications, working with sequences on the same strand as the reference is easier. Surely having reads on their original strands is sometimes more convenient, but this happens less frequently. When SAM was in design, a colleague of mine argued reads should be put on their original strand, but he later told me that the SAM way is convenient for most of his works.

ADD COMMENTlink written 7.0 years ago by lh331k

Thanks for your reply. But I was asking about the third and fourth reads which were not mapped. I will edit my question little bit.

ADD REPLYlink written 7.0 years ago by Vikas Bansal2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 854 users visited in the last hour