Hello!
I'm attempting to map strand-specific paired-end reads to the Arabidopsis genome (TAIR10) using TopHat v2.0.12. When I upload these data to JBrowse there are reads mapping to both the plus strand and minus strand, so it looks like I'm losing strand specific information.
I've tried solving the problem, and it looks like the issue occurs during the actual tophat mapping step. The command that I use is:
tophat --library-type fr-secondstrand --read-mismatches 2 --read-edit-dist 2 --max-multihits 10 --b2-very-sensitive --transcriptome-max-hits 10 --no-coverage-search --output-dir topOut_wholeRoot_polyA_rep1_trimmed -G TAIR10_GFF3_genes.gtf -p 2 bowtie2_TAIR10/TAIR10 wholeRoot_polyA_rep1_trimmed_1.fastq wholeRoot_polyA_rep1_trimmed_2.fastq
I ran bamToBed to take the output bam file and convert it to a bed file so I could see what was going on, and it looks like the read pairs are mapping to opposite strands. For example the bam files reads:
HWI-D00550:272:C75M0ANXX:6:1101:10732:9653 99 Chr1 34896 50 125M = 34898 127 AGCGCATACAGGAAGAAGTCCATGAGAAGCCCACCAAGCAGTTGCAGCAGCTACTGTAGCAGCGGCAATGGCAGTTATACTTGGAGGAGAAGAGCTCATTGGGGTTGATGAATCACCAGAATTCC BCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGEEEGGGGFGGGGGGGGGGGGGGGGGCGGGGGGGGCFEGCGGGGGGGGGGGEGGEGDGGGGGGGGCAGFCGGGGGGGGGGGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:125 YT:Z:UU XS:A:- NH:i:1 HWI-D00550:272:C75M0ANXX:6:1101:10732:9653 147 Chr1 34898 50 125M = 34896 -127 CGCATACAGGAAGAAGTCCATGAGAAGCCCACCAAGCAGTTGCAGCAGCTACTGTAGCAGCGGCAATGGCAGTTATACTTGGAGGAGAAGAGCTCATTGGGGTTGATGAATCACCAGAATTCCCG BGGGGGEGGFGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGBCCCB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:125 YT:Z:UU XS:A:- NH:i:1
While the bamToBed output reads:
Chr1 34895 35020 HWI-D00550:272:C75M0ANXX:6:1101:10732:9653/1 50 + Chr1 34897 35022 HWI-D00550:272:C75M0ANXX:6:1101:10732:9653/2 50 -
My understanding is that for all of my downstream analyses these paired reads need to be on the same strand. Is there a problem with my TopHat command? Does anyone have any recommendations to correct this post-mapping?
Thank you for the help!