Question: In RNA-seq data, how to find whether the mRNA is on the forward strand or reverse strand?
1
gravatar for yanweng
3.3 years ago by
yanweng30
yanweng30 wrote:
  1. For strand-specific RNA-seq (by UNG treatment), whether read 1 is reverse complementary to mRNA, and read 2 is the same to mRNA?

  2. For strand-specifc RNA-seq data, what is the best way to separate reads, based on which strand the mRNA is from? In another word, how to group all reads that map to genes on plus strand?

I have made a gtf file containing only genes on plus strand (genes_on_plus_chr.gtf), and used "bedtools intersect -a Aligned.sortedByCoord.out.bam -b genes_on_plus_chr.gtf" trying to pool out all shared region. But I found this is not good enough, because I can still find some reads with flag number 147 and 99 (ideally would be only contain 163 and 83). Since read 1 should be reverse complementary to mRNA while read2 is the same to mRNA for stranded RNA-seq, can I pool out all the reads with flag number either 163 or 83, and these should correspond to mRNA from gene on plus strand?

  1. How I can do the separation on unstranded RNA-seq data, since I don't know the relationship between read1/read2 to the mRNA?
rna-seq alignment next-gen genome • 1.7k views
ADD COMMENTlink modified 3.3 years ago by Devon Ryan91k • written 3.3 years ago by yanweng30
1

Right, so what is your actual biological goal? I imagine you're trying to get counts or something like that. In this case use featureCounts or htseq-count and call it done, there is rarely a reason to use anything else.

For unstranded data you can't ever discern the strand of the original fragment.

ADD REPLYlink written 3.3 years ago by Devon Ryan91k

I am not interested in differential expression level. My goal is to identify post-transcriptional RNA modification (A to I). To achieve that, I need to first separate the reads based on where them mapped to. For reads mapped to gene on plus strand, I will look for A to G; for gene on reverse strand, I will look for T to C

ADD REPLYlink written 3.3 years ago by yanweng30
2
gravatar for Devon Ryan
3.3 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

Ah, you want to look at RNA editing. Details like that tend to be useful when asking for help. Here is a short bit of (untested) python using pysam demonstrating how to do this.

#!/usr/bin/env python
import pysam
infile = pysam.AlignmentFile("input.bam", "r")
forward = pysam.AlignmentFile("forward.bam", "w", template=infile)
reverse = pysam.AlignmentFile("reverse.bam", "w", template=infile)

for read in infile.fetch():
    if read.is_paired:
        if read.is_read1 and read.is_reverse:
            forward.write(read)
        elif read.is_read1:
            reverse.write(read)
        elif read.is_read2 and read.is_reverse:
            reverse.write(read)
        else:
            forward.write(read)
    else:
        if read.is_reverse:
            forward.write(read)
        else:
            reverse.write(read)
forward.close()
reverse.close()
infile.close()

Afterward, do some variant calling on each and filter accordingly.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Devon Ryan91k

Do the reads need to be flipped prior to variant calling? I'd assume any variant caller should identify reads mapping to the negative strand from a BAM file and account for that in the SNP calls.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by Vivek2.3k
1

The reads don't need to be flipped, but variant callers are otherwise not strand aware.

ADD REPLYlink written 3.3 years ago by Devon Ryan91k

Thanks Devon! I tried the script and it worked. The only thing is that the new bam files don't have header, which will cause problem for further process (like samtools view). I tried "samtools reheader forward.bam another.sam", but it gives me messy error.. Any suggestions?

ADD REPLYlink written 3.3 years ago by yanweng30

Ah, I bet the output was SAM rather than BAM. Try changing the mode from w to wb.

ADD REPLYlink written 3.3 years ago by Devon Ryan91k

Thanks Devon! I tried the script and it worked. The only thing is that the new bam files don't have header, which will cause problem for further process (like samtools view). I tried "samtools reheader forward.bam another.sam", but it gives me messy error.. Any suggestions?

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by yanweng30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 790 users visited in the last hour