Question: Stranded RNA-seq data visualization in UCSC
1
gravatar for marina.v.yurieva
6 months ago by
Farmington, CT
marina.v.yurieva470 wrote:

I have stranded RNA-seq data which doesn't look stranded when I visualize bedGraphs in Genome Browser:

enter image description here

I tried different aligners (star, bowtie, bwa), different ways to make bedGraph files, nothing works. But when I run salmon on these samples, it detects ISR (paired-end stranded library). I'm very confused why it doesn't look stranded in the browser but I know is stranded. Any help will be really appreciated! I've been trying to solve it on my own for a while...

ADD COMMENTlink modified 6 months ago by Rob3.1k • written 6 months ago by marina.v.yurieva470

I tried different aligners (star, bowtie, bwa), different ways to make bedGraph files, nothing works.

This is quiet uninformative. If "nothing works", then please provide the codes you used. How did you create the bedGraphs (exact command please)?

ADD REPLYlink written 6 months ago by ATpoint13k

You should check a known housekeeping gene like ACTB or GAPDH before you check some obscure transcript that may not be real.

ADD REPLYlink written 6 months ago by igor7.3k

Yeah, I looked at everything, just gave the first example. Here is Gapdh:

enter image description here

ADD REPLYlink written 6 months ago by marina.v.yurieva470

And how are you generating these bedGraphs?

ADD REPLYlink written 6 months ago by igor7.3k
3
gravatar for Rob
6 months ago by
Rob3.1k
United States
Rob3.1k wrote:

Hi,

The reason for this is that whether or not to consider a fragment for quantification in salmon is considered after the mapping is written. That is, the reads are mapped to the transcriptome, without regard for the inferred or provided library type, and then these mappings are written to file if requested by the user. However, when salmon considers the probability of these mappings arising from different transcripts, it will discard those that are not compatible with the library type. So, my guess is that you're seeing in this output multi-mapping reads where the sequence matches, and hence there is a possible allocation of this read to the transcript, but for which these mappings are later discarded during quantification. You could, for example, filter the mapping file based on the tags of the alignment records, to only keep those reads that align according to the ISR library type.

ADD COMMENTlink written 6 months ago by Rob3.1k

Thank you! So should I remove the multi-mappers or filter on more specific tags?

ADD REPLYlink written 6 months ago by marina.v.yurieva470
2

Well, you should definitely filter on more specific tags to get rid of the alignments that don't map according to the library type, since salmon will discard them for quantification purposes. It's not easy to discard the multimappers. The fact that a multi-mapping reads maps to a particular locus doesn't mean that it was assigned there for the purposes of quantification. To determine that, you'd have to weight each multi-mapping read according to the abundances estimated by salmon. However, simply getting rid of the reads that align different from the library type should remove some of the ambiguity about how the pile-ups should look.

ADD REPLYlink written 6 months ago by Rob3.1k
1

It's working! Thank you SO much!!!

I used bamCoverage from deeptools:

bamCoverage -b sample.bam -of bedgraph --filterRNAstrand  forward --samFlagExclude 16 --effectiveGenomeSize 2913022398 --normalizeUsing RPKM -o sample_plus.bedGraph

I'm not sure if it's enough to just use --filterRNAstrand so I used it together with --samFlagExclude but it's definitely working!

ADD REPLYlink written 6 months ago by marina.v.yurieva470
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2225 users visited in the last hour