Question

Salmon / IGV

0

Entering edit mode

5.8 years ago

pablo ▴ 300

Hi guys,

I've a question about my salmon results'

I have got 6 samples (3 conditions with 2 replicates each) and therefore 6 quant.sf files for each one

I want to compare my 3 conditions : A vs B , A vs C , B vs C

But for that, I need to know more details about splicing junctions and transposable elements (TEs)

I tried to import my (pseudo)bam files into IGV to try to understand how mapped reads deal with the reference transcriptome but I couldn't do it. About TEs, I don't know how I could get information about them.. I read papers but I'm still getting confused ..

Any suggestion?

Best, Vincent

salmon igv • 2.6k views

ADD COMMENT • link updated 5.8 years ago by h.mon 35k • written 5.8 years ago by pablo ▴ 300

2

Entering edit mode

If you are mapping with Salmon you are mapping to the transcriptome, and you are already taking into account splice junctions. If you want more details about splice junctions, you have to map against the genome - two good programs for this are STAR and HISAT2.

About TEs, see this pipeline: https://github.com/hyunhwaj/SalmonTE

I tried to import my (pseudo)bam files into IGV to try to understand how mapped reads deal with the reference transcriptome but I couldn't do it.

You have to explain in more detail what you did and report error messages, if any, otherwise it will be difficult to help you.

ADD REPLY • link 5.8 years ago by h.mon 35k

1

Entering edit mode

Thanks for answering me ,

I mean, for example : for 1 gene, we can get many isoforms transcripts (A,B,C..) . So, how a read can be specific of the isoformA and not the B one for example? I can't understand how Salmon makes the difference ...

Thanks for SalmonTE, i'm gonna look that

About IGV, it requires a reference genome and bam files from the alignment. In my case, I have got a transcriptome reference and bam files from Salmon (in reality, I read they are not real bam files but pseudo bam files )

So, I import my transcriptome reference into IGV : it looks recognize the reference because there's no error messages and I can see the nucleotidic sequence of the transcriptome. But when I import one of my (pseudo)bam file from Salmon, IGV doesn't match the bam file with the transcriptome. There's no error message, but the software struggles a lot

I guess IGV does that because it is not suitable for transcriptome, and only for genome

ADD REPLY • link 5.8 years ago by pablo ▴ 300

score 0 · Answer 1 · 2018-08-02

Each isoform has some unique region, and some shared regions with other isoforms. Based on how many reads map to the unique regions of each isoform, Salmon uses an expectation-maximization algorithm to optimally apportion shared counts between isoforms. This question has been addressed before, e.g. see Rob answer to Big differences between mappings computed by Salmon and quantification .

Did you position-sort and index the bam file? Did you select one particular transcript for viewing? For bam visualization, IGV will only show mapped reads after zooming in to small regions. Another problem may be your transcriptome has hundreds of thousands of transcripts, and this may overload IGV. Did you check IGV memory usage after loading the bam?