Calculate Percent Spliced In (PSI)
1
4
Entering edit mode
8.3 years ago
Vikas Bansal ★ 2.4k

Dear all,

I was wondering if there is any tool to calculate PSI from the BAM files. Basically what I want is-

I have gtf file with all the exons and I want to calculate for each exon (lets take exon filled with grey for example) how many reads mapped to junction of that exon (a and b in figure) and how many reads skip that exon (c in figure). So if we have, let's say, 10k exons in the gtf, the output should contain 10k rows with a,b and c as columns.

I used MATS and bam2ssj but did not get what I was expecting. I have gtf file with approx. 300k exons but MATS gives the output only for 3k exons. bam2ssj is calculating the values for introns rather than exons.

Any suggestions?

RNA-Seq Splicing • 18k views
0
Entering edit mode
0
Entering edit mode

Two general kinds of analyses are possible:

1. Estimate expression level of exons ("exon-centric" analysis) or,
2. Estimate expression level of whole transcripts ("isoform-centric" analysis)

I think it include the reads mapped on exons but I am interested only in junction reads.

0
Entering edit mode

in the documentation there is also a paragraph about psi estimation.

the definition given in their paper is: 'percentage spliced in' (PSI or Ψ)11 denotes the fraction of mRNAs that represent the inclusion isoform. Reads aligning to the alternative exon or to its junctions with adjacent constitutive exons provide support for the inclusion isoform, whereas reads aligning to the junction between the adjacent constitutive exons support the exclusion isoform; the relative read density of these two sets forms the standard estimate of Ψ.

the paper is here

0
Entering edit mode

Hi Vikas,

we have just written a small protocol to do this:

http://onlinelibrary.wiley.com/doi/10.1002/0471142905.hg1116s87/abstract

6
Entering edit mode
8.3 years ago

I just tried to write something for this: https://github.com/lindenb/jvarkit/wiki/Biostar103303

I don't have any data to test though.

0
Entering edit mode

Thanks a lot for the reply. Its amazing that you included it in jvarkit so quickly. I am trying to install jvarkit but it is not working. Below are my commands -

1
Entering edit mode

use jdk7 please, not 8 (java 8 is too "new" for the libraries included in htsjdk)

0
Entering edit mode

Thanks. Now build is fine. But when I do ant biostar103303 or ant Biostar103303 or ant Biostar103303.java, it says

Buildfile: /project/specht/tools/jvarkit/htsjdk/build.xml

BUILD FAILED
Target "biostar103303" does not exist in the project "htsjdk".

Total time: 0 seconds

0
Entering edit mode

you forgot to cd ... Please run ant biostar103303 in varkit not in varkit/htsjdk

0
Entering edit mode

Finally, I got the output file. But the columns - "exon.count_prev_and_next", "exon.count_prev_and_curr" "exon.count_curr_and_next" are all 0. I think it is not able to calculate junction reads.

0
Entering edit mode

do you have a bam (a subset) and a link to gtf so I can test my program ?

0
Entering edit mode

Here is the link for the BAM (subset) file - https://www.dropbox.com/s/k2ibkw81t4hexq8/onlymyh7.bam

Please let me know if you need any other file.

0
Entering edit mode

thanks, I'll play with it next week.

0
Entering edit mode

I've updated my sources.It seems to work now. Re-download everything to use git to update the sources. Please use https://github.com/lindenb/jvarkit/issues?state=open to report others bug.

0
Entering edit mode

Thanks. I think I found a bug and I will post it on github. Just one thing- Did you publish jvarkit? I found it quite fast. Although I never use java, I found it easy to use.

0
Entering edit mode

Will you explain what the columns represent in the results.tsv file?