Question

Calculate Percent Spliced In (PSI)

4

Entering edit mode

9.9 years ago

Vikas Bansal ★ 2.4k

Dear all,

I was wondering if there is any tool to calculate PSI from the BAM files. Basically what I want is-

I have gtf file with all the exons and I want to calculate for each exon (lets take exon filled with grey for example) how many reads mapped to junction of that exon (a and b in figure) and how many reads skip that exon (c in figure). So if we have, let's say, 10k exons in the gtf, the output should contain 10k rows with a,b and c as columns.

I used MATS and bam2ssj but did not get what I was expecting. I have gtf file with approx. 300k exons but MATS gives the output only for 3k exons. bam2ssj is calculating the values for introns rather than exons.

Any suggestions?

Thanks in advance.

RNA-Seq Splicing • 19k views

ADD COMMENT • link updated 2.2 years ago by dnaJNKY • 0 • written 9.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

have a look at MISO: http://genes.mit.edu/burgelab/miso/docs/

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Martombo ★ 3.1k

0

Entering edit mode

Thanks for the reply. I read about MISO but it says

Two general kinds of analyses are possible:

Estimate expression level of exons ("exon-centric" analysis) or,
Estimate expression level of whole transcripts ("isoform-centric" analysis)

I think it include the reads mapped on exons but I am interested only in junction reads.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

in the documentation there is also a paragraph about psi estimation.

the definition given in their paper is: 'percentage spliced in' (PSI or Ψ)11 denotes the fraction of mRNAs that represent the inclusion isoform. Reads aligning to the alternative exon or to its junctions with adjacent constitutive exons provide support for the inclusion isoform, whereas reads aligning to the junction between the adjacent constitutive exons support the exclusion isoform; the relative read density of these two sets forms the standard estimate of Ψ.

the paper is here

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Martombo ★ 3.1k

0

Entering edit mode

Hi Vikas,

we have just written a small protocol to do this:

http://onlinelibrary.wiley.com/doi/10.1002/0471142905.hg1116s87/abstract

ADD REPLY • link 8.5 years ago by derseb • 0

Ram · Accepted Answer · 2014-06-12

6

Entering edit mode

9.9 years ago

Pierre Lindenbaum 161k

I just tried to write something for this: https://github.com/lindenb/jvarkit/wiki/Biostar103303

I don't have any data to test though.

ADD COMMENT • link 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Thanks a lot for the reply. Its amazing that you included it in jvarkit so quickly. I am trying to install jvarkit but it is not working. Below are my commands -

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Vikas Bansal ★ 2.4k

1

Entering edit mode

use jdk7 please, not 8 (java 8 is too "new" for the libraries included in htsjdk)

ADD REPLY • link 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Thanks. Now build is fine. But when I do ant biostar103303 or ant Biostar103303 or ant Biostar103303.java, it says

Buildfile: /project/specht/tools/jvarkit/htsjdk/build.xml

BUILD FAILED
Target "biostar103303" does not exist in the project "htsjdk".

Total time: 0 seconds

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

you forgot to cd ... Please run ant biostar103303 in varkit not in varkit/htsjdk

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Finally, I got the output file. But the columns - "exon.count_prev_and_next", "exon.count_prev_and_curr" "exon.count_curr_and_next" are all 0. I think it is not able to calculate junction reads.

ADD REPLY • link 9.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

do you have a bam (a subset) and a link to gtf so I can test my program ?

ADD REPLY • link 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Here is the link for the BAM (subset) file - https://www.dropbox.com/s/k2ibkw81t4hexq8/onlymyh7.bam

and gtf I downloaded from here - http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=378435437_BVHytx2vRGH3XVmRsp39JG5acTeD&clade=mammal&org=Mouse&db=mm9&hgta_group=genes&hgta_track=refGene&hgta_table=0&hgta_regionType=genome&position=chr12%3A57795963-57815592&hgta_outputType=primaryTable&hgta_outFileName=

Please let me know if you need any other file.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

thanks, I'll play with it next week.

ADD REPLY • link 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

I've updated my sources.It seems to work now. Re-download everything to use git to update the sources. Please use https://github.com/lindenb/jvarkit/issues?state=open to report others bug.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Thanks. I think I found a bug and I will post it on github. Just one thing- Did you publish jvarkit? I found it quite fast. Although I never use java, I found it easy to use.