Analysis rMATS of RNA Seq data
2
0
Entering edit mode
3.0 years ago
yangshuaihz ▴ 10

I have used rMATS to get the output files of various types of alternative splicing, and then what should I do next? I want to count the types of splicing of each file. I don't know what software to use?

RNA-Seq • 1.9k views
0
Entering edit mode

Why do you want to count the number of splicing in each file?

0
Entering edit mode

I just want to know, I have solved it now, is to use the “wc -l” command on the Linux system. But now I have another question. What does the last column of IncLevelDifference mean? Can it be used as a basis for screening differentially expressed genes?Thanks a lot.

1
Entering edit mode
3.0 years ago

If what you want is to compare different conditions and look for differences in splicing you should focus on the differential splice sites (as identified by rMATS) since the vast majority of splice sites will not be changing whereby counting to total number of sites will not be a particular sensitive approach.

If you are interested in obtaining a genome wide summary of changes between conditions we have recently developed a statistical approach for this which you can get a quick idea of what it can do in this part of the IsoformSwitchAnalyzeR vignette as well as read the article here. If you have any questions for this please don't hesitate to ask.

0
Entering edit mode

Thank you very much！ Because I just got into bioinformatics, our teacher asked me to do the task of alternative splicing, so I am trying to learn this knowledge. If you can provide more help, I will be very grateful! For example, recommend some websites about the relevant aspects of alternative splicing and the analysis process.

0
Entering edit mode

I honestly think of the better overviews is a post I wrote here on Biostars recently which introduce the different types of splicing analysis one can do along with mentioning the appropriate tools. You can find it here. From there you can choose the type of analysis that is of interest to your project and read the corresponding articles. Choosing the type that is relevant for you project is the essential step.

For a general introduction to splicing I would suggest wiki. Additionally there are many good reviews about splicing but because splicing affect almost all of biology they tend to be field specific - so you will have to find those that are of interest yourself (pro tip: when you search on pubmed you can click the "review" button in the left side after you have performed a search to only find reviews. Additionally using the advanced search you can select specific organisms etc). Another post that might be relevant for your question is this.

0
Entering edit mode

Hi, I am very interested in your newly developed software "IsoformSwitchAnalyzeR" . But I have a question, I have two cultivars "A" and "B". I treated them on the second and sixth days respectively. There are three replicates for each treatment.And I use the 0th day as a control. I sequenced these cultivars on day 0, day 2, and day 6, respectively.So I have a total of 18 fastq files.and I used cufflinks and cuffdiff to run these datas. The code for cuffdiff is probably as follows: cuffdiff -p 16 --library-type fr-firststrand -b ./*.fa -L A0d,B0d,A2d,B2d,A6d,B6d -u ./merged.gtf <*.bam>. I want to know the files I got, such as "genes.fpkm_tracking", "gene_exp.diff", etc., can these files be imported directly into R according to your instructions? Or this R package can only be handled separately like A0 and A2 or A0 and A6，respectively. Thanks in advance!

0
Entering edit mode

Yes it can. IsoformSwitchAnalyzeR have a specific function for handling import of cufflinks data - it is called importCufflinksFiles() and you can reed more here.

0
Entering edit mode
3.0 years ago
darbinator ▴ 290

If you want to just count number of AS events in each sample, you have to run these sample again themselfs. For exemples, b1.txt and b2.txt should both been: Control_1.bam

the PSI scores will obviously be equal to 0 but by counting the number of rows (and thus events) in each of the 5 result files for the 5 types of possible events (A3SS,A5SS,RI,MXE,SE), you will have the number of associated splicing events for the sample in question

0
Entering edit mode

Thanks a lot!Your answer helped me solve the problem.But now I have another question. What does the last column of IncLevelDifference mean? Can it be used as a basis for screening differentially expressed genes?Thanks in advance!

0
Entering edit mode

For the 2 groups, rMATS calculate the IncLevel which is approximately the number of reads that support the inclusion of an event on the number of reads that support the exclusion of the event. Finally the IncLevelDifference it just the difference between the two IncLevel , calculated by:

IncLevelDifference = IncLevel1 - IncLevel2

This number varies between -1 and 1 and the further one goes from 0, the more the exclusion or the inclusion of the event varies between the two conditions So you have to choose an arbitrary threshold to decide if a gene in differentially spliced or not. I have choose 0,2 according to some researches, but I many genes with this and I affraid to have many false positives

(if people have feedback on the threshold to choose I am interested)

0
Entering edit mode

Thank you very much, I will consult the literature myself to study this knowledge.