Question: Creation of Transcript expression file
0
gravatar for mail2steff
2.0 years ago by
mail2steff50
Potsdam, Germay
mail2steff50 wrote:

Dear team

I am analysing AS events in Arabidopsis thaliana using SUPPA. I predicted AS_events using generateEvents option. For calculating PSI (next step), it requires Transcript expression file. But I do not know from where I can get the Transcript expression file for my sample? Can anyone help me in this issue? Thank you in advance

In SUPPA documentation, They have given the following explanation;

The transcript expression file is a tab separated file where each line provides the estimated abundance of each transcript (ideally in TPM units). This file might contain multiple columns with the expression values in different samples. The expression file must have a header with the naming of the different expression fields, i.e., the sample name of each expression value.

An example of a transcript expression file for one single sample:

sample1
transcript1 <expression>
transcript1 <expression>
transcript1 <expression>

A transcript expression file with multiple samples:

sample1 sample2 sample3 sample4
transcript1 <expression>    <expression>    <expression>    <expression>
transcript2 <expression>    <expression>    <expression>    <expression>
transcript3 <expression>    <expression>    <expression>    <expression>
ADD COMMENTlink modified 2.0 years ago by Satyajeet Khare1.4k • written 2.0 years ago by mail2steff50

What does PSI stand for?

ADD REPLYlink written 2.0 years ago by Macspider2.9k

It refers to the magnitude of splicing change (ΔPSI) in the case of SUPPA

ADD REPLYlink written 2.0 years ago by mail2steff50

If you don't have RNAseq reads to map, you don't have an expression profile in TPM. You might get it from microarrays. How is your experiment set up?

ADD REPLYlink written 2.0 years ago by Macspider2.9k

I have the bam format of my files. I am looking for AE in different organs of At. for this analysis, I am using SUPPA.

ADD REPLYlink written 2.0 years ago by mail2steff50

featureCounts, htseq, cufflinks just to name a few!

ADD REPLYlink written 2.0 years ago by Macspider2.9k

Output of featureCount is

    # Program:featureCounts v1.5.3; Command:"./featureCounts" "-a" "Arabidopsis_thaliana.TAIR10.36.gtf" "-o" "counts.txt" "accepted_hits_Bur-0.bam" 
Geneid  Chr Start   End Strand  Length  accepted_hits_Bur-0.bam
AT1G01010   1;1;1;1;1;1 3631;3996;4486;4706;5174;5439   3913;4276;4605;5095;5326;5899   +;+;+;+;+;+ 1688    1061
AT1G01020   1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1;1 6788;6788;6788;6788;6788;6788;7157;7157;7157;7157;7157;7157;7384;7384;7384;7384;7564;7564;7564;7564;7564;7564;7762;7762;7762;7762;7762;7942;7942;7942;7942;7942;8236;8236;8236;8236;8236;8236;8417;8417;8417;8417;8571;8571;8571;8594;8594;8594 7069;7069;7069;7069;7069;7069;7232;7232;7232;7450;7232;7450;7450;7450;7450;7450;7649;7649;7649;7649;7649;7649;7835;7835;7835;7835;7835;7987;7987;7987;7987;7987;8325;8464;8325;8325;8464;8325;8464;8464;8464;8464;9130;8737;9130;9130;9130;8737 -;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;- 1571    235

But from this how can I get transcript expression file?

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by mail2steff50

I am not here to suggest you commands to copy-paste in your terminal: there are manuals, literature, file formats and specifications that you have to read to understand what is needed for you.

Quoting you:

An example of a transcript expression file for one single sample:

sample1
transcript1 <expression>
transcript1 <expression>
transcript1 <expression>

From the output you pasted here you have all you need. Plus, I am pretty sure that there is a function in featureCounts to convert to expression in TPM or FPKM (better the first).

ADD REPLYlink written 2.0 years ago by Macspider2.9k

Hi mail2steff,

I am doing a similar kind of analysis in rice but am an error after running the following command: python suppa.py generateEvents -i ../../../Splicing/Alternate_Acceptor_and_Donor/all.gff3 -o all.events -e SE SS MX RI FL -f ioe

The error is: Traceback (most recent call last): File "suppa.py", line 14, in <module> import significanceCalculator as diffSplice File "/Backup/Splicing/Suppa/SUPPA-master/significanceCalculator.py", line 15, in <module> from lib.diff_tools import multiple_conditions_analysis File "/Backup/Splicing/Suppa/SUPPA-master/lib/diff_tools.py", line 30 print(prefix, " ", "%d / %d. " % (i+1, lst_len), "%.2f%% completed." % ((i/lst_len)*100), end="\r", flush=True) ^ SyntaxError: invalid syntax

Can you please help me with the same?

ADD REPLYlink written 8 months ago by rachitasrivastava70
1
gravatar for Satyajeet Khare
2.0 years ago by
Satyajeet Khare1.4k
Pune, India
Satyajeet Khare1.4k wrote:

Assuming that you either have alignment output (SAM/BAM) or .fastq files, you can generate a count matrix for transcript expression using prepDE.py or featureCounts. You can calculate normalized raw counts or you can calculate TPMs. featureCounts in R will also create a gene length vector which you can use to calculate TPM. Just make sure that you are using transcript as a feature and not gene.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Satyajeet Khare1.4k

Thank you. Ill try that

ADD REPLYlink written 2.0 years ago by mail2steff50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 638 users visited in the last hour