stringtie step, I estimated transcript abundances and created table counts for
In a folder "ballgown", I have the all the sample subdirectories inside this folder "ballgown". Each sample subdirectory have following files:
e2t.ctab e_data.ctab i2t.ctab i_data.ctab TCGA-XX-XXX-XX_GRCh38.gtf t_data.ctab
To get the FPKM matrix I did like below:
library(ballgown) my.data <- ballgown(dataDir = "ballgown", samplePattern = "TCGA", meas="all") Sun Aug 25 14:12:19 2018 Sun Aug 25 14:12:19 2018: Reading linking tables Sun Aug 25 14:12:29 2018: Reading intron data files Sun Aug 25 15:00:38 2018: Merging intron data Killed !
I tried couple of times but I got the same error. What could be wrong here? Did I do something wrong?
Hi, See Adrian's answer below. You don't specifically need to run ballgown if you are just interested in FPKM values.
Btw, if you had large no. of samples, ballgown might have died of memory. Recently I was attempting a ballgown run on >80 samples on a desktop (8Gb RAM) where it died, but ran as expected on a server.
Yes I understand that. I have almost 1200 samples. I didn't use -A argument with stringtie. I have the ballgown outputs. Should I run stringtie again with -A?
That would be re-running StringTie for all your 1200 samp. again. Probably it could be easier if you could run ballgown on a system with high RAM available. If not and time notwithstanding, you could re-run StringTie with
There could be other solutions depending what you are question is. Are you looking for transcript isoform level quantification specifically. If not then you could generate count (gene level) data from BAMs using HTSeq-count or featureCounts tools. These would have much less demand for compute resource.
There is another way of checking isoform level quantification without using StringTie (or any other transcriptome assembly tool). You could use tools like kallisto.
Or can I use
t_data.ctabwhich has transcripts and FPKM also. I can cut the two columns for each sample and then join all into a single file?
Like mentioned here FPKM matrix (Check the last comment)
brilliant! That should work. I was in the impression somehow that they are binary files.
Thankyou, will follow that.