Entering edit mode
3.7 years ago
Morris_Chair ▴ 330
Hello everyone, I'm analyzing data using the new tuxedo package (HISAT, StringTie, and Ballgown) but I have problem to make ballgown working
pf_rna<-ballgown(dataDir="ballgown/", samplePattern = sample, pData=pheno_data)
I get this error
Sat Jul 6 00:55:39 2019 Sat Jul 6 00:55:39 2019: Reading linking tables Sat Jul 6 00:55:40 2019: Reading intron data files Sat Jul 6 00:55:41 2019: Merging intron data Sat Jul 6 00:55:43 2019: Reading exon data files Sat Jul 6 00:55:45 2019: Merging exon data Sat Jul 6 00:55:47 2019: Reading transcript data files Sat Jul 6 00:55:48 2019: Merging transcript data successfully rearranged! Wrapping up the results Sat Jul 6 00:55:50 2019 Warning message: In ballgown(dataDir = "ballgown/", samplePattern = sample, pData = pheno_data) :
Rows of pData did not seem to be in the same order as the columns of the expression data. Attempting to rearrange pData...
The name of the pData have the same order of the name of the folders where the files .ctab are located but after doing different attempts to fix it I'm exhausted and I need help ... Do you have any suggestion? I appreciate
Ok I will answer by myself :) I was able to run ballgown in a different way (pData caused some problem), in my opinion there must be a bug in this tool preventing the analysis with the script above. After searching, studying and discouraging.. here is the solution:
Read the design_matrix file
Full path to the sample directories
Load ballgown data structure
All the best
You could make your life a lot easier if you simply used
salmon-tximportand then any of the common downstream tools such as
DESeq2. Documentation is outstandingly comprehensive and you do not have to mess around with this odd
I followed your advice in the past and I have my pipeline that works perfectly fine, and yes your are definitely right, it's lot easier to use Salmon and DESeq2.
The reason why I'm doing this is because I want to experience different tools for doing differential expression analysis (I'm aware that ballgown it's not the best), because I want to be able to detect the splicing variant for each genes and lastly, because I want to be familiar with this command lines and pipeline because they might be useful to the next aim, meta analysis.
Thank you :)
Hi Morris, I am using HISAT2 - StringTie -Ballgown pipeline. I was wondering if you could please comment on the following error message? Thanks.
Hi Asad, I am sorry but at the end for my analysis I used a different pipeline so I don't know what to say about this error. I hope someone else will be helpful for you
Thanks Morris. Now, I am using DESeq2 instead of Ballgown. My RNA-Seq workflow is HISAT2 - StringTie -DESeq2. It seems to be working fine.
To help others running into the same issue, here I present what I am doing:
I have estimated transcript abundance with the following call.
Then, I have run the 'python prepDE.py' script within the above 'ballgown' directory. It has generated gene and transcript count matrix in csv format.
I am analysing these csv files using the DESeq2 package.
Hi Asad Prodhan,
So can we use DESeq2 directly, without running balgown?