Question: ballgown and pData
gravatar for Morris_Chair
11 months ago by
Morris_Chair180 wrote:

Hello everyone, I'm analyzing data using the new tuxedo package (HISAT, StringTie, and Ballgown) but I have problem to make ballgown working

pf_rna<-ballgown(dataDir="ballgown/", samplePattern = sample, pData=pheno_data)

I get this error

Sat Jul  6 00:55:39 2019
Sat Jul  6 00:55:39 2019: Reading linking tables
Sat Jul  6 00:55:40 2019: Reading intron data files
Sat Jul  6 00:55:41 2019: Merging intron data
Sat Jul  6 00:55:43 2019: Reading exon data files
Sat Jul  6 00:55:45 2019: Merging exon data
Sat Jul  6 00:55:47 2019: Reading transcript data files
Sat Jul  6 00:55:48 2019: Merging transcript data
successfully rearranged!
Wrapping up the results
Sat Jul  6 00:55:50 2019
Warning message:
In ballgown(dataDir = "ballgown/", samplePattern = sample, pData = pheno_data) :

Rows of pData did not seem to be in the same order as the columns of the expression data. Attempting to rearrange pData...

The name of the pData have the same order of the name of the folders where the files .ctab are located but after doing different attempts to fix it I'm exhausted and I need help ... Do you have any suggestion? I appreciate

thank you

rna-seq ballgown • 360 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by Morris_Chair180

Ok I will answer by myself :) I was able to run ballgown in a different way (pData caused some problem), in my opinion there must be a bug in this tool preventing the analysis with the script above. After searching, studying and discouraging.. here is the solution:

Read the design_matrix file

pheno_data = read.table(file ="phonotype.txt", header = TRUE, sep = "\t")

Full path to the sample directories

sample_full_path <- paste("ballgown/",pheno_data[,1], sep = '/')

Load ballgown data structure

bg = ballgown(samples=as.vector(sample_full_path),pData=pheno_data)

All the best

ADD REPLYlink modified 11 months ago • written 11 months ago by Morris_Chair180

You could make your life a lot easier if you simply used salmon-tximport and then any of the common downstream tools such as edgeR or DESeq2. Documentation is outstandingly comprehensive and you do not have to mess around with this odd ballgown tool.

ADD REPLYlink written 11 months ago by ATpoint35k

Hello ATpoint,

I followed your advice in the past and I have my pipeline that works perfectly fine, and yes your are definitely right, it's lot easier to use Salmon and DESeq2.

The reason why I'm doing this is because I want to experience different tools for doing differential expression analysis (I'm aware that ballgown it's not the best), because I want to be able to detect the splicing variant for each genes and lastly, because I want to be familiar with this command lines and pipeline because they might be useful to the next aim, meta analysis.

Thank you :)

ADD REPLYlink written 11 months ago by Morris_Chair180
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 917 users visited in the last hour