Question: ballgown has "." as Gene Names
0
gravatar for Fawzi Yassine
10 weeks ago by
Fawzi Yassine0 wrote:

Hi, I'm using stringtie and ballgown (in R) for the standard RNA-seq data analysis.

Both

texp(ballgown_obj1, 'all')$gene_name

and

ballgown::geneNames(ballgown_obj1)

return all the gene names as “.”

How can I get my gene names?

Thanks,

rna-seq ballgown assembly • 201 views
ADD COMMENTlink modified 8 weeks ago • written 10 weeks ago by Fawzi Yassine0
0
gravatar for Fawzi Yassine
8 weeks ago by
Fawzi Yassine0 wrote:

I solved the problem, the annotation file did not a gene_name column.

ADD COMMENTlink written 8 weeks ago by Fawzi Yassine0

Hi Fawzi,

It’s perfectly OK to answer you own questions, but if you could endeavour to make the answers as thorough as possible for people who may come across this issue in future that would be good.

ADD REPLYlink written 8 weeks ago by jrj.healey11k
1
gravatar for aditi.qamra
10 weeks ago by
aditi.qamra260
Toronto
aditi.qamra260 wrote:

Did you do a denovo assembly ? If so, you will have to first add the gene names by using getGenes() and a gtf file. Also check what is the output of indexes(ballgown_obj1)$t2g

ADD COMMENTlink written 10 weeks ago by aditi.qamra260

Hi Aditi, It is not denovo assembly. he output of indexes(ballgown_obj1)$t2 is

t_id    g_id
4     4 MSTRG.5
7     7 MSTRG.5
9     9 MSTRG.5
10   10 MSTRG.5
16   16 MSTRG.2
17   17 MSTRG.2
ADD REPLYlink modified 8 weeks ago by genomax64k • written 10 weeks ago by Fawzi Yassine0
1

Can you post your code here?

ADD REPLYlink written 9 weeks ago by aditi.qamra260

Aditi, I am sorry for the late reply. I was fixing my PC.. Below is the code for stringtie and ballgown. Here is the code for stringtie: stringtie -e -B -p 8 -G ./stringtie_merged.gtf -o ${BALLGOWNDIR}/SRR${A}/SRR${A}.gtf ${HISAT2DIR}/SRR${A}.bam

Here is the code for ballgown

#Read phenotype sample data
pheno_data = read.csv("data/frda_phenodata.csv", header = TRUE, colClasses = rep("character", 4))
pheno_data = pheno_data[order(pheno_data$ids), ]


# Read in expression data
ballgown_obj = ballgown(dataDir = "data/ballgown", samplePattern = "SRR", pData = pheno_data)

#Pre-Filtering out genes that are expressed at low levels prior to differential expression analysis reduces the severity of the multiple-testing correction and may improve the power of detection.\
#Pre-filtering to keep only rows that have at least 10 reads total (across samples). 
ballgown_obj1 = subset(ballgown_obj, "rowSums(texpr(ballgown_obj)) >= 10", genomesubset=TRUE)

#Filter low-abundance genes. Here we remove all transcripts with a variance across the samples of less than one:
ballgown_obj2 = subset(ballgown_obj, "rowVars(texpr(ballgown_obj)) > 1", genomesubset=TRUE)

#DE by transcript
differ_transcripts1 = stattest(ballgown_obj1, feature = "transcript", covariate = "genotype", getFC = TRUE,  meas = "FPKM")

#DE by gene
differ_genes1 = stattest(ballgown_obj1, feature = "gene", covariate = "genotype", getFC = TRUE, meas = "FPKM")
ADD REPLYlink modified 8 weeks ago by genomax64k • written 9 weeks ago by Fawzi Yassine0

I solved the problem, the annotation file did not a gene_name column.

ADD REPLYlink written 8 weeks ago by Fawzi Yassine0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1875 users visited in the last hour