I am using HISAT2 ,Stringtie & Ballgown for differential expression analysis of wheat RNA-seq data.
According to this Protocol in a Step Estimate transcript abundances and create table counts for Ballgown: we use this Command and GTF files are produced as output result.
$ stringtie –e –B -p 8 -G stringtie_merged.gtf -oballgown/ERR188044/ERR188044_chrX.gtf ERR188044_chrX.bam$
stringtie –e –B -p 8 -G stringtie_merged.gtf -oballgown/ERR188104/ERR188104_chrX.gtf ERR188104_chrX.bam$
stringtie –e –B -p 8 -G stringtie_merged.gtf -oballgown/ERR188234/ERR188234_chrX.gtf ERR188234_chrX.bam
I am confused that in very next step after loading relevent R packages we have to Load a pehnotype data (CVS FILE) like
> pheno_data = read.csv("file.csv")
Protocol Write this as follows:-
Load the phenotype data for the samples. An example file called geuvadis_phenodata.csv is included with the data files for this protocol (ChrX_data). In general, you will have to create this file yourself. It contains information about your RNA sequencing samples, formatted as illustrated in this csv (comma-separated values) file. Each sample should be described on one row of the file and each column should contain one variable. To read this file into R we use the command read.csv.
Can someone please guide me that how we can make this StringTe's (GTF out) according to the pipeline to create table counts for ball gown? How this CSV file is created for Ballgown?.