I am working with RNA-seq data and trying to implement my stringtie output file from "prepDE.py" for all 9 of my samples into DESeq2 to perform differential Expression on my three conditions here is how my data is set up:
cell line 1: sample1 (control) sample2 (knockdown) sample3 (overexpression) cell line 2: sample4 (control) sample5 (knockdown) sample6 (overexpression) cell line 3: sample7 (control) sample8 (knockdown) sample9 (overexpression)
I have a generated "transcript_count_matrix.csv" file from prepDE.py and a merged_transcripts.gtf file from stringtie --merge for all 9 samples with FPKM values/ensembl IDs.
I also have the output for each sample from stringtie -e -B:
sample1.gtf e2t.ctab e_data.ctab i2t.ctab i_data.ctab t_data.ctab
I would like to know how can I perform Differential expression with this output from stringtie with DESeq2? I would like to compare all 3 control vs. all 3 knockdown/overexpression expression levels and have this in a format that I can use to input as a .gct file for Gene Set Enrichment Analysis.
Much like how cuffdiff works and outputs fpkm_tracking files with gene symbols and fpkm values. I would like something similar with this pipeline.
Any suggestions on how to proceed and any help would be greatly appreciated!!
Thanks so much,