R ballgown - Obtain number of transcripts per gene per sample?
0
0
Entering edit mode
3 months ago
Sam • 0

It seems like this should be feasible, however, I'm not well versed in R and have only begun dabbling with ballgown.

Currently, my only thought is to pull data out of ballgown and create intermediate files containing two columns:

  • a column of genes

    • a column of associated transcript IDs
   seqnames  id
NC_007175.2  1
NC_007175.2  2
NC_007175.2  3
NC_007175.2  4
NC_007175.2  5
NC_035780.1  6

and then do something more "basic" like using Bash awk '{print $1}' transcript-to-genes.txt | sort | uniq -c.

I'd prefer a full R solution in order to keep things tidy (i.e. not have intermediate files, not switch between languages).

Any suggestions/help would be much appreciated.

EDITED: Made Bash code accurate.

ballgown R • 156 views
ADD COMMENT
0
Entering edit mode

Hi, I suppose you used stringtie for the assembly of the trasncripts, correct? If so, you can find the transcripts directly in the .gtf file created for each sample, with the genes associated with it in different column. It's already a "column of genes and a column of trasncripts" file.

ADD REPLY

Login before adding your answer.

Traffic: 2259 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6