Question

suggestions to compare gene counts

0

Entering edit mode

6.3 years ago

b10hazard ▴ 30

I'm new to RNA-seq and I'm playing around with STAR Aligner. I was changing some of the command line options on STAR and trying to determine their impact on my gene counts (counted using featureCounts command line tool). My question is, what is the best way to compare these data sets?

For example, in one trial I ran STAR with default settings and then I reran with alignIntronMin set to 100 (default is 21). So now I have two count text files...

counts_default.txt

counts_alignIntMin100.txt

I came up with a metric where I calculate the fold-difference in gene counts between sets and calculate the median fold-difference of the 500 most changed genes. This can be called something like Median-500-Most-Fold-Different-Genes metric. In this case the value is really low; 0.0000001, probably because my command line change had little effect.

However, is there a better way to assess these changes? Am I on the right track? Keep in mind I only have one RNA-seq dataset. I can't do comparisons beyond how my command line changes alter the gene counts. Any suggestions would be welcome. Thanks!

RNA-Seq STAR-aligner featureCounts • 1.9k views

ADD COMMENT • link updated 6.3 years ago by Charles Plessy ★ 2.9k • written 6.3 years ago by b10hazard ▴ 30

score 0 · Answer 1 · 2018-02-06

0

Entering edit mode

6.3 years ago

Charles Plessy ★ 2.9k

The field of differential expression analysis is quite mature now, so I recommend to first look at existing solutions, that are based on one or two decades of experience. In case you do not know where to start and you are fine using R and Bioconductor, I would suggest to have a look at their RNA-seq workflow.

ADD COMMENT • link 6.3 years ago by Charles Plessy ★ 2.9k

0

Entering edit mode

As I mentioned in my post; I only have one file. So I cannot look at differential gene expression.

ADD REPLY • link 6.3 years ago by b10hazard ▴ 30

0

Entering edit mode

You have only one file, but with it, you generated multiple expression tables. Have you tried to compare the expression tables using software for differential gene expression ?

ADD REPLY • link 6.3 years ago by Charles Plessy ★ 2.9k