Question

Coefficient of variation

0

Entering edit mode

6.6 years ago

nicoles ▴ 10

I am a newb and I come from a background of we lab experience. Recently, we have started doing RNA-Seq. Originally, our bioinformatics core was going to handle analysis and then that person went on sabbatical. I started using Galaxy to analyze our data. My PI has set parameters (based off the literature) before proceeding with GO terms. One of the conditions is only including genes with a CV of less than or equal to 0.5. Can I do this in Galaxy? If not, could some please tell me how I could do so manually.

I went through Tophat, cufflinks, cuffcompare, cuffdiff based off a colleagues recommendation. I also have a separate workflow of htseq-count then DESeq2.

Any help will be greatly appreciated.

Thanks!

RNA-Seq Galaxy Coefficient of variation. • 8.4k views

ADD COMMENT • link updated 6.6 years ago by Renesh ★ 2.2k • written 6.6 years ago by nicoles ▴ 10

score 3 · Answer 1 · 2017-09-15

3

Entering edit mode

6.6 years ago

Renesh ★ 2.2k

The CV calculations are necessary if you want to select stable and consistently expressed genes from your RNA-seq datasets. The CV calculation is very straightforward and involves standard deviation and mean. CV = SD/Mean. The CV will give you the extent of variability in your gene expression dataset. Your PI is telling to include the genes which are stably expressed across replicates/experiments as the CV is low (0.5).

I am not sure Galaxy do basic statistical calculation with the table data. To calculate CV, you can use database like psql or Excel. You can use CV calculations on htseq-count raw data and then proceed to DESeq package. Most of the gene epression packages calculate the dispersion which accounts for CV.

ADD COMMENT • link 6.6 years ago by Renesh ★ 2.2k

0

Entering edit mode

Thank you. I'll calculate with the htseq-count. Is it also acceptable to calculate the stdev and mean from the cufflinks FPKM? For my own understanding and further explanation to my PI

ADD REPLY • link 6.6 years ago by nicoles ▴ 10

0

Entering edit mode

Yes, you can also calculate CV from FPKM. FPKM is also a normalized count.

ADD REPLY • link 6.6 years ago by Renesh ★ 2.2k

0

Entering edit mode

I want to extract unstable/inconsistently expressed genes from gene expression data, and I used CV as follow:

SD <- apply(eset_HTA20,1, sd)
CV <- base::sqrt(exp(SD^2)-1)

but I got this unusual result in terms of CV value range:

> summary(CV)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
 0.04753  0.12946  0.16494  0.20181  0.22925 15.00777

I think CV should not be more than 1, please correct me. Plus, How can I retain the genes which show a high amount of variation in terms of gene expression level? Any idea?

ADD REPLY • link 4.8 years ago by Jurat Shahidin ▴ 100

score 0 · Answer 2 · 2017-09-15

0

Entering edit mode

6.6 years ago

nicoles ▴ 10

Thank you for replying Kevin. I am trying to learn bioinformatics for myself and our lab. It is definitely and essential skill to have. With obtaining the raw counts from my RNA-Seq samples from Kallisto, can I then determine differentially expressed genes with DESeq2? Could I use DESeq2 through Galaxy after I obtain the counts in Kallisto? Thanks!

ADD COMMENT • link 6.6 years ago by nicoles ▴ 10

0

Entering edit mode

I hope that a tool like Galaxy accepts Kallisto-derived counts, or at best a custom matrix of counts. However, if the HT-seq option is already built-into Galaxy, then you should stick to HT-seq. As far as I recall, you'll therefore have to align the reads to produce a BAM file, over which HT-seq counts transcript abundances (Kallisto and other modern tools don't require a BAM alignment).

There is a great thread here for RNA-seq and Galaxy, which you may have already seen: https://galaxyproject.org/tutorials/rb_rnaseq/

ADD REPLY • link 6.6 years ago by Kevin Blighe 87k

1

Entering edit mode

Yes, I did need the BAM files for ht-seq count. As there will be more RNA-seq coming, I would like to know quicker methods of quantification. In the near future I'll find out if Galaxy accepts the Kallisto counts. The tutorial has greatly helped .

ADD REPLY • link 6.6 years ago by nicoles ▴ 10