How to generate FPKM value for each replicates?
2
0
Entering edit mode
6.6 years ago
Karma ▴ 310

Given two conditions (control and treated) with 4 replicates. How can I get genes with FPKM for each replicate?

Control Replicates: C1, C2, C3, C4

Treated Replicate: T1, T2, T3, T4

For example, for a gene TP53 I need the following

TP53 FPKM(c1), FPKM(c2),FPKM(c3),FPKM(c4), FPKM(t1),FPKM(t2),FPKM(t3),FPKM(t4)

I tried the following command cuffdiff -o diff_out -b genome.fa -p 8 -L CR,TR -u merged_asm/merged.gtf CR_R1_thout/accepted_hits.bam,CR_R2_thout/accepted_hits.bam,CR_R3_thout/accepted_hits.bam,CR_R4_thout/accepted_hits.bam TR_R1_thout/accepted_hits.bam,TR_R2_thout/accepted_hits.bam,TR_R3_thout/accepted_hits.bam,TR_R4_thout/accepted_hits.bam But, I got combined FPKM for two conditions not for replicates

How can I get FPKM for each replicates?

RNA-Seq NGS cufflinks Tophat cuffdiff • 3.1k views
ADD COMMENT
1
Entering edit mode
6.6 years ago
Sparrow_kop ▴ 260

Hi, please note that cuffdiff is used to identify the dfferent expression genes based on the individual sample FPKM and the group informations. So if you means to get the individual sample FPKM value, just simple use cufflinks instead. for example :

cufflinks -g yourGTF -u -o output_name individual.bam
ADD COMMENT
0
Entering edit mode

Though cuffdiff is for differential expression, it generates the FPKM/RPKM values

ADD REPLY
0
Entering edit mode

The data which I am using is of paired end data. So it will generate FPKM values

ADD REPLY
0
Entering edit mode

So, if I use this, I am going to get fpkm/rpkm from each replicate. If I create a matrix of fpkm values from replicates and calculate fold change it should be equal to the fold change generated by the command cuffdiff -o diff_out -b genome.fa -p 8 -L CR,TR -u merged_asm/merged.gtf CR_R1_thout/accepted_hits.bam,CR_R2_thout/accepted_hits.bam,CR_R3_thout/accepted_hits.bam,CR_R4_thout/accepted_hits.bam TR_R1_thout/accepted_hits.bam,TR_R2_thout/accepted_hits.bam,TR_R3_thout/accepted_hits.bam,TR_R4_thout/accepted_hits.bam Right?

ADD REPLY
1
Entering edit mode

Yes, I think so. Meanwhile, if you have the individual sample expression value, you could apply another method or statistics model to identify DE genes. Also you should refer to the manual : cufflinks package manual

ADD REPLY
0
Entering edit mode
6.6 years ago

If you inspect all the files, there will files with rpkm for each replicate.

From cuffdiff website:

Cuffdiff calculates the expression and fragment count for each transcript, primary transcript, and gene in each replicate. The results are output in per-replicate tracking files in the format described here. isoforms.read_group_tracking genes.read_group_tracking cds.read_group_tracking tss_groups.read_group_tracking

Otherwise,

Quantify the genes using featureCounts and feed the matrix to edgeR's and use rpkm() function. featureCounts output contains the gene length information in one of the columns, which edgeR needs.

ADD COMMENT

Login before adding your answer.

Traffic: 2620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6