Calculating FPKM for large number of samples mapped to co-assembly
0
0
Entering edit mode
4.2 years ago
arla_21 • 0

Hi

I am very new to this line of analyses so please be kind and I am sorry if I miss any information.

I am interested in calculating the abundance of carbohydrate active enzyme sequences in my samples (but using a co-assembly).

I have co-assembled my samples (Megahit) and mapped the reads of each sample to the co-assembly (bowtie). I have also used dbcan to annotate the co-assembly with the carbohydrate active enzyme database. I then used ht-seq count to count the number of reads mapped to each gene in each sample Therefore, I currently have the counts for each sample but I am confused about how to normalise the counts. I also have a gtf file with all the gene calls for the co-assembly which looks like:

argelvor_000000000001   PROKKA  CDS 2   304 .   +   .   gene_id 1_1
argelvor_000000000002   PROKKA  CDS 1   168 .   -   .   gene_id 2_1
argelvor_000000000003   PROKKA  CDS 1   384 .   +   .   gene_id 3_1
argelvor_000000000004   PROKKA  CDS 1   321 .   +   .   gene_id 4_1
argelvor_000000000005   PROKKA  CDS 30  530 .   -   .   gene_id 5_1
argelvor_000000000006   PROKKA  CDS 1   96  .   +   .   gene_id 6_1
argelvor_000000000007   PROKKA  CDS 1   558 .   +   .   gene_id 7_1
argelvor_000000000008   PROKKA  CDS 2   484 .   -   .   gene_id 8_1
argelvor_000000000009   PROKKA  CDS 2   142 .   +   .   gene_id 9_1
argelvor_000000000009   PROKKA  CDS 191 343 .   +   .   gene_id 9_2

And a standard count matrix where gene ids are rows and samples are columns. Is it possible from this information to calculate FPKM (and have it automized). I am most comfortable in R but would welcome any suggestions.

Once I have the FPKM values, I can then use the gene ID's to map to the output of dbcan!

Thanks

metagenomics Assembly next-gen FPKM sequencing • 708 views
ADD COMMENT

Login before adding your answer.

Traffic: 2118 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6