I would like to find the TPM counts for the GSE102073 study. When I downloaded the raw data from GEO, the raw data are featureCounts output.
First part of the file:
# Program:featureCounts v1.4.3-p1; Command:"/data/NYGC/Software/Subread/subread-1.4.3-p1-Linux-x86_64/bin/featureCounts" "-s" "2" "-a" "/data/NYGC/Resources/ENCODE/Gencode/gencode.v18.annotation.gtf" "-o" "/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/featureCounts/Sample_JB4853_counts.txt" "/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/STAR_alignment/Sample_JB4853_Aligned.out.WithReadGroup.sorted.bam"
Geneid Chr Start End Strand Length /data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/STAR_alignment/Sample_JB4853_Aligned.out.WithReadGroup.sorted.bam
ENSG00000223972.4 chr1;chr1;chr1;chr1 11869;12595;12975;13221 12227;12721;13052;14412 +;+;+;+ 1756 0
ENSG00000227232.4 chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1;chr1 14363;14970;15796;16607;16854;17233;17498;17602;17915;18268;24734;29321;29534 14829;15038;15947;16765;1705
How can I convert this into tpm counts?
I tried the method from this post but it requires a counts file
which I don't have access to; or this post but I am confused on how to use tximport to get the tpm counts nor the input variable featureLength and meanFragmentLength.
Thank you.
This file is your counts file, isn't it?
featureCounts file
Yes, I thought the featureCounts file is your counts file.