I have bulkRNAseq dates (12 samples, pair end sequenced) and my pipine was :
- I performed quality control with FastqQC,
- Trimmed reads with Trimmomatic
- Aligned reads to the reference genome with STAR
- Used Samtools to sort and index the BAM files
- Calculated counts with FeatureCounts
Now, I want to cover the counts into TPM as follow:
counts_to_tpm <- function(counts, featureLength, meanFragmentLength)
countsthat is my merge file with the hit counts from all samples
featureLengthA numeric vector with feature lengths which it's present in my BAM file
meanFragmentLengthis the mean fragment lengths
Is it correct to calculate this parameter can be calculated with
CollectRnaSeqMetrics (Picard) or with
picardmetrics ? and do I have to run it for every samples of my dataset? or given they have been sequenced together one sample is enough? I guess the mean length for x gene should be the same regardless the samples - or am I wrong and I didn't`t get the role of Picard?
I did try to run it for one sample but I am confused which is the parameter that I have to use in code above for the
meanFragmentLength to get the TPM. I got a txt file which looks like this:
Apologies if it is again a stupid question!
Thank you for the help!