I have pseudo-aligned some RNASeq reads using Kallisto where each sample was split over two flow cells.
I pseudo-aligned the fastq files for each flow cell separately, meaning for each sample I have two abundance TSV files that look like this:
target_id length eff_length est_counts tpm ENST00000434970.2 9 7 0 0 ENST00000448914.1 13 11 0 0 ENST00000415118.1 8 6 1 0 ENST00000631435.1 12 10 0 0 ENST00000632684.1 12 10 0 0
I would like to merge each sample's two TSV files into one and aggregate the transcript level counts to gene-level counts with tximport in R.
I'm unsure about the best way to approach this. The length column is the same in both TSV files, and the est_counts just need to be added together, but I'm not sure about eff_length and tpm. Does tximport require this information?
I would appreciate some advice.