Consistency in the GTF files used for the index building of STAR alignment and RSEM
1
0
Entering edit mode
16 months ago

Hello, I want to use RSEM to get the transcript counts from STAR alignment results. I have built the STAR indices with the GTF from UCSC and got the BAM files after alignment. I wonder if I could still use these BAM files if I want to use the Gencode GTF in rsem-prepare-reference.

Furthermore, when are the GTF files used in STAR index building and further feature quantification (like RSEM and featureCounts) interchangeable?

Thank you very much!

RSEM STAR alignment rna-seq gtf • 827 views
ADD COMMENT
0
Entering edit mode
16 months ago
Ram 43k

I don't think you can use different GTF files with STAR --runMode genomeGenerate and RSEM rsem-calculate-expression. rsem-prepare-reference internally calls STAR --runMode genomeGenerate with the sjdbGTFfile param set to your GTF input. This will create a *.transcripts.fa file with the transcript identifiers from the GTF file, and downstream alignment will align to this transcriptome, which means your transcriptome BAM will contain these transcripts as the contigs, thus rendering the BAM useless unless you extract the reads and realign hem to the other transcriptome - it'd be easier to just run rsem-calculate-expression against the new transcriptome.

Furthermore, when are the GTF files used in STAR index building and further feature quantification (like RSEM and featureCounts) interchangeable?

What do you mean by "when"? I don't see the time component to this question. Even if you're asking for specific conditions, the question does not make sense. Pipelines need to consistently use the same set of reference files such as FASTA and GTF.

ADD COMMENT

Login before adding your answer.

Traffic: 2304 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6