Question: How to create a gtf using the GRC38 gtf and ERCC92.gtf
gravatar for adam
3.9 years ago by
adam0 wrote:


I am trying to incorporate ERCC spike-ins into my RNA seq experiments. Everything I have read says to just append the ERCC gtf (ERCC92.gtf) onto my other gtf (GRC38.83.gtf). I tried that but am now getting an error from featuresCounts saying that my new gtf is not usable.

When I compare the ERCC92.gtf with the GRC38 format, they are totally different, so it is not surprising that simply appending doesn't work.

Has anyone been able to get these two typed of gtfs to work together?


rna-seq • 1.9k views
ADD COMMENTlink modified 3.9 years ago by neelablore0 • written 3.9 years ago by adam0
gravatar for neelablore
3.9 years ago by
neelablore0 wrote:

ADD COMMENTlink written 3.9 years ago by neelablore0

Hi adam, are you able to post the first couple of lines of your GRC38.83.gtf file? are you working on a Mac? There could be issue with the line return character.

I have been able to merge the two quite easily using cat on the command line.

ADD REPLYlink written 3.9 years ago by ncl.wong0

Thank you for your suggestion. I have pasted the first lines of output from OS X terminal below. There are "#" symbols before the "!" of the first few lines, but they were doing weird things to the formatting here, so I removed them in this post. I also added new lines after each entry here, to help with the formatting.

Also, previous to receiving your answer, I looked more closely at the files and what I thought were differences between the two do not appear to be after all. I now believe them to be compatible. However, if I try to run STAR alignment using a indices created from the GRCh38.83.gtf file alone, alignment works great. However, if I try to run it with the indices created from the concatenated GRC and ERCC files, STAR looks like it is running, but the Aligned.out.sam file reaches 3.6 KB very quickly then stays that size for at least an hour. I killed STAR at that point, so not sure what would happen if I let it continue to run.

!genome-build GRCh38.p5

!genome-version GRCh38

!genome-date 2013-12

!genome-build-accession NCBI:GCA_000001405.20

!genebuild-last-updated 2015-10

1 havana gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2";

1 havana transcript 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; tag "basic"; transcript_support_level "1";

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by adam0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 999 users visited in the last hour