A puzzling warning when running Cufflinks
2
1
Entering edit mode
9.3 years ago
Ding Wenchao ▴ 10

Hi, all

I'm using the Cufflinks for detecting the expression in isoform level.

The command is:

cufflinks -p 8 -o out -g Homo_sapiens.GRCh38.78.gtf -b Homo_sapiens.GRCh38.dna.primary_assembly.fa -u  Aligned.sortedByCoord.out.bam

The GTF file and genome.fa was downloaded from Ensembl ftp.

I got this warning for each line which represents a gene

Warning: could not parse ID or Parent from GFF line

I'm wondering if there are something wrong with my gtf file or anything else?

Every answer will be appreciated!

Thanks in advance!

RNA-Seq • 3.8k views
ADD COMMENT
0
Entering edit mode

The gene lines in annotation files seem to cause problems with cufflinks, you can try removing them:

Remove "gene" entries (this is what I use)
https://groups.google.com/d/msg/tuxedo-tools-users/FTKA4qozJIc/p47AwnCXxvwJ

Remove everything except "exon" entries:
https://groups.google.com/d/msg/tuxedo-tools-users/V9eI65dVzyU/loGGxs3ev1oJ

ADD REPLY
0
Entering edit mode
9.3 years ago
Ram 43k

You might wanna check on the chromosome naming in the GTF and Primary Assembly files. GTF uses 1,2,... naming and I'm not sure they'd match the fasta file IDs.

ADD COMMENT
0
Entering edit mode

The chromosome names in both files are consistent. This warning occured in the initial step when "Loading reference annotation", and only for the gene line not the transcript or exon line.

Thank you all the same.

ADD REPLY
0
Entering edit mode
9.3 years ago

Consider igenomes. http://support.illumina.com/sequencing/sequencing_software/igenome.html

They have Ensembl annotation designed for the Tuxedo packages

ADD COMMENT

Login before adding your answer.

Traffic: 1622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6