Question: Cuffdiff terminated after GffObj::getSpliced() error
0
gravatar for nalandaatmi
4.6 years ago by
nalandaatmi90
United States
nalandaatmi90 wrote:

Dear All,

I am currently analyzing mouse RNASeq samples. I used Tophat2.1.1 version, cufflinks2.2.1 version for my analysis and I downloaded the latest version of mm10 from Igenomes (genome and GTF). I am facing an issue at following steps

Cuffmerge step:

I noticed following error messages at cuffmerge log file. But the merged file (merged.gtf)is generated in the output directory.

"Error (GFaSeqGet): end coordinate (117274415) cannot be larger than sequence length 115169878
Error (GFaSeqGet): end coordinate (117981028) cannot be larger than sequence length 115169878

...

Error (GFaSeqGet): end coordinate (85529519) cannot be larger than sequence length 59373566

Cuffdiff step:

All the output files in cuffdiff directory are empty. Then I checked the log file from cuffdiff, I noticed following error messages.

"Error (GFaSeqGet): end coordinate (117135884) cannot be larger than sequence length 115169878

…..

Error (GFaSeqGet): end coordinate (61176309) cannot be larger than sequence length 59128983
Error (GFaSeqGet): end coordinate (61228418) cannot be larger than sequence length 59128983

This contig will not be bias corrected.
Warning: couldn't find fasta record for 'chrUn_JH584304'!
This contig will not be bias corrected.
GffObj::getSpliced() error: improper genomic coordinate 3078823 on chrX for TCONS_00034613
"

cuffmerge cuffdiff rnaseq • 2.0k views
ADD COMMENTlink modified 4.6 years ago by Devon Ryan95k • written 4.6 years ago by nalandaatmi90
1
gravatar for Devon Ryan
4.6 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

The error message is surprisingly informative :)

Somehow the merged GTF file is invalid or at least inconsistent with what you're feeding into cuffdiff. Figure out which chromosome has a length of 115169878 (either look in a BAM header or the .fai file made by "samtools faidx") and use awk to confirm that there are entries in the merged GTF file that are beyond that end position. You might then check in the GTF files made by cufflinks to see if that occurs there as well. I should note that I suspect you used multiple fasta files, where the chromosome lengths differ between them.

ADD COMMENTlink written 4.6 years ago by Devon Ryan95k

Dear Devon Thanks for getting back to me. I really appreciate it.

Sure I will check the file as you mentioned. I used following commands in my analysis

Tophat command:

$ tophat -p 12 -G mousegenes.gtf --library-type fr-unstranded -o tophat_out mousegenome R1.fastq R2.fastq

Cufflinks command:

$ cufflinks -p 12 -G mousegenes.gtf --library-type fr-unstranded -o cufflinks_out tophat_out/accepted_hits.bam

Cuffmerge command:

$ cuffmerge -p 12 -g mousegenes.gtf -o cuffmerge_out -s mousegenome.fa assemblylist.txt
ADD REPLYlink modified 7 months ago by RamRS27k • written 4.6 years ago by nalandaatmi90

Dear Devon,

When I used Ensembl mouse GTF file, I didn't encounter this error. But with the UCSC mouse GTF file, I faced the same kind of error for the different project dealing with the mouse.

Another colleague in my team did the RNAseq analysis using UCSC mouse GTF file. He generated some output files. For him, the gene_expdiff output has 1400 significant genes. But for me, when I redo the analysis using Ensembl mouse GTF file, I got 600 genes only. I couldn't check with him now. He moved to a different place. Is the difference in genes due to Ensembl GTF file?

ADD REPLYlink modified 7 months ago by RamRS27k • written 4.6 years ago by nalandaatmi90

Possibly, it's impossible to say without knowing exactly what was done before.

ADD REPLYlink written 4.6 years ago by Devon Ryan95k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 812 users visited in the last hour