Cuffmerge running error
0
1
Entering edit mode
7.9 years ago
hana ▴ 190

Hi all

I want to use cuffmerge to merge assemblies made with cufflinks (with Ensemble GRCH38 refrences ). I am running the command:

cuffmerge -o /home/ra/cuffmerge -g /home/ra/Ensemble_GRCH38/Homo_sapiens.GRCh38.77.gtf /home/ra/cuffmerge/assemblies.txt

and I get the following output:

Error: duplicate GFF ID 'ENST00000361547' encountered!
 [FAILED]
Error: could not execute gtf_to_sam

I tried :

sudo chmod +xr /uer/bin/gtf_to_sam

and still have the same error.

I have downloaded the references fasta and gtf file from ensemble database.

How can I fix this error?

Thank you

RNA-Seq • 7.3k views
ADD COMMENT
1
Entering edit mode

You need to check SAM header and whether SAM is sorted.

If there are no SQ records in the header, or if the header is missing, then also it would show same error.

The alignments must be sorted lexicographically by chromosome name and by position.

hth

ADD REPLY
0
Entering edit mode

my alignment result is in bam format ,should I convert it to sam and sort it ?

ADD REPLY
1
Entering edit mode

yes, check it by viewing it by samtools

ADD REPLY
0
Entering edit mode

Hi

This is the header of my bam file

@SQ    SN:chr1    LN:248956422
@SQ    SN:chr10    LN:133797422
@SQ    SN:chr11    LN:135086622
@SQ    SN:chr12    LN:133275309
@SQ    SN:chr13    LN:114364328
@SQ    SN:chr14    LN:107043718
@SQ    SN:chr15    LN:101991189
@SQ    SN:chr16    LN:90338345
@SQ    SN:chr17    LN:83257441
@SQ    SN:chr18    LN:80373285
@SQ    SN:chr19    LN:58617616
@SQ    SN:chr2    LN:242193529
@SQ    SN:chr20    LN:64444167
@SQ    SN:chr21    LN:46709983
@SQ    SN:chr22    LN:50818468
@SQ    SN:chr3    LN:198295559
@SQ    SN:chr4    LN:190214555
@SQ    SN:chr5    LN:181538259
@SQ    SN:chr6    LN:170805979
@SQ    SN:chr7    LN:159345973
@SQ    SN:chr8    LN:145138636
@SQ    SN:chr9    LN:138394717
@SQ    SN:chrM    LN:16569
@SQ    SN:chrX    LN:156040895
@SQ    SN:chrY    LN:57227415

still I have the same error.

Would you please let me know what can I do to solve it.

thank you

ADD REPLY
1
Entering edit mode

do you still have two errors?

  1. duplicated ID

    check your reference gtf and assembled gtf.txt are they in listed, separated by new line,

    did you use same gtf for cufflinks

  2. gtf_to_sam

    see if its in your path PATH=$PATH:/PATH/TO/gtf_to_sam

    try to run gtf_to_sam separately on transcripts.gtf just to check that if there is an issue with gtf_to_sam

ADD REPLY
0
Entering edit mode

The gtf_to_sam is in my path. I run it separately on transcripts.gtf and get the same error as below:

gtf_to_sam -r /home/ra/GRCH38/GRCh38.fa transcripts.gtf out.sam
[11:24:48] Loading reference annotation.
Error: duplicate GFF ID 'ENST00000361547.4' encountered!
ADD REPLY
1
Entering edit mode

Dear Hana,

I understand this is very annoying that gtf_to_sam is running but there is problem with transcripts.gtf.

I think that this is some bug in latest versions of cufflinks, (I guess older version does not have this issue but the newer has)

From my point of view there can two remedies

  1. swith to older version of cufflinks, and write about this to Cole Trapnell

or

  1. edit the file to remove/reduce the duplicates. There could be scientific consequences when doing this, so consider carefully.
ADD REPLY
0
Entering edit mode

Dear Manu

Thank you so much for your reply. I have sent the email to Cole Trapnel about this issue.

Would you please tell me how I can remove duplicates from transcripts.gtf file?

thanks in advance

ADD REPLY
1
Entering edit mode

A simple R script might work in this case, (but also recheck it with someone that this approach is okay, may be someone in this forum might also intervene if any alternative is possible)

for example

new.gtf=old.gtf[!duplicated(old.gtf[,numeric]),]
#####numeric is the "column number" which contains ensembl_transcript_ID
ADD REPLY
0
Entering edit mode

thank you .I will try it

ADD REPLY
1
Entering edit mode

I have the same problem here! The version of cufflinks is 2.2.1...

When I ran gtf_to_sam -r /share2/hlibyar/tophat/Am_genome.fa 101_S1_L001_R1_001/101_S1_L001_R1_001.gtf out.sam, it worked

The bam file is sorted.

pls help!!

[hlibyar@login02 cufflinks]$ cuffmerge -s /share2/hlibyar/tophat/Am_genome.fa assemblies.txt
[Wed Nov  4 13:40:09 2015] Beginning transcriptome assembly merge
-------------------------------------------
[Wed Nov  4 13:40:09 2015] Preparing output location ./merged_asm/
Warning: no reference GTF provided!
[Wed Nov  4 13:41:35 2015] Converting GTF files to SAM
[13:41:35] Loading reference annotation.
GFF Error: duplicate/invalid 'transcript' feature ID=GB42164-RA
        [FAILED]
Error: could not execute gtf_to_sam
ADD REPLY
0
Entering edit mode

Did you combine several assembled gtf files into one before running cuffmerge? I had same problem before because I used cat to combined two gtf into one.

ADD REPLY
0
Entering edit mode

Hi Hana,

I am running cuffmerge using cufflinks v2.1.1 and I ran into the exact same problem you mentioned above. I was wondering if you ever figured it out? Also, which version of cufflinks were you using (I understand it has been a while since you made the post)?

Thank you,

RQF

ADD REPLY
0
Entering edit mode

Hi, Hana,

How did you solve your problem? I got exact same one. Really need your suggestion.

Thanks!

Best,
Ellie

ADD REPLY
0
Entering edit mode

Hello Hana and Ellie,

I also got the exact same problem while running Cuffmerge. I previously didn't encounter this problem, but this time I turned on the RABT option in Cufflinks. Could it be RABT causing the problem for us? Did you turn that option on? Thanks.

Jamie

ADD REPLY
0
Entering edit mode

Hi all,

I have the same exact problem running cuffmerge v2.1.1 and also with v 2.2.1. I tried gtf_to_sam separately and it didn't work. I keep on getting the same error:

Error: duplicate GFF ID 'transcript:ENST00000466300' encountered!

Did anyone find a solution for this?

Thanks,
Olivia

ADD REPLY
0
Entering edit mode

I have the same errors and need help. This is my input and output:

cuffmerge -p 8 -g '/home/Mus_musculus.GRCm38.85_Modified.gff3' '/home/Cufflinks_Output/6_months/assembly_GTF_list.txt' 

[Sun Sep 11 20:02:16 2016] Beginning transcriptome assembly merge
-------------------------------------------

[Sun Sep 11 20:02:16 2016] Preparing output location ./merged_asm/
[Sun Sep 11 20:02:29 2016] Converting GTF files to SAM
[20:02:29] Loading reference annotation.
GFF Error: duplicate/invalid 'transcript' feature ID=transcript:ENSMUST00000045689
    [FAILED]
Error: could not execute gtf_to_sam

If anyone knows how to fix can you please reply?

ADD REPLY
0
Entering edit mode

I have the same problems, have someone solved it?

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6