Error: duplicate GFF ID 'ENSMUST00000105372' encountered! [FAILED] Error: could not execute cuffcompare
1
0
Entering edit mode
7.5 years ago
jolin0701-dy ▴ 100

I just got an error from cuffmerge

$ ~/programs/cufflinks-2.1.1.OSX_x86_64/cuffmerge -g ~/GRCm38_86/mouse.gtf -s ~/GRCm38_86/mouse.fa assemblies.txt

[Mon Oct 10 19:28:48 2016] Beginning transcriptome assembly merge
-------------------------------------------

[Mon Oct 10 19:28:48 2016] Preparing output location ./merged_asm/
[Mon Oct 10 19:28:58 2016] Converting GTF files to SAM
[19:28:58] Loading reference annotation.
[19:28:59] Loading reference annotation.
[Mon Oct 10 19:29:02 2016] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g ~/GRCm38_86/mouse.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 ./merged_asm/tmp/mergeSam_tmp.2.PBd67n 
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_tmp.2.PBd67n doesn't appear to be a valid BAM file, trying SAM...
[19:29:02] Loading reference annotation.
[19:29:18] Inspecting reads and determining fragment length distribution.
Processed 39590 loci.                       
> Map Properties:
>   Normalized Map Mass: 105710.00
>   Raw Map Mass: 105710.00
>   Fragment Length Distribution: Truncated Gaussian (default)
>                 Default Mean: 200
>              Default Std Dev: 80
[19:29:20] Assembling transcripts and estimating abundances.

8:119910359-124345724   Warning: Skipping large bundle.
Processed 39589 loci.                       
[Mon Oct 10 19:44:26 2016] Comparing against reference file ~/GRCm38_86/mouse.gtf
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Error: duplicate GFF ID 'ENSMUST00000105372' encountered!
    [FAILED]
Error: could not execute cuffcompare

Any suggestions? Thanks so much~~

rna-seq • 3.4k views
ADD COMMENT
2
Entering edit mode

I would suggest you to use the latest version of Cufflinks, although it might not resolve your issue. Many researchers have experienced the same problem as you have. It will work if you delete such duplicate entries from mouse.gtf. Similar issue was resolved by dhir_kumar at seqanswers forum: http://seqanswers.com/forums/showthread.php?t=22692 using

`awk '!/Selenocysteine/' Homo_sapiens.GRCh38.76.gtf >Homo_sapiens.GRCh38.76.gtf_seleno_filtered`

Your GFF ID is also related to "Selenocysteine". I have looked at Mus_musculus.GRCm38.84.gtf annotation file, and found that there are 62 entries related to Selenocysteine. Probably the above solution will work for you too.

ADD REPLY
0
Entering edit mode
7.5 years ago
jolin0701-dy ▴ 100

I just fixed it by using 2.2.1 .....

ADD COMMENT

Login before adding your answer.

Traffic: 2822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6