Question: Error: duplicate GFF ID 'ENSMUST00000105372' encountered! [FAILED] Error: could not execute cuffcompare
0
gravatar for jolin0701-dy
19 months ago by
jolin0701-dy60
jolin0701-dy60 wrote:

I just got an error from cuffmerge

$ ~/programs/cufflinks-2.1.1.OSX_x86_64/cuffmerge -g ~/GRCm38_86/mouse.gtf -s ~/GRCm38_86/mouse.fa assemblies.txt

[Mon Oct 10 19:28:48 2016] Beginning transcriptome assembly merge
-------------------------------------------

[Mon Oct 10 19:28:48 2016] Preparing output location ./merged_asm/
[Mon Oct 10 19:28:58 2016] Converting GTF files to SAM
[19:28:58] Loading reference annotation.
[19:28:59] Loading reference annotation.
[Mon Oct 10 19:29:02 2016] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g ~/GRCm38_86/mouse.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 ./merged_asm/tmp/mergeSam_tmp.2.PBd67n 
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File ./merged_asm/tmp/mergeSam_tmp.2.PBd67n doesn't appear to be a valid BAM file, trying SAM...
[19:29:02] Loading reference annotation.
[19:29:18] Inspecting reads and determining fragment length distribution.
Processed 39590 loci.                       
> Map Properties:
>   Normalized Map Mass: 105710.00
>   Raw Map Mass: 105710.00
>   Fragment Length Distribution: Truncated Gaussian (default)
>                 Default Mean: 200
>              Default Std Dev: 80
[19:29:20] Assembling transcripts and estimating abundances.

8:119910359-124345724   Warning: Skipping large bundle.
Processed 39589 loci.                       
[Mon Oct 10 19:44:26 2016] Comparing against reference file ~/GRCm38_86/mouse.gtf
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Error: duplicate GFF ID 'ENSMUST00000105372' encountered!
    [FAILED]
Error: could not execute cuffcompare

Any suggestions? Thanks so much~~

rna-seq • 1.1k views
ADD COMMENTlink modified 19 months ago • written 19 months ago by jolin0701-dy60
2

I would suggest you to use the latest version of Cufflinks, although it might not resolve your issue. Many researchers have experienced the same problem as you have. It will work if you delete such duplicate entries from mouse.gtf. Similar issue was resolved by dhir_kumar at seqanswers forum: http://seqanswers.com/forums/showthread.php?t=22692 using

`awk '!/Selenocysteine/' Homo_sapiens.GRCh38.76.gtf >Homo_sapiens.GRCh38.76.gtf_seleno_filtered`

Your GFF ID is also related to "Selenocysteine". I have looked at Mus_musculus.GRCm38.84.gtf annotation file, and found that there are 62 entries related to Selenocysteine. Probably the above solution will work for you too.

ADD REPLYlink written 19 months ago by Persistent LABS740
0
gravatar for jolin0701-dy
19 months ago by
jolin0701-dy60
jolin0701-dy60 wrote:

I just fixed it by using 2.2.1 .....

ADD COMMENTlink written 19 months ago by jolin0701-dy60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 947 users visited in the last hour