2
0
Entering edit mode
7.5 years ago
BDK_compbio ▴ 90

After running cufflink for two samples of RNA-Seq data, I used cuffmerge and executed the following command

cuffmerge -p 8 -g <directory>/<gff file>   -s <directory>/<refernec fasta file>  <directory>/assemblies.txt


Where assemblies.txt contains the transcripts.gtf

But I am getting following error

[Sun Jul 13 14:44:46 2014] Beginning transcriptome assembly merge
-------------------------------------------

[Sun Jul 13 14:44:46 2014] Preparing output location ./merged_asm/
[Sun Jul 13 14:46:04 2014] Converting GTF files to SAM
[Sun Jul 13 14:52:46 2014] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g <directory>/<gff file> -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_file3DKFYb
File ./merged_asm/tmp/mergeSam_file3DKFYb doesn't appear to be a valid BAM file, trying SAM...
Error parsing strand (?) from GFF line:
IWGSC_CSS_1DS_scaff_731014      .       repeat_region   1       174     .       ?    .Name=trf;class=trf;repeat_consensus=ATTGGTATAGAACGCATGAAGAAACTCCATACAGATGGATCTTTAGACTCACTCAATTATGAAAAAATTGAGACATGCAAACCATGTCT;type=Tandem repeats
[FAILED]

2
Entering edit mode
7.5 years ago
komal.rathi ★ 3.9k

You are using the wrong input file. You downloaded the GFF3 instead of GTF file. The correct file is: ftp://ftp.ensemblgenomes.org/pub/plants/release-22/gtf/triticum_aestivum/Triticum_aestivum.IWGSP1.22.gtf.gz

##gff-version 3
IWGSC_CSS_5BS_scaff_1034127    .    repeat_region    61    200    .    +    .    Name=gnl|TREP|TREP3026;class=Unknown;repeat_consensus=N;type=Unknown
IWGSC_CSS_3AL_scaff_747250    .    repeat_region    2    200    .    +    .    Name=gnl|TREP|TREP765;class=Unknown;repeat_consensus=N;type=Unknown
IWGSC_CSS_3B_scaff_7107049    .    repeat_region    1    200    .    +    .    Name=gnl|TREP|TREP232;class=Unknown;repeat_consensus=N;type=Unknown


This is the GTF file:

IWGSC_CSS_6DL_scaff_127793    protein_coding    exon    526    645    .    +    .     gene_id "Traes_6DL_7FFFE462C"; transcript_id "Traes_6DL_7FFFE462C.2"; exon_number "1"; seqedit "false";
IWGSC_CSS_6DL_scaff_127793    protein_coding    CDS    574    645    .    +    0     gene_id "Traes_6DL_7FFFE462C"; transcript_id "Traes_6DL_7FFFE462C.2"; exon_number "1"; protein_id "Traes_6DL_7FFFE462C.2";

0
Entering edit mode

Thanks a lot. Yes, I already started running the script using GTF file.

0
Entering edit mode

sbdk82 whenever you figure out the solution to your question before others, please post it here as an answer or accept an answer that other people post. So that people can focus on other 'open' questions. Thanks!

0
Entering edit mode

Yes, I was about to do that but the script was running and I was not sure if using GTF file solved that issue.

0
Entering edit mode

Hi

I am new in Rna-seq... I am trying to run the cuffmerge but ir is giving me some error... Please, could you help me...

cuffmerge -g hg19.gtf -s hg19.fa -p 8 assemblies.txt

[Sun Jan 25 22:51:41 2015] Beginning transcriptome assembly merge
-------------------------------------------

[Sun Jan 25 22:51:41 2015] Preparing output location ./merged_asm/
[Sun Jan 25 22:51:48 2015] Converting GTF files to SAM
[Sun Jan 25 22:51:55 2015] Quantitating transcripts
Warning: Your version of Cufflinks is not up-to-date. It is recommended that you upgrade to Cufflinks v2.2.1 to benefit from the most recent features and bug fixes (http://cufflinks.cbcb.umd.edu).
Command line:
cufflinks -o ./merged_asm/ -F 0.05 -g hg19.gtf -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 8 ./merged_asm/tmp/mergeSam_fileZZUtV7
File ./merged_asm/tmp/mergeSam_fileZZUtV7 doesn't appear to be a valid BAM file, trying SAM...
[FAILED]

0
Entering edit mode

It seems there is some error in the GTF file. Please check if it is the correct one.

0
Entering edit mode

Hi sbdk82 and komal.rathi;

I checked the gtf file too... I changed my gtf file also and tried once again but still same error its giving.....

0
Entering edit mode

Does  assmebly.txt contain all the transcript.gtf files? Did you run tophat and cufflink before running cuffmerge? Try sorting all the SAM/BAM files before running cufflink.

0
Entering edit mode
yeah, i got the output from tophat and cufflinks,and the assembly.txt file has all the transcript.gtf files.... if i run the cuffmerge the following error has occurred..

my error:
File ./merged_asm/tmp/mergeSam_fileMgq1Io doesn't appear to be a valid BAM file, trying SAM...
[FAILED]
Error: could not execute cufflinks
0
Entering edit mode

nikhilvgbt Sorry for the delayed response. Did you get things worked out?

0
Entering edit mode

Yeah... it worked for me... the problem in the bam file is just because there is no proper order of Chromosomes in Reference genome and Gtf file.... thank you... komal.rathi...

and  komal.rathi... how did you solve this error... so that other people may have an answer for this....

1
Entering edit mode
7.5 years ago
Josh Herr 5.7k

You didn't give us much information to help you, but it looks like it might be a formatting error. You just need to make sure you do not have any mis-aligned columns or extra line endings that may be in your file.

0
Entering edit mode

Hi Josh,

I used the following gff file and reference file.