Hello: I am trying to get coding sequences based on gff file and the genome fast file. I am using the function gffread. It works well with "gffread my.gff -g genome.assembly.fasta -x cds.fa". But after I add the parameter "-J" it reported an error "Error (GFaSeqGet): end coordinate (76134) cannot be larger than sequence length 76132". But I manually checked the annotation of the contig with the length 76132, and found there is no annotation with the end coordinate "76134". The maximum coordinate is 76132.
About the "-J" parameter
-J discard any mRNAs that either lack initial START codon or the terminal STOP codon, or have an in-frame stop codon (only print mRNAs with a fulll, valid CDS)
Anybody have met the same problem?
Thanks
Hi. I have the same issue. Did you find any solution?
Please use
ADD COMMENT/ADD REPLYwhen responding to existing posts to keep threads logically organized.I had the same problem... For me the "easiest" way is to do it is to run gffread without the -J parameter. Then with a custom script check that each sequence with "ATG" and finish with one stop-codon. Additionally, you have to look for stop codons inside the sequence.