Entering edit mode
                    10.4 years ago
        biolab
        
    
        ★
    
    1.4k
    Hi everyone,
I have a draft genome fasta file and a GFF annotation file. The GFF file is like below.
9311_chr12      GLEAN   mRNA    17901210        17902763        0.90124 +       .       ID=9311_GLEAN_10008559;
9311_chr12      GLEAN   CDS     17901210        17901318        .       +       0       Parent=9311_GLEAN_10008559;
9311_chr12      GLEAN   CDS     17901418        17901486        .       +       2       Parent=9311_GLEAN_10008559;
9311_chr12      GLEAN   CDS     17901566        17901672        .       +       2       Parent=9311_GLEAN_10008559;
9311_chr12      GLEAN   CDS     17901722        17901755        .       +       0       Parent=9311_GLEAN_10008559;
9311_chr12      GLEAN   CDS     17902585        17902763        .       +       2       Parent=9311_GLEAN_10008559;
9311_chr04      GLEAN   mRNA    22207209        22208012        0.999282        -       .       ID=9311_GLEAN_10029041;
9311_chr04      GLEAN   CDS     22207209        22208012        .       -       0       Parent=9311_GLEAN_10029041;
My purpose is to get the gene coding sequences (without UTRs). I can filter the GFF file to include the CDS tracks only, but how to achieve the next step, that is to get the CDS sequences? Thank you very much!
Sorry for re-posting. I have found the solution on Biostars Extract Cds Fastas From A Gff Annotation + Reference Sequence
Thanks for your attention.