I have an annotation file in GFF3 file, but I do not have the amino acid and cds sequences anymore. Is there a tool which can retrieve those files from a genome in FASTA format and a GFF3 file?
Thank you in advance
I have an annotation file in GFF3 file, but I do not have the amino acid and cds sequences anymore. Is there a tool which can retrieve those files from a genome in FASTA format and a GFF3 file?
Thank you in advance
For example, the Arabidopsis group does it ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes.gff . Does AGAT can anything close to it?
It shouldn't be the only result this is the 10000 gene but if you grep for "AT1G00000000001" you will find the first. I just checked and it works for me. The only problem, is that when it propagates the ID it does not do it in the same way (order) it prints the result at the End. So the first line in the output is not necessarily the first number. I will open an issue in the repo to improve that for the next release.
I found it in my last chromosome:
NbV1Ch19 AUGUSTUS gene 97401 99254 0.03 - . ID=AT1G00000000001
NbV1Ch19 AUGUSTUS mRNA 97401 99254 0.03 - . ID=AT1M00000000001;Parent=AT1G00000000001
NbV1Ch19 AUGUSTUS exon 97401 99007 . - . ID=AT1E00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS exon 99101 99254 . - . ID=AT1E00000000002;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS CDS 98823 99007 0.36 - 2 ID=AT1C00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS CDS 99101 99230 0.68 - 0 ID=AT1C00000000002;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS five_prime_utr 99231 99254 0.25 - . ID=AT1F00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS intron 99008 99100 0.69 - . ID=AT1I00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS start_codon 99228 99230 . - 0 ID=AT1S00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS stop_codon 98823 98825 . - 0 ID=AT1ST00000000001;Parent=AT1M00000000001
NbV1Ch19 AUGUSTUS three_prime_utr 97401 98822 0.05 - . ID=AT1T00000000001;Parent=AT1M00000000001
Why there is such big difference in IDs between a gene and its sub-features?
NbV1Ch01 AUGUSTUS gene 97932 99714 0.06 - . ID=AT1G00000068467
NbV1Ch01 AUGUSTUS mRNA 97932 99714 0.06 - . ID=AT1M00000076570;Parent=AT1G00000068467
NbV1Ch01 AUGUSTUS exon 97932 98571 . - . ID=AT1E00000339808;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS exon 98679 98844 . - . ID=AT1E00000339809;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS exon 99134 99325 . - . ID=AT1E00000339810;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS exon 99417 99714 . - . ID=AT1E00000339811;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS CDS 98177 98571 1 - 2 ID=AT1C00000294005;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS CDS 98679 98844 1 - 0 ID=AT1C00000294006;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS CDS 99134 99325 1 - 0 ID=AT1C00000294007;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS CDS 99417 99668 0.65 - 0 ID=AT1C00000294008;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS five_prime_utr 99669 99714 0.14 - . ID=AT1F00000101217;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS intron 98572 98678 1 - . ID=AT1I00000123933;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS intron 98845 99133 1 - . ID=AT1I00000123934;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS intron 99326 99416 1 - . ID=AT1I00000123935;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS start_codon 99666 99668 . - 0 ID=AT1S00000057436;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS stop_codon 98177 98179 . - 0 ID=AT1ST00000057445;Parent=AT1M00000076570
NbV1Ch01 AUGUSTUS three_prime_utr 97932 98176 0.44 - . ID=AT1T00000096168;Parent=AT1M00000076570
Because it is numbered by feature type (3rd column) independently, here an example:
NbV1Ch01 AUGUSTUS gene 97932 99714 0.06 - . ID=gene1
NbV1Ch01 AUGUSTUS mRNA 97932 99714 0.06 - . ID=mRNA1
NbV1Ch01 AUGUSTUS exon 97932 98571 . - . ID=exon1
NbV1Ch01 AUGUSTUS exon 98679 98844 . - . ID=exon2
NbV1Ch01 AUGUSTUS exon 99134 99325 . - . ID=exon3
NbV1Ch01 AUGUSTUS exon 99417 99714 . - . ID=exon4
NbV1Ch01 AUGUSTUS CDS 98177 98571 1 - 2 ID=cds1
NbV1Ch01 AUGUSTUS CDS 98679 98844 1 - 0 ID=cds2
NbV1Ch01 AUGUSTUS CDS 99134 99325 1 - 0 ID=cds3
NbV1Ch01 AUGUSTUS CDS 99417 99668 0.65 - 0 ID=cds4
NbV1Ch01 AUGUSTUS mRNA 97935 99711 0.06 - . ID=mRNA2
NbV1Ch01 AUGUSTUS exon 97935 98571 . - . ID=exon5
NbV1Ch01 AUGUSTUS exon 98679 98844 . - . ID=exon6
NbV1Ch01 AUGUSTUS exon 99134 99325 . - . ID=exon7
NbV1Ch01 AUGUSTUS exon 99417 99711 . - . ID=exon8
NbV1Ch01 AUGUSTUS CDS 98177 98571 1 - 2 ID=cds5
NbV1Ch01 AUGUSTUS CDS 98679 98844 1 - 0 ID=cds6
NbV1Ch01 AUGUSTUS CDS 99134 99325 1 - 0 ID=cds7
NbV1Ch01 AUGUSTUS gene 109665 112554 0.04 - . ID=gene2
NbV1Ch01 AUGUSTUS mRNA 109665 112554 0.04 - . ID=mRNA3
NbV1Ch01 AUGUSTUS exon 109665 110489 . - . ID=exon9
NbV1Ch01 AUGUSTUS exon 110608 111042 . - . ID=exon10
NbV1Ch01 AUGUSTUS exon 111592 111844 . - . ID=exon11
NbV1Ch01 AUGUSTUS exon 112128 112554 . - . ID=exon12
NbV1Ch01 AUGUSTUS CDS 109839 110489 0.69 - 0 ID=cds8
NbV1Ch01 AUGUSTUS CDS 110608 111042 0.21 - 0 ID=cds9
NbV1Ch01 AUGUSTUS CDS 111592 111844 0.23 - 1 ID=cds10
NbV1Ch01 AUGUSTUS CDS 112128 112450 0.95 - 0 ID=cds11
It sounds coherent, It is noted, I will modify the code for a next release. Would you check here it is what you expect? https://github.com/NBISweden/AGAT/issues/16
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Could anyone please revert the question and title to the previous version?
Hi Ric
we were able to trace back the original post title, however we don't have the ability to get the original post content back. Perhaps you are best placed to re-create it?