Entering edit mode
27 days ago
Eduardo Oñate
•
0
Hi everyone!
I'm trying to create a list with de gene_id and respective protein_id from a .gtf file, I want to extract de information using a simple line code. I tried using this command:
grep -o 'gene_id "[^"]*"\|protein_id "[^"]*"' file.gtf | paste - - > lista_genes_proteins.txt
But the paste results in the list are not correct. Does anyone know how I can do this? Can help me, please? Thanks.
You can use
AGAT
toolkit to do this properly: https://agat.readthedocs.io/en/latest/tools/agat_sp_extract_attributes.htmlThanks! I resolve it using a different command line and sublime text to edit the text.
I use this command line:
grep -w 'CDS' GCF_003086295.2_arahy.Tifrunner.gnm1.KYV3_genomic.gtf| cut -f9 > columna_9.txt
and use sublime text to edit the file columna_9.txt, It works!.
thanks for taking the time to answer.