Extract sequence from Fasta using header
7 weeks ago
Princy ▴ 40

Hello, I have a list of headers, I need to extract the sequence from the fasta file. how can I do it? kindly let me know.

The header file looks like this

>TRINITY_DN74691_c0_g1_i1
>TRINITY_DN74659_c0_g1_i1
>TRINITY_DN74698_c0_g1_i1


fasta file looks like this

>TRINITY_DN74697_c0_g1_i1 len=243 path=[221:0-242] [-1, 221, -2]
GTATGTCCCACCAGACACAGCAGGGCTGGCAGGCCGAGTTTGAGTTTGGAATATATCTG

@princy: You have asked many questions on biostars over the last few months but appear not to have validated any of them. Accepting answers (you can accept multiple) using the green checkmark is appropriate way to provide closure to threads.

7 weeks ago
GenoMax 119k
7 weeks ago

seqkit will be able to do this. (subprogram seq )

(alternative, if you have a blastdb of that fasta file, you can also get them by using blastdbcmd)