Question: Matching gene id and extracting fasta sequences using shell script
0
gravatar for Rashedul Islam
4.3 years ago by
Canada
Rashedul Islam340 wrote:

Dear All,

I have a list of genes id e.g.

gene01

gene22

gene27 and so on.

I need extract the gene names with fasta sequence from assembly fie. Assembly file looks:

>gene01

ATAGCGATCCCCCTTTTTCCTT

>gene02

ATACCCCCGCGAT

>gene03

ATACCCAAAAAAACCGCGAT and so on.

Can anyone help me to write a shell script that will search gene names of my gene list in the assembly file and will give the output with associated DNA sequence. Example output for gene01 is:

>gene01

ATAGCGATCCCCCTTTTTCCTT

shell fasta • 2.4k views
ADD COMMENTlink modified 4.3 years ago by Matt Shirley9.1k • written 4.3 years ago by Rashedul Islam340
1

I'm not sure if this website is like Stackoverflow, but you should really post what you tried instead of just asking people to do your work for you. It looks like you did not try anything at all, and while many people on this site don't have much programming experience, this is a relatively simple task which you could find the solution to on google or code in a few mins.

ADD REPLYlink written 4.3 years ago by steven70

Thank you for your reply. Your answer helped. I found this link: http://unix.stackexchange.com/questions/156783/getting-matched-fasta-file.

ADD REPLYlink written 4.3 years ago by Rashedul Islam340
0
gravatar for Matt Shirley
4.3 years ago by
Matt Shirley9.1k
Cambridge, MA
Matt Shirley9.1k wrote:

See: A: How to get length of selected contigs within a fasta file generated by transcrip

ADD COMMENTlink written 4.3 years ago by Matt Shirley9.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1231 users visited in the last hour