Question: Help with extracting multiple sequences from a fasta file with a list of Ids (four counting per line)
0
gravatar for ahmed_bio82
11 months ago by
ahmed_bio820 wrote:

I would like to extract multiple sequences from a fasta file with a list of counting ids (four counting per line). I found several scripts to extract sequences from fasta file based on a list of counting ids but with one counting per line. In my list of counting ids I have four counting per line. This is the header of my list of itd list

OG1.5_9691: aco|TRINITY_DN39707_c3_g4_i1.p1 bio|GFMW01138197.1.p1 lym|FX192122.1.p1 physa|Contig31631.p1
OG1.5_9693: aco|TRINITY_DN34744_c0_g1_i2.p1 bio|GFMW01140870.1.p1 lym|FX194372.1.p1 physa|Contig299.p1
OG1.5_9694: aco|TRINITY_DN40605_c7_g1_i1.p1 bio|GFMW01145544.1.p1 lym|FX194851.1.p1 physa|Contig70050.p1
OG1.5_9695: aco|Contig7627.p1 bio|GFMW01145616.1.p1 lym|FX202590.1.p1 physa|Contig22503.p1

I would really approciate any help you can provide to extract my sequences from the fasta file.

ADD COMMENTlink modified 11 months ago by Pierre Lindenbaum126k • written 11 months ago by ahmed_bio820

I have edited the question for you this time, but for future reference, this is a Question not a Tutorial.

Its not clear to me what you mean by extracting IDs by 'counting'.

Can you show your input data? It looks like you've only shown us one of the 2 files.

ADD REPLYlink modified 11 months ago • written 11 months ago by Joe16k

faSomeRecords utility from Jim Kent should extract the sequences as long as the fasta header exactly matches in both files. Linux version linked. Remember to chmod a+x faSomeRecords after you download before executing. I assume 4 counting means there are 4 identifiers separated by space in your headers?

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax78k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1448 users visited in the last hour