How to extract multi-line protein sequences with Ids present in headers ?
0
0
Entering edit mode
20 months ago

Hi all Can anyone tell me how to retrieve multi-line protein sequences with Ids present in headers?

>gi|1706522686|gb|QDM68077.1| CraA [Acinetobacter baumannii]
MKNIQTTALNRTTLMFPLALVLFEFAVYIGNDLIQPAMLAITEDFGVSATWAPSSMSFYLLGGASVAWLL
GPLSDRLGRKKVLLSGVLFFALCCFLILLTRQIEHFLTLRFLQGIGLSVISAVGYAAIQENFAERDAIKV
MALM
>gi|1818457412|dbj|BCA98153.1| 1-acyl-sn-glycerol-3-phosphate acyltransferase [Acinetobacter baumannii]
MTQTQSIVNSTLKKFSKIGLYGKKVTSATAAISEGFYLVYRHGLYKDPNNPVNTRYVQYFCRRLCQVFNL
EVQVHGTIPREPALWVSNHISWLDIAVLGSGARVFFLAKAEIEKWPILGNLAKGGGTLFIKRGSGDSIKI

>gi|1818457412|dbj|BCA98158.1| 1-acyl-phosphate acyltransferase [Acinetobacter baumannii]
MTQTQSIVNSTLKKFSKIGLYGKKVTSATAAISEGFYLVYRHGLYKDPNNPVNTRYVQYFCRRLCQVFNL
EVQVHGTIPREPALWVSNHISWLDIAVLGSGARVFFLAKAEIEKWPILGNLAKGGGTLFIKRGSGDSIKI


and I have Ids like

QDM68077.1
BCA98153.1


Please let me know how to retrieve sequnces for these Ids. Would appreciate if someone tell me how to use seqkit. I have used seqkit like

seqkit grep -nrif remaining_except_core 307_DR_determinats.fasta but getting nothing from this command.

PERL awk Sed • 514 views
0
Entering edit mode

Take a look at the Similar posts section on the right-hand side of the page for related questions and the corresponding solutions.

e.g. Retrieve multi-line fasta sequences using list of locus tag shows a very similar question and an accepted solution that you could try.

Also take a look at: https://bioinf.shenwei.me/seqkit/usage/#grep

0
Entering edit mode

If you can do this with a GUI application, then take a look at SEDA (https://www.sing-group.org/seda/).

You can apply the Pattern filtering operation (https://www.sing-group.org/seda/manual/operations.html#pattern-filtering) to the headers (check the Header radio button) using the sequence IDs you want (Note: you can use the Import patterns option to import these IDs from a TXT file instead of typing them manually into the GUI).