Question: Extract Sequences from Multiple sequence alignment containing special character like gap?
0
gravatar for Shaminur
11 weeks ago by
Shaminur40
Dhaka University
Shaminur40 wrote:

I have multiple sequence alignment files of DNA in FASTA which contain deletions and insertions. I want to extract sequences that only contain a gap (-). Are there any available programs to run it?

My input like this

>A
ATGCATGCATGCAGCATGC
>B
GCATGCAT---GCATACATGC
>C
ATGC-ATGCATGCATGCA---

Output should be

>B
GCATGCAT---GCATACATGC
>C
ATGC-ATGCATGCATGCA---

Thanks In Advance.

sequence alignment • 103 views
ADD COMMENTlink modified 11 weeks ago by shenwei3565.2k • written 11 weeks ago by Shaminur40
3
gravatar for shenwei356
11 weeks ago by
shenwei3565.2k
China
shenwei3565.2k wrote:

Try seqkit grep

seqkit grep -s -p  "-" in.fa > out.fa
ADD COMMENTlink written 11 weeks ago by shenwei3565.2k

Thank you very much, Shenwei.

ADD REPLYlink written 11 weeks ago by Shaminur40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1681 users visited in the last hour