Question: Extract Sequences from Multiple sequence alignment containing special character like gap?
0
gravatar for Shaminur
6 months ago by
Shaminur60
Dhaka University
Shaminur60 wrote:

I have multiple sequence alignment files of DNA in FASTA which contain deletions and insertions. I want to extract sequences that only contain a gap (-). Are there any available programs to run it?

My input like this

>A
ATGCATGCATGCAGCATGC
>B
GCATGCAT---GCATACATGC
>C
ATGC-ATGCATGCATGCA---

Output should be

>B
GCATGCAT---GCATACATGC
>C
ATGC-ATGCATGCATGCA---

Thanks In Advance.

sequence alignment • 171 views
ADD COMMENTlink modified 6 months ago by shenwei3565.5k • written 6 months ago by Shaminur60
3
gravatar for shenwei356
6 months ago by
shenwei3565.5k
China
shenwei3565.5k wrote:

Try seqkit grep

seqkit grep -s -p  "-" in.fa > out.fa
ADD COMMENTlink written 6 months ago by shenwei3565.5k

Thank you very much, Shenwei.

ADD REPLYlink written 6 months ago by Shaminur60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1185 users visited in the last hour