How to extract the first and last 2 nt from multiple sequences in a FASTA file?
1
0
Entering edit mode
8 weeks ago
praasu ▴ 40

Hi Everyone, I have multiple intron sequence fasta file. I would like to extract first and last 2 nucleotide sequence from that file. Please let me know if you have any suggestion or short code?

perl awk fasta genome sequence • 189 views
ADD COMMENT
2
Entering edit mode
8 weeks ago

Try seqkit subseq

# example
$ cat t.fasta
>s1
acntg
>s2
ACNNTG

# first 2 bases
$ seqkit subseq -r 1:2 t.fasta 
>s1
ac
>s2
AC

# last 2 bases
$ seqkit subseq -r -2:-1 t.fasta 
>s1
tg
>s2
TG

# first 2 +  last 2, although every strange
$ seqkit concat <(seqkit subseq -r 1:2 t.fasta) <(seqkit subseq -r -2:-1 t.fasta)
>s1
actg
>s2
ACTG
ADD COMMENT
0
Entering edit mode

Thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 2501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6