Sequence search using a pattern
3
0
Entering edit mode
7.4 years ago

How can I retrieve complete amino acid sequences in fasta format from the non-redundant bacterial amino acid fasta file (downloaded and concatenated) using a known string of 8 amino acids that occur in continuity at the end of my sequence of interest?

sequence • 2.3k views
ADD COMMENT
0
Entering edit mode

what did you find so far ?

ADD REPLY
0
Entering edit mode

Pierre,

Thanks for you reply. I am not a bioinformatician or a computer programmer. However, so far, I have found that grep can be used to search for my query but this does not necessarily give me results that are at the end of the sequence and also I do not know how to extract full sequences using grep based on my 8 amino acid query.

Thanks

Amar

ADD REPLY
0
Entering edit mode
7.4 years ago

Similar post: A: String research in a fasta file

Use SeqKit to search sequence pattern in the last 100 amino acid

seqkit grep --by-seq --region -100:-1 --ignore-case --use-regexp --pattern TCTGACC Seq.fa > result.fa
ADD COMMENT
0
Entering edit mode
7.4 years ago

shenwei356,

Thank you very much for your help.

Amar

ADD COMMENT
0
Entering edit mode
7.4 years ago

shenwei356, Your suggestion works well but i am getting the following message on terminal after some results:

panic: runtime error: slice bounds out of range

goroutine 1 [running]: panic(0x308920, 0x10838030) /usr/local/app/go/src/runtime/panic.go:500 +0x325 github.com/shenwei356/bio/seq.(Seq).SubSeq(0x10d9dad0, 0x0, 0x7, 0x0) /home/shenwei/shenwei/script/Go/project/src/github.com/shenwei356/bio/seq/Seq.go:121 +0x442 github.com/shenwei356/seqkit/seqkit/cmd.glob..func5(0x4a8c00 0x108da8c0, 0x1, 0x8) /home/shenwei/shenwei/script/Go/project/src/github.com/shenwei356/seqkit/seqkit/cmd/grep.go:195 +0x10e3 github.com/spf13/cobra.(Command).execute(0x4a8c00, 0x108da740, 0x8, 0x8, 0x0, 0x0) /home/shenwei/shenwei/script/Go/project/src/github.com/spf13/cobra/command.go:636 +0x676 github.com/spf13/cobra.(Command).ExecuteC(0x4a92c0, 0x4a8c00, 0x0, 0x0) /home/shenwei/shenwei/script/Go/project/src/github.com/spf13/cobra/command.go:722 +0x397 github.com/spf13/cobra.(Command).Execute(0x4a92c0, 0x0, 0x0) /home/shenwei/shenwei/script/Go/project/src/github.com/spf13/cobra/command.go:681 +0x25 github.com/shenwei356/seqkit/seqkit/cmd.Execute() /home/shenwei/shenwei/script/Go/project/src/github.com/shenwei356/seqkit/seqkit/cmd/root.go:52 +0x21 main.main() /home/shenwei/shenwei/script/Go/project/src/github.com/shenwei356/seqkit/seqkit/main.go:48 +0x11

Let me know what you think of this.

Amar

ADD COMMENT
0
Entering edit mode

Please click ADD COMMENT to respond to a answer, not create new answers.

it seems a bug. You may have sequences shorter than 100aa, please change or discard the arguments depending on your case

ADD REPLY
0
Entering edit mode

I've fixed this bug in v0.3.9, Please download the latest version.

ADD REPLY

Login before adding your answer.

Traffic: 1817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6