extract a DNA sequence based on header in multifasta file
1
0
Entering edit mode
3.5 years ago
Optimist ▴ 180

Hello all,

My multifasta file with rRNA sequences looks like this.

>16S_rRNA::1:4-522(-)
TGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGCGTGATGGCGGGAACTCAAAGGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGAGTAGGGCTACACACGTGCTACAATGGCGTATACAAAGGGAAGCGACCCCGCGAGGGCAAGCGGAACTCATAAAGTACGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTAGATCAGAATGCTACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTCCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTT
>16S_rRNA::0-508(-)
TTGACGTTACCGACAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTGATTGAGTCAGATGTGAAATCCCCGGGCTTAACCCGGGAATTGCATCTGATACTGGTCAGCTAGAGTCTTGTAGAGGGGGGTAGAATTCCATGTGTAGCGGTGAAATGCGTAGAGATGTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCTGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCT
>5S_rRNA::1-80(-)
TGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTG
>5S_rRNA::1-77(-)
TGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGT
>5S_rRNA::1:4731-4814(-)
TGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGTGAGAGTAGGGAACCGCC

I want to extract only 16S_rRNA sequences into a new file. Is there a way to do this and I have to apply this on a large dataset of 500 files.

Thanking You

fasta sequence extraction • 638 views
ADD COMMENT
1
Entering edit mode
3.5 years ago

Try:

$ awk '$0 ~ /^>16S_rRNA/ {getline seq; print $0,seq}' test.fa
$ sed -n '/^>16S_rRNA/{N;p}' test.fa

don't use sed -i

ADD COMMENT
0
Entering edit mode

Thanks for the trick. awk has worked for me.

ADD REPLY

Login before adding your answer.

Traffic: 2763 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6