help with file processing
1
0
Entering edit mode
3.8 years ago
ravi.eshwari ▴ 10

Hi all,

I have a fasta file with sequences as below

>ATGATCTATCGTGTATCACGGTCA(1) TGACCGTGATACACGATAGATCAT
>TACGGTTCTGAAACGGAGAGTTCG(1) CGAACTCTCCGTTTCAGAACCGTA
>GCTTGCGACGACTGAGTTGGAG(1) GCTTGCGACGACTGAGTTGGAG
>ATTACTTGTTGTGATTGTTGGCCT(1) ATTACTTGTTGTGATTGTTGGCCT
>ATGCCGTCGGAAATAATGAGTTTA(1) ATGCCGTCGGAAATAATGAGTTTA
>AACAGATCCGCTGTAGCACATCGG(1) CCGATGTGCTACAGCGGATCTGTT
>TTGGCACGAGTGACTCCTTAGAC(1) GTCTAAGGAGTCACTCGTGCCAA
>TTAAGCATGACTTAGACTATC(2) TTAAGCATGACTTAGACTATC
>CAAAGGAACCGTGAGCTCAACT(1) CAAAGGAACCGTGAGCTCAACT

i need an output as below

>AAGTATGATTGATAATTCGTGATT(1) 
    AATCACGAATTATCAATCATACTT
>ATGGATGAAATGACATGGAATACAC(2) 
    GTGTATTCCATGTCATTTCATCCAT
>CATGGATAAGAGAGAAAAGGACACAAGAAGCCA(1) 
    CATGGATAAGAGAGAAAAGGACACAAGAAGCCA
>ATCGGTTGCAGGTAGACCGAGCTT(1) 
    AAGCTCGGTCTACCTGCAACCGAT

can you please suggest me an easiest way to do this or any code in ubuntu

sequence ubuntu • 576 views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better.
code_formatting

Thank you!

With the code button your input will look like this.

>ATGATCTATCGTGTATCACGGTCA(1) TGACCGTGATACACGATAGATCAT
>TACGGTTCTGAAACGGAGAGTTCG(1) CGAACTCTCCGTTTCAGAACCGTA
>GCTTGCGACGACTGAGTTGGAG(1) GCTTGCGACGACTGAGTTGGAG
>ATTACTTGTTGTGATTGTTGGCCT(1) ATTACTTGTTGTGATTGTTGGCCT
>ATGCCGTCGGAAATAATGAGTTTA(1) ATGCCGTCGGAAATAATGAGTTTA
>AACAGATCCGCTGTAGCACATCGG(1) CCGATGTGCTACAGCGGATCTGTT
>TTGGCACGAGTGACTCCTTAGAC(1) GTCTAAGGAGTCACTCGTGCCAA
>TTAAGCATGACTTAGACTATC(2) TTAAGCATGACTTAGACTATC
>CAAAGGAACCGTGAGCTCAACT(1) CAAAGGAACCGTGAGCTCAACT
ADD REPLY
0
Entering edit mode

try

$ sed 's/\s\+/\n/' test.txt or $awk '{print $1"\n"$2}' test.txt

>ATGATCTATCGTGTATCACGGTCA(1)
TGACCGTGATACACGATAGATCAT
>TACGGTTCTGAAACGGAGAGTTCG(1)
CGAACTCTCCGTTTCAGAACCGTA
>GCTTGCGACGACTGAGTTGGAG(1)
GCTTGCGACGACTGAGTTGGAG
>ATTACTTGTTGTGATTGTTGGCCT(1)
ATTACTTGTTGTGATTGTTGGCCT
>ATGCCGTCGGAAATAATGAGTTTA(1)
ATGCCGTCGGAAATAATGAGTTTA
>AACAGATCCGCTGTAGCACATCGG(1)
CCGATGTGCTACAGCGGATCTGTT
>TTGGCACGAGTGACTCCTTAGAC(1)
GTCTAAGGAGTCACTCGTGCCAA
>TTAAGCATGACTTAGACTATC(2)
TTAAGCATGACTTAGACTATC
>CAAAGGAACCGTGAGCTCAACT(1)
CAAAGGAACCGTGAGCTCAACT
ADD REPLY
1
Entering edit mode
3.8 years ago
GenoMax 141k

It could be as simple as:

$ cat your_file | tr " " "\n"
>ATGATCTATCGTGTATCACGGTCA(1)
TGACCGTGATACACGATAGATCAT
>TACGGTTCTGAAACGGAGAGTTCG(1)
CGAACTCTCCGTTTCAGAACCGTA
>GCTTGCGACGACTGAGTTGGAG(1)
GCTTGCGACGACTGAGTTGGAG
>ATTACTTGTTGTGATTGTTGGCCT(1)
ATTACTTGTTGTGATTGTTGGCCT
>ATGCCGTCGGAAATAATGAGTTTA(1)
ATGCCGTCGGAAATAATGAGTTTA
>AACAGATCCGCTGTAGCACATCGG(1)
CCGATGTGCTACAGCGGATCTGTT
>TTGGCACGAGTGACTCCTTAGAC(1)
GTCTAAGGAGTCACTCGTGCCAA
>TTAAGCATGACTTAGACTATC(2)
TTAAGCATGACTTAGACTATC
>CAAAGGAACCGTGAGCTCAACT(1)
CAAAGGAACCGTGAGCTCAACT
ADD COMMENT

Login before adding your answer.

Traffic: 2628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6