Question: sequence in different line
0
gravatar for Bulbul Ahmed
9 weeks ago by
Bulbul Ahmed20
United States
Bulbul Ahmed20 wrote:

I have fasta file in this format (one line)

>accession1     GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA

How will i convert into the below format(seperate line for sequence) using perl script or any other way

>accession1     
GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     
TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA
rna-seq perl • 366 views
ADD COMMENTlink modified 9 weeks ago by genomax34k • written 9 weeks ago by Bulbul Ahmed20
1

Substitute tab or space with newline use unix tr

ADD REPLYlink written 9 weeks ago by Ashutosh Pandey11k

which command should i use in rehat??

ADD REPLYlink written 9 weeks ago by Bulbul Ahmed20
2
cat yourinput | tr '\t' '\n' > youroutput.fa

Although we can't see which whitespace is between your accession identifier and the actual sequence.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by WouterDeCoster22k

thank so much sir. i will try this, hopefully it will work

ADD REPLYlink written 9 weeks ago by Bulbul Ahmed20

Maybe sed -r 's#\s+#\n#' input >output then?

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by Ram12k

Bah, I prefer:

sed -r 's|\s+|\n|' input >output
ADD REPLYlink written 9 weeks ago by WouterDeCoster22k

So, a different delimiter?

ADD REPLYlink written 9 weeks ago by Ram12k

Exactly ;-)

[just some slight Friday night trolling]

ADD REPLYlink written 9 weeks ago by WouterDeCoster22k

Strictly speaking, this is not really bioinformatics.

ADD REPLYlink written 9 weeks ago by Ram12k
2

I don't know... it seems like an awful lot of bioinformatics is just reformatting text files :)

Personally, in this case, I would copy and paste into Notepad++, which allows search/replace of \t for \n. But then I had never seen "tr" before, so I learned something from the thread!

ADD REPLYlink written 9 weeks ago by Brian Bushnell14k
1

tr is good, but I use it more for squeezing consecutive white spaces (tr -s) or for quick deletion (tr -d) than to replace. I prefer sed for all replace operations as it has better granular control.

ADD REPLYlink written 9 weeks ago by Ram12k

I have fasta file in this format (one line)

Then it's not a FASTA. While it's not a bioinformatics question per se, the OP is at least using a file with sequence information.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by st.ph.n1.8k

Yeah, it satisfies that, but really? A find+replace operation?

ADD REPLYlink written 9 weeks ago by Ram12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 962 users visited in the last hour