Question: sequence in different line
0
gravatar for Bulbul Ahmed
11 months ago by
Bulbul Ahmed20
United States
Bulbul Ahmed20 wrote:

I have fasta file in this format (one line)

>accession1     GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA

How will i convert into the below format(seperate line for sequence) using perl script or any other way

>accession1     
GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     
TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA
rna-seq perl • 733 views
ADD COMMENTlink modified 11 months ago by genomax52k • written 11 months ago by Bulbul Ahmed20
1

Substitute tab or space with newline use unix tr

ADD REPLYlink written 11 months ago by Ashutosh Pandey11k

which command should i use in rehat??

ADD REPLYlink written 11 months ago by Bulbul Ahmed20
2
cat yourinput | tr '\t' '\n' > youroutput.fa

Although we can't see which whitespace is between your accession identifier and the actual sequence.

ADD REPLYlink modified 11 months ago • written 11 months ago by WouterDeCoster30k

thank so much sir. i will try this, hopefully it will work

ADD REPLYlink written 11 months ago by Bulbul Ahmed20

Maybe sed -r 's#\s+#\n#' input >output then?

ADD REPLYlink modified 11 months ago • written 11 months ago by Ram16k

Bah, I prefer:

sed -r 's|\s+|\n|' input >output
ADD REPLYlink written 11 months ago by WouterDeCoster30k

So, a different delimiter?

ADD REPLYlink written 11 months ago by Ram16k

Exactly ;-)

[just some slight Friday night trolling]

ADD REPLYlink written 11 months ago by WouterDeCoster30k

Strictly speaking, this is not really bioinformatics.

ADD REPLYlink written 11 months ago by Ram16k
2

I don't know... it seems like an awful lot of bioinformatics is just reformatting text files :)

Personally, in this case, I would copy and paste into Notepad++, which allows search/replace of \t for \n. But then I had never seen "tr" before, so I learned something from the thread!

ADD REPLYlink written 11 months ago by Brian Bushnell15k
1

tr is good, but I use it more for squeezing consecutive white spaces (tr -s) or for quick deletion (tr -d) than to replace. I prefer sed for all replace operations as it has better granular control.

ADD REPLYlink written 11 months ago by Ram16k

I have fasta file in this format (one line)

Then it's not a FASTA. While it's not a bioinformatics question per se, the OP is at least using a file with sequence information.

ADD REPLYlink modified 11 months ago • written 11 months ago by st.ph.n2.3k

Yeah, it satisfies that, but really? A find+replace operation?

ADD REPLYlink written 11 months ago by Ram16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 561 users visited in the last hour