Question: sequence in different line
0
gravatar for Bulbul Ahmed
5 months ago by
Bulbul Ahmed20
United States
Bulbul Ahmed20 wrote:

I have fasta file in this format (one line)

>accession1     GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA

How will i convert into the below format(seperate line for sequence) using perl script or any other way

>accession1     
GGGGAGCTACGGCAGCGGCGGCGGGGTGCTGCCGCTGGCGTCGCTTAA
>accession2     
TTCCGGTAGAAAATCCATTATTGCCAATGGAAGAAGTGA
rna-seq perl • 516 views
ADD COMMENTlink modified 5 months ago by genomax40k • written 5 months ago by Bulbul Ahmed20
1

Substitute tab or space with newline use unix tr

ADD REPLYlink written 5 months ago by Ashutosh Pandey11k

which command should i use in rehat??

ADD REPLYlink written 5 months ago by Bulbul Ahmed20
2
cat yourinput | tr '\t' '\n' > youroutput.fa

Although we can't see which whitespace is between your accession identifier and the actual sequence.

ADD REPLYlink modified 5 months ago • written 5 months ago by WouterDeCoster24k

thank so much sir. i will try this, hopefully it will work

ADD REPLYlink written 5 months ago by Bulbul Ahmed20

Maybe sed -r 's#\s+#\n#' input >output then?

ADD REPLYlink modified 5 months ago • written 5 months ago by Ram13k

Bah, I prefer:

sed -r 's|\s+|\n|' input >output
ADD REPLYlink written 5 months ago by WouterDeCoster24k

So, a different delimiter?

ADD REPLYlink written 5 months ago by Ram13k

Exactly ;-)

[just some slight Friday night trolling]

ADD REPLYlink written 5 months ago by WouterDeCoster24k

Strictly speaking, this is not really bioinformatics.

ADD REPLYlink written 5 months ago by Ram13k
2

I don't know... it seems like an awful lot of bioinformatics is just reformatting text files :)

Personally, in this case, I would copy and paste into Notepad++, which allows search/replace of \t for \n. But then I had never seen "tr" before, so I learned something from the thread!

ADD REPLYlink written 5 months ago by Brian Bushnell15k
1

tr is good, but I use it more for squeezing consecutive white spaces (tr -s) or for quick deletion (tr -d) than to replace. I prefer sed for all replace operations as it has better granular control.

ADD REPLYlink written 5 months ago by Ram13k

I have fasta file in this format (one line)

Then it's not a FASTA. While it's not a bioinformatics question per se, the OP is at least using a file with sequence information.

ADD REPLYlink modified 5 months ago • written 5 months ago by st.ph.n2.0k

Yeah, it satisfies that, but really? A find+replace operation?

ADD REPLYlink written 5 months ago by Ram13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 911 users visited in the last hour