How does Fasta protein split into columns?
0
0
Entering edit mode
16 months ago

How to split the ID parts such as NP_001007096.1 into one column, the rest of the specific naming into the 2nd column, and the sequence data into the 3rd column in the protein fasta protein file by using R programming?

column Fasta R • 799 views
ADD COMMENT
0
Entering edit mode

Provide an example of the headers.

ADD REPLY
0
Entering edit mode
NP_001343651.1 Uncharacterized protein CELE_F07F6.2, partial [Caenorhabditis elegans]
IEAKCNLRLHLRWYSVGLIFFSFIPIYYSIIVCQPQQFKIDGFELINPVFNKHHSTRSCTSATKSLQNGLIALFIFYALKIYKKVYLIVLYIILIIHFGFEIRNAKSETSRKYIAISTREMFMYYVELLLLYFQNLLLLPYICGGYFLIRHVHRIPSKEEVDRQTLKFKEEARRIKRLMIEEDWHVKEANEDVKNKIEQEGLKRKDMEFEEQLYHLRIEKVKRREQVLKQKLEEKKAKRRQNAERRKKRREIAMEQREQ
ADD REPLY
0
Entering edit mode

Is this a FASTA? Will the first line always start with '>'? Is the sequence always on a single line regardless of length or will it wrap?

ADD REPLY
0
Entering edit mode

In addition.. If you don't preserve the structure of a fasta file, you will be ended with a worthless fasta file. I mean that the second line must be the sequence itself. You cannot arbitrarily set in the second line any other information but the sequence

ADD REPLY

Login before adding your answer.

Traffic: 1430 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6