How to extract FASTA headers in R
1
0
Entering edit mode
19 months ago
WUSCHEL ▴ 810

I have downloaded a reference uniprotkb FASTA file. How can I only extract the FASTA headers of each gene (raw-wise) into a CSV file using R?

r • 1.1k views
ADD COMMENT
0
Entering edit mode

I came across this handy tool. CSVā¬ŒFASTA https://cdcgov.github.io/CSV2FASTA/

ADD REPLY
0
Entering edit mode

This does not answer your question - it does not use R. Please accept the answer that addresses your question, do not add answers that explicitly break a requirement without specifying that the requirement is broken.

I'm moving this answer to a comment now.

ADD REPLY
6
Entering edit mode
19 months ago
cfos4698 ★ 1.1k

Here's one example using R:

library(seqinr)
fa <- read.fasta(file = "samples.fasta",   seqtype = "DNA")
CON=file("output.csv", "w")
writeLines("genes", CON)
writeLines(names(fa), CON)
close(CON)

However, is R necessary in this case? Can you just use simpler UNIX tools?

echo "genes" > output.csv
grep ">" samples.fasta | cut -c2- >> output.csv

edit: if you want specific parts of the fasta headers, you will need to be more specific in your question

ADD COMMENT

Login before adding your answer.

Traffic: 916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6