How to extract specific samples (by ID) from Fasta file to new fasta file in R
1
0
Entering edit mode
2.4 years ago
Katya • 0

I have a question concerning the extraction of sequences from a multy fasta file with sequence headers. I have been playing around and been looking all over the internet to find a solution for this problem, but surprisingly, nothing really matches what I want to do.

code R • 757 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
2
Entering edit mode
2.4 years ago
ATpoint 81k
#/ In R using Biostrings:
library(Biostrings)

fa <- readDNAStringSet("~/foo.fa")
> fa
DNAStringSet object of length 3:
    width seq                names               
[1]     4 ATCG               chr1
[2]    12 GGATGTGTGTCA       chr2
[3]     6 GTAGCT             chr3

#/ Say we want chr2 and chr3:
fa_new <- fa[c("chr2", "chr3")]
> fa_new
DNAStringSet object of length 2:
    width seq                names               
[1]    12 GGATGTGTGTCA       chr2
[2]     6 GTAGCT             chr3

#/ write back to a file:
writeXStringSet(fa_new, "~/out.fa")
ADD COMMENT

Login before adding your answer.

Traffic: 2370 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6