How to extract variable nucleotide regions from a list of contigs
1
0
Entering edit mode
3.6 years ago
kayrouz.1 • 0

I have a list of about 6000 NCBI contig accession numbers and I'd like to extract a specific 30kb region from each contig. I have, in a separate file, a list of "begin" and "end" indices that represent the region of interest for each contig. Is there a way to retrieve a fasta file of these trimmed contigs using an Entrez query? Given that I'm not a very skilled programmer, I would have just put the sequences in an excel spreadsheet and trimmed accordingly, but the sequence strings are too long to fit in an excel cell. Is there a simple way to do this via E-Utilities?

blast contig extract refseq • 1.1k views
ADD COMMENT
0
Entering edit mode

Do you have a reference genome? If you do you could use bedtools getfasta.

ADD REPLY
0
Entering edit mode
3.6 years ago

To the point explanation of bedtools getfasta is here

ADD COMMENT

Login before adding your answer.

Traffic: 2030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6