How to extract variable nucleotide regions from a list of contigs
1
0
Entering edit mode
3.6 years ago
kayrouz.1 • 0

I have a list of about 6000 NCBI contig accession numbers and I'd like to extract a specific 30kb region from each contig. I have, in a separate file, a list of "begin" and "end" indices that represent the region of interest for each contig. Is there a way to retrieve a fasta file of these trimmed contigs using an Entrez query? Given that I'm not a very skilled programmer, I would have just put the sequences in an excel spreadsheet and trimmed accordingly, but the sequence strings are too long to fit in an excel cell. Is there a simple way to do this via E-Utilities?

blast contig extract refseq • 1.1k views
0
Entering edit mode

Do you have a reference genome? If you do you could use bedtools getfasta.

0
Entering edit mode
3.6 years ago

To the point explanation of bedtools getfasta is here