Question: How to extract a particular region from a nucleotide contig?
0
gravatar for saadleeshehreen
12 months ago by
saadleeshehreen70 wrote:

Hi,

I have downloaded a contig from NCBI. It is around 100 kbp long and has a integrated prophage region. The prophage region is between 54501-90604 bp. How to extract only 54501-90604 bp from this contig?

Cheers

sequence assembly • 264 views
ADD COMMENTlink modified 12 months ago by Pierre Lindenbaum131k • written 12 months ago by saadleeshehreen70
2
gravatar for WouterDeCoster
12 months ago by
Belgium
WouterDeCoster44k wrote:

use samtools faidx to index and retrieve those coordinates

ADD COMMENTlink written 12 months ago by WouterDeCoster44k
2
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:
grep -v -E '^>|^$' input.fasta | tr -d '\n' | cut -c 54501-90604 | fold -w 60
ADD COMMENTlink written 12 months ago by Pierre Lindenbaum131k
1
gravatar for Joe
12 months ago by
Joe18k
United Kingdom
Joe18k wrote:

The easiest (for me) would be:

from Bio import SeqIO

rec = SeqIO.read('/path/to/file', 'format')  # assuming only one sequence in the file
prophage = rec[54501:90604]

# write sequence out or do whatever analysis next.

You will need to double check the coordinates. Since python starts at 0, I think you may need to subtract 1 from each index.

ADD COMMENTlink written 12 months ago by Joe18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1766 users visited in the last hour