identify the coordinate for coding and non_coding region.
0
0
Entering edit mode
8 weeks ago
G.S ▴ 50

Hi,

I would like to calculate the beginning and end positions for the coding and non coding regions in my genome sequence. is there any tool or script to do this ? my consensus sequence differ than the NCBI sequence. It has N stretch at the beginning.

Any help would be much appreciated. Thanks in advance

enter image description here

coding non_coding • 266 views
ADD COMMENT
0
Entering edit mode

Why do you have those N's at the beginning of the sequence? If the remainder of the sequence matches 100% then the initial N's may be wrong in your assembly.

ADD REPLY
0
Entering edit mode

mmmm I am not sure. This is how I generate my consnsus sequence

 # Get consensus fastq file
samtools mpileup -uf  KT992094.1.fasta  seq-89_markup.bam | bcftools call -c | vcfutils.pl vcf2fq > seq-89_markup_sorted.fastq

# Convert .fastq to .fasta 
seqtk seq  seq-89_markup_sorted.fastq > seq-89.fasta
ADD REPLY

Login before adding your answer.

Traffic: 991 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6