How to handle 'N' in Nucleotide/Genes Sequences retrieved from NCBI GeneBank?

0

Entering edit mode

6.0 years ago

ammaraakhtar3 • 0

Some sequences retrieved from NCBI, contain letter 'N', which means that these nucleotide bases are not deciphered correctly, leaving an unidentified nucleotide. Should I replace N with any other base i.e. AGTC, assuming N can be any nucleotide, or I should exclude such sequences assuming that the sequencing done was not of good quality. If none of these, what I can do with such sequences in my dataset?

sequencing genome alignment gene • 944 views

ADD COMMENT • link updated 6.0 years ago by Pierre Lindenbaum 166k • written 6.0 years ago by ammaraakhtar3 • 0

0

Entering edit mode

context is missing. What is the purpose of those sequences ?? How To Ask Good Questions On Technical And Scientific Forums

ADD REPLY • link 6.0 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Most good tools will allow Ns in the sequences but it depends what you’re intending to do with them.

The one thing you almost certainly should not do, is replace them with a random nucleotide.

ADD REPLY • link 6.0 years ago by Joe 22k

Login before adding your answer.