N stretchs in high confidence genes
0
0
Entering edit mode
3.0 years ago
lessismore ★ 1.4k

Dear all,

in the frame of a project that aims at identifying and describing members of a specific gene family we used several routinary steps to identify, annotate, and analyze gene sequences, whose gene models were already published and available on public sequence databases. When going more in details with the analyses we've noticed that some of these sequences contained stretchs of N's in their genomic sequences (almost all in intron sequences). The N portions of these sequences ranges between 4 and 22% of the whole genomic sequence and this is a bit uncomfortable to me as these sequences were supposed to be high confidence gene predictions (as from the genome paper that annotated them). What would you suggest to do ? Discard them or keep them as these N's do not affect the protein domain characteristic of this gene family ? Thanks in advance for any tip

gene annotation family Nstretch • 423 views
ADD COMMENT

Login before adding your answer.

Traffic: 2087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6