I'm working on annotating transcripts and what I have are genomic coordinates, chromosome info, and polyA signals for mouse. I'm using ftp://ftp.ncbi.nih.gov/genomes/Mus_musculus/CHR_01/ as a reference file. Currently I'm matching genomic coordinates on a range of +/- 100,000bp, but I'm thinking that might be too broad. Is there a smaller range I ought to use based on usual methods of gene annotation?
I'm new to this so would appreciate any advice and guidance! Thank you.