Question: Refseq vs original genome assembly and issues
0
gravatar for Biogeek
3 months ago by
Biogeek340
Biogeek340 wrote:

A long winded post with multiple questions to gauge the consensus of the 'correct' approach to RNA-seq alignment when there is a Refseq vs published assembly version of a genome present.

I've got the scenario where there is a published genome available as V1.0 and V1.1 (Genbank), and also on refseq.

I've aligned my mRNA-seq reads with STAR to Aiptasiav1.1 using the available Genbank (GCA_) files available in the above link. I've now performed gene counts and differential expression. When I use http://aiptasia.reefgenomics.org/download/aipgene_to_kxj.tsv.gz to convert the NCBI gene accessions to the original AIPGENE concessions to get functional gene annotations, I notice that there are 5 more genes in the v1.1 version compared to the v1.0 functional annotations (aipgene_to_kxj.tsv file is used for to map back). This has me questioning using version 1.1 all together. Ive looked for these 5 missing genes on NCBI and they indeed have functional annotations on NCBI.

My questions are:

Do I revert back to v 1.0 where everything seems to be complete (the original version which was published before it was submitted to Genbank) or do I download the Refseq gff3 and scaffolds and re-do my analysis. If so, how can I obtain the functional gene annotations from refseq to know what the genes are, and then how can I go about getting GO terms for downstream analysis?

Thanks for your time!

ADD COMMENTlink written 3 months ago by Biogeek340
0
gravatar for genomax
3 months ago by
genomax63k
United States
genomax63k wrote:

Prior discussion about this also took place in: NCBI genome version vs published genome version - what's 'better'?

ADD COMMENTlink written 3 months ago by genomax63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2495 users visited in the last hour