How to properly modify genome .gff and .fa to combine 2 loci that are really 1 gene
2
0
Entering edit mode
7.3 years ago
michael.nagle ▴ 100

There are 2 loci in a genome that are separately annotated and were thought to be two separate genes. Experimental evidence from cDNA sequencing now confirms that they are actually one gene. Now I need to modify the .gff and .fa files and re-do bowtie/tophat/cufflinks analysis to find corrected expression values for this combined genes. I have a few questions for this.

  1. Column 8 refers to: "frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on.." What's the best way to determine what this should be? I could do it by eye, but is there software that is best for this?

  2. Is there software that I can use to use my CDS sequencing data to generate corrected data for these .gff and .fa files? Is the best way to modify these using a text editor?

Thanks for the help.

genomics • 1.5k views
ADD COMMENT
0
Entering edit mode
7.3 years ago

To partially answer your first question, exons don't necessarily consist of a number of nucleotides dividable by 3. It's perfectly possible that one nucleotide of a codon is in exon 42 and the remaining two nucleotides are in exon 43.

ADD COMMENT
0
Entering edit mode

Of course, modified question to ask if this needs to be determined by eye for every exon or if there's software for this.

ADD REPLY
0
Entering edit mode
5.9 years ago
Juke34 8.5k

You could use this service to look at the longest ORF in each exon. For the first cds piece it's easy it is 0 (If you have your complete gene)

A good way would be to load your two genes in a genome browser allowing manual curation, and you merge your genes manually, then you download the result in gff format.

ADD COMMENT

Login before adding your answer.

Traffic: 1876 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6