I need to retrieve the genomic locations of all genes in the given read count matrix
1
0
Entering edit mode
3.1 years ago
pavelasquezv ▴ 50

Hi,

I hope this message finds you well.

I am trying to retrieve the genomic locations of all genes in the given read count matrix. I am using the get_intervals function from the TEffectR package. Unfortunately, the genome of the Helicoverpa armigera (the species I am working on) is not available in that database ("ensembl.org"). When I run the function I get the following error

gene.annotation <- get_intervals(x = rownames(exprs), assembly="GCA_002156985.1 ", 
                                                  ID.type = "ensembl_gene_id", 
                                                  URL="https://feb2021.archive.ensembl.org" ) 

**Error in returnEnsembl(assembly, URL) : object 'y' not found**

In my case, I am using gene IDs from Helicoverpa armigera (e.g." LOC110378257", "LOC110378189") from the NCBI.

Please, I would really appreciate it if someone could give me help!

Many thanks

All the best Alex

RNAseq annotation TEffectR r • 773 views
ADD COMMENT
1
Entering edit mode
3.1 years ago
GenoMax 141k

This organism is not available in Ensembl database so one option is to download this feature table file from NCBI for this organism. Uncompress the file and then you can use the following code to get the gene names and their coordinates.

$ awk -F " " '{ if ($1 == "gene" && $2 == "protein_coding") print $12,$9,$10,$11}' GCF_002156985.1_Harm_1.0_feature_table.txt

Output should look like this

LOC110380464 9115 9504 +
LOC110380463 45742 71499 -
LOC110380466 187153 191796 -
LOC110380461 192371 207227 +
LOC110380465 207715 209876 -
LOC110380455 210916 219783 +
LOC110380462 221306 229606 +
LOC110380457 230027 248078 +
LOC110380460 253219 280667 +
LOC110380459 281676 289117 +
ADD COMMENT
0
Entering edit mode

Many thanks,

You brought me back to life!

All the best

Alex

ADD REPLY

Login before adding your answer.

Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6