Question: Annotations with biomart for ballgown
0
gravatar for schlogl
10 months ago by
schlogl70
Brazil-Florianopolis
schlogl70 wrote:

Hi guys sorry I am here again, but I try for sure to look for some answer before I come here bother you all. I following a tutorial to learn to work with R and some RNA-Seq data and time to time I have to face a different problem. Until now I have being dealing with Biomart annotation and got some error to find the exact attributes and stuff, but I got everything ok looking for at google.

But at the last part of the annotation I got to face that I didn't get any annotated genes.

This is the head of the merged gtf file that I got with stringtie2:

Chr1    StringTie   transcript  3631    5899    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; gene_name "AT1G01010.TAIR10"; xloc "XLOC_000001"; ref_gene_id "AT1G01010.TAIR10"; cmp_ref "AT1G01010.1.TAIR10"; class_code "="; tss_id "TSS1";
Chr1    StringTie   exon    3631    3913    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "1";
Chr1    StringTie   exon    3996    4276    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "2";
Chr1    StringTie   exon    4486    4605    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "3";
Chr1    StringTie   exon    4706    5095    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "4";
Chr1    StringTie   exon    5174    5326    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "5";
Chr1    StringTie   exon    5439    5899    .   +   .   transcript_id "AT1G01010.1.TAIR10"; gene_id "MSTRG.1"; exon_number "6";

And it looks very similar with the file from the guys that put out the tutorial. But at the and I got none annotated genes.

g_sign<-subset(results_genes_no_filter,pval<0.01 & abs(log2fc)>0.584)
head(g_sign)

> head(g_sign)
     feature               id           fc         pval      qval    log2fc
125     gene AT1G04610.TAIR10 5.611861e+01 0.0001284833 0.5662515  5.810407
128     gene AT1G04660.TAIR10 3.523963e+01 0.0021592230 0.5662515  5.139127
205     gene AT1G07450.TAIR10 2.100165e+00 0.0021455935 0.5662515  1.070503
220     gene AT1G07985.TAIR10 1.037975e+02 0.0002824740 0.5662515  6.697627
676     gene AT1G21830.TAIR10 4.241355e+00 0.0024521123 0.5662515  2.084525
1003    gene AT1G31360.TAIR10 4.043344e+05 0.0057775911 0.5662515 18.625189

I got this well after looking for some adjusts

    #### ANNOTATION - BioMart ########

    listMarts(host="plants.ensembl.org")

    mart=useMart("plants_mart", host="plants.ensembl.org")
    head(listDatasets(mart))[grep("thaliana",listDatasets(mart)[,1]),]

    mart <- useMart("plants_mart", dataset="athaliana_eg_gene", host="plants.ensembl.org")
    listFilters(mart)
    head(listFilters(mart),20)
    listFilters(mart)[grep("ensembl",listFilters(mart)[,1]),]
    searchAttributes(mart = mart, pattern = "ensembl_gene_id")
    listAttributes(mart = mart, page="feature_page")
getBM(attributes=c("ensembl_gene_id”,”ensembl_transcript_id”,”ensembl_peptide_id”,"ensembl_exon_id" ,”description"),mart=thale_mart)
    head(thale_data_frame)
    tail(thale_data_frame)

    ## Now match the genes from our list to this dataset
    annotated_genes = subset(thale_data_frame, ensembl_gene_id %in% g_sign$id)
    dim(annotated_genes )
[1] 0 3

There are anyway to fix it?

I follow the steps and had similar results, however the last thing that I could't fix was that my Biomart package (2.40.5) is old for my R version, but I tried to update without good exit.

> version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          6.1                         
year           2019                        
month          07                          
day            05                          
svn rev        76782                       
language       R                           
version.string R version 3.6.1 (2019-07-05)
nickname       Action of the Toes

Any suggestions or directions?

Thank you for your time! Paulo

rna-seq R • 291 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by schlogl70

No one can share any ideas, suggestions, ... 8(

ADD REPLYlink written 10 months ago by schlogl70
1
gravatar for schlogl
10 months ago by
schlogl70
Brazil-Florianopolis
schlogl70 wrote:

For who needs futures answers to the same question.

https://github.com/wolf-adam-eily/rnaseq_for_model_plant#Fourth_Point_Header

ADD COMMENTlink written 10 months ago by schlogl70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1225 users visited in the last hour