Question: Variant Annotation in R
0
gravatar for adakoury
2.7 years ago by
adakoury0
adakoury0 wrote:

I am trying to perform Variant Annotation using "VariantAnnotation" in R I was trying to make TxDb file form Gff file but I got a warning messages as shown below. I am wondering if anybody has gone into the same problem and how it was solved.

> txdb <- makeTxDbFromGFF(gtffile, format= "gtf",circ_seqs=character())
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning messages:
1: In .local(con, format, text, ...) :
  gff-version directive indicates version is 3, not 2
2: In .extract_transcripts_from_GRanges(tx_IDX, gr, type, ID, Name) :
  The following transcripts have multiple parts that were merged:
3: Named parameters not used in query: internal_chrom_id, chrom, length, is_circular
snp R • 1.2k views
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by adakoury0

A warning is not an error, so if for the rest everything goes fine with the rest of your commands then this isn't something to worry about.

ADD REPLYlink written 2.7 years ago by WouterDeCoster42k

Here is the whole process. The outcome was not something I was expecting. Any idea where I amde a mistake or missed something.

> vcf <- readVcf("chroA01.vcf", "Brassica_napus.annotation_v5_sorted_modified.gff")
> vcf
class: CollapsedVCF 
dim: 11847 189 
rowRanges(vcf):
  GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
  DataFrame with 3 columns: NS, DP, AF
info(header(vcf)):
      Number Type    Description                
   NS 1      Integer Number of Samples With Data
   DP 1      Integer Total Depth                
   AF .      Float   Allele Frequency           
geno(vcf):
  SimpleList of length 5: GT, AD, DP, GQ, PL
geno(header(vcf)):
      Number Type    Description                                               
   GT 1      String  Genotype                                                  
   AD .      Integer Allelic depths for the reference and alternate alleles ...
   DP 1      Integer Read Depth (only filtered reads used for calling)         
   GQ 1      Float   Genotype Quality                                          
   PL 3      Float   Normalized, Phred-scaled likelihoods for AA,AB,BB genot...
> gtffile <- file.path("K:/ClubrootGenetics/Reference genomes/B. napus reference genome v5/modified files/Brassica_napus.annotation_v5_sorted_modified.gff")
> txdb <- makeTxDbFromGFF(gtffile, format= "gtf",circ_seqs=character())
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning messages:
1: In .local(con, format, text, ...) :
  gff-version directive indicates version is 3                                                          , not 2
2: In .extract_transcripts_from_GRanges(tx_IDX, gr, type, ID, Name) :
  The following transcripts have multiple parts that were merged:
3: Named parameters not used in query: internal_chrom_id, chrom, length, is_circular 
> seqlevels(vcf) <- "chrA01"
> rd <- rowRanges(vcf)
> loc <- locateVariants(rd, txdb, CodingVariants())
>  head(loc, 3)
GRanges object with 0 ranges and 9 metadata columns:
   seqnames    ranges strand | LOCATION  LOCSTART    LOCEND   QUERYID      TXID
      <Rle> <IRanges>  <Rle> | <factor> <integer> <integer> <integer> <integer>
           CDSID      GENEID       PRECEDEID        FOLLOWID
   <IntegerList> <character> <CharacterList> <CharacterList>
  -------
  seqinfo: no sequences
ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by adakoury0

Hint: Highlight text and then use the "101" button to format code/output to make it readable.

ADD REPLYlink written 2.7 years ago by genomax75k

Which text do you refere to?

ADD REPLYlink written 2.7 years ago by adakoury0

e.g. from your previous post everything from >vcf ..... seqinfo: no sequences.

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by genomax75k

I did. Is it readable now?

ADD REPLYlink written 2.7 years ago by adakoury0

You could check if the variables gtffile, txdb and rd look "as expected" to figure out which step went wrong.

ADD REPLYlink written 2.7 years ago by WouterDeCoster42k

I solved the problem. there was an error in the gff file.

ADD REPLYlink written 2.7 years ago by adakoury0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 865 users visited in the last hour