Question: Fasta File For Exomedepth And Cnv Calling
1
gravatar for Jimbou
6.2 years ago by
Jimbou710
Germany
Jimbou710 wrote:

Hello,

I have a problem using ExomeDepth with a referenceFASTA file. I analysed app. 500 target genes (>7000 Exons, hg19) and without including a FASTA file and the GC content everything worked fine.

As GC content influencing amplification, coverage and CNV detection, I want to include this factor. Therefore I tried several things to get appropiate FASTA sequences.

  1. BioMart

    mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl") FASTA <- getSequence(chromosom,start,stop,type="hgnc_symbol",seqType="gene_exon", mart = mart)

Error:

  `Reference fasta file provided so exomeDepth will compute the GC content in each window
   Error in (function (classes, fdef, mtable)  : 
   unable to find an inherited method for function ‘scanFa’ for signature ‘"data.frame", "GRanges"’`
  1. BSgenome

    FASTA<-getSeq(BSgenome.Hsapiens.UCSC.hg19,chromosom,start,stop)

    show(FASTA)

    A DNAStringSet instance of length 10 width seq [1] 845 GTTGAAAAGTGATCAGGTTCATTTTATTGACTACACAGAAGCAATTCCATTT...GAGGAGGCAGATCACGGCGAAGACAATGAAGCTGTACGGGCCGAGGCCCTC [2] 129 CCTGGATGAACGGGAAGATCAAGCCCACGGTGAAGTTGGAGAGCCAGTGCAC...GCCGAGAGGACTGCAGGAAGATCTCAGTGATGAGCAGCGCGGGTATGGGAC [3] 77 CTGGGCCCGAGGGCATGTCCTATGACGTAGGAGATGACACAGACGATGCTGATGTATGGCATCCAGGACACTGTGTC [4] 103 CCTGCAGTGCCAGAGCTGCAGTGAGCACGCAGCAGGCTATGAGGCAGATGGAGAAGCCCAGCAGCAGCAGCAGCCTCCGACCCAGGAGCTCCACCACGAACAC [5] 112 CGGCGCAGAAGGTCATGACCACGTTCACGGCCCCGGTGCCGGCCGTCACGTA...CTCCTCCGGCACGCCGGCGCTCAGGTAGATCTGGTCCGCGTAGTAGTAGAT

Error:

Reference fasta file provided so exomeDepth will compute the GC content in each window
Error in (function (classes, fdef, mtable)  : unable to find an inherited method for function ‘scanFa’ for signature ‘"DNAStringSet", "GRanges"’

3. Downloading human_g1k_v37.fasta.gz

Chromsome, Start & Stop including only the 7000 exon locations. With method 1 and 2, I was able to get target sequences, but the counting (getbamcounts) didn't work, different errors occur, warning something is wrong with the FASTA file. I think there are some file formatting issues.

my.counts<- getBamCount(data.frame(Chromsome, Start,Stop), bam.files = bam.files, include.chr = F , referenceFasta =FASTA)

I dindn't tried the human_g1k_v37.fasta.gz so far, because I don't know how to load it in R.
Do you have an idea how to transform the files that they work or how to load the whole genome FASTA File?

exon R ngs calling cnv exome • 2.7k views
ADD COMMENTlink modified 6.2 years ago • written 6.2 years ago by Jimbou710

It is always a good idea to include the output of sessionInfo() as well as any error messages you get when posting an R question.

ADD REPLYlink written 6.2 years ago by Sean Davis26k

I edited and attachted the errors

ADD REPLYlink written 6.2 years ago by Jimbou710
1
gravatar for Sean Davis
6.2 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

The reference FASTA file should be the same as that used for your mapping step, so you should have it already.

ADD COMMENTlink written 6.2 years ago by Sean Davis26k

Problem is, that I dind't mapped the bam files by myself. I only have the already mapped files.

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Jimbou710

You'll need to establish what the reference file was so that you can get it or simply remap the files yourself. When inheriting something other than raw data, it is always a good idea to establish the data provenance, including auxiliary files so that you can publish your results.

ADD REPLYlink written 6.2 years ago by Sean Davis26k
1

Ahh ok. Then I have to use exactly the same reference file which was used for the mapping of the reads. It will not work to "build" a new FASTA file for the target regions, although I'm shure of the informations human & hg19 and so on?

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Jimbou710

You'll save yourself some headache by getting the reference used for the mapping, but you can always experiment to see if something works. In particular, you'll want to get the right "build" of the genome and make sure that at least the chromosome names match the alignment chromosome names.

ADD REPLYlink written 6.2 years ago by Sean Davis26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1085 users visited in the last hour