Question: Cnv With Cufflinks?
gravatar for Adrian Pelin
5.8 years ago by
Adrian Pelin2.3k
Adrian Pelin2.3k wrote:


I want to know the copy number of all ORFs in my genome. Does it make sense to map genomic reads to these ORFs using either BWA or Bowtie, and then quantify FPKM values with cufflinks?

I expect most ORFs are in 1 copy, so they will have about the same FPKM. I can then normalize all FPKM values to the value corresponding to a single copy ORF, and see how many copies other ORFs have.

Does this make sense? Any pitfalls? should I use bwa or bowtie for this?

Thank you for your advice, Adrian

cufflinks cnv bwa • 1.8k views
ADD COMMENTlink modified 5.8 years ago by Charles Warden7.4k • written 5.8 years ago by Adrian Pelin2.3k
gravatar for Charles Warden
5.8 years ago by
Charles Warden7.4k
Duarte, CA
Charles Warden7.4k wrote:

I think your strategy sounds similar to the strategy used in CoNIFER (on a BWA alignment), except that has an extra (important) step using SVD to correct for biases in coverage:

Either way, CoNIFER has an RPKM function, so that can potentially make your life easier. However, I'm not sure how comfortable I'd be with copy number calls on a single sample (where you can't apply the SVD step).

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by Charles Warden7.4k

Conifer is the right tool to account for sequence alignment biases. Some regions are easier to generate reads to and will appear multicopy if you don't correct for this.

ADD REPLYlink written 5.8 years ago by karl.stamm3.5k
gravatar for Devon Ryan
5.8 years ago by
Devon Ryan92k
Freiburg, Germany
Devon Ryan92k wrote:

Firstly, unless you're working in a prokaryote, only looking at chromosome X or Y (or the equivalent for your organism), or working on single-cell sequencing, you should normally expect 2 copies of an allele.

Secondly, why would you want to shoe-horn cufflinks into an analysis for which there are already numerous pre-made programs? CNVnator is the first example that comes to mind, but there are a LOT of packages out there. Have a read through this paper for a relatively recent overview of what's out there.

ADD COMMENTlink written 5.8 years ago by Devon Ryan92k

You only expect 2 copies of an allele when the organism is diploid. Even if you are working on single cell sequencing, unless the cell is a gamete, you expect 2 copies if the organism is diploid.

However, the good news is that the 2 alleles don't have to be very divergent, so in most cases, if you adjust the settings on stringency in read mapping, reads from both alleles will map to any one of them, and thus that gene would be considered single copy.

Why do I want to use cufflinks? Because I already know what the input needs to be, the .bam file, and I know what the output will look like, my fasta headers which are ORFs together with FPKM values beside it.

The solution you proposed is very weak in documentation, and seems to measure copy numbers among specified regions. I am only interested in feeding it with the fasta file containing all ORFs and getting back the copy number.

I will look through the paper now.

ADD REPLYlink written 5.8 years ago by Adrian Pelin2.3k

Yeah, the single-cell thing was a silly mistake on my part (I had RNAseq on the brain, perhaps because of the mention of ORFs), mea culpa.

CNVnator was just an example of one solution, which happens to be geared more toward whole-genome analyses. Mapping reads arising from the whole genome (or even the whole exome or similar targeted regions) to only ORFs is not a great idea if that's what you're proposing doing (it'll lead to biased mappings). Normally you would just map genomic reads to the whole genome and call CNVs based on that (whether you end up using a genomic-window method like CNVnator or another is up to you).

ADD REPLYlink written 5.8 years ago by Devon Ryan92k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2346 users visited in the last hour