Correct GC bias using read counts and Loess regression method
1
0
Entering edit mode
7.2 years ago

I read this post ( C: Compute Binned Gc-Normalized Read Counts From Bam File ) and I am wondering how can I remove GC content bias with Loess regression?

loess GC bias GC content read counts • 5.6k views
ADD COMMENT
1
Entering edit mode

If you have an issue with GC-bias, why not use computeGCBias and correctGCBias from deepTools?

ADD REPLY
0
Entering edit mode

Actually I want write my own code. I want to know about the method that use loess output and do correction for read counts. but Thanks for your answer. I will look at them maybe I can find their methods.

ADD REPLY
3
Entering edit mode
5.1 years ago
Leon ▴ 130

Get 300kb bin window:

bedtools makewindows -g hg38.chrom.size -w 300000 > hg38_300.bed

hg38.chrom.size like this:

chr13 114364328
chr18 80373285
chr21 46709983

Get gc content

bedtools nuc -fi hs38DH.fa -bed hg38_300.bed | cut -f 1-3,5 > 300.gc.bed

Get depth in each bin:

bedtools coverage -a hg38_300.bed -b S1901020.aln.bam > S1901020.counts

combine gc content and depth:

paste <(grep -v '#' 300.gc.bed) <(cut -f4 S1901020.counts)|sed '1i chr\tstart\tend\tGC\tRC' > nipt

Rscript

RC_DT<- read.table('nipt',sep='\t',head=TRUE) gcCount.loess <- loess(RC~GC,data=RC_DT,control = loess.control(surface = "direct"),degree=2) predictions1<- predict (gcCount.loess,RC_DT$GC)
resi <- RC_DT$RC-predictions1
RC_DT$RC <- resi
the corrected RC in RC_DT$RC, more help about loess can be got in help document in R.

ADD COMMENT
1
Entering edit mode

for MacOS users, use gsed instead if sed, otherwise you will get an error sed: 1: "1i chr\tstart\tend\tGC\tRC": command i expects \ followed by tex

ADD REPLY

Login before adding your answer.

Traffic: 2480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6