Calculating GC content around a specific genomic location
2
0
Entering edit mode
5 months ago
hkarakurt ▴ 110

Hello everyone, I am looking for a method that calculates GC content of a specific genomic region. I could not find an automated Python (or R) library that calculates GC content around a location.

For example; I want the GC contect of + and - 20bp around Chr3 16500 location. Is it possible to do it?

Thank you in advance.

WGS WES GC genome SNP • 265 views
ADD COMMENT
0
Entering edit mode

Hi,

You can use Biopython to import a sequence, to subset a sequence to the region of interest and then just apply the function GC(), like appears explained in the documentation:

>>> from Bio.Seq import Seq
>>> from Bio.SeqUtils import GC
>>> my_seq = Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC")
>>> GC(my_seq)
46.875

I hope this helps,

António

ADD REPLY
1
Entering edit mode
5 months ago
samtools faidx indexed_ref.fasta `echo "chr3:16500" | awk -F: '{P=int($2);X=20;printf("%s:%d-%d",$1,(P-X<1?1:P-X),P+X);}'` | tail -n+2 | awk '{T+=length($0);gsub(/[AaTtWwNn]/,"");N+=length($0);} END{print N/T;}'
ADD COMMENT
1
Entering edit mode
5 months ago
benformatics ★ 2.1k

R version:

library(Biostrings)
library(GenomicFeatures)

## load fasta
dna <- readDNAStringSet('your_genome.fasta')

## find your relevant chromosome
grep('chr3',names(z.dna),value=T)

## if it is present subset by that chr and rename for convenience
dna.chr3 <- dna[grep('chr3',names(dna))]
names(dna.chr3) <- 'chr3'

## setup your genomic coordinates
gr <- GRanges('chr3:16500')

## flank your location
flank.gr <- flank(gr,20,both=TRUE)

## extract sequence
flank.seq <- getSeq(dna.chr3, flank.gr)
## if you wanted to get ONLY the 20 bp flanking sites - not including the internal section
## flank.gr <- setdiffflank.gr,gr)

## get AGCTN+ frequency
flank.alphafreq <- alphabetFrequency(flank.seq)

## get G/C content as a fraction
sum(flank.alphafreq[,c('G','C')])/sum(flank.alphafreq[,c('A','G','C','T')])
ADD COMMENT

Login before adding your answer.

Traffic: 2361 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6