Question: GC content from bam to fasta file
0
gravatar for GK1610
16 months ago by
GK161080
United States
GK161080 wrote:

I want to get mean_gc content of a given bed regions from a bam file

I use

samtools fasta $bam > $bam.fasta
bedtools nuc -fi $bam.fasta -bed $regions.bed >$bam_gc_content.bed

this gives me the error 1:67000041-67000044) beyond the length of chr1 size (0 bp). Skipping.

my regions.bed looks like this

1   27719   30282
1   34575   35119
1   178413  180107
1   196349  196596
1   198074  200761

fasta file bam file (paired end)

>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/2
AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/1
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/2
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/2
AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/1
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/2
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/2
GTTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTT
>HWI-D00309:105:C7HAGANXX:1:1105:12543:6619/1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAGGACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/1
TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2202:2887:84643/1

How do i resolve this issue?

chip-seq • 402 views
ADD COMMENTlink written 16 months ago by GK161080
0
gravatar for genomax
16 months ago by
genomax85k
United States
genomax85k wrote:

Your bam.fasta file is now essentially original reads converted to fasta format. How can bedtools correlate them with your regions.bed file which contains chromosomal identifiers?
You should extract reads that span the intervals with samtools view first (A: Extract Reads From A Bam File That Fall Within A Given Region ) and then convert them to fasta for GC analysis.

ADD COMMENTlink modified 16 months ago • written 16 months ago by genomax85k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 714 users visited in the last hour