Question: GC content from bam to fasta file
0
gravatar for GK1610
11 weeks ago by
GK161060
United States
GK161060 wrote:

I want to get mean_gc content of a given bed regions from a bam file

I use

samtools fasta $bam > $bam.fasta
bedtools nuc -fi $bam.fasta -bed $regions.bed >$bam_gc_content.bed

this gives me the error 1:67000041-67000044) beyond the length of chr1 size (0 bp). Skipping.

my regions.bed looks like this

1   27719   30282
1   34575   35119
1   178413  180107
1   196349  196596
1   198074  200761

fasta file bam file (paired end)

>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/2
AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/1
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/2
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/2
AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/1
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/2
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/2
GTTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTT
>HWI-D00309:105:C7HAGANXX:1:1105:12543:6619/1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAGGACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/1
TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2202:2887:84643/1

How do i resolve this issue?

chip-seq • 145 views
ADD COMMENTlink written 11 weeks ago by GK161060
0
gravatar for genomax
11 weeks ago by
genomax67k
United States
genomax67k wrote:

Your bam.fasta file is now essentially original reads converted to fasta format. How can bedtools correlate them with your regions.bed file which contains chromosomal identifiers?
You should extract reads that span the intervals with samtools view first (A: Extract Reads From A Bam File That Fall Within A Given Region ) and then convert them to fasta for GC analysis.

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by genomax67k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2300 users visited in the last hour