Question: GC content from bam to fasta file
0
gravatar for GK1610
22 months ago by
GK161090
United States
GK161090 wrote:

I want to get mean_gc content of a given bed regions from a bam file

I use

samtools fasta $bam > $bam.fasta
bedtools nuc -fi $bam.fasta -bed $regions.bed >$bam_gc_content.bed

this gives me the error 1:67000041-67000044) beyond the length of chr1 size (0 bp). Skipping.

my regions.bed looks like this

1   27719   30282
1   34575   35119
1   178413  180107
1   196349  196596
1   198074  200761

fasta file bam file (paired end)

>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/2
AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/1
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/2
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/2
AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/1
CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
>HWI-D00309:105:C7HAGANXX:1:2104:19269:80273/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2212:13536:55714/2
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:2307:16222:92746/1
TGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGG
>HWI-D00309:105:C7HAGANXX:1:1313:15716:47136/2
GTTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTT
>HWI-D00309:105:C7HAGANXX:1:1105:12543:6619/1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAGGACC
>HWI-D00309:105:C7HAGANXX:1:1210:15413:21872/1
TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAAC
>HWI-D00309:105:C7HAGANXX:1:2202:2887:84643/1

How do i resolve this issue?

chip-seq • 496 views
ADD COMMENTlink written 22 months ago by GK161090
0
gravatar for GenoMax
22 months ago by
GenoMax94k
United States
GenoMax94k wrote:

Your bam.fasta file is now essentially original reads converted to fasta format. How can bedtools correlate them with your regions.bed file which contains chromosomal identifiers?
You should extract reads that span the intervals with samtools view first (A: Extract Reads From A Bam File That Fall Within A Given Region ) and then convert them to fasta for GC analysis.

ADD COMMENTlink modified 22 months ago • written 22 months ago by GenoMax94k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1952 users visited in the last hour
_