BedGraph file
1
0
Entering edit mode
4 months ago
daffodil ▴ 10

Hi,

I want to generate BigWig file from Bam file, and I wrote this commands:

bedtools genomecov -pc -ibam SRR212304_marked_duplicates.bam > SRR212304.bedGraph
sort -k1,1 -k2,2n SRR212304.bedGraph > SRR212304_sorted.bedGraph
./bedGraphToBigWig SRR212304.bedGraph mm10.chrom.sizes SRR21230488.bw

But I got this error

end (223) before the start (252) line 253 of SRR21230488.bedGraph

I open the file with nano and change them, but when I check the file, there are many lines with this difference, what should I have to do??

Best

bam BigWig • 1.4k views
ADD COMMENT
1
Entering edit mode

I am relatively sure you need to use the -bg (or -bga to include zero intervals) argument to genomecov to trigger bedGraph output. Otherwise it's a histogram-like output. Try that. What is the output of head SRR212304.bedGraph?

ADD REPLY
0
Entering edit mode

Thank you so much for your response I used -bg and generated novel bedgraph file and then i have sorted

sort -k1,1 -k2,2n SRR21230488_bg.bedGraph > SRR21230488_bg_sorted.bedGraph

but i got this error when i wanted to generate BigWig file

SRR21230488_bg.bedGraph is not case-sensitive sorted at line 25062706.  Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C

It would be appritiate if you could help me.

ADD REPLY
1
Entering edit mode
LC_ALL=C sort -t $'\t' -k1,1 -k2,2n SRR21230488_bg.bedGraph
ADD REPLY
0
Entering edit mode

I wrote this command LC_ALL=C sort -t $'\t' -k1,1 -k2,2n SRR21230488_bg.bedGraph but again i got this error :

SRR21230488_bg.bedGraph is not case-sensitive sorted at line 25062706.  Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C,  or bedSort and try again.
ADD REPLY
1
Entering edit mode

try with LC_COLLATE=C sort -t $'\t' -k1,1 -k2,2n SRR21230488_bg.bedGraph but I don't think it will change anything....

otherwise, show us the context of line 25062706....

ADD REPLY
0
Entering edit mode

Yes, Sure

sed -n '25062706p' SRR21230488_bg.bedGraph
chr1    3000588 3000620 1
ADD REPLY
0
Entering edit mode

if there any output for

awk -F '\t' '(int($2) > int($3))' RR21230488_bg.bedGraph | head

?

ADD REPLY
0
Entering edit mode
awk: cmd. line:2: (int($2) > int($3)
awk: cmd. line:2:                   ^ unexpected newline or end of string
ADD REPLY
1
Entering edit mode

missing parenthesis, fixed

ADD REPLY
0
Entering edit mode

I run: awk -F '\t' '(int($2) > int($3))' SRR21230488_bg.bedGraph | head there is not out put for that Again, I run.

 /bedGraphToBigWig SRR21230488_bg.bedGraph mm10.chrom.sizes SRR21230488.bw

and i got this error :

SRR21230488_bg.bedGraph is not case-sensitive sorted at line 25062706.  Please use "sort -k1,1 -k2,2n" with LC_COLLATE=C,  or bedSort and try again.
ADD REPLY
1
Entering edit mode

let's try another thing.

what is the output of

 awk  '(NF!=4)' SRR21230488_bg.bedGraph | head
ADD REPLY
1
Entering edit mode

furthermore, are you using the very last version of bedGraphToBigWig ?

ADD REPLY
0
Entering edit mode

Thanks alot Finally It work and the .bw file was generated!

ADD REPLY
0
Entering edit mode

so what was the error ? And at least, validate my answer below please.

ADD REPLY
0
Entering edit mode

I removed bedGraphToBigWig and again install it.

ADD REPLY
1
Entering edit mode

Just adding for completeness, you could use bamCoverage from deeptools. Relatively slow, but one single command to do that all. Lots of options to choose from for customization. Could save trouble.

ADD REPLY
0
Entering edit mode

To Normalize BigWig file based on bamCoverage I use this commands bamCoverage -b SRR21230488_marked_duplicates.bam -o output.bw --normalizeUsing RPKM It would be aapreciated if you let me know your comments. Best

ADD REPLY
3
Entering edit mode
4 months ago

https://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html

By default, bedtools genomecov will compute a histogram of coverage for the genome file provided. The default output format is as follows:

chromosome (or entire genome)

depth of coverage from features in input file

number of bases on chromosome (or genome) with depth equal to column 2.

size of chromosome (or entire genome) in base pairs

fraction of bases on chromosome (or entire genome) with depth equal to column 2.

so you don't get a format CHROM/START/END BUT CHROM/DEPTH/COUNT...

you want the option -bg

the -bg option instead produces genome-wide coverage output in BEDGRAPH format. This is a much more concise representation since consecutive positions with the same coverage are reported as a single output line describing the start and end coordinate of the interval having the coverage level, followed by the coverage level itself.

ADD COMMENT

Login before adding your answer.

Traffic: 1938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6