Hello!
I am trying to visualize on the UCSC Genome Browser some .wig files, but I keep getting the following error message during the uploading of my custom track:
Error line 39996036 of somefile.gz: 'chrEBV' is not a valid sequence name in hg38
I have already trying to filter out from the starting .bam files all the reads corresponding to chrEBV, using the procedure suggested in Remove mitochondrial reads from BAM files.
Unfortunately, I still get the same error when trying to upload the new .wig file.
How can I make sure to get rid of all the chrEBV reads and finally manage to visualize my data on the Genome Browser?
Thanks!
how did you do that ? what was the cmd -line ? what is the output of `samtools idxstat' ?
My command line was:
samtools idxstats input.bam | cut -f 1 | grep -v chrEBV | xargs samtools view -b input.bam > output_filtered.bam
it looks ok. And how did you create the wig ?
I first created .bigWig files with the following command line:
bamCoverage -b input.bam -o output.bw -of bigwig -bs 20 -p 6 --effectiveGenomeSize 2747877777 --normalizeUsing RPKM -e 76 --centerReads
and then converted it from .bigWig to .wig, using this command line:
bigWigToWig input.bw output.wig
bamCoverage
does not allow me to create a .wig file directly, that's why I follow this two-steps procedure. Also, when uploading the tracks on the UCSC Genome Browser, I prefer to go with .wig.gz files rather than .bigWig, since for .bigWig I need to upload them first on a web-server and then provide an URL (as long as I understood).yes, I think ATPoint is right. If bamCoverage uses the SAM header dictionary to create the wig, i will create some records containing the chrEBV chromosome.
Yes, indeed his approach worked fine. Thanks anyway for the reply and the time dedicated to my issue :)