BigWig file of strand-specific RNA-seq data
1
1
Entering edit mode
3.4 years ago
yuabrahamliu ▴ 60

Hi all, I began to process strand-specific RNA-seq data from dUTR method library these days. I added the flag --rna-strandness RF to the default hisat2 commandline on paired-end RNA-seq data and each line of the final bam file also got a XS:A:+ or XS:A:- value to indicate the plus or minus strand. Then, I coverted the bam files to bigwig files and submited them to UCSC browser to check the data. However, the reads in the browser still displayed no strand specificity. I mean, a read would be displayed by both plus and minus strand, rather than only one strand. (See the figure below). I feel confused and don't known what's wrong with that. Could any one give me some help? My code to covert bam to bigwig is like this:  genomeCoverageBed -bg -split -ibam KO.bam -strand + -g mm10.chrom.sizes > KO.p.bedgraph   genomeCoverageBed -bg -split -ibam KO.bam -strand '-' -g mm10.chrom.sizes > KO.m.bedgraph   norm_bedgraph.pl -t 33441081 -i 'KO.p.bedgraph KO.m.bedgraph' -m 'KO.p.bedgraph'   bedGraphToBigWig KO.p.bedgraph.normolized mm10.chrom.sizes KO.p.bw   bedGraphToBigWig KO.m.bedgraph.normolized mm10.chrom.sizes KO.m.bw 

RNA-Seq strand-specific dUTP BigWig • 4.2k views
1
Entering edit mode
3.4 years ago

There is no concept of strand in bigWig files, they only hold positions with values associated with them. If you want to have strand-specific bigWig files then you'll need to make a bigWig file for each strand. You can do that with bamCoverage from deepTools, using the --filterRNAstrand option.

genomeCoverageBed is using a different understanding of strand than what you want. It's splitting individual reads by their orientation, so since you presumably have paired-end data, the mates from each pair will end up in different files. What you want instead is a tool that will treat mates the same and count them (or not) depending on the orientation of the fragment from which they arose.

0
Entering edit mode

Thank you so much for your help. I have changed genomeCoverageBed to bamCoverage, but unfortunately, the plus and minus strands still cannot be distinguished. I feel really confused, I don't know if there is something wrong with my code. Do you have any suggestions on dUTP data? Thank you.  trimmomatic PE -threads 20 SRR_1.fastq SRR_2.fastq trimmed_1.fastq unpaired_1.fastq trimmed_2.fastq unpaired_2.fastq SLIDINGWINDOW:4:30   hisat2 -p 20 --rna-strandness RF -x mm10hisatidx -1 trimmed_1.fastq -2 trimmed_2.fastq |samtools sort -@ 20 > test.bam   samtools index -@ 20 test.bam   bamCoverage -p 20 --normalizeTo1x 2150570000 --filterRNAstrand forward -b test.bam -o test.m.bw   bamCoverage -p 20 --normalizeTo1x 2150570000 --filterRNAstrand reverse -b test.bam -o test.p.bw 

0
Entering edit mode

Look at the scale, there is a clear distinction with presumably either some low level of anti-sense transcription or (more likely) the strandedness of the library isn't perfect.