How to plot alignment statistics; how many reads are mapped to the genome?
1
0
Entering edit mode
2.1 years ago
anamaria ▴ 180

Hello,

I am doing RNA-seq analysis. I will have these steps performed:

hisat2 -p 12 --new-summary --summary-file $OUTPUT.hisat2.summary -x$REF -1 $R1 -2$R2 -S $OUTPUT.sam #1-Convert sam to bam samtools view -bS -o$OUTPUT.bam $OUTPUT.sam # 2- Sort bam file samtools sort$OUTPUT.bam  -o $OUTPUT.sorted.bam # 3- Generate index for bam file samtools index$OUTPUT.sorted.bam


I know that I get number of mapped and unmapped reads with:

samtools view  -b -f 2 $OUTPUT.bam > mapped.bam samtools view -b -F 2$OUTPUT.bam > unmapped.bam


Can someone please recommend me a code to make a plot like attached?

samtools RNA-Seq • 1.0k views
2
Entering edit mode
2.1 years ago

multiqc will automatically generate reports for hisat, and a bunch of other software such as fastqc.

0
Entering edit mode

Thank you so much! So basically if I use hista2 with --new-summary flag I will get summary stats that I can use with MultiQC to generate plots? Do you have any tutorial on how MultiQC is exactly used for that purpose?

Or all I need to run is: multiqc .

and it will generate teh output from whatever it find in the current directory? Please advise

1
Entering edit mode

It will look through all the files and directories contained within the directory you specify for compatible results/reports.

So for example if you have a project directory that has a directory with your fastqc results and another directory with your hisat2 results, if you specify that project directory it will generate a report that includes the fastqc results and hisat2 results.