Coverage drops in fastq alignment against custom Immunoglobulin reference
Entering edit mode
15 months ago
Gama313 ▴ 90

I am working on Hiseq2000/2500 single end reads on RNASeq leukemia samples. I am interested in aligning all the reads beloging to the Immunoglobulin genes (Ig) for further analysis. The task is difficult for two main reasons:

  • Final Ig genes are the result of a "collage" of genomic regions and alignment is difficult;
  • Ig are hypermutated, meaning that there are a lot of mutations interfering with the alignment;

I decided to develop my own pipeline since those available are inappropriate. I post a brief resume of the workflow:

  • Star alignment agains hg38;
  • Extraction of unmapped reads;
  • Extraction of reads mapping to the Ig locus;
  • De novo construction of the Ig with Trinity;
  • Alignment of de novo sequences with IgBlast;
  • Select the "correct" Ig;
  • Align sequences against the custom Ig reference with bowtie2 (--very-sensitive-local).

I've analyzed around 100 samples, 70% of those align correctly displaying a uniform coverage. The 30% of samples show a non-uniform coverage profile (See Fig. attached). As you can see, there is the common 3' bias with the coverage increasing at the end of the Ig. However, there is a huge drop in the middle of the Ig and I cannot understand whether is a kind of artifact lacking those reads or is the alignment parameters wrong.

enter image description here To resume, I cannot understand how is possible to observe different coverage profiles for samples belonging to the very same experiment. Any suggestion will be helpful

Alignment RNASeq Immunoglobulin • 252 views

Login before adding your answer.

Traffic: 1224 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6