Question: Failed read alignment in Plasmoidum falciparum
1
gravatar for RT
4.8 years ago by
RT10
Singapore
RT10 wrote:

I am analyzing  ChIP-Seq data of Plasmodium falciparum ( which is well known for ~80% AT , ~20 % GC) The reads are 75 bp paired-end reads and were mapped to genome using Bowtie (1&2), and BWA. I am getting a very low percentage of alignment with less than 10% for the sample of our interest, but as per the FastQC report the data quality seems to be good, though it complaints about few other like duplicates, GC content which I suppose is normal in this genome which is AT biased.

I tried to BLAST the sequence and majority of the query matches many E.coli, and many other bacterial sequences, though the post-doc who performed the assay says they never used Plasmid in the pipeline!

I welcome any suggestion on how else could we improve the alignments or troubleshoot this.

PS: Control sample read alignment was 50% as against the treated one. The sample is blood cells infected with P. falciparum, so no other sources of other genomic contamination too. This is the second time we are repeating the ChIP-Seq and last time the alignment was around 22%. :(. I just read about the GEM Mappability tool and planning to try it.

Update: Mapping to host(human) didn't turn out fruitful. But the blast results are strong implying E.coli. So we are mapping with E.coli now.  Just curious, has anyone handled low complexity libraries like Plasmodium falciparum? We would like some advice from you, as we feel that should be the problem now, as this is our first time with ChIP-Seq and this is the first time for the facility that did the experiment to handle a AT rich genome! Thanks 

ADD COMMENTlink modified 4.8 years ago by Istvan Albert ♦♦ 80k • written 4.8 years ago by RT10

Could some of the samples have high amounts of host DNA carry-over? Try aligning against the host species as well.

ADD REPLYlink written 4.8 years ago by Devon Ryan89k

Thanks Ryan, I will try it.  We did try for the previous data, but no luck!

ADD REPLYlink written 4.8 years ago by RT10
0
gravatar for Istvan Albert
4.8 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

If your sequences match E coli or other bacteria then that's that. You have sequenced E coli and other bacteria. The data preparation is not what it was claimed to be.

ADD COMMENTlink written 4.8 years ago by Istvan Albert ♦♦ 80k

Yeah it looks like it. Thanks for the suggestions. 

ADD REPLYlink written 4.8 years ago by RT10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1993 users visited in the last hour