Question

Failed read alignment in Plasmoidum falciparum

1

Entering edit mode

9.8 years ago

RT ▴ 10

I am analyzing ChIP-Seq data of Plasmodium falciparum ( which is well known for ~80% AT , ~20 % GC) The reads are 75 bp paired-end reads and were mapped to genome using Bowtie (1&2), and BWA. I am getting a very low percentage of alignment with less than 10% for the sample of our interest, but as per the FastQC report the data quality seems to be good, though it complaints about few other like duplicates, GC content which I suppose is normal in this genome which is AT biased.

I tried to BLAST the sequence and majority of the query matches many E.coli, and many other bacterial sequences, though the post-doc who performed the assay says they never used Plasmid in the pipeline!

I welcome any suggestion on how else could we improve the alignments or troubleshoot this.

PS: Control sample read alignment was 50% as against the treated one. The sample is blood cells infected with P. falciparum, so no other sources of other genomic contamination too. This is the second time we are repeating the ChIP-Seq and last time the alignment was around 22%. :(. I just read about the GEM Mappability tool and planning to try it.

Update: Mapping to host(human) didn't turn out fruitful. But the blast results are strong implying E.coli. So we are mapping with E.coli now. Just curious, has anyone handled low complexity libraries like Plasmodium falciparum? We would like some advice from you, as we feel that should be the problem now, as this is our first time with ChIP-Seq and this is the first time for the facility that did the experiment to handle a AT rich genome! Thanks

short-reads-mapping plasmodium ChIP-Seq • 2.5k views

ADD COMMENT • link updated 2.3 years ago by Ram 43k • written 9.8 years ago by RT ▴ 10

0

Entering edit mode

Could some of the samples have high amounts of host DNA carry-over? Try aligning against the host species as well.

ADD REPLY • link 9.8 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks Ryan, I will try it. We did try for the previous data, but no luck!

ADD REPLY • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by RT ▴ 10

Ram · Answer 1 · 2014-07-10

0

Entering edit mode

9.8 years ago

Istvan Albert 100k

If your sequences match E coli or other bacteria then that's that. You have sequenced E coli and other bacteria. The data preparation is not what it was claimed to be.

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by Istvan Albert 100k

0

Entering edit mode

Yeah it looks like it. Thanks for the suggestions.

ADD REPLY • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by RT ▴ 10