Question: BBSplit xenograft Human-Mouse- RawCount
0
gravatar for giovannaventola3es9
6 months ago by
giovannaventola3es930 wrote:

Hi, I have a problem with BBsplit. I have xenograft mouse-human rna-seq samples (paired fastq) and I had thought to using BBSplit to delete the mouse contamination.

So I used this command line:

bbsplit.sh in1=reads1.fq in2=reads2.fq ref=human.fa,mouse.fa ambiguous2=toss basename=out_%.fq refstats=Statistics_%.txt

Than, I have remapped the output fastq file for the human reference with STAR and then I would like to use FeatureCount to recostruct the rawcount of the genes, but it doesn't work well.

Can you recommend a pipeline to follow for rna-seq data after using bbsplit? Thaks so much for the reply.

rna-seq • 275 views
ADD COMMENTlink modified 6 months ago by Biostar ♦♦ 20 • written 6 months ago by giovannaventola3es930

but it doesn't work well.

What does not work well? After you bin the reads they should be able to map to human/mouse genomes normally. Can you post what the refstats looked like?

There is also XenofilteR (https://github.com/PeeperLab/XenofilteR ) but it sounds like your problem is not with the binning.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax75k

RefStats of one of my samples:

name    %unambiguousReads   unambiguousMB   %ambiguousReads ambiguousMB unambiguousReads    ambiguousReads  assignedReads   assignedBases
HG38    90.69745    4525.569508 6.03784 301.545669  59646218    3970720 63616938    4827115177
mm10    2.89875 144.634382  6.03784 301.545669  1906330 3970720 1906330 144634382

It seems that 90% of the total reads map on HG38 ok? Then, I use this single fastq file to map to Gencode reference using STAR and I obtain: Uniquely mapped reads % | 83.15%

Then, I use FeatureCount and I obtain:

Total alignments : 77189115                                           
Successfully assigned alignments : 8748229 (11.3%)

It Seems very low respect to STAR map...why?

Instead, if I don't use BBSplit and I map directly the fastq original file with human I obtain: 75.6% of Uniquely mapped reads % and with FeatureCount I obtaine 23.1%.

How is it possible? How can I fix it? Which of the two analyzes is better?

ADD REPLYlink modified 6 months ago by genomax75k • written 6 months ago by giovannaventola3es930

Are you using the correct stand option (-s) when counting with featureCounts?

bbsplit is clearly able to assign 90% of your reads (which uses bbmap.sh under the covers) so there should be no reason why the split file should not align well directly.

ADD REPLYlink modified 6 months ago • written 6 months ago by genomax75k

My original fastq files are prepared with reverse-stranded kit, maybe when I remove mouse of them and get a single fastq, Can I lose this info?

ADD REPLYlink modified 6 months ago • written 6 months ago by giovannaventola3es930

Fastq files themselves are not reverse stranded. The kit that was used for prep captured the reverse strand. Did you try using -s 2 option when counting reads? Have you examined your aligned files to ensure that reads are aligning properly (there is no general alignment outside of exons, i.e. possibility of DNA contamination in your prep)?

ADD REPLYlink written 6 months ago by genomax75k

yes, it is so!!! However, I used -s 2 and I obtained a low level count. I don't know if there is a DNA contamination... but what would this mean?

ADD REPLYlink written 6 months ago by giovannaventola3es930

Have you examined your aligned files in IGV? Go to genes you know should be there and see what the alignments there look like. What happens if you use -s 0? Does the assignment % go up? Just to be sure we are discussing all this for the human part of your data? The mouse part has been separated?

ADD REPLYlink written 6 months ago by genomax75k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1729 users visited in the last hour