No hits for influenza virus in an RNA seq. data using FAST Q screen (from babraham institute)
1
0
Entering edit mode
6 months ago

Hi All, I have been trying to fetch the reads mapped to influenza virus genome (negative sense RNA) in an RNA seq. data from chicken infection experiment that I have recently done., using Fastq screen (https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/). It is a nice tool, where you can edit the configuration file (by adding database/s to screen against, an aligner. I have used two aligner, the BWA and the Bowtie2, have indexed the genomes of the 3- databases (mice as a control negative, chicken, and influenza virus of the same strain used for infection plus another reference strain). The indexing was correct:

bwa index Guangdong_HA
bowtie2-build Guangdong_HA.fasta Guangdong_HA 

Then I run Fastq screen : here I used bwa, but have done this for bowtie2 and I am telling the tools to align my fastq file (infected at 6 hpi) against the 3-databases.

FastQ-Screen-0.15.3/fastq_screen --aligner bwa /mnt/lustre/RDS-live/samir/ephemeral/Infection_expe/fastq/Ross-6h-A_R1_trimmed.fastq

I have done the same for fasta file that are produced from STAR alignment basically after removing the reads aligned to the chicken genome.

FastQ-Screen-0.15.3/fastq_screen --aligner bwa /mnt/lustre/RDS-live/samir/ephemeral/Infection_expe/fastq/unmapped_fastq/Ross_6h_A1Unmapped.out.mate1.fastq

I have run this for also infection at 12 hpi, I obtained nice mapping to chicken, no unique mapping to mice and no mapping to any of the flu RNA databases, see below enter image description here

I am sure it is something in the aligner: Did bwa aligner works when aligning RNA (my samples) against RNA (the flu database) ? Did Bowtie2 did the same ?

Could any of you explain to me why do not I have reads mapped to the virus. Peoples who done the infection told they have nicely infected the samples.

Thanks

alignment Linux mapping • 595 views
ADD COMMENT
1
Entering edit mode
6 months ago
Chris Dean ▴ 390

Some possible reasons why you did not observe any hits to the viral reference segments in your database are because (1) there was no viral RNA in the sample you collected; (2) a lack of sequencing depth precluded you from identifying any viral RNA that might have been in your sample; or (3) the viral reference segments do not match closely enough to the viral RNA in your sample.

Assuming the x-axis represents the total proportion of reads in your sample, these reasons seem plausible as a majority of your reads were assigned to the chicken and mice reference genome. Consider extracting the unmapped reads (the grey proportion at the bottom of your graphic) and run a simple BLAST search on them to see what they come back as.

ADD COMMENT
0
Entering edit mode

Thanks for your comments: (1) there was no viral RNA in the sample you collected: this is something we can not validate unless we run RT-qPCR on the sample, which is not feasible at the moment, but could be underpowered infection, because it is actually infecting chicken egg.

(2) a lack of sequencing depth precluded you from identifying any viral RNA that might have been in your sample; we run 40 X million read depth, which is high, Do you think using blast search could get me any remnants if the seq depth was not high enough to capture virus ?

or (3) the viral reference segments do not match closely enough to the viral RNA in your sample: I do not think so, because I run the alignment against the actual virus that was used in the infection. there might be changes during the infection itself, that is why I align it also against a reference influenza strain, which also did not produce any hits as well.

A general last question: do you think the polyA enrichment during the sequencing could preclude detection the virus reads as they are not polyadenylated.

ADD REPLY
0
Entering edit mode

(1) Propogating an influenza virus in embryonated chicken eggs is a common practice for enrichment purposes, but it does not look like it was effective based on the graphs you are showing and without confirmation of the virus using qPCR or a hemagglutination assay.

(2) If the sequencing depth was not high enough to capture the virus, BLAST will not be able to find it. I guess my point was to take the reads that did not map to anything and then BLAST them to see what hits you get. You may get hits to a viral segment not currently represented in your reference database, e.g., PB (segment 3), NA (segment 4) or NS1/NS2 (segment 8) or human/bacterial background DNA.

(3) This sounds fine. Try adding in the other segments if they are available, i.e., those representing the polymerase, neuraminidase and non-structural proteins.

I do not have an educated opinion about your last question.

ADD REPLY

Login before adding your answer.

Traffic: 1651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6