I was given recently sequenced smallRNA data from the pathogen fungi Aspergillus fumigatus and Candida albicans and asked to perform differential expression of miRNAs in response to human blood infection. Therefor i had 2 replicates for each of the RNA-Seq experiments of both fungi and after their extraction from infected human blood: namely Af_1, Af_2, Ca_1, Ca_2 and HsAf-Af_1, HsAf-Af_1, HsCa-Ca_1, HsCa-Ca_2. I should develop a workflow and prepare differential expression tables
I performed so but I failed to impressed my mentor, I don't know why :( :( :(
1-checked the quality of FASTQ files by FastQC following I removed Illumina Small RNA 3' Adapters and reads shorter than 15 bp by bbduk.
2-downloaded Aspergillus_fumigatus.CADRE.32.gtf.gz, Aspergillus_fumigatus.CADRE.dna.toplevel.fa.gz, Candida_albicans_sc5314.ASM18296v2.32.gtf.gz and Candida_albicans_sc5314.ASM18296v2.dna.toplevel.fa.gz from Ensembl
3- built genomes by bowtie2-build -f genome.fa genome
4- mapped the cleaned reads on reference genome through tophat by tophat -p 10 -G file.gtf file.fq
5- Assembled transcripts by cufflinks -p 10 accepted_hits.bam
6- created merged transcriptome annotation by cuffmerge -g file.gtf -s genome.fa -p 10 ssemblies.txt
7- Identified differentially expressed genes by cuffdiff for example cuffdiff -o diff_out -b genome.fa -p 10 –L Af,HsAf -u merged_asm/merged.gtf Af_1.bam, Af_2.bam HsAf-Af_1.bam, HsAf-Af_2.bam
8- finally I extracted only significant genes
I was going to use miRNAs GTF but there was not such a files for these fungi in miRBase. there was gff3 in ensembl contains miRNAs when I converted that to GTF I got error then I used the mentioned GTFs.