FPKM, Fragments Kilobase of exon model per millon mapped reads, which can be used to indicate the expression (abundance) characteristics of genes. Now I want to describe the operation about obtaining interested gene FPKM value.
fastq-dump: convert sra file to fastq file.
bowtie:an ultrafast and memory efficient tool for aligning sequencing reads to long reference sequences.
cufflinks:assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.
gffread: convert gff3 file to gtf file.
website: http://cufflinks.cbcb.umd.edu/ (This program is included with cufflinks package)
- Download genome.fa and genes.gff3 file from genome website; Download sra file from NCBI
$ fastq-dump -I --split-files SRR123456789.sra # convert sra file to fastq file $ gffread -E genes.gff3 -o genes.gtf # convert gff3 file to gtf file
$bowtie2-build genome.fa genome
$ bowtie2 -x genome -1 SRR123456789_1.fastq -2 SRR123456789_2.fastq -S SRR123456789.sam $ samtools view -bS SRR123456789.sam > SRR123456789.bam $ samtools sort SRR123456789.bam SRR123456789
$ cufflinks SRR123456789.bam -G genes.gtf -o result
After these operations, we can extract FPKM values from
genes.fpkm_tracking file based on gene ID.
Note: If you find some bugs, please contact me.