Determine type of mutation
1
0
Entering edit mode
9 weeks ago

Using hisat2, I did RNA-seq mapping against the genome. I now wish to identify the missense, nonsense, and silent mutations. I also have a ".gff" file that contains CDS, genes, etc. I can do it using for example IGV-browser, load on IGV genome, alignment (.bam), annotation (.gff), and manually examine it. But how can I automate it if I have thousands of mutations? As a result I want to see something like a table:

gene position type
A1 1020 missense
A1 1040 silent
B2 2000 silent
hisat2 alignment annotation • 542 views
1
Entering edit mode

Have you looked at the literature at all? The table you show represents the result of a fairly extensive pipeline consisting of several different tools, as described for instance by Variant analysis pipeline for accurate detection of genomic variants from transcriptome sequencing data (Adetunji et al., 2019), and the references listed therein, or related to that article as listed in the link. There are many ways to solve this problem, none of them are simple, but you should probably look over refs or tutorials so you can sketch out a plan.

0
Entering edit mode

I did not think it is too hard. I have already performed variant calling and I have .vcf file. I thought there is a way to parse all these files: annotation (.gff), variants (.vcf), alignment (.bam), genome (.fasta) to get the result I wanted.

2
Entering edit mode
9 weeks ago

You can run tools like SnpEff or the Variant Effect Predictor.

I trust you have performed steps like base quality score recalibration and variant filtration before to ensure that your calls are accurate. Variant-calling based on RNA-seq data is a quite dicey/challenging subject and way harder than from WGS data.

1
Entering edit mode

That is exactly what I was looking for! Thank you!