Question: How to do NGS Analysis of a Particular Gene?
0
gravatar for user5212
18 months ago by
user52120
user52120 wrote:

I have reads (fastq data) of a particular gene from the human genome hg38. I also have the genbank (GBK) file and the fasta file of the gene of interest. I want to know what variants map to each exon of that gene and the coverage of each exon of that gene.

For example, I want to be able to say: variant T > A occurs at hg38 reference position chr12:88813734 which is exon 1/14 of the gene, variant TGGGA > TA occurs at hg38 reference position chr12:88847259 which is exon 5/14 of the gene, and so on. Exon 1 has 10000X coverage, Exon 2 has 11500X coverage, and so on.

Is there a variant caller that does this? If not, what protocol would you use to do this kind of analysis?

I understand not all variants reported will map to the exons. Is there a variant caller that will tell me the IVS that the variant maps to. For example, variant G > C occurs at hg38 reference position chr12:88846432 which is intron7 of the gene (IVS-7).

snp dna-seq gene • 495 views
ADD COMMENTlink modified 18 months ago by finswimmer12k • written 18 months ago by user52120
1

If you are only interested in one specific gene there may be other techniques that would be more cost effective compared to NGS.

ADD REPLYlink written 18 months ago by genomax72k
2
gravatar for finswimmer
18 months ago by
finswimmer12k
Germany
finswimmer12k wrote:

Hello,

Variant calling is just one of many steps you have to do to get your desired output. At minimum you first need to map and align your reads of the fastq file to your reference genome (e. g. with bwa) . Than you can do variant calling only for your region of interest (e. g. with GATK HaplotypeCaller or freebayes) . To get to know in which intron/exon the variant is located an annotation is required (e.g. with SnpSift/Snpeff). The variant caller also reports the read depth at this position. If you need it for every region you have to use a tool like bedtools.

You see there is a lot work to do. But don't hesitate to ask a concrete question on that way.

fin swimmer

ADD COMMENTlink written 18 months ago by finswimmer12k

Thanks your help. As a brief follow-up question, I know my gene is located on Chromosome 12. Instead of mapping my reads to the entire reference genome (Hg38), can I simply map and align my reads to Chromosome 12 of the Hg38 reference genome?

ADD REPLYlink written 18 months ago by user52120
1

This is possible. But I wouldn't recommend it. Dependig on the method used for library preparation you always have sequences that doesn't belong to your target region. It's better to map these reads to their real origin. If you doesn't provide its reference it might happen that the reads are mapped wrong.

You don't have to be scared about the time needed for mapping against the whole genome. The time depends on the number of reads you have and their length and not on the reference.

fin swimmer

ADD REPLYlink written 18 months ago by finswimmer12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2245 users visited in the last hour