finding gene sequence from WGS data
1
0
Entering edit mode
3 months ago
analyst ▴ 70

I have performed variant calling and annotation analysis from WGS data. Now I need to get sequences of few genes containing variants.

How can I get sequences of particular variant containing genes?

gene WGS • 893 views
ADD COMMENT
0
Entering edit mode

I know this isn't the exact answer to your question but the most common workflow to see the impact of variants on genes is to run a variant effect predictor

ADD REPLY
2
Entering edit mode

OP seems to have at least tried doing this (if it the same data): building snpeff database for plant

ADD REPLY
0
Entering edit mode

Yes I have done variant annotation.

Its rice data, I used available rice database from snpEff.

My PI wants to perform structural analysis too like comparing structure of normal gene structure with annotated gene containing variant.

Therefore I would need to extract gene sequences for only 3 or 4 genes from our WGS data.

Your guidance is highly appreciated.

Thankyou!

ADD REPLY
2
Entering edit mode
3 months ago

It's tricky to get full sequences out of a bam, so your best bet is to make a fixed consensus sequence using your original reference fasta and your vcf.

ADD COMMENT
0
Entering edit mode

Thanks swbarnes2!

I need to extract the sequences of 3 or 4 genes only not all genes

ADD REPLY
2
Entering edit mode

It's probably simpler to just make the whole altered consensus, then pick out what you want, instead of only making the consensus for 4 regions. You can also then realign to that consensus and see if your genes of interest look good in IGV.

ADD REPLY

Login before adding your answer.

Traffic: 3146 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6