Recently, I got genome sequencing data of a plant subspecies for which I HAVE WELL ANNOTATED REFERENCE SEQUENCE information. My objective is to comparative analysis of promoter regions of a few genes. How can I get the promoter region from NGS genome sequencing reads? Can anyone provide a correct method for that?
Thanks
Thanks @ Devon Ryan.. I will look into it..Also i worry whether i can retrieve sequence rather than direct analysis to reference sequence.
Sorry, I can't parse your last sentence.
Sorry, what I mean is that in my case, I need to retrieve ~5kb upstream region of that gene (because I need that sequence for further MSA with other subspecies sequences I have). I am new to 'variant call in promoter region'. So, my impression is that using that method, I can identify the variation, but CANNOT fetch the sequence. @Devon Ryan
Ah, right, yes, calling the variants won't change the sequence. GATK has a tool that will convert a reference fasta file according to variants in a VCF file though.
Thanks a lot @Devon Ryan...I will look into it.