Question: Blat Input From Vcf File
0
gravatar for GPR
6.3 years ago by
GPR310
Mexico
GPR310 wrote:

Hello, I want to do a BLAT alignment on the SNPs I got in my analysis. My VarScan output is in vcf format. Any hints at how to convert a vcf file to one amenable to input in BLAT? Thanks, G.

vcf blat • 2.4k views
ADD COMMENTlink modified 6.3 years ago by Liping30 • written 6.3 years ago by GPR310

You might want to consider explaining why you need to align variant calls to a reference genome. Are you trying to visualize the location of your variants? If so, you might consider using something like IGV and loading your aligned reads and vcf.

ADD REPLYlink written 6.3 years ago by Matt Shirley8.9k

I want to identify alignments to paralogous genes and eliminate these, to filter out potential false positive calls in my analysis.

ADD REPLYlink written 6.3 years ago by GPR310
1
gravatar for gs
6.3 years ago by
gs10
gs10 wrote:

I don't know BLAT, what shall it look like ? I'm just reading/converting the 1000 genomes/.../omni vcf-files

testing how to post here

--------edit-------------

OK, http://en.wikipedia.org/wiki/BLAT_(bioinformatics)

------------------edit------------------

downloaded blat ...

blat - Standalone BLAT v. 33x5 fast sequence search command line tool usage: blat database query [-ooc=11.ooc] output.psl where: database and query are each either a .fa , .nib or .2bit file,

or a list these files one file name per line.

so, convert .vcf to fasta ? that's what I'm currently doing ... (but considering the SNP positions only to reduce the size)

but why align the fasta with blat then ? It should be aligned, - same positions for the sequences in the vcf

... unless you want to merge two vcfs with different positions ?!?

ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by gs10
1
gravatar for Liping
6.3 years ago by
Liping30
United States
Liping30 wrote:

One possible solution to this is:

You extract flanking sequences of all your SNVs in your VCF file with the BEDtools, and then do a BLAT alignment on the flanking sequences.

  1. grep -v -e '^#' your.vcf | awk 'BEGIN{OFS="\t"}{print $1,$2-50,$2+50,$1 ":" $2}' >try.bed
  2. bedtools getfasta -fi ucsc.hg19.fasta -bed try.bed -fo VCF_SNVs.fasta -name
  3. blat ucsc.hg19.fasta VCF_SNVs.fasta output.psl
ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by Liping30

Will try this. Many thanks. G.

ADD REPLYlink written 6.3 years ago by GPR310
0
gravatar for GPR
6.3 years ago by
GPR310
Mexico
GPR310 wrote:

I want to identify alignments to paralogous genes and eliminate these, to filter out potential false positive calls in my analysis. Thanks for your help. G.

ADD COMMENTlink written 6.3 years ago by GPR310
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 930 users visited in the last hour