Manhattan plot p-value
1
0
Entering edit mode
2.8 years ago

Hello, I'm shamsur. I got this snp file of a goat breed. SNP file showes CHROM, POS, ID, REF, ALT, QUAL, FILTER AND INFO. Click the link 1 I have taken screenshot from my terminal as it was easy to open from terminal. I want to make a Manhattan plot using this snp file to visualize significant snp for gwas study. But Manhattan plot require p-value for plotting.

How I'm suppose to have this p-value? Thanks in advance.

manhattan plot snp • 1.5k views
ADD COMMENT
0
Entering edit mode
2.8 years ago

Hi,

Based on the field names, your file fits the VCF (Variant Call Format) specification. Can you explain how this file was produced?

Yes, you need p-values in order to generate a Manhattan plot, and these p-values typically represent loci scatterred across the genome. Typically, we would generate such a plot using p-values from a comparison between groups, e.g., Asthmatic Adults versus Healthy Control Adults. However, as you have not explained anything about your experiment, we have no way to know where to look for these p-vales. They may be encoded in the VCF. Please paste some lines from the VCF.

Kevin

ADD COMMENT
0
Entering edit mode

Hello Kevin Blighe \ Some very common tools was used to produce this vcf file such as bwa, samtools, gatk, vcftools. ARS1 goat was used as reference genome. Commands were at below, Please ignore prefix's of command.

bwa index ref.fa\ bwa aln ref.fa read1.fq > aln1.sai\ bwa aln ref.fa read2.fq > aln2.sai\ bwa sampe ref.fa aln1.sai aln2.sai read1.fq read2.fq > aln.sam

samtools view -bS -o aln.raw.bam aln.sam\ samtools sort aln.raw.bam aln.sort

java -jar MarkDuplicates.jar \ ASSUME_SORTED=TRUE \ REMOVE_DUPLICATES=TRUE \ VALIDATION_STRINGENCY=LENIENT \ INPUT=aln.sort.bam \ OUTPUT=aln.bam \ METRICS_FILE=aln.dupli

java -jar AddOrReplaceReadGroups.jar \ INPUT=aln.bam \ OUTPUT=aln.rg.bam \ SORT_ORDER=coordinate \ CREATE_INDEX=true \ RGID=Rice01 \ RGLB=Rice3k \ RGPL=Illumina \ RGPU=ATGGGC \ RGSM=Rice VALIDATION_STRINGENCY=SILENT

java -Xmx1g -jar GenomeAnalysisTK.jar \ -T HaplotypeCaller -R $genome -I $BAM \ -o $prefix.gatk.raw.vcf \ -nct $cpu \ --genotyping_mode DISCOVERY \ -stand_call_conf 30 \ -stand_emit_conf 10

java -Xmx1g -jar GenomeAnalysisTK.jar \ -T SelectVariants \ -R $genome \ -V $prefix.gatk.raw.vcf \ -selectType SNP \ -o $prefix.gatk.snp.raw.vcf

I searched how to get p-values for snps and I came across "vcftools --hardy" command to get pvalues while generating vcf file. Are you familier with "vcftools --hardy" command. Thank you Shamsur

ADD REPLY

Login before adding your answer.

Traffic: 2014 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6