Question: Question about Base calling for pacbio data
0
gravatar for kamel
14 months ago by
kamel30
kamel30 wrote:

I looked at some links of GATK on pacbio data but I can not understand that they are the parameters that must be used on -variant_index_parameter , -stand_emit_conf , -stand_call_conf during the base calling by haplotypecaller.

Also for VariantFiltration of SNP and Indel

-filterExpression 'QD || FS || MQ || HaplotypeScore || MappingQualityRankSum || ReadPosRankSum

Could you give me an example of the base call on pacbio data with the values of these parameters.

gatk snp pacbio indel genome • 507 views
ADD COMMENTlink modified 14 months ago by lakhujanivijay4.6k • written 14 months ago by kamel30

I'm not really sure exactly what you're asking, but I'm happy to try to help out here. Are you looking recommendations for GATK HaplotypeCaller and VariantFiltration parameters for PacBio data? What type of PacBio data are you using, and how was it generated?

ADD REPLYlink written 14 months ago by wrowell0

I want to know what values should be used for these parameters on pacbio sequel data (these are whole genome data from a haploid fungus). For example on the illumina data I use the values below:

For Base calling

$   HaplotypeCaller -R Reference.fa -I sorted_marked.bam -ploidy 1 -ERC GVCF   --variant_index_type LINEAR --variant_index_parameter 128000 -stand_emit_conf 10 -stand_call_conf 30 -o raw_gVCF.vcf

For SNP filtration

$    VariantFiltration -R reference.fa -V raw-snps.vcf --filterExpression 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || HaplotypeScore > 13.0 || MappingQualityRankSum < -12.5 || ReadPosRankSum < -8.0' --filterName “my_snp_filter” -o filtered_snps.vcf

For Indel Filtration

$   VariantFiltration -R reference.fa -V raw-indel.vcf --filterExpression 'QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0' --filterName “my_indel_filter” -o filtered_indels.vcf

I have no idea about the values that should be used for these parameters on pacbio sequel data. Or do I use the same values?

ADD REPLYlink modified 14 months ago • written 14 months ago by kamel30
1

Hello kamel ,

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLYlink modified 14 months ago • written 14 months ago by lakhujanivijay4.6k

Are these subreads, or CCS?

ADD REPLYlink written 14 months ago by wrowell0

these are subreads generated by sequel

ADD REPLYlink written 14 months ago by kamel30

I have parameters for calling and filtering short variants with GATK HC on Q20+ CCS reads, but I haven't had much luck with raw subreads.

ADD REPLYlink written 14 months ago by wrowell0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1766 users visited in the last hour