Question: hg19 or hg38 for variant calling
0
gravatar for jsneaththompson
7 months ago by
jsneaththompson50 wrote:

I've recently been troubleshooting an error in part of my variant calling pipeline, which has been traced back to me using bam files aligned to hg38 as input for an Agilent deduplication tool which has yet to migrate from hg19 to hg38. Currently my workaround is to align to hg19, deduplicate, then split the resultant sam back into fastqs and re-align to hg38, which seems convoluted.

Should I continue working with hg38 once I'm past this step, or should I stick with hg19 all the way? How do other people balance pipelines when some tools/datasets are in hg38 and others have yet to switch over from hg19? Any advice on this whole hg19 v. hg38 issue would be appreciated.

Edit: The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

ADD COMMENTlink modified 7 months ago by h.mon19k • written 7 months ago by jsneaththompson50

And you have to use this Agilent deduplication tool? There are alternatives, unless it's something specific you need.

ADD REPLYlink written 7 months ago by WouterDeCoster32k

Second that. Look at clumpify: Introducing Clumpify: Create 30% Smaller, Faster Gzipped Fastq Files. And remove duplicates.

ADD REPLYlink written 7 months ago by genomax56k

The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

ADD REPLYlink written 7 months ago by jsneaththompson50

What is the tool? Which kind of data? Is it the AgilentMBCDedup Tool, used to process the Molecular Barcode (MBC) of a HaloPlex runs?

ADD REPLYlink written 7 months ago by h.mon19k

The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

ADD REPLYlink written 7 months ago by jsneaththompson50
0
gravatar for h.mon
7 months ago by
h.mon19k
Brazil
h.mon19k wrote:

From the documentation you linked, LocatIt does not necessarily expects / uses hg19, it just expects the chomosome names will follow its conventional naming scheme. Maybe you have random / unplaced / alt chromosomes? Anyway, did you try to use the -H parameter?

-H SAM header file: By default, LocatIt expects hg19 names, chr1-chrM. If the contig names are different (for example, GRCh37 names or nonhuman), one can use this option and provide a SAM header file containing a dictionary of the contigs used by the data files, SAM/BAM and, optionally, the bed file.

ADD COMMENTlink written 7 months ago by h.mon19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1540 users visited in the last hour