Question: Undo alignment using bamtofastq, after dedup
0
gravatar for dmyersturnbull
4.1 years ago by
Stanford University
dmyersturnbull70 wrote:

I'm a newcomer to the sequencing world. From another lab, I have a large (full human genome) mate-pair BAM file produced from the following steps:

1. Trimming of reads to 30bp for PHRED scores under 20 (software unknown).

2. Alignment against GRCh37 with BWA 0.5.8a.

3. De-duplication with GATK 1.0.4.

4. Local realignment around known indels and base score recalibration with GATK 1.0.4.

5. Picard's FixMateInformation (version unknown).

I want to realign the reads against GRCh38 using newer software; in other words, I want to undo steps 1–5, or at least 2–5.

Will SamTools bamtofastq handle this correctly? Specifically, it seems that de-duplication using an alignment against GRCh37 (step 3) permanently changed the BAM by removing reads that might be aligned differently against GRCh38. Since the only command from GATK I could find for de-duplication is MarkDuplicates, which doesn't delete any reads, I will assume this was used. Are there any other steps that would be an issue, and is bamtofasq the right way to do this?

I understand these steps algorithmically but don't know how the data in BAM format is actually altered.

Thanks!

next-gen genome • 1.4k views
ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by dmyersturnbull70
1

Did they remove duplicates or just mark them?

ADD REPLYlink written 4.1 years ago by Devon Ryan85k

Thanks! I updated my post.

ADD REPLYlink written 4.1 years ago by dmyersturnbull70
1
gravatar for Devon Ryan
4.1 years ago by
Devon Ryan85k
Freiburg, Germany
Devon Ryan85k wrote:

I assume you mean samtools bam2fq. It will will write out marked duplicates. It only ignores supplementary and secondary alignments.

ADD COMMENTlink written 4.1 years ago by Devon Ryan85k
0
gravatar for dmyersturnbull
4.1 years ago by
Stanford University
dmyersturnbull70 wrote:

Thanks—that answers my question.

For anyone else interested in this question, the Broad Institute has a guide:

http://gatkforums.broadinstitute.org/discussion/2908/howto-revert-a-bam-file-to-fastq-format

ADD COMMENTlink written 4.1 years ago by dmyersturnbull70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour