Question: How to align given specific region in genome with manifest file in bwa?
1
gravatar for bayramliaziz
11 months ago by
bayramliaziz0 wrote:

Usually when I use alignment with bwa, I follow this script:

bwa aln reference.fa sample.fastq.gz > sample.sai

But this time I have manifest.txt and I want to target specific regions with this and I want to speed the alignment process with specific output. If It is possible in BWA, I would like to get your answers. Any help is much appreciated.

Thanks in advance.

manifest bwa alignment genome • 649 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by bayramliaziz0
2

I don't think you can limit alignments to certain regions from genome using an intervals file with bwa. You will need to filter the alignments afterwards if you are only interested in specific regions.

You could create a reduced representation genome by extracting the areas you are interested in and creating a new index. This always carries the risk that reads may be forcibly aligned to regions where they did not come from. So the approach of post-filtering alignment will always be better.

Note: If you are doing this for an amplicon workflow then you can go with creating a reduced representation reference with areas of your interest.

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax78k

If I can not do this with bwa then which tool can I use to do the alignments for certain region?

ADD REPLYlink written 11 months ago by bayramliaziz0

MiSeq reporter manual says this for custom amplicon workflow:

The Targeted RNA workflow utilizes the banded Smith-Waterman aligner. Improvements have been added to allow alignments across very small amplicon targets (often less than 10 bp).

PCR amplicon protocol uses bwa.

If you are looking to reproduce either workflows then you will need to take this into consideration. How this is exactly implemented in Illumina software is not stated in the manual.

If you are doing PCR amplicon, use the strategy noted by @Pierre and then use bwa.

ADD REPLYlink modified 11 months ago • written 11 months ago by genomax78k
1

what is manifest.txt ?

https://commons.wikimedia.org/wiki/Category:Crystal_balls#/media/File:Paulinefrederick_5_retouched.jpg

ADD REPLYlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum126k

Manifest.txt is the file that specifies a reference genome and targeted reference regions to be used in the alignment step. This definition is from MiSeq System Guide. Manifest.txt is given to MiSeq, but I want this process done on computer. Example for manifest file and I would like to know If I can do same procedure with BWA.

[Header]                        
XT Manifest Version 1                   
ReferenceGenome Homo_sapiens\UCSC\hg19\Sequence\WholeGenomeFasta                    

[Regions]                       
Name    Chromosome  Amplicon Start  Amplicon End    Upstream Probe Length   Downstream Probe Length 
FMF         chr16           3293140                 3306587          50                                 50  
BRCA1   chr17           41194312           41279500          50                                 50
ADD REPLYlink modified 11 months ago by Pierre Lindenbaum126k • written 11 months ago by bayramliaziz0
3
gravatar for Pierre Lindenbaum
11 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum126k wrote:

you need to map everything first, convert this "manifest.txt" file as a bed file and extract the regions. Samtools And Region List

ADD COMMENTlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum126k

Thanks you helped me a lot. The following script did the job: samtools view -b -L manifest.bed input.bam > manifest_output.bam

I have one more question. Script above outputs alignments overlapping with the manifest.bed file. Is it possible to get output with non-overlapping alignments?

ADD REPLYlink written 11 months ago by bayramliaziz0

Is it possible to get output with non-overlapping alignments?

  -U FILE  output reads not selected by filters to FILE [null]
ADD REPLYlink modified 11 months ago • written 11 months ago by Pierre Lindenbaum126k

I tried that but I couldn't get it to work. It just gives me the same bam file. I changed -L with -U.

ADD REPLYlink written 11 months ago by bayramliaziz0

you need to use both options

ADD REPLYlink written 11 months ago by Pierre Lindenbaum126k

samtools view -b -L manifest.bed -U non-overlapping.bam input.bam > overlapping.bam

I used both options this way. But this time my overlapping.bam changed and It is not the same output with only -L. I am newbie so, I am sorry If I can not see the obvious.

ADD REPLYlink written 11 months ago by bayramliaziz0

ok I think I'm wrong with '-U' that do not work in conjonction of '-L'

another idea is to use 'bedtools' complement https://bedtools.readthedocs.io/en/latest/content/tools/complement.html to get the complement of the bed and run samtools view -L complement.bed

ADD REPLYlink written 11 months ago by Pierre Lindenbaum126k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1022 users visited in the last hour