Question: align spike in database
0
gravatar for Björn
2.4 years ago by
Björn40
Björn40 wrote:

Hi, I have fasta sequences from sample that includes spike-in controls. How can I align those spike-in with the database using bowtie2 and filter them to create mapped reads of FASTQ files. Any references or scripts to perform the Task would be appreciated. Thanks

fastqc rna-seq spike-in bowtie2 • 1.0k views
ADD COMMENTlink modified 2.4 years ago by swbarnes27.1k • written 2.4 years ago by Björn40
0
gravatar for Brian Bushnell
2.4 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

Depending on how long and specific your spike-ins are, I recommend using BBMap's Seal which can both remove and quantify them using kmer-matching. For example:

seal.sh in=reads.fq ref=spikeins.fa pattern=spikein_%.fq outu=clean.fq stats=stats.txt k=31
ADD COMMENTlink written 2.4 years ago by Brian Bushnell17k
0
gravatar for swbarnes2
2.4 years ago by
swbarnes27.1k
United States
swbarnes27.1k wrote:

Make a new reference file which includes the spike-in sequences. Reindex that genome with Bowtie, realign.

ADD COMMENTlink written 2.4 years ago by swbarnes27.1k

Hi swbarnes2, I prepared a new *.fa file with all the sequences of spike-in as shown below

UniSP100
TCCCAAATGTAGACAAAGCA
UniSP101
TGAAGCTGCCAGCATGATCTA
UniSP102
CAGCCAAGGATGACTTGCCGG

Would you be kind enough to send me the script for Bowtie2 provided my test sequence is B12_015.fastq? Thank you very much

ADD REPLYlink written 2.4 years ago by Björn40

With such short sequences, I think kmer-matching will probably work better than alignment... are those the full length of the spike-ins?

ADD REPLYlink written 2.4 years ago by Brian Bushnell17k

Yes, they are the full length of the spike-ins. Although, there were 12 spike-ins used, I gave examples of only 3 spike-ins which is ok to know the command-line for removal .

ADD REPLYlink written 2.4 years ago by Björn40

In that case, if you decide to use Seal, change the flag "K=31" to "k=20" or whatever the length is of the shortest spike-in. And you may want to allow a substitution with the "hdist" flag, e.g.

seal.sh in=reads.fq ref=spikeins.fa pattern=spikein_%.fq outu=clean.fq stats=stats.txt k=20 hdist=1
ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Brian Bushnell17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1211 users visited in the last hour