Using magicblast for short sequences
0
0
Entering edit mode
3.8 years ago
goonerrn • 0

I am trying to use magicBlast using database of length 30-80 bp to extract sequences from reads I have thousands of sequences of length 30- 80 bp. Each sequence was named based on location(eg. chr1: 24331-23372). I created a BLAST database and want to search for reads in fastq file or directly SRA file.

I looked at post from https://plindenbaum.blogspot.com/search?q=magic+

The headers i obtain, I think are the names of the sequence in the BLAST used as contigs. (normally they are chr1, chr2...etc) This is a sample line output from the SAM file (output of MagicBlast)

@HD VN:1.0  GO:query

@SQ SN:chr1:23506-23543 LN:38

@SQ SN:chr1:25089-25134 LN:46

@SQ SN:chr1:29039-29082 LN:44

@SQ SN:chr1:34360-34397 LN:38

'#'Sample line

**SRR1237994.908191 163 chr14:69256143-69256224 1   255 41S60M  =   20  82  CACATTCTGAGGTGCTGGGGGAAAGGGGTTGAGCTTAACCTTGTTAATGTAGGGCCTGTGGGGAATGGGATGGGTAGGGAGAAGAGGGTATGGGATGTGGG   *   NH:i:1  AS:i:60 NM:i:0**

To overcome the contig issue, I made genome reference(assembly) containing the regions as the headers(from the SAM file). The fasta file used to blast initially against the reads is now the reference file(to use in samtools, mpileup ) I am confused in this step.

I am interested in identifying variants in the region, just through the reads. Is there an alternative, better suited to this method, or is my way not the right way at all? Can i use GATK pipeline on the initial SAM file i obtained? Any suggestions are helpful. Thanks. I am looking at this method to be useful for WGS data, if i can only analyze the required region, would save a lot of time and complexity.

Code used to generate the files

magicblast -sra SRR1237994 -db my_reference

Then I ran mpileup, with the assembly and call.

bcftools mpileup $currBAM -f $currAssembly --max-depth 1000000 -a FORMAT/AD -a INFO/AD  > tmp.vcf

cat tmp.vcf | bcftools call -mv -Ov -o $currVariantFile
alignment RNA-Seq genome BLAST magicBLAST • 869 views
ADD COMMENT

Login before adding your answer.

Traffic: 2301 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6