Reference Genome Path In Samtools
1
0
Entering edit mode
11.4 years ago
win ▴ 970

Hi all, One of the classic commands in samtools is as follows:

samtools mpileup -uf ref.fa aln1.bam aln2.bam | bcftools view -bvcg - > var.raw.bcf

In the command above ref.fa is the reference fasta file.

I wanted to know if samtools can work with a reference file that is output from a script (for e.g. http://mysite.org/ref.php), wherein ref.php would write fasta data to the output stream.

thanks, a

genome • 3.0k views
ADD COMMENT
2
Entering edit mode
11.4 years ago

do you mean using a stream as a reference ? no, because samtools uses a random-access process (fseek) to peek the specific regions of the genome. That is also why a genome must be indexed with 'samtools faidx'.

But you can always save the ouput of your php script to a file and use it as a regular fasta file.

Edit:yes it could be possible to use a database but you would have to change many lines of codes in samtools (for example, in faidx.c!

char *faidx_fetch_seq(const faidx_t *fai, char *c_name, int p_beg_i, int p_end_i
, int *len)

and accessing the sequence through a database would certainly slow down the whole analysis.

Another (geek) solution would be to implement a proxy to the database using fuse ( http://fuse.sourceforge.net/ )

ADD COMMENT
0
Entering edit mode

yes, i did mean using stream as a reference. thanks for the info about "fseek". Actually i wanted to load the reference file in a database (i know this may not be most recommended but i want it for a specific reason). So really what i am looking for is if it's possible to read reference fasta from a database.

ADD REPLY
0
Entering edit mode

that sounds like one possible approach, thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6