Question: Extract Consensus Sequence from DIAMOND
gravatar for c.pouchon
21 months ago by
c.pouchon0 wrote:

Hi everyone,

I am working on shotgun genomic sequences on plants, and I wanted to mapp my sequences into a list of genes of references (eg. BUSCO) in order to retrieve a consensus for each of my own sample and to compare them after in phylogenetic analyses.

I ran DIAMOND with my reads/contigs (from SPAdes) into my proteins of reference and I identified different hits. But i am wondering how I can retrieve my consensus (in nucleotide) for each gene of reference, and if there is a correct way to proceed as in samtools with mpileup function. Have you any idea?

Thanks, Best regards

Charles P.

ADD COMMENTlink modified 21 months ago • written 21 months ago by c.pouchon0

It is not clear what you are asking above. Let me try to state it as follows. Let us know if that is correct.

You want to extract reads/contigs (you include both above, is that what you used when you did DIAMOND search) that are "aligning" to a particular protein, as consensus sequence (or eventually generate a consensus from them)?

ADD REPLYlink modified 21 months ago • written 21 months ago by GenoMax95k

Thanks, I am sorry for my question. You are right, I made two kind of analyses. But i am interesting on reads, i want to extract reads aligning on a particular protein (by DIAMOND) and after generate a consensus from them.

ADD REPLYlink written 21 months ago by c.pouchon0

Then I suggest you extract the read names from your DIAMOND output (if you had tabular output then some combination of grep and cut commands should work) and then get those reads from your original fastq data. You can use from BBMap suite as one of the program options to do that.

ADD REPLYlink modified 21 months ago • written 21 months ago by GenoMax95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 968 users visited in the last hour