Question

Designing degenerate primers using alignment of protein sequences from other species

1

Entering edit mode

7.2 years ago

kamran.shekh ▴ 10

I am trying to design a PCR primer for a gene whose sequence is not known. Even the whole transcriptome sequencing done in our lab did not identify that particular gene. Hence, I guess I am left with only one option: to design the degenerate primers for this gene by performing sequence alignment of the given protein from several related species and designing primers based on conserved region. Could you please explain the process of primer design by this method. I know some steps which I have described below stepwise:

step 1: download sequence of protein in question from related species from NCBI in FASTA format

step 2: perform alignment using clustal omega

step 3: identify the conserved domains

I am not sure how to move forward from here. I tried J-codehop for next steps but it is asking many parameters for which I have no clue. Thanks

degenerate primers protein sequence alignment • 3.4k views

ADD COMMENT • link 7.2 years ago by kamran.shekh ▴ 10

0

Entering edit mode

Hello kamran.shekh!

It appears that your post has been cross-posted to another site: https://biology.stackexchange.com/questions/55707/designing-degenerate-primers-using-alignment-of-protein-sequences-from-other-spe

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 7.2 years ago by Michael 54k

0

Entering edit mode

The species in my question is white sturgeon fish which is a very ancient cartilagenous fish. For transcriptomics, RNA-Seq library was loaded onto Mi-Seq v3 150 cycle cartridge and run as 75 basepair (bp) paired-end reads on a Mi-Seq sequencer (Illumina). No public databases for either the genome or transcriptome of white sturgeon were available. Therefore, a comprehensive reference transcriptome was constructed by use of de novo assembly from reads for liver of white sturgeon. Please let me know if need further information on assembly. Actually, I am interested in two genes: SLC39a8 (a zinc transporter) transporter and ECaC (a calcium transporter). Both are very common transporters in animals and are found in abundance across animal and plant kingdom. Sequence is known in many species also but as mentioned above, transcriptomic data in our lab failed to identify these . More importantly, ECaC sequence is even known in very closely related fish known as Lake sturgeon.

ADD REPLY • link 7.2 years ago by kamran.shekh ▴ 10

0

Entering edit mode

Thanks for the update! And how did you perform the de-novo assembly (Trinity?) and search (tblastn?) using a closely related species as template?

ADD REPLY • link 7.2 years ago by Michael 54k

0

Entering edit mode

Contigs for the reference transcriptome were de novo assembled from the merged reads and unmerged paired-end reads from individual sequencing reads by use of CLC genomics workbench v.5.0 (CLC Bio) with default parameters. Contigs comprising the reference transcriptome were annotated by use of BlastX searches in Blast2GO v.2.5.0 software2 against sequences in the NCBI non-redundant protein database for zebrafish.

ADD REPLY • link 7.2 years ago by kamran.shekh ▴ 10

0

Entering edit mode

You should try a different assembler, I have not used CLC myself, but I don't think it is top notch for your TSA. What I heard is that it is easy to use and generates assemblies that look good on paper (N50, etc. ) but when we used it for our genome, it created a lot of chimeric contigs. Try trinity (with default parameters) to get a second assembly, then run a different search strategy:

get a few sequences from the closest related species + some more fish for your genes of interest and search using those as templates only, using tblastn or tblastx using the Trinity.fasta as blast database. Extract the best hits against NR again for validation. Good luck!
I/we can help you with that pipeline as well, if you want.

ADD REPLY • link 7.2 years ago by Michael 54k

0

Entering edit mode

Thanks for detailed response. Will try the approach you suggested. Thanks

ADD REPLY • link 7.2 years ago by kamran.shekh ▴ 10

score 1 · Answer 1 · 2017-01-28

I am not sure this is going to work, it is suspicious that your transcriptome does not contain a matching transcript. How did you do the transcriptome sequencing and assembly? Which gene are we actually talking about, and can you retrieve sequences from closely related species? There is a chance that the gene is really missing, can you do a proteomics approach or western blot instead?

If you you want to go for the degenerate primer approach, here is a random blog post I found: https://skeetersays.wordpress.com/2008/07/10/degenerate-pcr-a-guide-and-tutorial/

When the post talks about "now stare at the printout of the alignment" you better do this with a GUI tool like Jalview and show the conservation score and "color by sequence identity". There are also automated tools for that here: https://omictools.com/degenerate-primers-category e.g.: GeneFisher2