Question: Retrieving All Abstracts And Searching For Gene Names
gravatar for Random
8.7 years ago by
Random160 wrote:

I'm currently looking for sex-related genes and their pseudogenes in a reptilian that I sequenced, for which there is no reference.

I would like to look for specific gene names on my BLAST results which may subscribe to the description I made above.

My idea was to go through the literature which contains key words, such as "sex" "bird" "X" "y" "zw/zz" "reptiles", and extract words that are all in capitals, because that those might well be gene names, and then just get rid of common abbreviations.

I just saw where Lars provides a good solution.

Does anyone have any other suggestions(which may not even involve abstract retrieval), probably more effective for what I'm trying to do, or that might be the best one?

Thanks in advance

text • 1.8k views
ADD COMMENTlink modified 8.6 years ago by Leonor Palmeira3.7k • written 8.7 years ago by Random160

What's the content of your BLAST ? do you have the name of a gene in your hits ? What do you mean by "there is no reference" : your organism has not been sequenced, these are unknown gene or you don't have any bibliographic reference ?

ADD REPLYlink written 8.7 years ago by Pierre Lindenbaum127k

The BLAST hits have the match_description, for example "Salmo salar clone BAC 261D01 Foxl2-like protein (Foxl2) gene"

FOXL2 is involved in ovarian development so it may be of interest.

The reason why I have to do this is because the sequences were largely contamined with bacteria, therefore there's a need to identify specific sequences( in this case contigs because they have been assembled), that might be related to my lizard.

When i mean there's no reference, is that there's no reference genome for my species.

ADD REPLYlink written 8.7 years ago by Random160

Since I'm working with a W sexual chromosome, the most similar that may exist is Chicken's W. Lizards have anolis sequenced, but that lizard has XX/XY system, while mine has ZZ/ZW.

My BLAST have the following columns: query_id match_description %_identity alignment_length mismatches gap_openings q_start q_end s_start s_end evalue

ADD REPLYlink written 8.7 years ago by Random160

I have no idea of what's on the W chromosome, so any tandem repeats that I might find, or genes linked to sex may well be of interest.

ADD REPLYlink written 8.7 years ago by Random160
gravatar for Pasta
8.7 years ago by
Pasta1.3k wrote:

You could make it simple.

1- Concatenate all your keywords using NCBI query syntax, for example : "sex" OR "ovarian" OR "bird" OR "Gene1" OR "grandad" ..... Whatever.

2- Copy paste this query in the NCBI query box (Is there a limit on number of characters you can use ? I dont know). Submit

3- Choose "Send to" , then "file"

4- Then you can parse/work on the downloaded file with some script.

You might give it a try, it should work fine if your query is not too long.

You can also try with SRS - that's badass for big queries

ADD COMMENTlink written 8.7 years ago by Pasta1.3k
gravatar for Casey Bergman
8.7 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

There are a few tools that take sequences at input and generate links to relevant pubmed articles:

Maybe one of these tools could help you achieve what you are attempting to do.

ADD COMMENTlink modified 6 months ago by RamRS26k • written 8.7 years ago by Casey Bergman18k
gravatar for Leonor Palmeira
8.7 years ago by
Leonor Palmeira3.7k
Liège, Belgium
Leonor Palmeira3.7k wrote:

Here's a side answer, that might help you drastically reduce your BLAST hits: as I read that you have a large bacterial contamination, I would BLAST with exclusion of all bacterial sequences. If you are using online's NCBI BLAST, you can exclude a complete taxonomic ID (Bacteria=2). If you want to reduce your BLAST hits, you can also consider BLASTing only a specific TaxID (this can be something as large as Sauropsida, for instance).

ADD COMMENTlink written 8.7 years ago by Leonor Palmeira3.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1838 users visited in the last hour