Hello, I am new to bioinformatics and new to this forum as well. So, I have a list of 1400+ gene names, chromosome number and their exon coordinates (3 exons/gene). I am looking for a way to get the domains associated with all these exons. So, is there any way to retrieve the sequences for each of these exons (for all genes!) in a batch manner and feed it to some other tool (*) to get the domains associated? Any thread regarding the same?
(*) - I only know of batch CD which requires protein query so please let me know if there is any other batch CD tool that works with nucleotide query
You can probably use NCBI
datasetsto download this information in batch. See: https://www.ncbi.nlm.nih.gov/datasets/docs/v2/how-tos/genes/download-gene-data-package/
you can take a look an old post that I wrote https://crazyhottommy.blogspot.com/2015/04/get-all-promoter-sequences-of-human.html
Just change the promoter coordinates to the exon coordinates.
just a note, looks like this was cross posted to reddit https://www.reddit.com/r/bioinformatics/comments/101ftv4/how_to_retrieve_sequences_in_batch_using_exon/
GenoMax Ming Tang @mohammadhassanj Thanks for the replies. I did try some of the solutions, especially biomart and they didn't work out. Somehow, I figured out how to use table browser and I got what I wanted. But I am definitely trying these solutions again in my free time. Thanks a bunch.