Question: where to find hg38 CDS without UTRs
0
gravatar for ta_awwad
10 months ago by
ta_awwad210
Frankfurt am Main
ta_awwad210 wrote:

Hello everyone, I am looking for Human protein-coding transcript sequences WITHOUT UTRs in fasta format if possible.. any Idea where to find such file?

Best, TA

ADD COMMENTlink modified 10 months ago by genomax68k • written 10 months ago by ta_awwad210
1
gravatar for genomax
10 months ago by
genomax68k
United States
genomax68k wrote:

You should be able to get them from BioMart at Ensembl. Video tutorial available for BioMart.

ADD COMMENTlink written 10 months ago by genomax68k
1
gravatar for cpad0112
10 months ago by
cpad011211k
India
cpad011211k wrote:
  1. Get gtf with all annotations (for all coding genes)
  2. Chuck 3' and 5' UTRs out from gtf
  3. use tools such as getFasta to get protein coding transcript sequences.
ADD COMMENTlink modified 10 months ago • written 10 months ago by cpad011211k
2

Simply greping or awking for 'CDS' as the feature in the GTF, followed by bedtools getfasta on the resulting coordinates does the job.

ADD REPLYlink modified 10 months ago • written 10 months ago by ATpoint17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1869 users visited in the last hour