Question

Problem for reading annotation file for GPL10999_family.soft

0

Entering edit mode

5.1 years ago

modarzi ▴ 170

Hi,

I have downloaded RNA-seq data from GSE45510. for annotation file I have downloaded SOFT formatted family file(s) from GPL10999. Now, I want to read that file through below code:

annot= fread("GPL10999_family.soft", skip = "!platform_table_begin", data.table = F)

but I got below error:

Error in fread("GPL10999_family.soft", skip = "!platform_table_begin",  : 
  File 'GPL10999_family.soft' does not exist or is non-readable.

Now, I want to know how can I get annotation file of this data set(GSE45510)? if downloading "GPL10999_family.soft" is wrong, so through which way I can get annotation of this data set?

I apprecite if anybody share hislher comment with me.

Best Regards,

annotation • 1.2k views

ADD COMMENT • link 5.1 years ago by modarzi ▴ 170

score 2 · Answer 1 · 2019-03-27

2

Entering edit mode

5.1 years ago

GenoMax 141k

Platform GPL10999 is an Illumina sequencer. This is not a microarray.

If you intend to do something specific with this you may want to download the raw sequence data and align/count/analyze it yourself. Otherwise data processing is described in a section on the page linked.

ADD COMMENT • link 5.1 years ago by GenoMax 141k

0

Entering edit mode

Thank you for your comment. ok.I downloaded SRA files of GSE45510. Now, I need annotation file of this study and my problem is that I don't know How can I get annotation file of this study.

I appreciate if you guide me for getting that file.

Best Regards,

ADD REPLY • link 5.1 years ago by modarzi ▴ 170

0

Entering edit mode

If you are going to do the alignments yourself using the raw data then use which ever sequence/annotation combination you prefer. This would be the best way anyway instead of trying to use counts from SRA.

You can find annotations/sequence at GENCODE site for human/mouse data.

ADD REPLY • link 5.1 years ago by GenoMax 141k

0

Entering edit mode

You will also have to convert the SRA files to FASTQ, and then use something like Salmon or Kallisto to pseudo-align these to the GENCODE reference transcriptome (available at genomax's link, above).

If all of that is too difficult, then just download the normalised data (as TPM counts) - it is in the file GSE45510_99LMS.transcriptome_TPM_round.txt.gz on the home page (at bottom): https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE45510

ADD REPLY • link 5.1 years ago by Kevin Blighe 87k