Question: Generating Counts Data From Fastq Sequence Files
0
gravatar for josph.sh
5.7 years ago by
josph.sh0
josph.sh0 wrote:

I'm new to sequencing and I've currently got several FASTQ files containing data corresponding to sequencing experiments (sequenced using Illumina miseq).

I was hoping to carry out some expression analysis (with edgeR, probably) using this data, but I'll need to generate a counts matrix from this data. Could somebody provide some instruction on how to generate counts data from a FASTQ file?

ADD COMMENTlink modified 5.7 years ago by Xingyu Yang260 • written 5.7 years ago by josph.sh0
2
gravatar for Ashutosh Pandey
5.7 years ago by
Philadelphia
Ashutosh Pandey11k wrote:
  1. You will have to first align those fastq files against the reference genome and produce SAM/BAM files.Tophat, STAR and many other splice aware RNA-seq aligners are available for this task. It is always good to preprocess your read data including QC, trimming off the low quality bases etc.

  2. Then you need to use some tool that will generate count data for you. Basically you will have to provide the aligned BAM file and the gene annotation file (gff3, gtf,bed format) for your reference genome. HTSeq, Cufflinks are some tools available for this task. Search "Biostar" and you will get names of other tools.

ADD COMMENTlink modified 8 weeks ago by RamRS25k • written 5.7 years ago by Ashutosh Pandey11k

Thank you for your reply.

ADD REPLYlink written 5.7 years ago by josph.sh0
1
gravatar for Xingyu Yang
5.7 years ago by
Xingyu Yang260
Atlanta
Xingyu Yang260 wrote:

http://www.nature.com/nprot/journal/v7/n3/abs/nprot.2012.016.html

ADD COMMENTlink written 5.7 years ago by Xingyu Yang260

Thanks for the link.

ADD REPLYlink written 5.7 years ago by josph.sh0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2187 users visited in the last hour