Question: Converting from BED to SAF/GFF
0
gravatar for ccag
3.5 years ago by
ccag30
Boston, MA
ccag30 wrote:

I called peaks using MACS2 and now would like to use my narrowPeak file to do featureCounts. As far as I can tell, featureCOunts only uses SAF or GFF formats. Does anyone know an easy way to convert between a bed and these file types for Dm3?

featurecount gff bed saf • 5.0k views
ADD COMMENTlink modified 2.1 years ago by ATpoint36k • written 3.5 years ago by ccag30
4
gravatar for ATpoint
2.1 years ago by
ATpoint36k
Germany
ATpoint36k wrote:
awk 'OFS="\t" {print $1"."$2"."$3, $1, $2, $3, "."}' in.narrowPeak > out.saf

The first column is the identifier. This is then either any gene name or in case of genomic regions simply the concatenated genomic coordinates, e.g. separated by dot, such as chr1.100.1000000. For genomic regions I always set strand to ., for genes one can set the strand where the gene is located.

ADD COMMENTlink modified 4 months ago • written 2.1 years ago by ATpoint36k

I am wondering whether $2 should +1 because the narrowPeak is 0-based. But Actually I rarely find the description about SAF coordinate system :).

ADD REPLYlink written 4 months ago by shangguandong19960
1

In 6.2.2. of https://bioconductor.org/packages/release/bioc/vignettes/Rsubread/inst/doc/SubreadUsersGuide.pdf it says that SAF format is inclusive for both start and end coordinate. I interpret this as "leave narrowPeak as it is". I noted that if you use e.g. bedtools makeWindows to make windows across the whole genome, and if you then transform to SAF and add 1 to start you will not have 100% reads assigned to that SAF but only 99.x% because that one start nucleotide is missing. WIthout modification to that BED you assign 100%. Not sure if this is a correct statement, I am always a bit confused with these different coordinate systems. I also never understood why the featureCounts developers could not simply use BED format instead of this SAF format.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint36k

Thanks for your reply :). I will see it.

ADD REPLYlink written 4 months ago by shangguandong19960
0
gravatar for Jeffin Rockey
3.5 years ago by
Jeffin Rockey1.1k
Karimannoor
Jeffin Rockey1.1k wrote:

Option 1 :GenomeTools

Option 2: Go to galaxy oqtans page here and on the left pane there is "GFF Toolkit" which has BED_to_GFF3 converter .

Option 3: A long route. From here make use of bedToGenePred followed by genePredToGtf to get a gtf file at first. Then you can make use of the lot of tools that does a gtf to gff3 conversion.

I am mentioning multiple options because depending on whether you bed file is 6 column or 12 column or so, some may not be applicable.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Jeffin Rockey1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 979 users visited in the last hour