Question: Splice Junction file intersection with genome annotation
0
gravatar for ruchiksy
4.8 years ago by
ruchiksy50
Singapore
ruchiksy50 wrote:

Hello,

 

I have a tab delimited format Splice Junction file and the file looks something like this:

chr1    11212    12009    1    1    0    0    2    48
chr1    11672    12009    1    1    0    0    1    31
chr1    11845    12009    1    1    0    0    1    28
chr1    12228    12612    1    1    1    0    1    32
chr1    12722    13220    1    1    1    0    3    9
chr1    14830    14969    2    2    1    0    218    50
chr1    15039    15795    2    2    1    0    98    50
chr1    15948    16606    2    2    1    1    10    48
chr1    16766    16857    2    2    1    0    24    44
chr1    16766    16875    2    2    0    0    2    36


The task is to filter out lines in which Column 6 has value 1, Column 7 has value 1 and Column 8 has value 10 or greater. 

 

I have been going through the bedtools documentation but I am not quite sure on how to get started, I would appreciate a few pointers on how to get going. My input file is going to be in the tab delimited format and I also have the Gencode V.19 GTF file for annotation.

 

Thanks!

*** Edit ***

Column 1: chromosome
Column 2: first base of the intron (1-based)
Column 3: last base of the intron (1-based)
Column 4: strand
Column 5: intron motif: 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT
Column 6: 0: unannotated, 1: annotated (only if splice junctions database is used)
Column 7: number of uniquely mapping reads crossing the junction
Column 8: number of multi-mapping reads crossing the junction
Column 9: maximum spliced alignment overhang


Added the field names.

 

 

ADD COMMENTlink modified 4.0 years ago by shirley081890 • written 4.8 years ago by ruchiksy50

Hello ruchiksy!

It appears that your post has been cross-posted to another site: SeqAnswers.

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Devon Ryan88k

Hi

Can you please tell me from where can I get the splice junction annotation file for human.

ADD REPLYlink written 2.9 years ago by Govardhan Anande130

You can download a GTF file from Ensembl.

ADD REPLYlink written 2.9 years ago by Devon Ryan88k
2
gravatar for Devon Ryan
4.8 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Since this isn't a BED file, it'd be extra work to get bedtools to deal with it. Just use awk:

awk '{if($6!=1 && $7!=1 && $8<10) print $0}' original.txt > filtered.txt

For your example, that would print:

chr1    11212    12009    1    1    0    0    2    48
chr1    11672    12009    1    1    0    0    1    31
chr1    11845    12009    1    1    0    0    1    28
chr1    16766    16875    2    2    0    0    2    36
ADD COMMENTlink written 4.8 years ago by Devon Ryan88k

It worked, thanks Mr. Devon. Although there is a small correction I wanted reads greater than 10 not less than, I changed it when I ran this.

ADD REPLYlink written 4.8 years ago by ruchiksy50
1
gravatar for Ann
4.8 years ago by
Ann2.2k
Concord NC USA
Ann2.2k wrote:

It's hard to tell from your example what each field is meant to represent as there are many possible ways you could use BED format to indicate splicing patterns. If you can give more detail it will be easier to recommend your next step.

ADD COMMENTlink written 4.8 years ago by Ann2.2k
1

I have just added the field names, should have done that in the first place. Thanks!

ADD REPLYlink written 4.8 years ago by ruchiksy50
0
gravatar for shirley0818
4.0 years ago by
shirley081890
United States
shirley081890 wrote:

Hi ruchiksy,

I found your input bed file is quite useful. May I ask which tool/software you used to obtain your splice Junction file? 

Many thanks,

Shirley

 

 

ADD COMMENTlink written 4.0 years ago by shirley081890

I think its STAR

ADD REPLYlink written 3.9 years ago by Lhl730

Definitely looks like my recent STAR output to me, in case another vote was needed.

ADD REPLYlink written 3.6 years ago by calizarr10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 989 users visited in the last hour