Question

How to convert NGS .tabular file to excel file?

0

Entering edit mode

6.4 years ago

Glory Basumata ▴ 140

I am using Galaxy for the NGS analysis. When I use tools like featureCounts, DESeq, etc., it gives output file in .tabular format. I need to see this data in a well-organized table in Microsoft Excel (xlsx) format. Could anyone help, please?

RNA-Seq tabular excel format galaxy • 6.8k views

ADD COMMENT • link updated 6.3 years ago by writetoroopali ▴ 20 • written 6.4 years ago by Glory Basumata ▴ 140

2

Entering edit mode

try importing it just the way you import a tab separated file in excel and also post few lines from example file, from next time Glory Basumata

ADD REPLY • link 6.4 years ago by cpad0112 21k

6

Entering edit mode

enter image description here

ADD REPLY • link 6.4 years ago by Pierre Lindenbaum 165k

0

Entering edit mode

Hi all, Thanks for the instructions. I will try this.

@Pierre: I like the cartoon icon :) Will try opening the file in R instead.

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140

0

Entering edit mode

Look at this post to see a live example of what Excel does to some gene names.

ADD REPLY • link 6.4 years ago by h.mon 35k

0

Entering edit mode

Hi All, Thank you for your responses to my question. I am sharing the DESeq2 result output file so that you could try opening it. I am trying to filter out top 10 to 50 genes which are either upregulated or downregulated for this file.

Link to file: https://drive.google.com/file/d/1ItlSqE8jH645z_REd2gc_oVJo6_CaHPf/view?usp=sharing

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140

1

Entering edit mode

Hello Glory Basumata ,

you should better start a new thread for this as it doesn't fit to your original question. But before doing it, try to solve this by yourself as this is a trivial task with unix commands sort and heador tail.

fin swimmer

ADD REPLY • link 6.4 years ago by finswimmer 16k

0

Entering edit mode

Hey can you please tell me how did you open the .tabular file in R or excel?

ADD REPLY • link 6.3 years ago by writetoroopali ▴ 20

0

Entering edit mode

What have you tried? Did you check with a simple read.table?

ADD REPLY • link 6.3 years ago by Ram 44k

score 2 · Answer 1 · 2018-09-18

2

Entering edit mode

6.4 years ago

finswimmer 16k

Hello Glory Basumata ,

first of all: You don't realy want to do this. It is very likely, that at a later points you will convert it back to something, that a too, which you like to use for further analysis, needs and then the nightmare begins ...

If you don't take care of my warning than you simply can open those files in excel. It will ask you how to convert and you have to select that those file are tab delimited. In the next step you can choose what is the decimal seperator.

But you have to be aware that excel also might try to convert some values to something it believe it could be. So e.g. a gene called SEP09can become 01.09.2009.

So again: DON'T DO THIS!

fin swimmer

ADD COMMENT • link 6.4 years ago by finswimmer 16k

0

Entering edit mode

Hi finswimmer,

Thank you for the warning. I appreciate your concern. I understand that excel is not the right tool for NGS study. My main intention to open/visualize tabular file is to check the DESeq result and filter out the differentially expressed gene. I also want to filter out the top 10 or 20 or 50 upregulated and downregulated genes. Furthermore, I would like to do GO (Gene Ontology) enrichment using GOseq. Kindly suggest a tool to study such tabular files. Thank you.

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140

1

Entering edit mode

All of that can and should be done in R. It has a steeper learning curve than just using Excel, but investing time now will definitely pay off in the long run.

ADD REPLY • link 6.4 years ago by WouterDeCoster 47k

1

Entering edit mode

You can use any tool that treats the file as a plain text file and the data in it as text. Excel does not do the latter implicitly. Read this paper to see a few examples of what can happen.

ADD REPLY • link 6.4 years ago by Ram 44k

0

Entering edit mode

Thanks for sharing the paper. I'll add it to my knowledge and personal folder.

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140

score 2 · Answer 2 · 2018-09-18

2

Entering edit mode

6.4 years ago

Paul ★ 1.5k

Hi,

here is an python script which takes as first argument TSV and second argument creates xlsx file.

import csv
import sys
reload(sys)
sys.setdefaultencoding('utf8')
from xlsxwriter.workbook import Workbook

# Add some command-line logic to read the file names.
tsv_file = sys.argv[1]
xlsx_file = sys.argv[2]

# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)
worksheet = workbook.add_worksheet()

# Create a TSV file reader.
tsv_reader = csv.reader(open(tsv_file, 'rb'), delimiter='\t')

# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
    worksheet.write_row(row, 0, data)

# Close the XLSX file.
workbook.close()

You can save this code to file as conversion.py and make executable:

chmod +x conversion.py

Then run it as:

./conversion.py input.tsv output.xlsx

ADD COMMENT • link 6.4 years ago by Paul ★ 1.5k

2

Entering edit mode

What is the advantage of this script over a plain import? Does this bypass the auto datatype assignment that Excel does?

ADD REPLY • link 6.4 years ago by Ram 44k

0

Entering edit mode

Hi Paul, thanks for sharing the script. However, does it work with the format (.tabular)? Also how useful is this?

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140

2

Entering edit mode

.tabular is a file extension, not a data format. Your data format is probably tab-separated values, and this script should work on any tab-separated dataset.

ADD REPLY • link 6.4 years ago by Ram 44k

score 2 · Answer 3 · 2018-09-18

2

Entering edit mode

6.4 years ago

swbarnes2 14k

Have you tried just opening the output in Excel? It's likely just tab delimited text, Excel shouldn't have an issue with that. Just pay attention to what Excel does to your gene names.

ADD COMMENT • link 6.4 years ago by swbarnes2 14k

0

Entering edit mode

Unless of course, the data has an unusable number of rows or columns. Or the process of Excel-opening it crashes the app or worse, the computer. Better off opening a 10kb chunk of it.

ADD REPLY • link 6.4 years ago by Ram 44k

0

Entering edit mode

In which case, converting it's format won't help. The poster in that case will need to use R, or a database program. But the poster almost certainly does not need to actually convert their file to .xlsx format.

ADD REPLY • link 6.4 years ago by swbarnes2 14k

0

Entering edit mode

I'm not saying "converting format will help". I'm saying "Don't use Excel even for opening/viewing the file"

ADD REPLY • link 6.4 years ago by Ram 44k

0

Entering edit mode

Hi swbarnes2, I tried opening it with excel but it just refuses to recognize the file extension .tabular. Howeve, this file type/extention could be imported in the program R and viewed.

ADD REPLY • link 6.4 years ago by Glory Basumata ▴ 140