How to convert NGS .tabular file to excel file?
3
0
Entering edit mode
3.1 years ago

I am using Galaxy for the NGS analysis. When I use tools like featureCounts, DESeq, etc., it gives output file in .tabular format. I need to see this data in a well-organized table in Microsoft Excel (xlsx) format. Could anyone help, please?

RNA-Seq tabular excel format galaxy • 3.3k views
2
Entering edit mode

try importing it just the way you import a tab separated file in excel and also post few lines from example file, from next time Glory Basumata

6
Entering edit mode

0
Entering edit mode

Hi all, Thanks for the instructions. I will try this.

@Pierre: I like the cartoon icon :) Will try opening the file in R instead.

0
Entering edit mode

Look at this post to see a live example of what Excel does to some gene names.

0
Entering edit mode

Hi All, Thank you for your responses to my question. I am sharing the DESeq2 result output file so that you could try opening it. I am trying to filter out top 10 to 50 genes which are either upregulated or downregulated for this file.

1
Entering edit mode

Hello Glory Basumata ,

you should better start a new thread for this as it doesn't fit to your original question. But before doing it, try to solve this by yourself as this is a trivial task with unix commands sort and heador tail.

fin swimmer

0
Entering edit mode

Hey can you please tell me how did you open the .tabular file in R or excel?

0
Entering edit mode

What have you tried? Did you check with a simple read.table?

2
Entering edit mode
3.1 years ago

Hello Glory Basumata ,

first of all: You don't realy want to do this. It is very likely, that at a later points you will convert it back to something, that a too, which you like to use for further analysis, needs and then the nightmare begins ...

If you don't take care of my warning than you simply can open those files in excel. It will ask you how to convert and you have to select that those file are tab delimited. In the next step you can choose what is the decimal seperator.

But you have to be aware that excel also might try to convert some values to something it believe it could be. So e.g. a gene called SEP09can become 01.09.2009.

So again: DON'T DO THIS!

fin swimmer

0
Entering edit mode

Hi finswimmer,

Thank you for the warning. I appreciate your concern. I understand that excel is not the right tool for NGS study. My main intention to open/visualize tabular file is to check the DESeq result and filter out the differentially expressed gene. I also want to filter out the top 10 or 20 or 50 upregulated and downregulated genes. Furthermore, I would like to do GO (Gene Ontology) enrichment using GOseq. Kindly suggest a tool to study such tabular files. Thank you.

1
Entering edit mode

All of that can and should be done in R. It has a steeper learning curve than just using Excel, but investing time now will definitely pay off in the long run.

1
Entering edit mode

You can use any tool that treats the file as a plain text file and the data in it as text. Excel does not do the latter implicitly. Read this paper to see a few examples of what can happen.

0
Entering edit mode

Thanks for sharing the paper. I'll add it to my knowledge and personal folder.

2
Entering edit mode
3.1 years ago
Paul ★ 1.4k

Hi,

here is an python script which takes as first argument TSV and second argument creates xlsx file.

import csv
import sys
sys.setdefaultencoding('utf8')
from xlsxwriter.workbook import Workbook

# Add some command-line logic to read the file names.
tsv_file = sys.argv[1]
xlsx_file = sys.argv[2]

# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)

# Create a TSV file reader.

# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
worksheet.write_row(row, 0, data)

# Close the XLSX file.
workbook.close()


You can save this code to file as conversion.py and make executable:

chmod +x conversion.py


Then run it as:

./conversion.py input.tsv output.xlsx

2
Entering edit mode

What is the advantage of this script over a plain import? Does this bypass the auto datatype assignment that Excel does?

0
Entering edit mode

Hi Paul, thanks for sharing the script. However, does it work with the format (.tabular)? Also how useful is this?

2
Entering edit mode

.tabular is a file extension, not a data format. Your data format is probably tab-separated values, and this script should work on any tab-separated dataset.

2
Entering edit mode
3.1 years ago

Have you tried just opening the output in Excel? It's likely just tab delimited text, Excel shouldn't have an issue with that. Just pay attention to what Excel does to your gene names.

0
Entering edit mode

Unless of course, the data has an unusable number of rows or columns. Or the process of Excel-opening it crashes the app or worse, the computer. Better off opening a 10kb chunk of it.

0
Entering edit mode

In which case, converting it's format won't help. The poster in that case will need to use R, or a database program. But the poster almost certainly does not need to actually convert their file to .xlsx format.

0
Entering edit mode

I'm not saying "converting format will help". I'm saying "Don't use Excel even for opening/viewing the file"

0
Entering edit mode

Hi swbarnes2, I tried opening it with excel but it just refuses to recognize the file extension .tabular. Howeve, this file type/extention could be imported in the program R and viewed.