Question: How to convert NGS .tabular file to excel file?
0
gravatar for Glory Basumata
12 weeks ago by
India/Guwahati/Indian Institute of Technology Guwahati
Glory Basumata10 wrote:

I am using Galaxy for the NGS analysis. When I use tools like featureCounts, DESeq, etc., it gives output file in .tabular format. I need to see this data in a well-organized table in Microsoft Excel (xlsx) format. Could anyone help, please?

ADD COMMENTlink modified 26 days ago by writetoroopali0 • written 12 weeks ago by Glory Basumata10
2

try importing it just the way you import a tab separated file in excel and also post few lines from example file, from next time Glory Basumata

ADD REPLYlink written 12 weeks ago by cpad011210k
6

enter image description here

ADD REPLYlink written 12 weeks ago by Pierre Lindenbaum115k

Hi all, Thanks for the instructions. I will try this.

@Pierre: I like the cartoon icon :) Will try opening the file in R instead.

ADD REPLYlink written 12 weeks ago by Glory Basumata10

Look at this post to see a live example of what Excel does to some gene names.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by h.mon21k

Hi All, Thank you for your responses to my question. I am sharing the DESeq2 result output file so that you could try opening it. I am trying to filter out top 10 to 50 genes which are either upregulated or downregulated for this file.

Link to file: https://drive.google.com/file/d/1ItlSqE8jH645z_REd2gc_oVJo6_CaHPf/view?usp=sharing

ADD REPLYlink written 12 weeks ago by Glory Basumata10
1

Hello Glory Basumata ,

you should better start a new thread for this as it doesn't fit to your original question. But before doing it, try to solve this by yourself as this is a trivial task with unix commands sort and heador tail.

fin swimmer

ADD REPLYlink written 12 weeks ago by finswimmer8.0k

Hey can you please tell me how did you open the .tabular file in R or excel?

ADD REPLYlink written 26 days ago by writetoroopali0

What have you tried? Did you check with a simple read.table?

ADD REPLYlink written 25 days ago by RamRS19k
2
gravatar for finswimmer
12 weeks ago by
finswimmer8.0k
Germany
finswimmer8.0k wrote:

Hello Glory Basumata ,

first of all: You don't realy want to do this. It is very likely, that at a later points you will convert it back to something, that a too, which you like to use for further analysis, needs and then the nightmare begins ...

If you don't take care of my warning than you simply can open those files in excel. It will ask you how to convert and you have to select that those file are tab delimited. In the next step you can choose what is the decimal seperator.

But you have to be aware that excel also might try to convert some values to something it believe it could be. So e.g. a gene called SEP09can become 01.09.2009.

So again: DON'T DO THIS!

fin swimmer

ADD COMMENTlink written 12 weeks ago by finswimmer8.0k

Hi finswimmer,

Thank you for the warning. I appreciate your concern. I understand that excel is not the right tool for NGS study. My main intention to open/visualize tabular file is to check the DESeq result and filter out the differentially expressed gene. I also want to filter out the top 10 or 20 or 50 upregulated and downregulated genes. Furthermore, I would like to do GO (Gene Ontology) enrichment using GOseq. Kindly suggest a tool to study such tabular files. Thank you.

ADD REPLYlink written 12 weeks ago by Glory Basumata10
1

All of that can and should be done in R. It has a steeper learning curve than just using Excel, but investing time now will definitely pay off in the long run.

ADD REPLYlink written 12 weeks ago by WouterDeCoster35k
1

You can use any tool that treats the file as a plain text file and the data in it as text. Excel does not do the latter implicitly. Read this paper to see a few examples of what can happen.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by RamRS19k

Thanks for sharing the paper. I'll add it to my knowledge and personal folder.

ADD REPLYlink written 12 weeks ago by Glory Basumata10
2
gravatar for Paul
12 weeks ago by
Paul1.2k
European Union
Paul1.2k wrote:

Hi,

here is an python script which takes as first argument TSV and second argument creates xlsx file.

import csv
import sys
reload(sys)
sys.setdefaultencoding('utf8')
from xlsxwriter.workbook import Workbook

# Add some command-line logic to read the file names.
tsv_file = sys.argv[1]
xlsx_file = sys.argv[2]

# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)
worksheet = workbook.add_worksheet()

# Create a TSV file reader.
tsv_reader = csv.reader(open(tsv_file, 'rb'), delimiter='\t')

# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
    worksheet.write_row(row, 0, data)

# Close the XLSX file.
workbook.close()

You can save this code to file as conversion.py and make executable:

chmod +x conversion.py

Then run it as:

./conversion.py input.tsv output.xlsx
ADD COMMENTlink written 12 weeks ago by Paul1.2k
2

What is the advantage of this script over a plain import? Does this bypass the auto datatype assignment that Excel does?

ADD REPLYlink written 12 weeks ago by RamRS19k

Hi Paul, thanks for sharing the script. However, does it work with the format (.tabular)? Also how useful is this?

ADD REPLYlink written 12 weeks ago by Glory Basumata10
2

.tabular is a file extension, not a data format. Your data format is probably tab-separated values, and this script should work on any tab-separated dataset.

ADD REPLYlink written 12 weeks ago by RamRS19k
2
gravatar for swbarnes2
12 weeks ago by
swbarnes24.5k
United States
swbarnes24.5k wrote:

Have you tried just opening the output in Excel? It's likely just tab delimited text, Excel shouldn't have an issue with that. Just pay attention to what Excel does to your gene names.

ADD COMMENTlink written 12 weeks ago by swbarnes24.5k

Unless of course, the data has an unusable number of rows or columns. Or the process of Excel-opening it crashes the app or worse, the computer. Better off opening a 10kb chunk of it.

ADD REPLYlink written 12 weeks ago by RamRS19k

In which case, converting it's format won't help. The poster in that case will need to use R, or a database program. But the poster almost certainly does not need to actually convert their file to .xlsx format.

ADD REPLYlink written 12 weeks ago by swbarnes24.5k

I'm not saying "converting format will help". I'm saying "Don't use Excel even for opening/viewing the file"

ADD REPLYlink written 12 weeks ago by RamRS19k

Hi swbarnes2, I tried opening it with excel but it just refuses to recognize the file extension .tabular. Howeve, this file type/extention could be imported in the program R and viewed.

ADD REPLYlink written 12 weeks ago by Glory Basumata10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1030 users visited in the last hour