Question: fastq file data
0
gravatar for Sam
2.0 years ago by
Sam130
Sam130 wrote:

Hello

how I can show nucleotide distribution and sequence length of multi fastq file (1.fq, 2.fq,...,6.fq) in just 2 separate graph?

Thanks

rna-seq fastq • 920 views
ADD COMMENTlink modified 2.0 years ago by Istvan Albert ♦♦ 81k • written 2.0 years ago by Sam130

what have you tried/found so far ?

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum124k

fastx-tool kit but it just for one lib and I want merge all lib data in one geraph

ADD REPLYlink written 2.0 years ago by Sam130

what about merging the fastq files before running fastx ?

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum124k

I don't want merge all reads , just want show sequence length of multi fq file , separately (for instance according color ) in one graph

ADD REPLYlink written 2.0 years ago by Sam130

I don't want merge all reads

http://hannonlab.cshl.edu/fastx_toolkit/commandline.html

"Tools can read from STDIN "

I think you should consider this option.

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum124k

should use output of fastx_quality_stats , as input for FASTA/Q Nucleotide Distribution

, so is it a right command ?

fastx_nucleotide_distribution_graph.sh -i file1.TXT file2.TXT file3.TXT [-t TITLE]  [-o OUTPUT]
ADD REPLYlink written 2.0 years ago by Sam130
1

No harm is trying the command out :)

ADD REPLYlink written 2.0 years ago by genomax74k

I think it should be something like

gunzip -c *.fq.gz | fastx_nucleotide_distribution_graph.sh - o OUTPUT
ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Pierre Lindenbaum124k

This is probably about Illumina data, but it never hurts to mention this!

ADD REPLYlink written 2.0 years ago by WouterDeCoster42k

above commands not work , except hurts and harm topic fast reply! could you help to find a way ?

ADD REPLYlink written 2.0 years ago by Sam130

You didn't tell us where you got the data from. So, Illumina data?

above commands not work

You'll have to tell us a bit more about how those don't work.

ADD REPLYlink written 2.0 years ago by WouterDeCoster42k

above commands not work

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum124k

but I explained all story, fastx_nucleotide_distribution_graph.sh just take TXT out put file of fastx_quality_stats as input, with one txt input file fastx_nucleotide_distribution_graph.sh works well but with two input file( -i file1.txt file2.txt) I got this error:

gnuplot> set term png size 1048,768
                  ^
         line 0: unknown or ambiguous terminal type; type just 'set terminal' for a list

WARNING: Plotting with an 'unknown' terminal.

It's illumina fq files, could you introduce other scripts ?

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Sam130

Anything wrong with FastQC?

ADD REPLYlink written 2.0 years ago by WouterDeCoster42k

no all thing is OK with FASTQC , I can post only 5 post per 6hr, is there any way to improve it ?

ADD REPLYlink written 2.0 years ago by Sam130

I can post only 5 post per 6hr, is there any way to improve it ?

After you have been on Biostars for a while this restriction will be removed.

ADD REPLYlink written 2.0 years ago by WouterDeCoster42k

fastx calls gnuplot ; it seems that your version is not complete:

https://stackoverflow.com/questions/22816030

ADD REPLYlink written 2.0 years ago by Pierre Lindenbaum124k

it's works without any problem with -i file1.text , and I have a nice plot in out put , problem is fastx_nucleotide_distribution_graph.sh is not compatible with more than one input file. so I should find other alternative scripts for merge nucleotide_distribution_graph from distinct fq file.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Sam130

If you need just one plot for the entire dataset then cat'ing the fastq files together to generate one inout (per read, R1/R2) may be the way to go.

ADD REPLYlink written 2.0 years ago by genomax74k

I think to a awk code to have a sequence length frequency in each fq file and then merge them in Excel to have a unique sequence length graph with different color for each lib.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Sam130
0
gravatar for Istvan Albert
2.0 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

FastQC produces both plots.

ADD COMMENTlink written 2.0 years ago by Istvan Albert ♦♦ 81k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1824 users visited in the last hour