Question: multiple Fastq file alignment
0
gravatar for me.sr1510
23 months ago by
me.sr15100
me.sr15100 wrote:

can we align two fastq file over one another without using reference? what are the other options apart from the comparing the two vcf files generated from the samtools. these are result of illumina whole genome sequencing.

kindly advice.

alignment • 1.3k views
ADD COMMENTlink modified 22 months ago by Biostar ♦♦ 20 • written 23 months ago by me.sr15100
1

What is your end goal?

ADD REPLYlink written 23 months ago by Devon Ryan91k

to align the fastq files and get the uncommon genes between them

ADD REPLYlink written 23 months ago by me.sr15100

The goal is fine but the way you want to go about it seems very odd. You could remove duplicate reads comparing the two files and then look at what is left, though that approach is not without its own problems.

ADD REPLYlink modified 23 months ago • written 23 months ago by genomax69k

I would like to know is there any way to compare the two files with each other

ADD REPLYlink written 23 months ago by me.sr15100

Comparing raw data is not going to be of much help. Your best bet is to follow @Brian's suggestion below.

ADD REPLYlink written 23 months ago by genomax69k

Define "uncommon genes". Are these genes that have differences in coverage, difference in sequence, exist in one but not the other according to de novo assembly, something else completely...?

ADD REPLYlink written 23 months ago by Devon Ryan91k

Difference in sequence

ADD REPLYlink written 23 months ago by me.sr15100
1

Not to mention for the average fastq file which contains thousands, if not millions of separate reads you'd have an impossibly large dataset to analyse even if you could align them all against one another.

ADD REPLYlink written 22 months ago by jrj.healey13k

It seems like it makes more sense for your goal to assemble both datasets, call and annotate the genes, and then compare the annotated genes.

Please update your question with more specifics about what kind of organisms and data you have, and your specific goal.

ADD REPLYlink written 23 months ago by Brian Bushnell16k

Since you do not have reference (or do not want to use reference), it would be denovo assembly. You would not get genes as there is no reference and thus you would get scaffolds and/or contigs. Then you may get differential scaffolds and/or contigs. Aligning one raw file against another raw file (esp reads) do not make sense to me (i.e to my limited knowledge)

ADD REPLYlink written 23 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1037 users visited in the last hour