Question: trim galore for 10Xgenomics
gravatar for Nunu
15 months ago by
Nunu0 wrote:

Hello helpers, I just got raw reads from 10Xgenomics and started to do trimming and assembly. I’m new to this technology and have some basic questions below. Basically, I want to use software trim galore for trimming first and have clarify these first so as to choose proper options.

1, I only have one library. In the fastq file, what is the first line mean @K00162:247:HNJVCBBXX:5:1101:1357:1314 1:N:0:NCTCGTTT ; how to tell if the sequences are from different samples? I guess if they are from the same species, I don’t need to do any demultiplex?

2, Is the adapter sequences used in 10Xgenomics the same as illumina? 3, Is 10Xgenomics using ASCII+33 quality scores as Phred scores?

Thanks a lot for your help! Nunu

sequencing assembly genome • 1.1k views
ADD COMMENTlink written 15 months ago by Nunu0

You may want to use one of the 10x genomic software pipelines to do the analysis. What kind of data is this? Here is an example of pipelines for single cell RNAseq data. You can find others at that same site.

Answers for your questions.

1- Samples may already have been demultiplexed. Did you get a folder per sample with multiple files in it. 10x uses 4 illumina indexes per sample. They also output index reads in a separate file.
2 - Yes.
3 - Yes.

Edit: As pointed out by @Igor, I overlooked the assembly mention in original post. Apologies for that. @Igor links 10x assembler in comment below.

Note: If you do not use 10x assembler then you may lose information about linked reads etc., which is one of the reasons why you do 10x in first place. supernova also produces diploid assemblies.

ADD REPLYlink modified 15 months ago • written 15 months ago by GenoMax95k

The original post mentioned assembly, so maybe it could be in regards to De Novo Assembly.

Either way, 10x Genomics files are organized in a unique way, so following generic tutorials based on standard Illumina data will be confusing.

ADD REPLYlink written 15 months ago by igor12k

hi, thanks a lot for the information. Yes, I'm doing de novo assembly. I have checked this website but it seems that they only recommend using supernova for diploid species, and mine is not diploid.. So I have to try using other assembly pipelines. The targeted genome is around 1Gb, for now, I want to use trim galore for trimming, followed by abyss. Any comments. Thanks

ADD REPLYlink written 15 months ago by Nunu0

I would try Supernova first to see what you get. That would serve as a good control.

If you use Trim Galore and Abyss, you skip generation of linked (long) reads, which is the benefit of 10x Genomics technology over standard short-read sequencing.

ADD REPLYlink written 15 months ago by igor12k

Thanks a lot! My data is genomics data from protozoa, so I can't follow single cell RNA seq pipelines. I only have one folder since it's sequenced long time ago but another technician. I used the following to check the index file

bioawk -cfastx '{print($seq)}' mysample_I1_001.fastq.gz | \
>     sort | uniq -c | sort -k1 -nr | head
85275613 ATGACCGC
75129227 TCTCGTTT
72863044 GGCTAGCG
51064073 CAAGTAAA
1752653 NTGACCGC
1540668 NCTCGTTT
1486006 NGCTAGCG
1045797 NAAGTAAA

so what does this to do with my trimming and assembly?

ADD REPLYlink modified 15 months ago by GenoMax95k • written 15 months ago by Nunu0

Looks like only first 4 are valid 10x indexes corresponding to code SI-GA-C12.

ADD REPLYlink modified 15 months ago • written 15 months ago by GenoMax95k

Thanks, do you know what should I do with these indexes? Also, do I need to figure out first the diploidy state of my targeted species to use a certain assembly software?

ADD REPLYlink written 15 months ago by Nunu0

I would still suggest trying supernova first to see what you get. If you don't want to do that then we are in uncharted territory. You could potentially treat the entire dataset as a normal paired-end dataset and then use assembler of your choice. Even though there are multiple indexes you just have one sample.

ADD REPLYlink modified 15 months ago • written 15 months ago by GenoMax95k

thanks. I will use Supernova. Do I need to do trimming first? In my file, there are 3 files, one index file and one paired end read files. I only have one sample, so I don't need to care about the index file and just proceed with the paired end reads. Is this correct? After checking supernova, I will just need to use supernova run & supernova mkoutput? Thanks a lot.

ADD REPLYlink written 15 months ago by Nunu0

10x provides extensive documentation on how to run Supernova:

ADD REPLYlink written 15 months ago by igor12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2583 users visited in the last hour