Question

trim galore for 10Xgenomics

0

Entering edit mode

4.5 years ago

Nunu ▴ 20

Hello helpers, I just got raw reads from 10Xgenomics and started to do trimming and assembly. I’m new to this technology and have some basic questions below. Basically, I want to use software trim galore for trimming first and have clarify these first so as to choose proper options.

1, I only have one library. In the fastq file, what is the first line mean @K00162:247:HNJVCBBXX:5:1101:1357:1314 1:N:0:NCTCGTTT ; how to tell if the sequences are from different samples? I guess if they are from the same species, I don’t need to do any demultiplex?

2, Is the adapter sequences used in 10Xgenomics the same as illumina? 3, Is 10Xgenomics using ASCII+33 quality scores as Phred scores?

Thanks a lot for your help! Nunu

Assembly genome sequencing • 3.6k views

ADD COMMENT • link 4.5 years ago by Nunu ▴ 20

2

Entering edit mode

You may want to use one of the 10x genomic software pipelines to do the analysis. ~~What kind of data is this? Here is an example of pipelines for single cell RNAseq data. You can find others at that same site.~~

Answers for your questions.

1- Samples may already have been demultiplexed. Did you get a folder per sample with multiple files in it. 10x uses 4 illumina indexes per sample. They also output index reads in a separate file.
2 - Yes.
3 - Yes.

Edit: As pointed out by @Igor, I overlooked the assembly mention in original post. Apologies for that. @Igor links 10x assembler in comment below.

Note: If you do not use 10x assembler then you may lose information about linked reads etc., which is one of the reasons why you do 10x in first place. supernova also produces diploid assemblies.

ADD REPLY • link 4.5 years ago by GenoMax 141k

1

Entering edit mode

The original post mentioned assembly, so maybe it could be in regards to De Novo Assembly.

Either way, 10x Genomics files are organized in a unique way, so following generic tutorials based on standard Illumina data will be confusing.

ADD REPLY • link 4.5 years ago by igor 13k

0

Entering edit mode

hi, thanks a lot for the information. Yes, I'm doing de novo assembly. I have checked this website but it seems that they only recommend using supernova for diploid species, and mine is not diploid.. So I have to try using other assembly pipelines. The targeted genome is around 1Gb, for now, I want to use trim galore for trimming, followed by abyss. Any comments. Thanks

ADD REPLY • link 4.5 years ago by Nunu ▴ 20

1

Entering edit mode

I would try Supernova first to see what you get. That would serve as a good control.

If you use Trim Galore and Abyss, you skip generation of linked (long) reads, which is the benefit of 10x Genomics technology over standard short-read sequencing.

ADD REPLY • link 4.5 years ago by igor 13k

0

Entering edit mode

Thanks a lot! My data is genomics data from protozoa, so I can't follow single cell RNA seq pipelines. I only have one folder since it's sequenced long time ago but another technician. I used the following to check the index file

bioawk -cfastx '{print($seq)}' mysample_I1_001.fastq.gz | \
>     sort | uniq -c | sort -k1 -nr | head
85275613 ATGACCGC
75129227 TCTCGTTT
72863044 GGCTAGCG
51064073 CAAGTAAA
1752653 NTGACCGC
1540668 NCTCGTTT
1486006 NGCTAGCG
1045797 NAAGTAAA
349133 TCTCGGTT
302770 CAAGGAAA

so what does this to do with my trimming and assembly?

ADD REPLY • link updated 4.5 years ago by GenoMax 141k • written 4.5 years ago by Nunu ▴ 20

0

Entering edit mode

Looks like only first 4 are valid 10x indexes corresponding to code SI-GA-C12.

ADD REPLY • link 4.5 years ago by GenoMax 141k

0

Entering edit mode

Thanks, do you know what should I do with these indexes? Also, do I need to figure out first the diploidy state of my targeted species to use a certain assembly software?

ADD REPLY • link 4.5 years ago by Nunu ▴ 20

0

Entering edit mode

I would still suggest trying supernova first to see what you get. If you don't want to do that then we are in uncharted territory. You could potentially treat the entire dataset as a normal paired-end dataset and then use assembler of your choice. Even though there are multiple indexes you just have one sample.

ADD REPLY • link 4.5 years ago by GenoMax 141k

0

Entering edit mode

thanks. I will use Supernova. Do I need to do trimming first? In my file, there are 3 files, one index file and one paired end read files. I only have one sample, so I don't need to care about the index file and just proceed with the paired end reads. Is this correct? After checking supernova, I will just need to use supernova run & supernova mkoutput? Thanks a lot.

ADD REPLY • link 4.5 years ago by Nunu ▴ 20

1

Entering edit mode

10x provides extensive documentation on how to run Supernova: https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/fastq-input

ADD REPLY • link 4.5 years ago by igor 13k