Hunting invisible characters?
1
0
Entering edit mode
3.0 years ago
geneticatt ▴ 140

Hi all,

I have a set of adapters which were given to me by a collaborator in a regular text file (i5R.txt). I moved these sequences onto my institution's linux HPC and attempted to use the files to pull sequences from a fastq using grep -f like so:

grep -f i5R.txt myseqs.fastq

This returned nothing, which was surprising because I know that the adaptors are there because I can match them in vim. Suspecting some pesky invisible characters, I typed out the characters in vim into a new text file called i5R.seqs. This fixed the pattern matching issue with grep.

Here is the diff of the two files, to show that they appear identical.

[geneticatt]$ diff i5R.txt i5R.seqs
1,8c1,8
< CCTGATAC
< TTAAGTTG
< CGGACAGT
< GCACTACA
< TGGTGCCT
< TCCACGGC
< ATGTCGTG
< CCACGACA
---
> CCTGATAC
> TTAAGTTG
> CGGACAGT
> CGACTACA
> TGGTGCCT
> TCCACGGC
> ATGTCGTG
> CCACGACA

What type of character could be the culprit? I searched for \r because I've had problems with that one before, but this is another invisible character. How does one go about hunting down and removing the invisible characters that plague their workflow? Further, what preventative measures can I take to make sure I don't get hung up on something like this again?

adaptor adapter grep • 1.1k views
ADD COMMENT
1
Entering edit mode

You could have looked at the file using cat -vet which would have shown all characters in the file. Printable and non.

ADD REPLY
1
Entering edit mode

Another way to see hidden characters is to pipe them through octal dump: cat infile | od -c this will print out hidden characters, newlines, etc.

ADD REPLY
2
Entering edit mode
3.0 years ago
Mensur Dlakic ★ 27k

You may want to read this. I think you may be able to fix your adaper file by typing:

dos2unix i5R.txt

If an error pops up saying that a command doesn't exist, this should work:

sed -i 's/\r//' i5R.txt
ADD COMMENT
0
Entering edit mode

Thank you, using dos2unix worked perfectly!

ADD REPLY

Login before adding your answer.

Traffic: 2438 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6