7

I have two "identical" 5-character strings in my text editors (Sublime Text2 | Notepad++).

The first string was copied from Gmail and the second one just typed by hand.

When I select the first string, I see 6 characters selected. When I select the second string, I see 5 characters selected.

enter image description here

When I select both strings in Sublime Text2 at the same time, I can see that there is an extra space selected after the first string.

enter image description here

I enabled "Display all characters" in Notepad++ but don't see anything obviously different between the first and the second string.

The file uses UTF-8 encoding. And the issue is consistent in both text editors.

Can anyone please advise how to remove the invisible extra character and where it came from?

BustedSanta
  • 1,922

3 Answers3

4

This worked for me in sublime 3 without using a hex editor.
Using normal search and replace.

  • Open replace dialog (Ctr + H)
  • enter the Unicode char U200B in 'Find What' (*See below for tips)
  • leave the 'Replace With' empty
  • Replace All

/* To get the Unicode char in there in the first place, use you're OS's method.

  • Windows - Hold Alt and type the Unicode code
  • Linux - Ctrl + Shift + u, without releasing Ctrl and shift, type the code
  • Sublime under Linux - As for Linux except it's Ctrl + Alt +Shift. (Sublime 3 binds Ctrl + Shift + u for 'soft redo')

Also, if you know where the char is in sublime you can just select it with shift + Arrow, you'll know you've got it because the cursor doesn't move, it just gets a bit thicker :-)

Unicode Composition in Sublime Text

Mbo42
  • 186
3

Based on the ANSI string that you got, gffk9​, it appears that the additional character present in the text is a zero-width space. Zero-width spaces are used to indicate where a program displaying text may "safely" break a line when the text does not actually visibly contain spaces. Since you copied it from Gmail, it seems likely that this came from an email that used HTML to format the text.

How you can go about removing the extra character may depend on your system. This hex viewer plugin for Sublime Text looks promising since it offers some search capabilities, but it does not explicitly mention searching by hex string or replacement. Since you are using Notepad++, I assume you are on Windows. XVI32 will let you search and replace hex strings in a file.

For reference, if you are in a Unix-like environment, sed would allow you to replace occurrences of a hex string in a file using the process described in this post.

In any case, the hex string that you would be looking to find and replace would be E2 80 8B.

0

you can also use hexdump -C, to see the characters which are strange. checkout for characters which are marked with .(dot) where it should be a space.

enter image description here