If you’ve ever tried changing the color of e.g. accent marks or combined characters you know it can be a pain. Here is an example.

That post explains both what we as developers want (colorising parts of a glyph) and the issues with it (the system doesn’t know the accent from the base character). Let’s use the example from the post. We have the text (in thai) วาีม which consists of the following characters ว + า + ี + ม. The า + ี combines into าี.

Both the code that tries to set the colors, as well as why it doesn’t work is within the link above.

The tl;dr of the issues is that the system doesn’t know the difference of the characters when it is being rendered, and fonts specify their own way of rendering characters. I honesly have not dug into this, so I can’t speak with confidence about the details - I have simply accepted the fact that there is no easy way of doing it.

Failed attempts

I spent many weeks on solutions that sort of worked, but fell short on the first “non trivial” case I encountered. I can unfortunately not get into why they failed as it was a long time ago and the attempts (code) is more or less gone. So I will focus on the method that works.

“Render by DNA”

I call this solution “render by DNA” because it essentially identifies characters by the way they are rendered then changes the color for the specific character. This might be clearer with some examples.

With this technique, let’s look at the example again.

Now let’s see how this is achieved. I’ll start by explaining the DNA part of the solutions. To begin, let’s examine a character being drawn in iOS. If we dig deep enough (or rather, google a bit) we can find that iOS renders CGGlyphs, which is basically.. something that says “put a point here, then here, then a line here” etc to draw shapes (in our case, characters). Honestly, it gets pretty complicated with all the CGClyph, CGPath, CTLineRef, CTRun and all sorts of low level APIs to draw and deal with fonts.

In the end all these APIs culminate in us getting to the fact that we can draw text with bezier curves, and it is possible to get each point and curve by iterating each point through these APIs.

To visualise this, I bring a character into a vector editor and convert the font into a path.

This path have different types of building blocks, points. If we color-code each type, we end up with something like this.

The different points have different meanings, there are points such as “start path”, “end path”, “add line” and “add curve”. If we lay the points out in a straight line, we get this sequence, or dare I say DNA. The DNA of the specific character.

Now we get back to the low level APIs in iOS. The point of them was to convert the text into points, just like the vector program does. Doing so, we get a sequence of CGPoints for each character. For each point, I also assign an identifier with a number. For example, “start path” is 0 and “add curve” is 2. The sequence of paths we got from the character ว is 02222222212222122224. If I colorise them based on their type, we get the following.

As we see, this matches up with what the vector program gave us. Note that the sequence will be different if the font is different, so this solution will be font specific.

We are not limited to doing this per character, the low level APIs in iOS are happy to give us all paths and points for all character in an NSString. So if we look at the DNA sequence for the whole example string วาีม we get this.

So now what do we do with this? Well we can start by finding our character ว because we know what its DNA string looks like.

Applying the color to each individual characters DNA sequence yields something like this.

The non-colored parts are essentially the holes in each character which are actually also included in each character, but I remove them for simplicitys sake.

This means that, for each character in each font, we need to know its DNA even before rendering. The process of gathering this is fairly simple, albeit a bit time consuming. What I personally did was to check the DNA of each character individually and add it into a lookup table for use while rendering.

Before the actual “render” explanation, we need to look slightly deeper at each pointer in the DNA sequence.

Each point actually includes data about where and how it should render*. For example the exact coordinates of each point in the bezier path and the control points, if you’ve ever played with them, you’ll probably understand what this means. It obviously also contains metadata about what kind of point it is, so that we can actually re-create CGPaths later for the actual rendering of the character.

* NOTE: This only applies to the DNA generated during runtime (because only the system knows the metadata like positions since they depend on the font and font size). The lookup table only includes the order of point-types (used to find DNA that includes more metadata).

Now let’s get into how we can use this to render the text.

So what we do now is the following steps:

  1. Generate the DNA sequence (and points data!) for the string วาีม
  2. Loop through each character in ว + า + ี + ม
  3. For each character:
    • Look up the DNA for the character in the lookup table
    • Find the matching DNA in the DNA full sequence of the string we want to render
    • Extract the DNA (including the points)
    • Create a CGPath based on each point
    • Render

Example project

Find code and example project at https://github.com/jontelang/DNALabel-proof-of-concept.

Issues with this solution

This solutions is obviously not optimal, in fact I ended up not using it at all in my app. But for completeness I’ll mention some issues I know it has, or possible issues.

  • Since we need to manually create the lookup table, we would need to do this manually for each font we want to use.

  • Since the lookup table is only the “type” of each point, two characters could technically be mistaken for each others. For example, take the characters ป and บ. They have the exact same order and type of points in their DNA, the only difference is that one of them have two points a bit higher while rendering. In practice, though, this isn’t an issue, because we will still find one character before the other and it will have the correct metadata in its DNA points to render correctly. But there might be issues like this, however I’m not able to find any example that doesn’t work at the moment.

  • It’s kinda complicated for what it achieves, it’s probably easier to just show an image in the few cases where you might want to highlight something.

  • It doesn’t (yet?) work if rendering multiple lines. This might be a simple fix but since I abandoned the idea I never bothered looking into it.

  • Probably many more.