Unveiling the Mystery of Google Fonts
What You Don't Know Can Hurt Your Privacy [Conspiracy Corner]
Pre-fonting
Most of the conspiracies we hear about wear out quickly, becoming dull and repetitive. Have people simply lost their imagination or perhaps they never had any to begin with. It's possible they're not even true “conspiracy theorists” – whatever that means. Whatever you want to call it, whatever you want to call me–I have a great imagination.
When I'm thinking about what's possible–I couldn't care less about the "truthfulness" of what it is I'm thinking about. Doing so takes away from the very thing that makes the idea exciting. Balancing imagination with reason can be challenging, but it's essential.
For the most part I like to exist right on the border - not an immigrant and not a citizen. Something in between-no this isn't a statement on the current state of affairs of what's going on in the world. If that's where your brain went it's possible evidence of a limiting belief mindset. Sometimes we should learn to let go of preconceived notions so we can allow our brains the mental flexibility it craves.
Sure I have my own limiting beliefs and boundaries–which I'm acutely aware of. Strong convictions in the real world help defend against those who manipulate language for negative purposes.
Email and the Brain
I emailed Google the other day. Is that a bizarre thing for a regular person to do? Maybe. But I wanted to, so I did. I suppose I should give you some backstory.
First I have a confession to make–my brain doesn't seem to function the same way as other people's. Not simply on account that I have ADHD, no, there's something else. I have this burning curiosity inside me. When I'm interested in something I want to know everything. I want to know how it works.
What makes it tick? What issues are there? Why do people think or feel the way they do about it? Does it have flaws? Will someone exploit this system in some way? By whom? For what reason? What potential consequences arise from its exploitation? Financial? Health? Can it or is it used to exploit people? Why does seemingly nobody care if it can? Okay, I think you get the picture.
I love to ask questions–some people hate that I love to ask questions. Why? Most likely because it challenges their beliefs. People really don't like having their seemingly hard-wired beliefs challenged. Especially when they've dedicated so much of their time believing it. Sometimes it really can take a lot of effort and dedication to maintain steadfast beliefs. Think about it, you have to actively block out anything that goes against your beliefs. You have to avoid speaking to or engaging with people who don't share the same beliefs. When encountering such individuals, defensive or passive-aggressive responses may be necessary. It's a hell of a lot of work. It's a whole thing.
Sometimes I ask seemingly dumb questions–to the casual observer they'll seem this way. Ask yourself why people ask questions. The answer may shock you! How's that for clickbait? All I'll say is that it's not always to simply gain knowledge or to find something out for the sake of it. Questions are powerful, use them.
Why did I email Google? Short answer: I had a totally insane, crackpot idea about privacy with Google Fonts and I wanted to see if Google would entertain the idea–humour me a little, throw me a bone, anything. Spoiler, they didn't. I already knew the likely response before typing a single keystroke—it felt predestined. Yet I still reached out to them. Me, just one dude in a torrential sea of people clamouring to unintentionally throw caution and my data to the wind of Google's gargantuan privacy-invading sails.
I'll briefly outline the idea I had on [I don’t remember the exact day], then share my inquiry to Google about their Font privacy practices, their response, my thoughts, and dive into the details.
It all began when I spent three hours creating my own custom font to use on Substack. Only to discover that their idea of a custom font is in fact a choice between pre-installed / pre-selected fonts.
My journey of font creation may end up as an article in future-if it does I'll likely link it here. Naturally after spending three hours in the wonderful world of fonts and following various braingential (brain + tangential–I just made it up, make sure you credit me when it makes you a millionaire in some way–if you find that someone else came up with it before me, thank them instead!) detours, I ended up at Google Fonts.
You see, much like Godwin's Law (the chance of a thread or discussion online referencing "Nazis" increases the longer it gets), in the world of fonts a similar phenomena occurs. Whereby when someone searches for information about fonts online it becomes inevitable that Google Fonts appear. The threshold for this to happen is actually far lower than with Godwin's Law. It's a fact, you can look it up – literally look up to what I just wrote because I imagined that up as well. I don't have a name for this law yet, drop your suggestions in the comments and I'll give you partial credit.
Intro-font-matter
Do you know what Google Fonts are? Do you know how they use them? Where are they stored? Where do they come from? (where'd they go?) Cotton Eye Joe? Isn't it strange that companies can technically put files on our computers, phones without us being consciously aware? On top of that, we don't really know what's in them or exactly what they do. Even if you read through all of their terms and conditions, privacy policies you wouldn't be any closer to really understanding all the technical goings ons. We have to trust they're doing the right thing. Trust in large corporations is scarce these days, if it ever existed at all.
WHAT THE FONT?
Do you know what a font is? You know how everyone has their own unique way of writing? Each person's handwriting is different. What makes each person's handwriting different? It's the shapes, the whirls, the lengths of the individual parts that make up each letter, how each letter connects with the next (or doesn't). If we call each person’s handwriting their own style, we can do the same thing with fonts. A font is basically the particular style those letters have. While there are thousands of different font styles, most people only use a handful. When you type a document on your computer or phone you will most likely be using the default font.
SIDE QUEST: Who gets to decide which font will become the default font? When creating a system, different teams perform various functions. One of those teams would be the UX team, or User Experience. Their job is to make sure—or try to as best they can, that all the buttons you click on or tap are easy to find and the colours don't strain your eyes, etc. They are also the people who decide on what the best font is to have as the default option. It's not simply a matter of "Eeny, Meeny, Miny, Moe"—the process would include user surveys and various other processes. Not everyone will agree on the default font in the end—but you can't please everyone! For example, back in January of this year (2024 for those of you playing along from the future) Microsoft seemingly changed their windows operating system default font out of nowhere. In reality, people have talked about this since 2021—though honestly, how many will eagerly await or keep up with news about fonts?
BUT, BUT, HOW THE FONT?
So we now have a basic idea about what fonts are—the next question is, how? How does your computer, phone, Hiptop (do you have to look this up?) manage to display the different fonts? Your computer or phone, whichever device, have small files that have the fonts (not all but a select number). The different programs or apps and the operating system itself use the files to display them. If you ever see files that end in “.ttf”—these are font files.
FONTS ON THE INTERNET?
What if someone has made their own website and they want to use some font no one has ever heard of—or maybe even their own font? Would you let the, "Frank's Flat Earth Society Blog" install an unknown file onto your computer? Here we would be in a bit of a bind, a font privacy bind–how on Earth can we display Frank's, "my-awsum-font-v2-3-final-fancy.ttf" font so the website doesn't look super plain? Good question! When you build a website, you must define which fonts to use for different elements (headings, subheadings, body text, etc.). These can be from various sources, though for the sake of this article, which is talking about Google Fonts, we’ll stick to that.
GOOGLE FONTS
Many, many websites on the Internet use Google Fonts. I dare say Google Fonts would be the most used font ‘CDN’ on the entire Internet. I don’t have figures to back this up, it’s just a hunch. Due primarily to the following:
Wide adoption
Many different website Content Management Systems (CMS), like WordPress, widely adopt Google Fonts and integrate them.
Free and Open Source
There is a massive variety of Google Fonts available so everyone has much choice when it comes to using it.
Ease of Use
To use Google Fonts on your website it’s generally just a matter of copying and pasting the boilerplate code into the header of your website and Bob’s your uncle.
Optimised
Because it’s Google, they can use their global Content Delivery Network (CDN) to deliver fonts quickly.
Updates
Google Fonts constantly update, adapting to changes and improvements in screens and devices.
Here’s a quick breakdown on exactly how Google Fonts work:
Users (the person creating the website, for instance) choose a font from the Google Fonts library
The website owner integrates the font/s into their website
When a user visits their website, their browser sends a request to Google’s servers to retrieve the font.
Fonts are loaded by Google only when necessary using dynamic methods
The user’s browser displays the font in the way the developer intended!
The user’s browser will determine which font file format to load (WOFF2, WOFF, TTF) to ensure the fonts load as quickly as possible
The Conspiracy
When I first started thinking about all this, one of the first thoughts I had was, no one is going to believe this is possible. So naturally I got to work writing all my thoughts about it in preparation for publishing it! As I mentioned earlier, I had this crackpot idea about Google Fonts. Also, just to be clear, I don’t necessarily think this is plausible or that it actually exists, it’s simply me regurgitating ideas originally conceptualised within my brain. In the following sections there will be parts that are a mock-up of what a research paper into this might look like–don’t mistake it for actual research, because it isn’t. (I didn’t end up including this so I could publish the article sooner).
If you're logged into a website like Facebook, for example, who, if anyone, do you think would be able to read your private messages? The cynical and paranoid among you may answer something to the effect of, "oh, all Facebook employees would be able to read everyone's messages.". This is quite a common frame of thinking and sometimes it feels as though there are only two opinions, generally. Either there is trust that they're doing the right thing or that they're probably reading everyone's messages.
Do you understand all the things involved for the employees to be able to read your private messages? I can't speak for Facebook directly, but a reputable company would have controls in place to ensure this can't happen, or only in very limited circumstances. What if there was another way?
Let's take a really "out there" example to suit this really out there hypothesis, putting aside the ethics and legality. Say you're part of a police organisation that has been tasked with investigating data breaches, the ones you might hear about in the news. A source has led you to a particular forum online that is known for discussing and sharing these types of breaches. You've managed to narrow down your suspect to less than a handful of users on the forum.
The trouble is that the owner of this forum is giving you a hard time and is not willing to share information about their forum's users. Not without a warrant signed by a judge, which you don't have. You suddenly have a lightbulb over your head moment, instant and your face lights up exactly like you would see in a cartoon.
You remember one time you were helping with building an internal website for your department, one that shouldn't be connected at all to the Internet. It was to be on a local network only. You were having some issues with the fonts loading properly. You put in a request to be able to use Google Fonts. In the request you included how all connections to the Internet would be blocked, except for those needed for loading Google Fonts. It was a long shot, but you knew it was the easiest way to solve the font loading issue you were having.
The request was denied, though not for the reason you were expecting. Below is the rejection:
You always thought it was weird that it was rejected under 37-B because Google was considered a trusted third-party provider for public, Internet-facing websites. Your sergeant wouldn't elaborate and refused to go into more detail. He kept pointing you to the memorandum and saying, "I strongly encourage you to fully understand the memorandum—I mean really understand it and its implications".
You must have read through the policy and memorandum a hundred times, and it never clicked until now. There was a snippet of the memorandum that has been playing over and over in your brain:
... is the potential for inadvertent data exposure through interactions...seemingly routine requests or interactions may inadvertently disclose sensitive information or patterns of behaviour
You understood that your browser can be singled out due to fingerprinting. There are many data points in your browser that make it unique out of all the millions of users across the web. No, this went deeper than that, there was something more, and you think you just figured out what it was.
When your browser requests a font from Google Fonts, Google optimises this process to ensure the experience isn't slowed down for the user. Most of the time it's not necessary to load the entire font set, or every letter on every single request. To ensure the experience is as fast as possible for the user, only the necessary sets of characters are sent. For example, if the character set {E, L, O, V} is frequently requested together, the word "LOVE" could be inferred.
Challenging, But is it Doable?
There are a number of practical issues to take into consideration when thinking about whether Google Font data collection and analysis in this way is possible. I feel as though I can’t in good faith throw Google under the bus, even with a completely hypothetical scenario.
Caching and Reuse
I think, at least from Google’s hypothetical perspective, caching and reuse of font data would be the biggest hurdle. It would not be feasible for a web server to load the font data from Google Fonts every time a request was made. In reality, caching would be used to store the font or character set locally on the web server. Caching would need to either be permanently disabled or wiped during the periods when a person to be monitored would be online. In theory it is certainly possible, though, depending on the size of the web server, it could potentially have a big impact on performance–noticeable at the very least.
Non-sequential Characters
When we’re typing words, in general we type the letters in the correct order. However, when the character sets are downloaded from Google Fonts, they’re not necessarily downloaded in the same order. This ups the complexity of the analysis task significantly.
Out-of-Context
In the scenario that I posed earlier, we knew the context of the font analysis–data breach forums. When the context is known, it makes the analyst’s job a lot easier because they will have a general idea of the types of words that are going to come up. It would be much more difficult to achieve if the context was not known.
Repetition Repetition
Some letters will appear a lot and be repeated often, like vowels. These letters add little unique value.
What’s Possible?
Let’s ignore the challenges listed above and focus our attention on what the user data might yield–the possibilities are endless, but not really. Looking into the potential Google Font security concerns, I believe there’s two primary avenues that would be the most likely.
Artificial Intelligence (AI) / Machine Learning (ML)
Google has vast amounts of data they’ve been collecting over the years. If we have a look at the table below, it really puts into perspective the amount of times Google Fonts are actually used:
Yes, that ‘T’ to the right-hand side of the numbers stands for, trillion. It’s safe to assume that Google has enough data to do pretty much whatever they want with.
If Google were to utilise AI / ML models, it’s possible that at the very least they could make some broad inferences and identify patterns.
Behavioural Analysis
By analysing font request patterns alongside other metadata such as, IP addresses, browser user agents and so on, it could be possible to infer any number of things.
Concluding Remarks
This is a completely hypothetical, made-up from my own brain folds, deliberate ‘conspiracy’ I thought I’d share. Google has been in its fair share of privacy-invading, user trust violating scenarios and I really don’t think what I’ve presented here is completely out of the realms of possibility for Google. On the other hand, we have to ask ourselves, what does Google have to gain from doing this? The amount of effort, resources and time it would take to get even a small amount of useful data would be significant. It would have to be a highly targeted, state level adversary to warrant such attention. Oh well, another thought for another day!
That’s it for now–as always, good luck, stay safe and be well.
And, yes, the appeal of a custom font!
[even if they are not downloaded a trillion times].
I had wondered if Google was something capable of this since they acquired Blogger and fonts became an issue there.
[Though one mostly uses the system fonts and/or platform-agnostic fonts - platform-agnostic was really important me switching from Mac to Windows to Mac again and through various iThings and Androids].
Cambridge Analytica of typography?