Why ‘dark social’ and AI traffic makes it harder than ever to make sense of website analytics

An increasing challenge for those of us working in the world of website analytics is the impact on traffic from Artificial Intelligence (AI) tools such as ChatGPT and Gemini. The traffic – or as I will come on to discuss, the lack of traffic – from such platforms is a relatively new phenomenon. However, it adds to other relatively recent challenges such as a rise in what is known as ‘direct’ traffic as well as the mysterious sounding source of traffic known as ‘dark social’.

Putting things in buckets

Let me back up a bit. Website analytics platforms routinely try to categorise traffic to web pages into different ‘buckets’ based on what they know about how someone arrived on a website.

Analytics platforms can tell you that a page view on your website originated from someone who arrived using a search engine, or from a public post on LinkedIn, or that your site was linked to from another website such as Wikipedia.

These different buckets all have special names, but the one that is occupying more of my time lately is the bucket called ‘direct’ traffic.

Historically, this captured traffic from people who went ‘directly’ to your website by typing an address in your browser or by clicking on a browser bookmark. But these days, the ‘direct’ bucket of traffic captures a lot more than that.

A better choice of name would be ‘untraceable’. If Google Analytics (and other website analytics platforms) can’t detect how the visitor came to your website, that traffic gets put in the ‘direct’ bucket.

This means that direct traffic can also includes things such as clicking on links in emails, or links in most messaging apps. Other sources of direct traffic include links on platforms like Apple News or links inside private Facebook groups.

The dark side clouds everything

‘Dark social’ is the broad (and nebulous) term that has emerged to describe a subset of this untraceable direct traffic. It was coined by Alexis Madrigal in 2012 in his blog post Dark social: we have teh whole history of the web wrong.

It is typically used to describe traffic that might come from some social media platforms and messaging apps. WhatsApp is very likely a big source of ‘dark social’ traffic for many websites.

Because direct traffic is – by its very nature – untraceable, you can’t really ever know where it came from. Some tell tale signs of dark social traffic is that the traffic is to a specific page that might have been shared by someone, e.g. a news story or blog post.

It is also probably is more likely to have come from a mobile device as most social media is still consumed on phones. If you didn’t know, analytics platforms can track a lot of data about the technology used to visit a web page (desktop PC vs mobile phone, Mac vs Windows, Firefox vs Chrome etc.)

There are other signals that can be also used to potentially identify subsets of direct traffic but ultimately this is all just educated guessing and you can never know for sure.

Rise of the machines

AI is rapidly changing the landscape of how website traffic is captured. First and foremost, if you run an information-heavy website with lots of resources, then AI tools might be leading to a big decline in your traffic.

This is simply because many more people are getting their answers from an AI chatbot and no longer need to click through to a website to find out more. This is assuming that chatbots provide a link to your site which may not always be the case…and even if they do, that link might be very easy to miss.

A lot of AI tools leave a digital fingerprint that enables web analytics platforms to put them in the ‘referral’ bucket of traffic. I.e. where traffic to your site was a referral from another website or tool.

A recent update to the website analytics platform that I routinely use (Matomo Analytics) has started capturing this known AI traffic in a new ‘AI Assistants’ bucket which lets me see traffic from popular tools such as ChatGPT, Copilot, Perplexity etc.

This is a screengrab of  Matomo Analytics’ ‘AI Assistant’ data. This is displayed a simple table with various AI tools in the first column (ChatGPT, Copilot, Gemini are the first three) and then associated columns of data such as ‘Visits’

AI Assistants traffic as it appears in Matomo Analytics

This traffic is growing rapidly but doesn’t neatly explain all AI-related traffic. If you use Google’s new AI mode in their search interface and then click through to a website, this is still captured in website analytics platforms as search engine traffic. But if you use a tool like Siri or Alexa to open a webpage, this might show up as direct traffic.

A tangled mess

All of this means that it is getting harder than ever to disentangle the various strands of traffic that bring people to your website. Dark social contributes an unknowable, and possibly increasing, subset of direct traffic and I imagine that the growth of WhatsApp is a big part of this.

AI tools - which are causing a reduction in traffic to many websites - might record what traffic they do generate in several different ways.

I’m finding that I’m increasingly using terms like ‘presumed WhatsApp’ traffic in some of my regular analytics reporting as the real answer about where our website traffic is coming from is increasingly ‘I don’t know’.

Regular recording of data is helpful as you can then start spotting trends that might reveal which areas of your website are seeing differences in traffic.

I’d also recommend using Google’s search console tool if you are concerned about AI tools displacing and replacing traditional search engine traffic. I’ve already noticed that the biggest drops in search engine traffic are to those sections of the website which are ‘informationally rich’. I.e. the content that is more likely to have been regurgitated as part of an AI tool’s answer to someone’s question.

Good luck with your website analytics detective work…it’s a jungle out there.