AOL Leak: Deep-diving into hundreds of thousands of users' personal search histories
In 2006, the internet witnessed one of its first great data scandals. Not through a hack, but through a colossal mistake. AOL publicly released a "research dataset" containing the search histories of roughly 660,000 users.
We will start by exploring this data through a standard aggregated approach to understand its limits, before diving into the raw logs to uncover the heavy, deeply personal stories hidden behind the numbers. The aggregated section was deliberately kept simple. The point is not to showcase the analysis, but to make the raw data shine brighter by contrast.
If I did my job correctly, you'll understand how much aggregated analysis can destroy information that was already speaking for itself.
We are diving into people's intimacy here. Although I won't dwell on the most disturbing things I've found, some themes remain very difficult to swallow. The first part of this article is focused on the statistical challenges, so it's safe to read. The sensitive content will be flagged with a Trigger warning.
Table of content:
- AOL Leak: Deep-diving into hundreds of thousands of users' personal search histories
The naive part: The Aggregated Data Approach
Preparing the Data
The raw dataset is surprisingly clean, a simple tab separator was enough to convert it into a DataFrame. At first glance, it looks like any standard Kaggle dataset: plenty of rows, but not much soul.
A quick sum-up:
- 660,000 unique users.
- 3,614,506 total queries (1,244,496 unique).
- On average, each unique query is repeated ~2.9 times.
- 1,935,613 queries led to a click, resulting in an overall Click-Through Rate (CTR) of ~53.5%.
To make things a bit more interesting, I tried to categorize the queries into themes. That's where the challenge began.
The Normalization Wall
I started by normalizing the queries (lowercasing and removing accents) and aggregating them by "blueprint." By sorting the words within each query alphabetically, I could group them by signature. For instance, google maps and maps google share the exact same blueprint.
Even then, this only reduced the number of unique "blueprint" queries by about 36,000. I was still left with 1,207,707 queries to deal with.
The Statistical Trap
One might argue that, according to the "sacred laws" of statistics and probability, sampling 100k queries at random should provide a representative sample of the vocabulary needed to build a categorization dictionary. But it doesn't work that way. In human search behavior, the most vital information is held by the least frequent queries. Look at the volume distribution:
- At 50% of total query volume, we've only covered 8.1% of the unique vocabulary.
- At 80%, we reach 40.4%.
- At 90%, we're at 70.1%.
- At 95%, we still only account for 85%.
The queries with the lowest frequencies are the ones holding all the interesting information. We already know the common uses of a search engine; what I'm interested in are the specific queries, and they are incredibly diverse. The distribution is far from normal, the data has a positive skew > 800, confirming that the vast majority of "the story" is hidden in the long tail of rare, highly specific searches.
Bruteforcing my way into a categorization dictionary
I spent an absurd amount of time defining regex patterns based on what I could find within the extremely numerous yet infrequent queries (those with a frequency of 1 or 2). I did the same for the average frequent queries. The most common queries, however, were categorized as "other," as they rarely contain anything of interest.
Although this approach makes the overall data much more understandable, allowing us to finally build some analysis, it has significant drawbacks:
- False Positives: A query like
how to kill someonecould be categorized as "reference and how to" simply because it matches thehow topattern. - The Precision Trap: You either reduce the data to a small set of categories to make aggregation possible (and lose vital information), or you use a multi-categorical approach and end up with so many unique combinations that you can no longer extract a meaningful "high-level" perspective.
I always advocate for a high-granularity approach, which is straightforward when dealing with numerical values. But here, we are talking about strings and sentences. Statistics can't truly understand context beyond basic patterns. Even sophisticated token clustering reaches its limits fairly quickly. After all, we all know how "dumb" an LLM can be when context gets messy.
Anyway, here are some numbers:
View categorization results table
| Category | Redundancy | Unique Users | % Share | Avg Query / User |
|---|---|---|---|---|
| Other | 1,802,226 | 54,800 | 49.86% | 32.89 |
| URL Navigation | 576,839 | 51,658 | 15.96% | 11.17 |
| Technology & Computers | 183,435 | 29,592 | 5.07% | 6.20 |
| Entertainment (Movies/TV) | 106,451 | 15,361 | 2.94% | 6.93 |
| Adult & Sexual | 94,610 | 7,594 | 2.62% | 12.46 |
| Travel & Tourism | 68,164 | 11,946 | 1.89% | 5.71 |
| Health & Medical | 66,732 | 9,846 | 1.85% | 6.78 |
| Education | 65,019 | 10,044 | 1.80% | 6.47 |
| Automotive | 64,351 | 9,799 | 1.78% | 6.57 |
| Fashion & Beauty | 49,159 | 7,574 | 1.36% | 6.49 |
| Shopping & E-commerce | 45,951 | 10,636 | 1.27% | 4.32 |
| Real Estate & Housing | 43,124 | 7,746 | 1.19% | 5.57 |
| Finance & Banking | 42,561 | 9,207 | 1.18% | 4.62 |
| Sports | 41,405 | 7,555 | 1.15% | 5.48 |
| Reference & How-to | 34,238 | 9,044 | 0.95% | 3.79 |
| Home, Garden & DIY | 34,157 | 6,171 | 0.94% | 5.54 |
| Food & Restaurants | 33,961 | 6,385 | 0.94% | 5.32 |
| Gaming | 33,582 | 6,248 | 0.93% | 5.37 |
| Pets & Animals | 32,233 | 5,089 | 0.89% | 6.33 |
| Government & Politics | 27,777 | 6,292 | 0.77% | 4.41 |
| Legal, Law & Crime | 23,005 | 4,456 | 0.64% | 5.16 |
| Employment & Jobs | 21,973 | 4,375 | 0.61% | 5.02 |
| Religion & Spirituality | 20,111 | 3,240 | 0.56% | 6.21 |
| Family & Parenting | 17,734 | 3,253 | 0.49% | 5.45 |
| Crafts & Hobbies | 12,506 | 2,084 | 0.35% | 6.00 |
| Dating & Relationships | 11,920 | 2,555 | 0.33% | 4.67 |
| Science | 11,477 | 2,703 | 0.32% | 4.25 |
| News & Media | 10,498 | 2,967 | 0.29% | 3.54 |
| Weather | 9,942 | 3,331 | 0.28% | 2.98 |
| People Search | 7,306 | 2,115 | 0.20% | 3.45 |
| Occult & Paranormal | 5,821 | 1,144 | 0.16% | 5.09 |
| Music | 5,554 | 1,769 | 0.15% | 3.14 |
| Home Improvement | 4,010 | 991 | 0.11% | 4.05 |
| Social Networking | 4,005 | 1,392 | 0.11% | 2.88 |
| Military & Weapons | 3,048 | 450 | 0.08% | 6.77 |
Few Observations
- The "Google in Google" Syndrome: About 16% of queries were simply users typing a URL to access a website. Yes, searching for
googleinside Google was already a common habit in 2006. - The Effort Gap: On average, users need fewer queries to find a car than they do to find porn (~6.57 queries per user for "Automotive" vs. 12.46 for "Adult & Sexual").
- Geographical Markers: If it wasn't obvious already, the "Military & Weapons" category, and the specific nature of those searches, clearly indicates we are looking at a U.S.-based dataset.
Now look at that table again. What does it actually tell us? That people in 2006 searched for porn, tech support, travel, and health information. That nearly half the data falls into a catch-all "Other" bin that resists any meaningful label. That the categories which sound the most interesting on paper, "Family & Parenting," "Dating & Relationships," "Legal, Law & Crime," together account for less than 1.5% of total volume.
This is the fundamental problem with aggregate analysis on behavioral data: the numbers confirm what you already assumed and obscure everything you didn't. The table above is a perfect mirror of generic internet usage. It tells you absolutely nothing about the 660,000 individual human beings behind it. You could show this table to anyone with a basic understanding of the internet and they would shrug. There is absolutely nothing new under the sun here.
This is why I titled this chapter the "Boring Part": not because the technical process wasn't a challenge, it was actually quite fun to do, but because of the sheer lack of narrative value in the results.
The revealing part: Why granular data holds so much more information
Statistics is the wrong tool for the job
Have you ever read letters from soldiers to their families during wartime? Even though these men were technically identical, same uniform, same age, same rank, same frontline and same tragic destiny, each one possessed a singular narrative. They all spoke of the same themes and likely followed the same structure, yet every letter touches you in a unique way.
The same thing is happening here. Nothing truly interesting can emerge from such a "living" dataset through statistical analysis alone. Transforming written intent, desperate questions, or daily habits into raw numbers kills the original motive and the story behind it. Hell, even normalizing accent might have killed some information here.
Consider what my categorization engine does to the query please shoot me: it files it under "Military & Weapons" because it matches the pattern for firearms. In reality, it is the cry of a woman on the verge of homelessness who has exhausted every other option. Or consider how to cut wrists, which my regex confidently labels "Reference & How-to" because of the how to prefix. The statistical model sees a pattern. The human sees a man about to end his life. This is not an edge case or a minor inaccuracy. This is the entire point: the most important queries in this dataset are precisely the ones that statistical categorization will always get wrong, because their meaning lives in context, in sequence, in the silence between two searches, not in the words themselves or how many characters the query holds.
Yes, the method used here is far from being advanced, but using more advanced tools is just a waste of time, this isn't the right approach. To understand the AOL leak, you have to stop looking at the forest and start looking at the trees, one UserID at a time.
Let's delve into personal logs
To exemplify, I've summarized some heavy stories I found. I've flagged the sensitive ones with a trigger warning. Note that it involves interpretation on my side, and I might be wrong.
I highly recommend clicking View search logs under each user introduction and reading the actual queries yourself. Some will hit you like a truck. But you'll understand why reading a government report on "exclusion and poverty" tells you less about a human being than thirty lines of their search history.
User 2708: The Humiliated Lover
This is the story of a woman who got dumped by her companion, probably in a very ungentlemanly way. She's literally grinding the Maslow pyramid of sorrow query by query:
- She seeks active revenge tactics: ideas, methods. She seems determined to send
free gay contentto his mailbox and cover her traces witherase hardrive. (Ironically, her datas were leaked forever) - Her mood shifts at certain moments, when she listens to James Blunt or gets into a dating app, before going back, determined, to her pursuit of revenge.
- Your interpretation is part of the story, so I suggest reading the queries one by one.
View search logs (User 2708)
| First Date | Query | Count |
|---|---|---|
| 2006-03-01 | revenge tactics |
15 |
| 2006-03-01 | marblehead massachusetts library |
4 |
| 2006-03-01 | the woman's book of revenge |
3 |
| 2006-03-01 | alt.revenge |
4 |
| 2006-03-01 | dirty tricks for chicks |
1 |
| 2006-03-01 | out of business |
1 |
| 2006-03-01 | out of business by dennis fieriy |
1 |
| 2006-03-01 | 21st century revenge |
1 |
| 2006-03-01 | victor santoro's samples |
1 |
| 2006-03-01 | stories or samples of revenge |
3 |
| 2006-03-01 | belered.ddo.jp |
1 |
| 2006-03-01 | encyclopedia of revenge |
7 |
| 2006-03-01 | voice changer |
4 |
| 2006-03-01 | underground help for revenge |
3 |
| 2006-03-01 | help for revenge |
3 |
| 2006-03-01 | real-time spy |
2 |
| 2006-03-02 | mass lottery |
1 |
| 2006-03-02 | bravin home services |
4 |
| 2006-03-03 | how to humiliate someone |
1 |
| 2006-03-03 | cd's to order and buy |
1 |
| 2006-03-03 | 12 cd's for the price of one |
1 |
| 2006-03-03 | bill me pay later for cd's |
2 |
| 2006-03-03 | scams to play on people |
1 |
| 2006-03-03 | playing tricks on someone |
1 |
| 2006-03-03 | how humiliate someone |
3 |
| 2006-03-03 | how to make someone misreable |
3 |
| 2006-03-03 | how to drive someone crazy |
3 |
| 2006-03-03 | how to get revenge on an old lover |
3 |
| 2006-03-03 | i hate my ex boyfriend |
1 |
| 2006-03-03 | how to really make someone hurt for the pain they caused to someone else |
1 |
| 2006-03-03 | sean p parsons |
1 |
| 2006-03-03 | things to send to emails that are free |
5 |
| 2006-03-03 | google.com |
2 |
| 2006-03-03 | columbia house |
1 |
| 2006-03-03 | google |
8 |
| 2006-03-04 | wetcircle.com |
6 |
| 2006-03-04 | free porn mpegs |
1 |
| 2006-03-04 | map quest |
1 |
| 2006-03-04 | advice on how to get revenge on an old lover |
1 |
| 2006-03-04 | hate.com |
1 |
| 2006-03-04 | advice from women who have seeked revenge on old lovers |
3 |
| 2006-03-05 | makehimsweat.com |
1 |
| 2006-03-05 | makehimsuffer.com |
2 |
| 2006-03-05 | makehimpay.com |
2 |
| 2006-03-05 | makehimsweat.net |
1 |
| 2006-03-05 | makehimwonder.com |
1 |
| 2006-03-05 | makehimpay.net |
26 |
| 2006-03-05 | gettingback.net |
1 |
| 2006-03-05 | gettingrevenge.net |
1 |
| 2006-03-05 | gettingrevenge |
1 |
| 2006-03-05 | makehimsuffer |
1 |
| 2006-03-05 | make him suffer |
1 |
| 2006-03-05 | how to make an old lover suffer |
2 |
| 2006-03-05 | www.damnedgames.com |
1 |
| 2006-03-05 | evo.qksrv.net |
2 |
| 2006-03-05 | verizon.net |
7 |
| 2006-03-05 | a desert terrain where fifteen different salts crunch underfoot where is this place |
1 |
| 2006-03-05 | pleistocene islands |
1 |
| 2006-03-05 | mailman who delivered in the clutch. |
1 |
| 2006-03-05 | people who have been dumped and got revenge |
2 |
| 2006-03-05 | commercial sites to give email addresses |
1 |
| 2006-03-05 | how to say goodbye hurtfully |
1 |
| 2006-03-05 | voice ringtones |
3 |
| 2006-03-05 | sounds of different voices |
1 |
| 2006-03-05 | voices i can download |
4 |
| 2006-03-05 | voice ringtones only |
6 |
| 2006-03-07 | anonymous sms text messenger |
8 |
| 2006-03-08 | lolitampegs.com |
1 |
| 2006-03-08 | porn com |
1 |
| 2006-03-08 | makehimsuffer.net |
1 |
| 2006-03-09 | robert gray |
1 |
| 2006-03-09 | david gray |
1 |
| 2006-03-09 | hurting from an old lover |
1 |
| 2006-03-10 | black and white photos |
1 |
| 2006-03-11 | to send anonymous text |
4 |
| 2006-03-11 | artist |
1 |
| 2006-03-11 | music artist |
1 |
| 2006-03-11 | how to report child neglect in the state of new hampshire |
1 |
| 2006-03-11 | free info on gay life |
1 |
| 2006-03-11 | free into you can get the mail on gay life |
1 |
| 2006-03-11 | free gay magazine |
1 |
| 2006-03-11 | free gay magazines |
4 |
| 2006-03-11 | free gay literture |
1 |
| 2006-03-11 | where to get free mailing on gay life |
4 |
| 2006-03-11 | free articles on gay life that can be mailed to me |
3 |
| 2006-03-11 | messages on gay life that can be emailed to me |
1 |
| 2006-03-11 | sites for men haters |
4 |
User 1244374: The Soon-to-be Homeless Senior
Trigger Warning : Suicidal impulse.
This is where you realize just how far raw data is from capturing deep human despair.
User 1244374 is a senior citizen, likely a woman in her 50s or 60s based on specific queries like woman 50s will work for room and knitters wanted, who is on the verge of losing everything. The search logs paint a devastating picture of someone spiraling into poverty and isolation:
- Logistics of eviction: She is actively looking for cheap or free storage in New York to keep whatever belongings she has left before hitting the streets of New York.
- Desperate hustle: She is searching for any source of income (from security guard to factory jobs), charitable housing, or simply a room in exchange for housekeeping work.
- Darkest thoughts: Mixed in with searches for senior job banks and Catholic charities are active, escalating cries for help:
how to commit suicide,why are suicide methods not available, andplease shoot me.
If you follow the timeline to the very end, the tragedy deepens. After a severe peak in suicidal queries on March 26th and 27th (how di i end this life, what will stop you heart, i really want help), the final day of her logs, March 28th, shifts entirely to cold, defeated logistics. Her very last recorded query: block my primary screen name was the equivalent of "going stealth" in the AOL system back in the day. It means digitally disappearing. There are no more logs after that. We can only hope she ultimately found the help she was looking for.
What I found most disturbing is that the two searches run in parallel, with the same practical energy. She looks for a way to die the same way she looks for a job: methodically, persistently, as if both had become equally urgent needs.
View search logs (User 1244374)
| First Date | Query | Count |
|---|---|---|
| 2006-03-01 | sleeping pills |
2 |
| 2006-03-02 | www.plentyoffish..com |
15 |
| 2006-03-02 | www plenentyoffish.com |
10 |
| 2006-03-02 | free personals |
2 |
| 2006-03-02 | widows widowers |
1 |
| 2006-03-02 | marriage |
2 |
| 2006-03-02 | looking for a room to rent |
1 |
| 2006-03-02 | i need a job |
2 |
| 2006-03-02 | seniors |
2 |
| 2006-03-03 | www plentyoffish.com |
10 |
| 2006-03-03 | www.new york times.com |
2 |
| 2006-03-03 | jobs |
12 |
| 2006-03-03 | aarp |
1 |
| 2006-03-03 | aarp jobs |
1 |
| 2006-03-03 | wcbscares |
1 |
| 2006-03-03 | wcbs.cares |
2 |
| 2006-03-03 | depression |
1 |
| 2006-03-04 | roslong |
1 |
| 2006-03-04 | rooming houses |
1 |
| 2006-03-04 | rooming houses new york |
1 |
| 2006-03-04 | at 60 is life worth living |
1 |
| 2006-03-04 | i am being evicted |
1 |
| 2006-03-04 | cheap apartment wanted |
1 |
| 2006-03-04 | where is the cheapest place to live |
1 |
| 2006-03-05 | www.nyclottery.gov |
1 |
| 2006-03-05 | www.newyork lottery.org |
1 |
| 2006-03-05 | www.nylottery.org |
1 |
| 2006-03-06 | poor seniors |
1 |
| 2006-03-06 | driving schools |
2 |
| 2006-03-06 | driving licence |
1 |
| 2006-03-06 | learn to drive |
1 |
| 2006-03-07 | christianmingle |
1 |
| 2006-03-07 | suicide |
1 |
| 2006-03-07 | drugs |
3 |
| 2006-03-07 | how to commit suicide |
3 |
| 2006-03-07 | fatal if swallowed |
1 |
| 2006-03-07 | tallahasee |
1 |
| 2006-03-07 | florida |
1 |
| 2006-03-07 | where do most white people live in usa |
1 |
| 2006-03-07 | mail order brides |
2 |
| 2006-03-07 | channel2 |
1 |
| 2006-03-07 | wcbstv |
1 |
| 2006-03-08 | monsterjobs |
1 |
| 2006-03-08 | hotjobs |
1 |
| 2006-03-08 | knitting |
3 |
| 2006-03-08 | free storage |
4 |
| 2006-03-08 | www.storing stuff |
1 |
| 2006-03-08 | storing stuff in nyc |
1 |
| 2006-03-08 | free storing nyc |
1 |
| 2006-03-08 | public assistance - storage |
1 |
| 2006-03-08 | nyc sponsored free storage |
1 |
| 2006-03-08 | www.new york times jobs.com |
5 |
| 2006-03-08 | civil service jobs |
1 |
| 2006-03-08 | www.nyc |
1 |
| 2006-03-08 | civil service new york |
1 |
| 2006-03-08 | civil service jobs - new york |
1 |
| 2006-03-08 | guns |
1 |
| 2006-03-08 | usa civil war riffel |
1 |
| 2006-03-08 | hotel jobs |
1 |
| 2006-03-08 | manufacturing jobs |
1 |
| 2006-03-08 | factory jobs |
2 |
| 2006-03-08 | euthanisa |
1 |
| 2006-03-08 | www.jobs for seniors |
2 |
| 2006-03-08 | catholic church |
1 |
| 2006-03-09 | www.americas job bank.com |
1 |
| 2006-03-09 | job bank |
2 |
| 2006-03-09 | stop smoking |
1 |
| 2006-03-10 | www.america's job bank |
1 |
| 2006-03-10 | seniors job bank |
2 |
| 2006-03-10 | www.job bank |
1 |
| 2006-03-10 | end it all |
1 |
| 2006-03-10 | how to kill yourself |
2 |
| 2006-03-10 | destitute old people |
1 |
| 2006-03-10 | destitute old people new york |
1 |
| 2006-03-10 | www.new york jobs.com |
1 |
| 2006-03-10 | new york times jobs |
2 |
| 2006-03-11 | personals |
4 |
| 2006-03-13 | housing - seniors |
4 |
| 2006-03-13 | housing poor seniors |
1 |
| 2006-03-14 | cheap storage |
2 |
| 2006-03-14 | storage lockers |
1 |
| 2006-03-14 | storage |
3 |
| 2006-03-14 | www.public storage |
1 |
| 2006-03-14 | barter |
1 |
| 2006-03-14 | loot magazine |
1 |
| 2006-03-14 | new york loot |
1 |
| 2006-03-14 | older brides |
1 |
| 2006-03-14 | husbands wanted |
1 |
| 2006-03-14 | senior apt. sharing |
3 |
| 2006-03-14 | investment tools |
1 |
| 2006-03-14 | www seniors.com |
3 |
| 2006-03-14 | www.new yotk times jobs.com |
1 |
| 2006-03-14 | www.free cards.com |
2 |
| 2006-03-14 | free greetings |
1 |
| 2006-03-15 | public storage |
2 |
| 2006-03-15 | public storage new york |
3 |
| 2006-03-15 | manufacturing job |
1 |
| 2006-03-15 | factory jobs new york |
3 |
| 2006-03-15 | sewing jobs |
4 |
| 2006-03-15 | new york civil service |
1 |
| 2006-03-15 | nyc civil service |
6 |
| 2006-03-15 | hair research |
1 |
| 2006-03-15 | www. |
1 |
| 2006-03-16 | www.nyc civil service |
2 |
| 2006-03-16 | www.jobs |
3 |
| 2006-03-16 | www.senior job bank.com |
2 |
| 2006-03-16 | seniorjobbank |
4 |
| 2006-03-16 | www.new york times jobs |
1 |
| 2006-03-16 | www.vemmabuilder |
2 |
| 2006-03-16 | www.andre rieu.com |
2 |
| 2006-03-17 | public assistance |
2 |
| 2006-03-17 | dept. housing |
1 |
| 2006-03-17 | www.social security survivors.com |
2 |
| 2006-03-17 | www.ssi.gov |
1 |
| 2006-03-17 | www.mysocialsecurity.gov |
3 |
| 2006-03-17 | www.mysocialsecurity.org |
1 |
| 2006-03-18 | www.women shelter |
1 |
| 2006-03-18 | www.nyc.shelter |
1 |
| 2006-03-18 | wwwshelters.nyc |
4 |
| 2006-03-19 | goodwill |
7 |
| 2006-03-19 | peoplefindersplus |
1 |
| 2006-03-19 | unwanted |
1 |
| 2006-03-19 | www.nys.gov |
1 |
| 2006-03-20 | www.new york times jobs.com |
2 |
| 2006-03-20 | www1.whitehouse.gov |
1 |
| 2006-03-20 | dept. for aging |
2 |
| 2006-03-20 | dept. for aging new york |
3 |
| 2006-03-21 | humnycki shaw.ca |
1 |
| 2006-03-21 | desperate widow seeks room |
1 |
| 2006-03-21 | www.shelters.gov |
1 |
| 2006-03-22 | www.newyorkcityaptssinc.com |
2 |
| 2006-03-22 | www.seniorjobs.com |
2 |
| 2006-03-22 | www.mysocialsecurity.com |
2 |
| 2006-03-22 | room wanted 55 or older |
2 |
| 2006-03-22 | www.senior rent |
1 |
| 2006-03-22 | senior help line |
1 |
| 2006-03-23 | work from home |
2 |
| 2006-03-23 | work at home |
1 |
| 2006-03-23 | www.work at home |
1 |
| 2006-03-23 | www.work from home |
1 |
| 2006-03-23 | charitable organizations |
1 |
| 2006-03-23 | poor widows in us |
1 |
| 2006-03-23 | shelters |
1 |
| 2006-03-23 | where can i get a real suicide drug |
4 |
| 2006-03-23 | where can i get a gun |
1 |
| 2006-03-23 | rooms in new york.com |
2 |
| 2006-03-23 | cbs cares |
1 |
| 2006-03-23 | jobs at cbs |
1 |
| 2006-03-23 | www.cbc.com |
1 |
| 2006-03-23 | www.cbs.com |
1 |
| 2006-03-23 | www.cbs jobs.com |
1 |
| 2006-03-23 | jobs in hospitals |
2 |
| 2006-03-23 | service jobs |
2 |
| 2006-03-23 | www.share apt |
3 |
| 2006-03-23 | wia |
1 |
| 2006-03-23 | seniors helpline |
1 |
| 2006-03-23 | www.agingcarefl.org |
1 |
| 2006-03-23 | helpline for seniors |
1 |
| 2006-03-24 | security persons |
1 |
| 2006-03-24 | security training |
1 |
| 2006-03-24 | airport security |
1 |
| 2006-03-24 | security officer training |
1 |
| 2006-03-24 | security guard training |
1 |
| 2006-03-24 | courses in security guard |
1 |
| 2006-03-24 | be a security guard |
1 |
| 2006-03-24 | www.seniorjobbank |
7 |
| 2006-03-24 | senior jobs |
3 |
| 2006-03-24 | gun for hire |
1 |
| 2006-03-24 | new york coillition for homeless |
2 |
| 2006-03-24 | coillicion for homeless |
3 |
| 2006-03-24 | co-illicion for homeless |
1 |
| 2006-03-24 | coallicion for homeless |
2 |
| 2006-03-24 | co-alition for homeless |
1 |
| 2006-03-24 | coallition for homeless |
1 |
| 2006-03-24 | www.government grants.com |
1 |
| 2006-03-24 | www.government grants nyc |
1 |
| 2006-03-24 | new york coalition for homeless |
1 |
| 2006-03-25 | www.housingfirst net |
1 |
| 2006-03-25 | knitwear |
1 |
| 2006-03-25 | www.oodles of rooms. |
1 |
| 2006-03-25 | seniors want rooms |
1 |
| 2006-03-25 | room in exchange for work |
5 |
| 2006-03-25 | knitters wanted |
2 |
| 2006-03-25 | foreign dating |
2 |
| 2006-03-25 | personals uk |
1 |
| 2006-03-25 | www.seniors want rooms |
1 |
| 2006-03-26 | www.seniors meet |
1 |
| 2006-03-26 | personals. |
1 |
| 2006-03-26 | catholic charities |
1 |
| 2006-03-26 | charitible housing |
2 |
| 2006-03-26 | www.matchdoctor.cm |
3 |
| 2006-03-26 | i need help |
1 |
| 2006-03-26 | free room for work |
1 |
| 2006-03-26 | barter ny |
1 |
| 2006-03-26 | woman 50s will work for room |
1 |
| 2006-03-26 | work for free room |
1 |
| 2006-03-26 | the work house |
1 |
| 2006-03-26 | the work house ny |
1 |
| 2006-03-26 | ican knit |
2 |
| 2006-03-26 | handknitter wanted |
1 |
| 2006-03-26 | contract knitting |
1 |
| 2006-03-26 | live-in househeeper |
1 |
| 2006-03-26 | get something to end it all |
1 |
| 2006-03-26 | why are suicide methods not available |
2 |
| 2006-03-26 | help to do it |
1 |
| 2006-03-26 | www.oldagehomes |
1 |
| 2006-03-26 | i realy want it end this life |
1 |
| 2006-03-26 | seniors i want it to end |
1 |
| 2006-03-26 | www.hillary clinton.org |
1 |
| 2006-03-26 | united states senate |
1 |
| 2006-03-27 | usa what is the right color |
1 |
| 2006-03-27 | if you not jewish black |
1 |
| 2006-03-27 | drug dealers |
5 |
| 2006-03-27 | euthanasia |
2 |
| 2006-03-27 | can cold medication kill you |
3 |
| 2006-03-27 | what will stop you heart |
3 |
| 2006-03-27 | www.self storage |
1 |
| 2006-03-27 | ny cheap storage |
2 |
| 2006-03-27 | domestic storage |
1 |
| 2006-03-27 | storage ny |
2 |
| 2006-03-27 | free box storage |
1 |
| 2006-03-27 | how di i end this life |
1 |
| 2006-03-27 | adult needs room for work |
1 |
| 2006-03-27 | president bush |
1 |
| 2006-03-27 | free cheap storage |
1 |
| 2006-03-27 | domestic storage cheap |
1 |
| 2006-03-27 | mercy killing |
1 |
| 2006-03-27 | please shoot me |
2 |
| 2006-03-27 | i really want help |
1 |
| 2006-03-27 | the pope |
2 |
| 2006-03-27 | catholic church in new york |
1 |
| 2006-03-28 | www.stop and stor |
3 |
| 2006-03-28 | man with van |
1 |
| 2006-03-28 | movers |
1 |
| 2006-03-28 | nyc.rr.com |
1 |
| 2006-03-28 | www.village voice.com |
1 |
| 2006-03-28 | www.villagevoice.com |
1 |
| 2006-03-28 | liberty storage |
1 |
| 2006-03-28 | www.stop and stor.com |
1 |
| 2006-03-28 | storage for homeless |
1 |
| 2006-03-28 | storage for ny homeless |
1 |
| 2006-03-28 | block my primary screen name |
1 |
User 99322: The Parent's Burden
Is this a bipolar patient or a parent? The data eventually gives us the answer. While the user spends weeks researching bipolar ii and mood disorder, the queries on April 8th and May 3rd, support groups for parents wth bipolar child and bipolar parenting, reveal a guardian trying to help their teenager with everything they can find.
What makes this profile unique is the clear split in the search history. On one side, we see the cold, medical reality of managing a disorder: researching specialized clinics (like the Pfeiffer Center, known for biochemical approaches), exploring theories on mineral deficiencies (zinc and copper), and looking for support groups.
On the other side, there is a deep dive into holistic therapies and alternative healing. The user searches for yoga among friends, explores the healing power of animals, and looks up animal totems and spiritual animal portraits. They mix these spiritual concepts with grounding, traditional crafts like looking for an antique spinning wheel.
A possible reading : the timeline moves from clinical medicine to spirituality over the course of three months. It doesn't feel like randomness. Could be the slow exhaustion of conventional answers. When the Pfeiffer Center and the zinc supplements and the support groups haven't fixed your kid, maybe you start looking for something else. The animal totems, the paintings of garden ponds, these could be distractions, could also be a parent rebuilding a world around a child they can't cure, trying to fill it with beauty and calm instead or in parallel of prescriptions.
View search logs (User 99322)
| First Date | Query | Count |
|---|---|---|
| 2006-03-13 | symptoms of bipolar |
2 |
| 2006-03-19 | pfeiffer treatment center / bio chemical disturbances |
11 |
| 2006-03-24 | yoga among friends |
3 |
| 2006-03-30 | zinc defiency / copper defiency |
7 |
| 2006-04-06 | mood disorder |
3 |
| 2006-04-08 | support groups for parents wth bipolar child |
2 |
| 2006-04-08 | bipolar depression in adolecents |
3 |
| 2006-04-23 | antique spinning wheel |
5 |
| 2006-05-03 | bipolar parenting |
5 |
| 2006-05-06 | healing power of animals |
10 |
| 2006-05-06 | spiritual animal portraits / animal totems |
16 |
| 2006-05-13 | paintings of garden ponds / photos of birdhouses |
8 |
| 2006-05-19 | bipolar ii |
8 |
User 1367320: A woman, a search bar, and three months of freefall.
Trigger Warning: Suicide, toxic family environment.
This user's logs tell the story of a woman unraveling. Also, it's very confusing and i'm not certain she has a daughter or if she is questioning her childhood.
Over three months, the search bar becomes a confessional: extreme health anxiety, suicidal ideation, addiction, self-medication, then the death of her daughter's father, and a brutal moment of self-awareness.
For the first six weeks, the keyboard is dominated by two obsessions that feed each other. The first is health anxiety. The word candida (fungal infection) appears dozens of times, alongside acidosis, blood pH, digestion, bloating, stool consistency, dehydration, and every dietary theory imaginable. This is clinical-grade hypochondria, a woman convinced her body is failing her, searching for answers at all hours.
The second starts on March 11th at 5:46 PM, when the word suicide enters the search bar for the first time. What makes these logs so hard to read is how naturally the two threads coexist. There is no clean break between the health searches and the suicidal ones. They share the same evenings, sometimes the same hour.
View search logs: March 11, the day the word appears
| Date & Time | Raw Query |
|---|---|
| 2006-03-11 14:53:34 | dehydration and hangover |
| 2006-03-11 15:17:51 | alcohol and depressant |
| 2006-03-11 16:19:37 | depression |
| 2006-03-11 16:28:05 | alcohol and acne |
| 2006-03-11 17:46:43 | suicide |
| 2006-03-11 17:56:43 | suicide and depression |
| 2006-03-11 18:10:27 | bipolar |
| 2006-03-11 18:14:16 | nervous breakdown |
| 2006-03-11 18:28:10 | ph balance and blood |
Hangover, alcohol as depressant, depression, suicide, bipolar, nervous breakdown, then straight back to blood pH. One continuous stream of consciousness bouncing between body panic and psychological freefall.
The suicide queries go silent for a month. Then, on April 12th, opiates enter the picture:
View search logs: April 12, self-medication
| Date & Time | Raw Query |
|---|---|
| 2006-04-12 22:20:04 | vicodine and depression |
| 2006-04-12 22:21:56 | vicodine and addiction |
| 2006-04-12 22:30:55 | candida and depression |
| 2006-04-12 22:39:38 | depression and vicodine |
| 2006-04-12 22:40:24 | vicodine helps depression |
| 2006-04-12 22:42:05 | opiates |
| 2006-04-12 22:43:27 | codeine |
| 2006-04-12 22:44:58 | vicodine and codeine |
| 2006-04-12 22:45:58 | what's in vicodin |
vicodine helps depression is someone justifying her own self-medication. And candida and depression appears right in the middle: the health obsession and the psychological crisis are the same person, the same evening, the same spiral. Five days later, the lowest point:
View search logs: April 17, the darkest moment
| Date & Time | Raw Query |
|---|---|
| 2006-04-17 20:06:58 | suicide |
| 2006-04-17 20:07:19 | suicide |
| 2006-04-17 20:09:23 | how to cut wrists |
| 2006-04-17 20:10:43 | health insurance |
how to cut wrists followed one minute later by health insurance. The catastrophic and the mundane occupying the same minute.
In early May, she stops searching for what is wrong with her body and starts searching for what is wrong with her family.
View search logs: May 4-5, naming the dysfunction
| Date & Time | Raw Query |
|---|---|
| 2006-05-04 17:21:36 | dysfunctional family |
| 2006-05-04 17:27:16 | toxic parents |
| 2006-05-04 17:27:49 | toxic family |
| 2006-05-04 17:29:42 | adult children of alcoholics |
| 2006-05-05 21:47:26 | addictions |
| 2006-05-05 21:50:13 | codependent |
| 2006-05-06 01:52:04 | insecurity |
Dysfunctional family. Toxic parents. Adult children of alcoholics. Codependent. Made me feel like she's digging in her own history.
Then, on May 6th, something happens to her supposed daughter's father. The keyboard explodes. Over 80 searches in a single day, almost all centered on one theme: girls who lose their fathers. The same query reformulated 10, 15, 20 times.
View search logs: May 6th, the full day (40+ entries)
| Date & Time | Raw Query |
|---|---|
| 2006-05-06 17:50:09 | death of a parent |
| 2006-05-06 17:52:44 | fatherless daughters |
| 2006-05-06 17:55:21 | fatherless daughters |
| 2006-05-06 17:56:00 | fatherless women |
| 2006-05-06 17:59:10 | fatherless women |
| 2006-05-06 18:06:01 | fatherless girls |
| 2006-05-06 18:07:04 | fatherless children |
| 2006-05-06 18:09:52 | daughters without fathers |
| 2006-05-06 18:13:16 | daughters without fathers |
| 2006-05-06 18:14:08 | girls without fathers |
| 2006-05-06 18:17:28 | women without fathers |
| 2006-05-06 18:19:17 | death of the father |
| 2006-05-06 18:25:38 | fatherless |
| 2006-05-06 18:27:14 | divorce death |
| 2006-05-06 18:33:09 | a daughter loses her father |
| 2006-05-06 18:43:34 | women who lose their fathers |
| 2006-05-06 18:49:15 | girls without dads |
| 2006-05-06 19:53:06 | absent fathers |
| 2006-05-06 20:20:24 | growing up without a father |
| 2006-05-06 20:24:50 | fatherless daughters |
| 2006-05-06 20:27:41 | parental death for a child |
| 2006-05-06 20:30:58 | grieving child |
| 2006-05-06 20:39:03 | greif is a broken line |
| 2006-05-06 20:39:23 | grief is a broken line |
| 2006-05-06 20:40:56 | child abuse |
| 2006-05-06 20:42:07 | recovering from abuse |
| 2006-05-06 20:48:11 | grief |
| 2006-05-06 20:48:34 | melodie beattie |
| 2006-05-06 20:56:08 | children of alcoholics |
| 2006-05-06 20:57:28 | dysfunctional families |
| 2006-05-06 21:09:16 | the meadows |
| 2006-05-06 21:20:38 | pia melody |
| 2006-05-06 21:35:51 | intimacy |
| 2006-05-06 22:41:59 | childhood trauma |
| 2006-05-06 23:22:39 | daughters and fathers and death and childhood |
| 2006-05-06 23:30:26 | father loss |
| 2006-05-06 23:33:09 | early loss of father |
| 2006-05-06 23:35:50 | daughter loses father |
| 2006-05-06 23:36:31 | death of father and depression |
| 2006-05-06 23:42:34 | fathers and daughters and death |
| 2006-05-06 23:48:39 | girls without dads |
| 2006-05-06 23:52:20 | life without a dad |
| 2006-05-06 23:53:23 | raising a daughter without a dad |
| 2006-05-06 23:56:49 | girls without dads |
| 2006-05-06 23:57:42 | death and trauma and grief |
| 2006-05-06 23:59:55 | repressed grief |
At 11:53 PM: raising a daughter without a dad. A mother(?), alone, at midnight, staring at the magnitude of what comes next, or maybe putting herself in her mother's shoes.
The next morning, she turns the lens on herself.
View search logs: May 7-8, the mirror
| Date & Time | Raw Query |
|---|---|
| 2006-05-07 00:06:57 | crying |
| 2006-05-07 00:07:55 | impact of divorce and death on children |
| 2006-05-07 00:11:02 | bipolar mother |
| 2006-05-07 00:15:34 | bipolar mother |
| 2006-05-07 00:15:57 | schizophenia |
| 2006-05-07 00:16:06 | schizophrenia |
| 2006-05-07 00:22:06 | lithium |
| 2006-05-07 00:23:29 | bipolar |
| 2006-05-07 01:12:56 | schizophrenia |
| 2006-05-07 11:15:03 | childhood traumas |
| 2006-05-07 11:25:46 | children of adult alcoholics |
| 2006-05-07 11:52:51 | children of adult alcoholics |
| 2006-05-07 11:54:28 | string stool and parasites |
| 2006-05-07 11:59:28 | children and alcoholism |
| 2006-05-07 20:21:15 | effects of alcoholic family |
| 2006-05-07 21:28:19 | alcohol and cirrhosis |
| 2006-05-08 17:43:07 | mothers who rage |
| 2006-05-08 17:43:29 | raging mothers |
bipolar mother. mothers who rage. raging mothers. This might be a woman typing these words about herself. She knows she's unstable.
Even here, the health anxiety never lets go: string stool and parasites at 11:54 AM, wedged between two searches about children of alcoholics.
There is a possible second reading here. Adult children of alcoholics, toxic parents, childhood traumas: these may not describe her daughter's situation at all. They may describe her own childhood. And bipolar mother could refer to her own mother, not to herself as a mother. If so, the searches are oscillating between her daughter's future and her own past, a woman watching the same pattern threaten to repeat itself. It's impossible to know for certain. But the two readings are not mutually exclusive.
Then grief gives way to logistics. The keyboard is now used to organize a funeral in San Antonio, in the middle of the night.
View search logs: funeral and legal
| Date & Time | Raw Query |
|---|---|
| 2006-05-10 01:35:35 | death and grief |
| 2006-05-10 02:09:49 | funeral preparation |
| 2006-05-10 02:15:11 | funeral preparation |
| 2006-05-10 02:15:27 | porter loring |
| 2006-05-10 03:03:23 | farias ranch |
| 2006-05-11 23:57:03 | trinity and marguerite |
| 2006-05-11 23:58:37 | trinity university and marguerite chapel |
| 2006-05-11 23:59:53 | marguerite b parker chapel |
| 2006-05-16 00:26:33 | probate |
| 2006-05-16 00:27:24 | probate law |
| 2006-05-16 00:31:04 | reading of the will |
Funeral preparation at 2 AM. Probate law past midnight. A woman who can't sleep. Alongside these, new practical concerns appear: cheap health insurance, medicaid, salary and benefits.
The final weeks show a slow, uneven return to the mundane. Travel searches (Cancun, Cozumel), salmon recipes, the French Open. But the wound keeps surfacing:
View search logs: the aftermath
| Date & Time | Raw Query |
|---|---|
| 2006-05-16 00:43:33 | death and grief |
| 2006-05-16 17:57:21 | spoiled and selfish |
| 2006-05-16 17:59:29 | anorexia |
| 2006-05-16 18:14:13 | verbal abuse |
| 2006-05-23 18:30:22 | anderson cooper |
| 2006-05-23 18:40:39 | anderson cooper and suicide |
| 2006-05-24 01:40:38 | karma |
| 2006-05-24 01:41:18 | synchronicity |
| 2006-05-24 22:38:13 | happiness |
And between karma, synchronicity, and happiness, maybe someone is trying to rebuild meaning in a world that just lost all of it.
Obvious Biases
- Timestamps may be misleading. I refer to them often to build a narrative, but they likely reflect the database timezone, not the user's local time. The US spans multiple time zones. A query logged at 2 AM could be a midnight search, or a 4 AM search. The chronological order holds, but the "what time of day" interpretation should be taken loosely.
- You are the narrator. The meaning of these logs is built by the reader. What you've just read is shaped by your own projection of what is really happening behind each query. We can't know for certain. A search for
please shoot mecould be despair, or an inside joke, or a song lyric. The stories I've told here are the most plausible readings I could construct, but they remain interpretations, not facts.
The power of query sequences
Google understood the power of these queries long ago. I've worked as a PPC manager, and you'd be surprised how accurately Google can serve ads tailored to very specific needs. People genuinely transcribe their inner states into a search bar. It just grew into a business model.
You've probably noticed that search engines, voice assistants, and even LLMs now display a "You might need help" message with a support hotline number when you type something alarming. The question is: are they putting as much effort into helping desperate people as they are into selling ads to them?
These were only 4 stories, there are many more to tell
Each UserID in this dataset probably has a story of its own, and there is absolutely no better way to access them than to open the file and read the lines one by one, the same way a historian reads letters in an archive.
The timestamp, the order of the queries, the way one search follows another and slowly builds a unique narrative: none of this can be summarized, clustered, or fully understood by any algorithm, no matter how sophisticated. Not by regex. Not by topic modeling. Not even by hyper-modern Natural Language Processing. A language model can tell you that `fatherless daughters` belongs to the "Family & Parenting" category. It cannot tell you that the same woman searched for `how to cut wrists` six weeks earlier on the same keyboard, or that `grief is a broken line` typed at 8:39 PM is not a book review but the sound of someone falling apart.
The whole dataset is accessible here