Advertisement

The big stores that track your every online move

Holiday shopping with Big Brother is always a bummer.

A study by Princeton researchers came to light earlier this month, revealing that more than 400 of the world's most popular websites use the equivalent of hacking tools to spy on you without your knowledge or consent.

So before you get all hopped up on eggnog and go hogwild doing your Black Friday or Cyber Monday shopping, you might want to find out which sites are seriously spying on you.

Using "session replay scripts" from third-party companies, websites are recording your every act, from mouse moves to clicks, to keylogging what you type and extracting your personal info off the page. If you accidentally paste something into a text field from your clipboard, like an address or password you didn't want to type out, the scripts can record, transmit and store that, too.

What these sites are doing with this information, and how much they anonymize or secure it, is a crapshoot.

Among top retail offenders recording your every move and mistake are Costco, Gap.com, Crate and Barrel, Old Navy, Toys R Us, Fandango, Adidas, Boots, Neiman Marcus, Nintendo, Nest, the Disney Store, and Petco.

After publication of the study, called "No Boundaries," both Bonobos and Walgreens said they would stop using session-replay scripts.

The study is the first in a series from Princeton's Center for Information Technology Policy (CITP). They were examining the world of "session replay scripts," software that runs on a site that records everything you do. This can include your mouse moves, hovers, clicks, and typing -- even if it's something you wrote and deleted. The researchers examined seven of the most popular session replay companies: Clicktale, FullStory, Hotjar, SessionCam, Smartlook, UserReplay and Yandex (the Russian search engine).

It's not a new thing, but few people know it's happening. If you're a privacy nerd -- and who isn't these days? -- the study's data release was especially fascinating in its breakdowns. It's a who's-who of companies, news outlets, stores, services and even a few porn sites. All of which know way too much about what you did and didn't do on their websites.

Tech and security websites spying on users include HP.com, Norton, Lenovo, Intel Autodesk, Windows, Kaspersky, Redhat.com, ESET.com, WP Engine, Logitech, Crunchbase, HPE.com (Hewlett Packard Enterprise), Akamai, Symantec, Comodo.com, and MongoDB.

Other sites you might recognize that are also using active session recording are RT.com, Xfinity, T-Mobile, Comcast, Sputnik News, iStockphoto, IHG (InterContinental Hotels), British Airways, NatWest, Western Union, FlyFrontier.com, Spreadshirt, Deseret News, Bose and Chevrolet.com.

Even passwords are included in these session recordings, according to Princeton's Center for Information Technology Policy, and in one instance involving SessionCam, they were sent to one of the companies providing session-replay scripts. "We found at least one website where the password entered into a registration form leaked to SessionCam, even if the form is never submitted," wrote CITP.

In a blog post one day after the study's release, SessionCam founder and CEO Kevin Goodings wrote, "Everyone at SessionCam can get behind the CITP's conclusion: 'Improving user experience is a critical task for publishers. However, it shouldn't come at the expense of user privacy.' The whole team at SessionCam lives these values every day."

The session-replay scripts used by these websites are rather insidious. "These scripts record your keystrokes, mouse movements and scrolling behavior, along with the entire contents of the pages you visit, and send them to third-party servers," CITP's writeup on the study explained. "Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder."

If you want to look up a specific site, Princeton's researchers released the data in a handy searchable database, complete with its methodology and caveats. For instance, they note that appearance of a website in their database "DOES NOT necessarily mean that session recordings occur, as website developers may choose not enable session-recording functionality."

However, they add that for some sites they do have clear evidence of session recordings occurring. "We mark these with the tag 'evidence of session recording.' For these sites, our measurement bots were able to detect a recording in progress, as detailed in our detection methodology," they explained. "For sites not marked with this tag, it does not mean that recordings don't occur -- simply that we don't know if they do."

All the sites listed above are among the sites found by the study to be actively recording sessions -- there is no ambiguity as to whether or not sites like Old Navy are spying on your every move, and your every mistake. They are. (We've reached out to Gap/Old Navy for comment.)

Session replay scripts are made palatable by selling themselves as useful for finding web-page mistakes and problems with user interactions with a page. But according to Princeton's research, a whole lot could be going wrong here.

More specifically, the study notes that "Collection of page content by third-party replay scripts may cause sensitive information such as medical conditions, credit card details and other personal information displayed on a page to leak to the third-party as part of the recording. This may expose users to identity theft, online scams, and other unwanted behavior." They add, "The same is true for the collection of user inputs during checkout and registration processes."

There's supposed to be a sort of checks and balances for safety in place by companies that provide session-replay scripting as a service. "The replay services offer a combination of manual and automatic redaction tools that allow publishers to exclude sensitive information from recordings."

Indeed. But the onus is on companies such as Petco to redact that information, and it's not as simple as it sounds. Princeton's Center for Information Technology Policy wrote that "in order for leaks to be avoided, publishers would need to diligently check and scrub all pages which display or accept user information. For dynamically generated sites, this process would involve inspecting the underlying web application's server-side code." To our collective dismay, they add "this process would need to be repeated every time a site is updated or the web application that powers the site is changed."

Shopping online with credit card

That's right: The sites snatching your information before you even have a chance to agree to any Terms of Service are the same ones responsible for administering a redaction process if they decide to do so. And if we've learned anything about company responsibility and user safety and security, it's that if something can go ignored or be sloppily done, it already was and got covered up, and we're all screwed, thanks.

Not only do Princeton's researchers say that our data can't reasonably be expected to be kept anonymous in these conditions. "In fact," they state, "some companies allow publishers to explicitly link recordings to a user's real identity." As in, they're linking the recordings of you to your account on their websites.

The session recording companies themselves were found to have troubling security practices. The researchers wrote:

Once a session recording is complete, publishers can review it using a dashboard provided by the recording service. The publisher dashboards for Yandex, Hotjar and Smartlook all deliver playbacks within an HTTP page, even for recordings which take place on HTTPS pages. This allows an active man-in-the-middle to inject a script into the playback page and extract all of the recording data.

Worse yet, Yandex and Hotjar deliver the publisher-page content over HTTP — data that were previously protected by HTTPS is now vulnerable to passive network surveillance.

To counter session replay spying, you'll need to use an ad blocker browser plugin that addresses these scripts. At the time of the study's publication, CITP noted that "two commonly used ad-blocking lists EasyList and EasyPrivacy, do not block FullStory, Smartlook or UserReplay scripts. EasyPrivacy has filter rules that block Yandex, Hotjar, ClickTale and SessionCam."

Those lists, notably EasyPrivacy, are included in the AdBlockPlus plugin. In the days following the release and subsequent press, EasyPrivacy was changed to block FullStory, Smartlook and UserReplay -- so AdBlockPlus is a good solution for some, but not all of these pernicious privacy purloiners.

Since the study's publication, session-replay companies are scrambling to counter the negative press. Like SessionCam, they're suggesting the study was overly dramatic and clinging to the spin that their tools are simply data collection for the purposes of improvement. SessionCam's CEO exclaimed, "Using behavioral analytics solutions to understand website visitors better is more often a sign of good intent -- the company wants to get it right."

Unfortunately, for most of us, the "to make your experience better online" is too tough to swallow in the era of Facebook.

I don't know about you, but I don't feel like helping any of these companies do their jobs right by sitting quietly as they take my personal information without my ability to say no. Especially not because I just happened to land on their websites. No one should.

Images: PA Imagess (Girl on Laptop); Getty (Holiday shopping)