If You Want to Stop Big Data Breaches, Start With Databases

The root of so many high-profile data leaks are the insecure databases that underpin the internet.
Hack4x3106682678.jpg
Alex Williamson/Getty Images

Over the past few years, large-scale data breaches have become so common that even tens of millions of records leaking feels unremarkable. One frequent culprit that gets buried beneath the headlines? Poorly secured databases that connect directly to the internet.

While companies commonly use these databases to store tempting troves of customer and financial data, they often do so with outdated and weak default security configurations. And while any type of database can be left open or unprotected, a string of breaches over the last few years have all centered around one type in particular: open-source “NoSQL” databases, particularly those using the popular MongoDB database program. Of course there are many types of hacks that can ultimately lead to data breaches, like using spear phishing to gain access to a network, but securing exposed databases is a relatively easy and concrete step organizations can take to strengthen their data defense.

All Your Database Are Belong To Us

Traditional relational databases concentrate data in one or a handful of related servers. By contrast, the newer NoSQL generation of databases scales quickly by arranging massive amounts of data across many servers. Because these databases are open source, anyone can easily implement them. That's good for attracting customers and getting developers up and running quickly when they're on deadline, but it also means that MongDB and the other companies that make NoSQL databases don't have control over how users set up and secure them.

That disconnect has led to extensive fallout. Memorable unprotected database breaches include the 2015 MacKeeper incident in which usernames, passwords and other data leaked for more than 13 million of the security scanner's customers. In April 2016, lead MacKeeper security researcher Chris Vickery discovered an exposed database containing the full names, addresses, birthdays and voter registration numbers for all 93.4 million Mexican voters, which had been accessible online for seven months. Also in April, hackers stole user data for 1.1 million people from the insecure database of the dating website BeautifulPeople.com, and in October hackers compromised personal data from 58 million customers of the data storage firm Modern Business Solutions. And those are just some of the most publicized hacks.

The attacks have also not only continued, but evolved. At the beginning of 2017, a rash of "ransomware" incidents hit exposed MongoDB databases. In these cases attackers actually just deleted a database's files, but made it seem like paying a Bitcoin ransom worth a few hundred dollars would trigger data restoration.

Open Source, Open Sesame

Security experts have been warning about NoSQL configuration insecurity for years, and MongoDB specifically has suffered from two issues. First, it used to have some problematic defaults, like not requiring password authentication and granting users overly broad privileges. MongoDB updated these configurations a few years ago. But, second, because MongoDB is open source, it's easy to find installers online that incorporate outdated or misguided security settings. Someone who doesn't have a lot of tech experience, or just isn't paying attention, can easily wind up accessing and relying on flawed configuration files while setting up a database.

"It’s not as though attackers have exploited any flaw in these technologies, they haven’t exposed any flaw in MongoDB," says Mat Keep, the director of product and market analysis at MongoDB. "What’s happened is there have been a very small number of users who have not applied the security controls that come as standard with the database and they’ve exposed those databases publicly to the open internet."

Unprotected databases are also trivial to find. Both criminals and researchers alike use network visibility tools like the search engine Shodan, which indexes internet-connected devices, to get a sense of how many exposed databases are out there. Currently searching "MongoDB" on Shodan reveals more than 50,000 exposed databases. They may or may not be vulnerable to attack, but simply being visible increases their risk.

The ubiquity of outdated MongoDB installers and tutorials contributes to the problem, but databases built with current MongoDB releases have been breached as well, usually because whoever set up the database intentionally disabled the default security protections. MacKeeper security researcher Chris Vickery, who has identified many high-profile database leaks over the last few years, says that poor institutional communication and planning are a hurdle when groups create NoSQL databases. "A problem is that somebody will set up a MongoDB in an insecure way, but safely behind a firewall. And then for whatever reason the device gets plugged in in front of the firewall or the firewall goes down and then all of a sudden the database is exposed," Vickery says. "The people who set it up never thought it would be exposed to the world, and they never talked to the people who are now taking down the firewall."

This problem also applies to databases on test servers that are built quickly, with intentionally few security measures in place so that it's easy to work on development projects. If that project becomes a legitimate service without anyone remembering to update its security settings, the database goes from being a private testing ground to a public exposure. "All these servers were placed on the internet without any authentication and Shodan indexed them," says Niall Merrigan, a solutions architect who compiles information about exposed databases. "This meant that there was an easy way to find open servers."

Course Correct

Despite MongoDB's improvements, researchers say that they haven't yet seen an overall decline in exposed NoSQL databases. "We have tried to be very proactive in enabling people to get the most out of our security features," MongoDB's Keep says. "It’s frustrating for us that despite being warned, a tiny minority of people are still failing to apply the most basic protections to their databases."

At least, though, the years of working to raise awareness, by researchers and companies like MongoDB, has resulted in mainstream recognition of the problem. "This issue has been known in the security community for a long time, and it's really not just a Mongo problem, but it was difficult for us to get anybody to care about it at first," says John Matherly, the creator of Shodan, who has been tracking MongoDB exposure for years.

Unfortunately, the urgency of evolving threats like the recent "ransomware" hacks is providing much of that belated motivation. Previously, an exposed database could cause an embarrassing breach, but didn't seem to pose a risk beyond that. The threat of losing entire databases that companies rely on for daily operations, though, has forced people to pay attention. "The whole MongoDB 'apocalypse' situation is bad for PR, but it made such a big splash that everybody will hear about it in the tech community, and then maybe it’ll spread awareness faster about securing your Mongo databases. Otherwise, there are real consequences," MacKeeper's Vickery says.

Of course, there were always consequences for the millions of people whose data was exposed. But now that companies feel those pressures too, something might actually be done about it.