Zazzle Shop

Screen printing

Tuesday, July 7, 2009

Social Security Numbers Deduced From Public Data

By Hadley Leggett Email Author


For years, government officials have urged people to protect their Social Security numbers by giving out the nine-digit codes only when absolutely necessary. Now it turns out that all the caution in the world may not be enough: New research shows that Social Security numbers can be predicted from publicly available birth information with a surprising degree of accuracy.

By analyzing a public data set called the “Death Master File,” which contains SSNs and birth information for people who have died, computer scientists from Carnegie Mellon University discovered distinct patterns in how the numbers are assigned. In many cases, knowing the date and state of an individual’s birth was enough to predict a person’s SSN.

“We didn’t break any secret code or hack into an undisclosed data set,” said privacy expert Alessandro Acquisti, co-author of the study published Monday in the journal Proceedings of the National Academy of Sciences. “We used only publicly available information, and that’s why our result is of value. It shows that you can take personal information that’s not sensitive, like birth date, and combine it with other publicly available data to come up with something very sensitive and confidential.”

With just two attempts, the researchers correctly guessed the first five digits of SSNs for 60 percent of deceased Americans born between 1989 and 2003. With fewer than 1,000 attempts, they could identify the entire nine digits for 8.5 percent of the group.

There’s only a few short steps between making a statistical prediction about a person’s SSN and verifying their actual number, Acquisti said. Through a process called “tumbling,” hackers can exploit instant online credit approval services — or even the Social Security Administration’s own verification database — to test multiple numbers until they find the right one. Although these services usually block users after several failed attempts, criminals can use networks of compromised computers called botnets to scan thousands of numbers at a time.

“A botnet can be programmed to try variations of a Social Security number to apply for an instant credit card,” Acquisti said. “In 60 seconds, these services tell you whether you are approved or not, so they can be abused to tell whether you’ve hit the right social security number.”

To keep identity thieves from exploiting their research, the scientists left a few key details about their method out of the paper, and they released the document to government agencies before making it public.

After developing an algorithm using the Death Master File, the researchers tested their results using information on birthday and hometown taken from a social networking site (the researchers declined to say which one). Again, they were able to predict Social Security numbers with a high degree of accuracy.

“It worked a little worse in the online social test for obvious reasons,” Acquisti said. “Some people may not reveal the right date of birth, or they may call hometown where they went to high school, not where they were born. There’s more noise in online social networking, but nevertheless the two studies confirmed each other.”

It also turns out that some SSNs are easier to predict than others. Because of the way numbers are assigned, younger people and those born in less populated states are more at risk, Acquisti said. Before 1988, many people didn’t apply for an SSN until they left for college or got their first job. But thanks to an anti-fraud effort in 1988 called the “Enumeration at Birth” initiative, parents started applying for their child’s number at birth, making it much easier to predict based on a person’s birthday.

The new findings remind consumers that they should use caution when sharing data online, even when the information itself doesn’t seem particularly sensitive. But Acquisti said his real message is for policymakers.

“We really wanted to come public with this result because the issue goes way beyond individual response,” he said. “It’s not just about remembering to shred your documents or to remove personal identification off your mail. As much as you try to protect your personal info, the info is already out there.”

According to information privacy experts, Social Security numbers were never meant to be used for authentication purposes, and using them as passwords puts all consumers at risk for identity theft.

“I have long argued that Congress or the Federal Trade Commission should prohibit companies from using SSNs as a means to verify identity,” Daniel J. Solove, professor of law at George Washington University Law School, wrote in an e-mail. “Merely protecting against their disclosure is insufficient since Acquisti and Gross demonstrate that they can readily be predicted.”

As a first step, the researchers suggest that the Social Security Administration start randomizing the assignment of SSNs. But randomization is only a Band-Aid, Acquisti said.

“It can buy us more time, but it isn’t going to change the underlying problem,” he said. “These numbers are supposed to be secret, but your bank has it, your insurance company has it, even your doctor has it. As long as we rely on numbers that are used as both identifiers and authenticators, then we are a system that remains insecure.”

Privacy law expert Chris Hoofnagle of the University of California, Berkeley, says the response must be drastic. “Their paper points to a radical solution: Perhaps we should stop trying to protect the secrecy of the SSN, and just publish all of them to prevent their use as passwords.”