Is Open Data About You?
Perhaps more than you think, and the potential harm from wrong/stale open data could damage your reputation and adversely affect you financially. Whether it is the incredible growth of sensor data captured about our internet behavior, the consequences of unintended open data joins, or the massive increase in data breaches, you may be at risk.
Yes, there are there concerns about which open data set contains what information about you and me, as I learned on June 24th, at the Navigating and Prospering in an Open Data World round table, held at IBM’s NYC headquarters. Meeting participants included John N. Stewart, Sr. Vice President and Chief Security Officer of Cisco Systems, Robert M. Groves, former Director of the US Census, now Provost of Georgetown University, John Walton, CIO of San Mateo County, Calif., and Jane P. Edwards, Privacy Council for IBM, and a dozen other open data stars.
But there is good news too, Analysis of open data can help us all. For example, sharing demographic data can help with health issues. CIO Walton believes that a combination of government statistics and journalistic stories will lead to political action. For example, using population data for enabling outreach to help control the explosion of diabetes in the US.
Open Data About You may be Eye Opening
The experts made the following points abundantly clear.
- Every hour, terabytes of potentially “revealing” sensor data is collected
- Concentration of data means concentration of risk through unforeseen data use
- “Accidental” joins of open data sets will produce unintended consequences from unintended insights
- Data breaches are up 350% in the past 12 months, here in the US
- According to Cisco, one-third of all devices have been infected with malware, and
- Last year, 47% of all US adults had some of their Personally Identifiable Information (PII) copied.
Long after your activities and transactions fade from memory, the data that records them lives on as history. What can you do when the open data about you is wrong, and/or gets into the wrong hands? Being interested and informed regarding the open data about you is critically important. Ensuring that data anomalies are corrected is the next step. When it is wrong, get it fixed. Later, I’ll show you how I fixed some errors in data used by marketers.
“Privacy is the first victim of Convenience”
So said John Walton, the San Mateo CIO, as we discussed the balance of privacy and utility of open data. On the other hand, Robert Groves, the former Census director,, believed that data brokers, for-profit aggregators of open and other data, often miss the poor and the disenfranchised. These are the people who most need to be visible, especially for government services.
Julia Lane, Institute Fellow of the American Institutes for Research, opined that data brokers, presently unregulated in the US, need regulation. Emily Shaw, National Policy Manager of The Sunlight Foundation agreed, saying that we need to ensure that people are protected from discrimination through the use of stored data. Let’s have a look at some of the open data that is out there.
Big Brother – Data.Gov
It wasn’t hard to guess that our government has a huge store of open data about us. Where can you find it? At Data.gov, the official US Government portal for open data. Today, there are more than 100,000 data sets accessible from the portal. Click the Data.Gov logo to see the data policy statements. Not all the data accessible from the site is from the federal government, Some may be sourced from universities, states, or other suppliers. Non-federal data may have restrictions on its use, indicated by banner.
A Riddle Unwrapped by An Enigma
So, what open data about you and me is in those data sets? Is the data that each contains correct? Searching manually, you would spend an eternity exploring, only to discover a data anomaly when it arrives as a problem on your doorstep. There are other ways to gain insight beyond a brute force approach. Visiting the Impact link on the Data.Gov site leads you to a list of third-party vendors making the information in this portal accessible for you.
Here is one that looked useful. You can click the graphic and visit their website, as I did.
Let’s see what I’ve been up to
After registering at Enigma.io, I launched a “Stuart Selip” search against all the databases Enigma accesses. I found one piece of information on me, shown following. It is correct.
I’m an outside director of a start-up called Woxxer, Inc. and this is an SEC filing statement about their financing. What if that information were wrong? A Google search revealed that SEC Form D filing instructions are found here, and there are instructions for filing an amendment to prior filings. If I were to have found an error, I would be obliged to have an amendment submitted. Not as bad as I thought it might be.
Fixing a Problem with Marketing Information
There is a boat-load of open data used by marketers to assign you to demographic categories, identify your buying interests, and so on. If that information is wrong, you may find yourself getting all sorts of odd and unwanted email and snail-mail solicitations, advertisements, and similar material.
Can you find out what data aggregators have collected about you to be used for marketing purposes? In at least one case, the answer is yes. Better still, you can fix wrong information right online. I did exactly that. IBM Privacy Counsel Jane Edwards suggested I look at data aggregator Axciom’s site aboutthtedata.com. There, I was to find some errors in the data about me, and fix them.
At the left, you can see the types of information Acxiom has on file. I decided to look at Household Vehicle Data, wondering whether it was up-to-date.
Well, it wasn’t. I haven’t owned a truck (a Bronco) since 2011, I have never owned an RV, and my insurance doesn’t renew in October. I fixed it. The slide show fills in the details. By the way, our household has several vehicles not in the database. Is that an error of omission, or are those vehicles not of interest to marketers? I have no idea.
Fair Credit and Right to be Forgotten
One area of data aggregation where US citizens have specific rights is in viewing and correcting errors in credit history. Your credit score and history are used for many, often unexpected purposes, and you should ensure the information is correct. Here is a summary of your rights.Fair Credit Reporting Act. If you need to dispute a finding on your report, look here for guidance.
Recently, the Court of Justice of the European Union addressed the issue of search engines and the right to be forgotten. In this case, the right to be forgotten means
Individuals have the right – under certain conditions – to ask
search engines to remove links with personal information about them. This applies where the information is inaccurate, inadequate, irrelevant or excessive for the purposes of the data.
The Bottom Line
The potential harm of wrong/stale open data about you can be great. Taking an interest in which information is out there and whether it is accurate and up-to-date is your responsibility. Whether the “you” is a corporate you, or an individual you, bad data in the wrong hands can be disastrous.
There are tools and approaches available to let you explore your open data presence and gain perspective on how you look to the casual inquirer, the marketer, and the credit evaluator. Take action today, right after you read this post.
The results of our ongoing survey into the results of poor data quality on business outcomes has revealed that the state of organization’s data, especially about their customers, is poor. If organizations with a profit motive can’t invest sufficiently to keep data about their customers clean and useful, you know that your open data may be at risk.
If you have a poor quality open data experience that you are willing to share, please do contact us to discuss. Sunlight is the best purifier, so please help us shine some light in the open data shadows.
Finally, I want to thank Gordon Feller, Director of Urban Innovations, Cisco Systems HQ, and Co-founder, Meeting of the Minds, and Rich Michos, Vice President, Smarter Cities, at IBM, who together organized the Navigating and Prospering in an Open Data World round table for inviting me to participate.