Social media is integrated in day-to-day society to the point that we frequently take it for granted. We update Twitter, Facebook, Pinterest and countless other sites by habit, without thinking. It’s not uncommon to see an entire table of young people – and the not so young – sitting at a restaurant staring at their phones. It’s not that they’re ignoring each other; they are simply updating their social media sites to reflect the fact that they are out with each other, and perhaps informing the rest of the Internet as to what they were doing before they arrived. No rudeness is intended; no exclusion is being performed; it is simply that, for a large subset of smartphone users, including the Internet at large in their activities is just something they do
In a “free country” whose citizens are under constant surveillance by their own government, it’s valuable perspective to remember that the single most effective means of tracking our activities, interests and opinions is not a government program at all. The most effective catalog of individual activity is provided by those individuals when they volunteer it. Your social media posts, your likes, your pictures, your location updates – all of these create a web of interlocking data that describes you and your movements to anyone who cares to look. The process of exploiting this cache, this repository of your volunteered information, is called data mining.
You already know that, of course. For example, you know that cookies left on your computer inform certain websites of where you’ve been. Look at a shirt at K-Mart’s website today, and tomorrow you’ll see that same shirt in a margin ad on a different website. Facebook tracks your interests, as do various quizzes and games that you play on it and through it. Its purpose, the intent behind what it does, is the same, if more sophisticated: The people behind the programming want to better pitch you things you’ll actually want to buy.
Data mining as a means of selling you things is not the extent of what can be learned through social media. Analysis of social media is being used to map conflict in Syria. As explained by Rakesh Sharma in Forbes, the project “is an initiative… to examine the massive amounts of citizen-generated information related to the Syrian conflict that is available online. Posts on social media, among other things, help to details [sic] the growth of opposition armed groups in each governorate within Syria; show the current geographic delineation of pro and anti-government forces and provides up-to-date analysis on the current state of the conflict. Initially, the goal of the Syria conflict mapping project was to provide information to neutral parties working toward a peaceful end to the crisis. With no end in sight, providing information about the strength and location of armed groups in each region of Syria also helps humanitarian organizations to know where it’s ‘safe’ to operate and where it’s not.”
You also know that Google automatically analyzes the texts of your emails in order to better target advertising to you. Ryan Neal reports that “a lawsuit making its way through federal court in California accuses Google of scanning emails sent and received by students using [Google] Apps for Education, and using information in these emails to build profiles of the students and target them with advertising.” The software in question is a suite of free “productivity apps” featured on Google’s Chromebook laptops. At issue is whether Google’s data mining of student information (if this is, in fact, occurring) violates a 1974 law protecting the privacy of educational records.
Meanwhile, posts on the micro-blogging site Twitter are actually being used to identify physical sickness. Google has already used the frequency of sickness-related search terms to provide estimates of flu infection, giving researches cause to be “optimistic about the potential of the Internet as a medium for data mining.” According to Thomas Claburn, “Penn State researchers say they have created ‘a system for making an accurate influenza diagnosis based on an individual’s publicly available Twitter data.’ … [T]he researchers set out to analyze the tweets from [differing patient groups] in their study to determine whether they could diagnose influenza from Twitter posts.” As it turns out, such a determination could, in fact, be made – and with greater than 99 percent accuracy.
The implications are powerful. Simply by analyzing the opinions individuals voice and the data those individuals share through social media, real facts about those posters’ physical natures can be identified. But this is a double-edged technology; this is a methodology that can be used for good or for ill. This brings us to the Internal Revenue Service, which has begun to use data mining of social media to identify disparities in seemingly incongruous (or missing) tax filings.
“The taxman is reportedly using data from social media on people who file fishy-seeming taxes or don’t file at all,” writes Dara Kerr. “In its quest to find and audit tax dodgers, the IRS is said to use online activity trackers to sift through the mass amounts of data available on the Internet. … This data is then added to the information the agency already has on people, such as Social Security numbers, health records, banking statements, and property.” Kerr goes on to cite the IRS’ recent attempt to read citizens’ email without search warrants, an initiative from which the agency backpedaled due to civil-liberties backlash. Even if they don’t read your email, they’re reading your Facebook and Twitter pages (among others).
The urge to update is unlikely to change. We are becoming more interconnected, not less, and individuals’ desire to share – the degradation of their barriers to withhold private data – will only increase. Given this, we cannot afford to lose sight of the implications of our collective drive to socialize online. Anything we share can and will be used to know about us … and we may not like the consequences of what we inadvertently admit.
Media wishing to interview Phil Elmore, please contact firstname.lastname@example.org.