Data sprawl, privacy, and cybersecurity

easyDNS is pleased to sponsor Jesse Hirsh‘s “Future Fibre / Future Tools” segments of his new email list, Metaviews

Thinking about our digital shadow

There’s irony in the correlation that the less our bodies move the more our data sprawls. Our bias towards the material world creates a blind spot that hides the extent to which we travel extensively in the digital.

As the pandemic has encouraged physical distancing, it has also fostered increased digital activity across a wide ranger of servers and devices. While the concept of data sprawl is largely used to describe dynamics within corporations, it can also be applied to individuals and their personal information.

The general concern is that our digital shadow has grown, and if we’re unaware of it, it could be used against us, or at least undermine our privacy and cybersecurity. Before we explore the concept on an individual level, let’s look at how it is being framed for the enterprise.

IT executives increasingly worry about the extent to employees have saved their company’s data in unprotected devices or sent sensitive information through insecure services, according to a survey released by data-governance firm Egnyte last week.

The survey, conducted in August, found that more than three-quarters of CIOs had concerns about content sprawl, with 38% very concerned about the issue. While the degree of data sprawl often depends on the department, the rapid move to remote work because of the coronavirus pandemic has become the No. 1 reason cited by CIOs for data replicating to insecure environments.

Employees may copy data to their home systems, even if those systems are not maintained or visible to the company, says Kris Lahiri, chief technology officer of Egnyte.

“In a lot of cases, the worker has problems getting stuff done, so they take an easier solution, whether it was insecurely sending something over email or a personal device,” he says. “People needs to realize that basic digital hygiene is important to visit.”

This last point is important. Data sprawl is partly enabled by a desire or need to get stuff done. Like an electrical current, we seek the path of least resistance, and in a digital world that path is constantly changing.

Just as a river changes shape and route with each new season, there’s always an easier way to do something digitally. A new service, app, or platform that offers a new feature or another way of connecting or sharing with people.

With each new login and each import, we’re extending our digital shadow and making another weak link in an identity and data network that we rarely pay attention to and probably are not in a position to quantify.

For corporations, this can easily sprawl out of control. Before the pandemic this sprawl was relatively limited to intranets and internal systems. Even in that relatively limited context the sprawl was getting out of control, and many companies felt compelled to turn to algorithms and machine learning tools to help identify and organize their data.

However now thanks to the pandemic, the personal world and the professional world have blended together, and it’s not just personal devices that have experienced the encroachment of company data, but the social cloud as a whole.

Which brings us back to the concept of a digital shadow. As individual users we cast this shadow anywhere we might create or leave behind data. However as a company, this shadow comprises the sum of individual parts, which can sprawl to all sorts of places and locations.

While some companies take this sprawl seriously, I’d hypothesize that most do not. That in so far as people do their jobs, or at least try, and the work gets done, that’s good enough. Sometimes when people leave jobs or go to competitors, this issue may arise, but it otherwise remains below the company’s level of organizational awareness.

Yet this pandemic may change that. The seismic shift towards working remotely may not only illustrate the need to conceive of if not manage a company’s digital shadow (and resources), but also encourage people to think of their own individual shadows.

Where we leave data, where we’ve left login credentials, and how our digital shadow has a direct link to our ability to protect and preserve our privacy and security.

Alternatively, the surveillance technology that has arisen in response to remote work and remote learning could be combined with machine learning models to track, monitor, and control data sprawl.

Part of the concern is the manner in which individuals turn to their personal devices to get work done. While companies could respond to this by issuing corporate devices for home usage, there is still the issue of privacy, when using a machine issued by an employer or one that is personal. The surveillance tech is not necessarily designed to respect the differences that arise in such a nuanced situation.

The nature of sprawl is that it usually happens so quickly that it exceeds plans and becomes a mess. It’s rare that sprawl is the result of design, but rather a reflection of a lack of design.

In this case data sprawl reflects our willful ignorance of the implications of our digital lives and habits. I use the phrase digital shadow as an acknowledgement of the invisibility of it all, but what if another description was digital garbage. A reflection of our digital irresponsibility and our failure to clean up our digital mess.

It’s only invisible to use because we were not told or encouraged to be responsible, and not leave our data or personal information lying around. Those accounts we create to get things done can be deleted once the task is complete. Yet because we cannot see or understand the consequences, we allow all this data to sprawl.

While corporations may be able to invest in tools that can help automate or remind people to tame their sprawl, what about the rest of us? The consequences of those tools will probably be the further colonization of our technology by employers and providers.

I suppose our larger alternative is to attempt to prevent sprawl in the first place. Be careful of where you take your data, and choose tools that work for you rather than working for your tool. 😉 #metaviews

Leave a Reply

Your email address will not be published. Required fields are marked *