twitter facebook rss

Law: Cloud Big Data

Posted by on August 21, 2014.

Dr Bandey [B&W Seleniuml]”Analytics in the Cloud: Traversing a Legal Minefield” – To avoid legal liability, organizations that want to reap the benefits of cloud-based big data analytics must carefully vet partner technology

by Dr. Brian Bandey, Doctor of Law

When a corporation mines the Big Data within its IT Infrastructure a number of Laws will automatically be in play. However, if that corporation wants to analyze the same Big Data in the Cloud – a new tier of legal obligations and restrictions arise. Some of them quite foreign to a management previously accustomed to dealing with its own data within its own infrastructure.

A corporation holding Big Data will possess different types of data which the Law will automatically classify and attach law-based obligations.

Some of that data may not be owned by the corporation. It may be a third party’s data which it holds pursuant to a Confidentiality Agreement. Such agreements may not only produce obligations that go to nondisclosure; but may also restrict the uses to which the data can be put and define what level of security is to be employed.

Other data might be owned by the corporation, but indentifies living individuals (whether directly or indirectly). Data Protection Law (as it’s generally known) is concerned with the access, use, movement and the technological safeguards to prevent disclosure of “Personal Identifying Information” (PII).

A Corporation will also own secrets about itself which, if disclosed, might cause irreparable damage.  Officers owe stakeholders a legally binding ‘duty of care’ to take all reasonable precautions to ensure the security of such information.

Due to restrictions on processing, both from Data Protection and Confidentiality Laws, care will need to be taken when building the Data Warehouse to be analyzed. Certain classes of data may need to be excluded.

All of these different types of Law intersect over the area of Big Data storage, security and processing. They produce a matrix of law-based obligations which, in many areas, cannot be delegated or avoided – only met.

But what happens when we translate that matrix into the Cloud?

The first matter is that of Security. Breaches occasioning the loss of data can cause an abundance of law-based difficulties: from breach of contract, fines under Data Protection Law, uncapped damages due to the release of 3rd party secrets and so on. But why is this the ‘first matter’?

The corporation cedes actual security to its cloud services provider. Instead of the corporation implementing its own security directly; that role is handed to the cloud services provider. A great deal is said about Service Level Agreements on this subject – their utility and importance. Frankly, I don’t see it that way. What remedies are available to the corporation under a SLA other than contractual remedies? Usually none!

In my opinion, it is highly likely that money damages will not put the corporation back in the position it would have been – but for the security/contractual breach.

No. What is needed is the choice of a correct cloud security architecture of sufficient robustness. One may ask why is that a legal topic? Surely it is strictly an IT matter? I take the view that one must look to the propensity of the cloud technology itself to cause the corporation legal exposure.

The Duty of Care owed by officers to their stakeholders, the corporation’s duty to those persons whose PII it holds, and the contractual obligations it owes with respect to 3rd party confidential information – all compel the corporation to exercise expertise, care and prudence in the selection of a technologically secure cloud computing environment. This means that they must look beyond the Cloud Services Provider per se, and discharge the Duty of Care through Due Diligence on the technology underpinning it. How capable is the architecture of securing the data? Is the architecture built to be secure and resistant to the correct range of security threats? How robust and secure is it against measurable benchmarks?

Secondly, there are significant technological differences between a cloud computing environment and a corporation’s ‘owned’ infrastructure. I am referring especially to the integrity of Multi-Tenancy Architectures. A leaky Multi-Tenancy system must cause a significant probability that the corporation will be in breach of its obligations to many prospective litigants. Thus real attention will need to be given to the architecture that isolates one ‘data set’ from another and keeps it isolated. These are not matters of academic technical interest – but go to the ability of the corporation to discharge what are often, non-delegable, unavoidable legal duties.

Moving on from Security; there is the matter that is generally referred to as the Trans-Border Movement of PII. Many countries either restrict or prohibit the exporting of PII. To do so can even be a corporate crime – certainly exposing the wrongful exporter to the likelihood of a hefty fine, adverse publicity and reputational loss.

Thus the problem for our conceptual corporation is the nature of Cloud Computing itself. By that I mean that the advantages of scalability, flexibility and economies of scale that are accessed through the technological advantage of distributing data across a number of servers which may not all be in the same country. Thus PII may be automatically exported illegally.

There are two avenues open to the corporation to obviate this ‘unlawfulness’. The first is to choose a Big Data Warehousing and Analytics architecture which, with certainty, can confine data storage and processing to servers residing in nominated legal jurisdictions. The Cloud Computing Architecture must be able to identify what data is in which jurisdiction and, if necessary, keep it there. The second is to transform the PII so that it no longer constitutes, in Law, PII. Data which is not PII cannot be subject to Data Protection Law.

For some time now, medical researchers have shared patient information internationally through a process of either Anonymisation or Pseudonymisation. Anonymisation is a process whereby the identifier sub-data is removed, prior to export, thus enabling any type of processing, anywhere. The data needs to be of a configuration that can still be effectively processed in the absence of identifier sub-data. Where the presence of a form of identifier sub-data is required for processing (or analysis) Pseudonymisation is used.

The aim of these two forms of de-identification is to obscure the identifier sub-data items within the patient records sufficiently that the risk of potential identification of the subject of a patient record is minimized to acceptable and permissible legal levels. Although the risk of identification cannot be fully removed, it can often be minimized so as to fall below the defining threshold.

There is no reason, in Law, why Big Data Analytics cannot be performed lawfully in the Cloud. However, in order to do so, significant attention needs to be directed to the actual software and hardware programming architectures to be employed – and match those to the matrix of Laws which operate over the storage, use, processing and movement of data. It may seem strange that I am advocating an almost technology-centric solution to what is clearly (and perhaps solely) a law-based problem. But as I said before – money damages in these scenarios will never, in my opinion, be sufficient compensation for the owners of Big Data.

Rather, the requirements of the Law need to be soundly and accurately matched and, indeed, mapped onto the cloud computing technology at hand. Only then can the minefield of Big Data Analytics in the Cloud be successfully traversed – without an explosion.

Leave a Reply

Your email address will not be published. Required fields are marked *

Submitted in: Dr Brian Bandey, Expert Views, News_legal | Tags: , , ,