What is a data complex business? – An Interview

Is your business data complex? How do you even know? In this interview with Chris, Director a Cybata, we uncover what data complexity means for you and your business.

This interview is focused on GDPR, Data Management, Data Processes and Data Protection.



What is data complexity? And how does that impact upon individual businesses?



Okay, so for me, the problem is most people don’t understand that question. And when people are thinking about data protection and GDPR, if you look at the industry and the market as a whole people will price services around a small business, a medium sized business, a large business. And my experience, the size of the business makes no difference to the work. It makes a difference to the time or the length of the project. But generally, it’s the complexity of the business.

So what I mean by that is you can have a very large commercial business, or b2b business that has GDPR requirements to live up to. But you can have a small sports organisation or a small charity that has GDPR requirements to live up to, but they’re more significant, more contact or resource requiring more costly, because it’s the data that they process, the way in which they process it, the supply chain they use to process that data, and the security measures they have to put and evidence that they’ve assessed in the supply chain that makes them data complex. So I think the question is, if you want, if someone needs to look at the resources they need to put in place to become GDPR compliant. It’s not about the size of their organisation, it’s about how data complex they are. And that’s a challenge.



So what are some of those signs that you are data complex? And if you’re just maybe oversimplifying it, but for the purpose of this, I think it’s quite a useful exercise? Are there some key indicators or key signs that you are more data complex? Or that you could be data complex?



So when I look at any industry, what some of the questions I’ll ask is, I’ll say, you know, it’s how many data subjects you have. Now, a data subject can be a an athlete, it can be a patient, it can be a medical professional, it could be a director of a business, it could be a staff member of the business, it can be contractors, it can be people in other national governing bodies when you come to sport. So each of those is a data subject. It’s a group of people that you take data from, and process their data. So how many types of data subjects you have is an indicator towards the complexity. In that group, you’ve also got adults and children as soon as you start processing, personal data of children, the risk goes up, the complexity goes up. Okay, so the next one is, there’s often significant if you’ve got more data capture per data subject type, obviously, the complexity goes up.

What I mean by that is that with some data subject groups, say a supplier, you may only have a name, email address, and phone number. But for a patient, you may have all of those contact records, you may have financial records, and you may have lots of health records. Staff usually has a super set of data because you’re collecting medical data you’re collecting leave data, you’re collecting their personal information, you’re collecting financial information for them, you’re collecting information to allow you to process their pensions. So if you imagine an organization’s that’s got three data subjects, staff, contractors and customers, and that you’re only collecting, say 5 pieces of data on average across all them, you’ve got 5 times 315 among 15 things to think about. In small charities and sports organisations, we will often find there will be 10 to 15 data subjects groups, sometimes adults only sometimes adults and children. And for each of those 10 data subjects, you may be collecting on average 10 pieces of data 10 times 10. That means 100 pieces of data that you are capturing for other business that begins to indicate the complexity. So the number of data subjects, the amount of data you collect for each data subject. And in the data subjects, adults is one thing, but children adds complexity.

The next thing you have is within the data, you can have standard data, and what’s called special category data. If you’re only collecting standard data, core standard data and names, contact details, financial information, the regulations lay this out, then that’s one thing. But if for the number of data subjects, you are also collecting special category data, which is biometric data, medical or health data, sex or sexual orientation, information and information about disabilities, for example, then the complexity of that organisation from a data protection and GDPR perspective goes up significantly. So we go back number of data, subjects, amount of data, individual data that you collect per data subject, increase complexity, children adds complexity.

If you’re collecting special category data that lands more complexity. If you are, then you then have to look at how you’re collecting this information. And you’ll be collecting it from maybe directly from the data subject by paper, or you might be collecting it directly via electronic means, you might be collecting it via email systems by website forms, you may be getting the information from other organisations and partners. So when you start to look at this, you’ll see that you’re collecting typically organisation collecting more information for more places by more means than they ever realised. So more of that is going on, the more complexity there is.



Yeah, so on that last point, you just said that businesses are putting themselves in a position where they are more complex than they actually need to be in, specifically when they’re collecting all this data. So is it that they didn’t realise, and actually they don’t need to be anywhere near as complex because they’re collecting data which could be irrelevant, or stuff that is forcing them to be further and further complex? Or is it the fact that this is just the way that business is, and actually most businesses do become data complex, because they are trying to hold so much data?



It’s both actually been great, great spot. For example, if you look at sports organisations, they will often say we’re just a small sports organisation, they try and do the right thing for the people who are coming to their sport, charities will often say, Well, you’re small charity, but when you look at what they do, the actual function they perform, and the way they deliver that service, they in my experience, by their nature, are going to be more complex than many commercial businesses.

But your other point is also true. I think most businesses have grown organisations, charity, sporting any organisation, over time ends up being more complex than it needs to be in a general sense. Most organisations are collecting more information about people than they actually need to do their provide their function to the customer. And actually, the GDPR regulation calls for data minimization. And in your words, that sort of helped simplify the risk. Because if you take out don’t collect data in the first place, you’re GDPR management is simpler, your risks are reduced. And your costs are reduced.

So I think there are some industries where simply the fact of what they do make some data complex, but that complexity absolutely can be reduced with good processes and procedures. And actually, fundamentally, it’s having a good understanding of what you do, then you can decide how to make it simpler. And that’s one of the key things that people miss around the remediation actions; how do we make it simpler?



So on so developing that a bit further, then we talked about sort of how potentially simplifying the data landscape that you have then so in terms of trying to reduce the amount of complexity, but there obviously, there’s only a certain amount that you can necessarily simplify. So for example, you’re collecting some sexual orientation data, your sports organisation collects sexual orientation data that you don’t necessarily need, because actually the national governing body may do that, as opposed to you as a sports club. So you may not need to do it. You’ve done it any way. You know, actually, we’ve simplified it because we don’t need that data as a as a Club just but we still obviously got plenty of data that we do still need to collect and organise. So how do we start managing that complexity? Not just simplifying? Because obviously, a part of management is obviously making it simple. But then obviously, then you still got a huge amount of data left, how do we start managing that? What the data that’s left?



Okay, so the first step for me in any project when we work with customers is education and knowledge. So helping that senior team to understand what GDPR is, and what it’s not, how to help them and identify where how complex they are, so they have an understanding of the risks. And then the next bit, the phase three effectively, is the piece that you talked about is what you need, what you need to do before you start to simplifies you have to document in a coherent and structured way, the data that you process for your organisation. So what you do, you end up creating what’s called a record of processing activity. In most cases, most people document it in a spreadsheet. And there are great templates around for that. I’ve seen people document it in Word documents, if they’re very small, I can work I’ve seen people document it in PowerPoint, and that’s not particularly effective. But for a very small organisation could be most people have a spreadsheet. And what the spreadsheet does, it helps you structure your thoughts. It helps you capture the data subjects that you have. It helps you capture the Date Specific data you capture for each data subject group. It helps you understand whether you’ve got standard data or special category data, it helps you understand where you are collecting this information from, it helps you understand what systems you’re storing the information on, it helps you understand what information you are sharing with other people. And naturally, once you pull this thing together, naturally falls out of it your highest risk areas, and also naturally, therefore pulls out your your to do list in risk priority order.

So for example, you might initially think that it’s your staff, all staff data in an organisation, all the staff, you think that’s the highest risk area, you go and do this data mapping. And you put it into a record of processing activity. And you find that actually, it’s the data that you share with other sports organisations. That’s the biggest risk, because you’re sharing it with people, you don’t have contracts with people who don’t have data sharing agreements with people who have never been given your athletes have never given permission for those contractors to have that information.

So you may decide that your highest risk area, then you look at that process. And you say how can we simplify it, it might be the fact you need to just tell the athletes that you have, you’re collecting that data for the purpose of x, y & z, and you tell them that it will be third party coaches that are accessing this data, you make sure you update your content, you make sure you have contracts with your coaches, and you make sure you have a data sharing agreement with the coaches. So you’ve protected yourself as an organisation, and you’ve been open, clear and transparent with the athlete on what’s happening with their data. Now, until you’ve done this documentation and putting this record of processing together, you don’t have a true picture of what you want data to process. You don’t know how data complex you are. Therefore, any action to take towards becoming what you think is GDPR compliant could be absolutely the wrong actions. You could be spending time money and effort in the wrong areas.



So is a ROPA the pretty much the starting point of a data complex journey? Is a ROPA the first stage on the movement towards try and become compliant or is it just a very early stage? And is it would you do a gap analysis or an audit before you do a ROPA or as part of the audit try and uncover what they currently have?



Great questions. So terminology, wise terminology, is helpful sometimes and confusing and difficult at others. For a larger organisations, my colleagues and I would be typically talking about an audit or a gap analysis but To think chain terms can sometimes be interchanged. For me, the process is quite simple. If you want to get to a somewhere B, and you want to know how to get to B, you have to know a your starting point. If you don’t know your starting point, you could be drawing a map to get to B, and you actually find that map is pointless, because you weren’t actually starting where you thought you were starting.

For me, the start of any of these, this journey is knowledge and education, have good education for the senior team in the orange team, what GDPR is, what the key facets are, what are the things they need, our priorities. And the context in the world in which they are operating is the first thing and typically, most projects, we recommend that for the senior team. And it’s a half day session. Once we have once we’re all now singing from the same song sheet, we understand the landscape, we understand the map, then we can say Right? Where are you on this map as your starting point. So that would be an audit or gap analysis.

Now, for very small customers who may be very data complex, and we know that when we talk to them, we can figure it out very quickly, we can figure that out in, you know, 10 minute conversation, what we would then do is say to them, right, your starting point is the record of processing activity, because we can come in for a couple of days, we can have conversations with all your key people. And we can pull your first level, your high level data map your record of processing activity together, and you get this naturally falling out is a list of your high risks and your action plan. Okay, for larger organisations, that may be more disparate, more facets, the language of audit and gap analysis works better.

For small organisations, they tend to find that language very off putting they think that means is going to be very expensive. When anyone talks about an audit or a gap analysis. So the language you use for small, very small customers usually is let’s put together your rep your picture your map your record of processing activity. And from that, then we can sit down talk about the high risk areas, talk about the key next steps. And then if they want to work with us to go and put those things in place they can if they want to do it themselves, they can they want to go and find someone else that can do that as well.



Yeah, that that makes a lot of sense. So I get that recognition of it being easier to describe and understand. Words are important to help understand the eventual outcomes. But the way in which it’s described needs to be done a different way. And that, I guess, is part of the issue with data. It can be quite seen as quite a jargon laden industry. So how do we try and break that down to make it easy for people to understand?



We work hard, we work very, very hard to take out the very technical language in the early phases of any project with a new client. Because it’s about getting engagement, getting an understanding, a detail, the nuances. The very specific language for me comes as they develop their understanding at a high level. What I find is that when people start with a very technical language, it puts others off, they’re bamboozled, they don’t get it, they don’t understand it.

Our approach is very much about helping them really understand what this is about why it’s important for them, why it’s important for them and their families, why it’s important for them and their business. Once we do that we get buy in and then we can start to introduce the nuances, the specific technical language as and when that is appropriate within that organisation. And that usually doesn’t come first. For me, it’s about getting them to really understand the principles and the need and the purposes and the benefits of doing this rather than reading absolutely right about the language.



Perfect. So we’ve talked about using ROPA as part of managing understanding data complexity and their role within that. What other tools help businesses with that will help manage that the complexity fear because as you mentioned, a ROPA is generally a start?

I think it helps you understand the picture and it helps you understand all the bits and pieces how they work together. But it obviously it still can be quite a big beast and I’m conscious that there may be other tools that you might need to think about in terms of managing your data complexity and future proofing it really because a ROPA will help you understand where you are now and you can add to it and obviously change it and keep it up to date, but there’s more to it than just a ROPA?



For me, the core of your any organization’s GDPR compliance plan and system is the record of processing activity. Now, for many customers, an XML based version of that is can be adequate, if they’re not highly complex. If things are not changing significantly in their organisation, then an XML based robot can be adequate. But what we have found in organisations that have lots of change at sport is being one of them actually, charities slightly less but sport, there seems to be a lot of change. In certain organisations, we find that the ROPA, the record of processing activity, you’re actually writing must be kept up to date. It’s a living breathing document, it’s not a one, you do it once and forget about it, it should reflect the data processing you actually do. And, if you change a process, so you add more data subjects into that project, that sporting activity or that activity with your patients, you should go back and update the ROPA and the ROPA might suddenly tell you the risk has gone up or that processing activity, it might tell you the risk has gone down. So it should absolutely reflect the real world.

But, what we have found in our record of processing activity, it can be no I’ve got small customers who started off thinking they were very simple. And then we built the record of processing activity in Excel ended up being 472 lines long, it wasn’t even complete. And even for someone like me, and my colleagues who are used to doing these things, that becomes a very difficult document to bring out easily. All of the things that we want to communicate to the senior management team are the things that they need to be looking at and dressing next. So all of the issues are probably buried in that spreadsheet, but it’s very hard to surface them.

One of my other clients said this, he said, once we put in a GDPR management system tool for him. So we’ve migrated him from an Excel based ROPA into a software tool that helps him manage his job as a data protection officer. He said suddenly, the issues that needed to be fucked up front and centre in his business were immediately visible to him. Prior to that, he would have probably had to have had some instinct that the problem was there and go and delve into the spreadsheet. And he probably would have found it. But what we want is that data protection issues should be very easily and simply made available to the senior management team in language that they can understand. So action can be taken.

You’re also right that you know there are tools out there. I often talk to clients about these tools, when I think it is appropriate. It’s not appropriate for every client. Some clients say with smaller data subjects, not huge amounts of special category or no special category data, where they’re pretty static, not much changes over time. They’re not using lots of well, methods of collecting data, they are not getting lots of data from partner companies, then an excellent base robot can can suffice Absolutely. But where you start to see that change, you start to see lots of inputs, you’re taking data from lots of organisations, you’re sharing with lots of organisations, your supply chain is very complex in terms of how you deliver the services using the data, then absolutely, there are software tools out there that can help document what you do store all documentation related to the GDPR and cybersecurity in the same platform. They can have project management modules, they have the ability to provide your privacy notice directly from the platform into your website. So it’s always, always reflects reality. Some of them have been processes and procedures embedded in them, some of them can store your processes and procedures.

So yes, they are available, there is a cost to them and monetary costs per month. But what we do see is that they can take away the need for lots of resource time and a client to reduce that to an acceptable level of resource time because the tool is supporting their activity, making their activity more productive.



Yeah, that makes sense. And the final question I have for you. Does every data complex business need a DPO, Data Protection Officer? Is it a legal requirement or a recommendation? Or is it a case that as long as you are keep keeping a ROPA update, then it doesn’t really matter?



GDPR explicitly describes the data protection officer as a specific role. It comes with some additional HR responsibilities for the company. Companies that are that don’t meet the requirements don’t have to formally appoint a DPO but they can still voluntary appoint a DPO.

What I would say is, every organisation should have a director who was responsible for data protection, whether they have a formal data protection officer in post or not. Now what some organisations do is rely on their compliance managers to do this. So certainly in the health sector, they often have compliance management as it is a highly regulated environment. They’ll already have a huge massive workload and then this data protection function gets dumped on them as well. However, they are typically are not data experts and so we find some real issues in that area.

So to come back to your question; there are rules in the GDPR that talk about when you would put a DPO formula in place, it’s about the risk of the processing. It’s about the volume of the processing.

We’re actually working with a client right now, where they are absolutely sitting on the fence. They’re very small, they’re in health. And the question is, should they appoint a DPO?

What we’re working with a client to say is put everything in place, you don’t have to appoint a DPO right now. However, when you are talking to the health service, talking to your potential clients, if you go into those conversation, you say, we already have a DPO appointed voluntarily, because we don’t have to formally yet that would count as a very positive sign to those patients, customers or users. And many of the questions that they may ask you may go away, because they see that you are taking data protection seriously from the outset.

So there are many organisations where absolutely, they don’t have to have a formal DPO. But they should absolutely be having someone who is still a director, who is who understands and takes responsibility for data protection, and cybersecurity.



Brilliant, answer that question quite neatly, thank you very much. It’s anything else? Any other questions? Is there anything else you wanted to say that you feel that might be useful?



The other thing I would say is, as you know, there are far too many organisations that simply put in place a privacy notice on their website, then went out and got permission or re-permission to talk to their customers in 2017/18, around the introduction of GDPR. They have a belief, a misguided belief, that they’ve done everything they need to in GDPR. In reality it is probably more like 5% of the things that an organisation needs to be doing around GDPR.

For me, these organisations need to have some education GDPR is not going away. In fact, it No it’s developing. Larger organisations are asking smaller supply chain partners to prove their GDPR credentials, and their related cybersecurity credentials. Increasingly, organisations are saying that if you can’t provide suitable responses, then you may not carry on with the contract, or you may not re-bid for a contract.

Organisations are going to have to look beyond the privacy notice, and the permission. They think they have to talk to their data subjects. They need to get that high level training to really understand what they need to do in their context. They then need to do this data mapping and if that’s as a result of doing an audit or a gap analysis, that’s fine, but they need to actually understand the how complex they are from a data perspective. Then from that understanding, they can start to work out where do they put their time, effort and resources? Is it into protecting this graph data more, is it more into protecting the patient data or the athlete data? And from that, what then measures do they implement? Is it better process and procedures? Is it better technology? Is it better training for the staff? Is it the fact they need to simplify their supply chain as a whole?

You know, only can you deal with those issues once you have that good understanding of what GDPR is and what it’s not, what data you have, what your process are, and how much of a risk that poses to your organisation. So yeah, understanding the data you have and this record of processing is absolutely crucial.