Restricting Data for Privacy

We have a need to protect privacy of our data - which basically means users are able to query record level data, but results are redacted when conditions are not met.

Pre-aggregating is not an option as a large part of the data population will end up in groups that are too small.

Row level security is not applicable - users are allowed to see the data, just not at the record level.

 

Using a formula such as below gets part way there:

if(sum(head_count) >10) then sum(head_count) else null

This will return null unless there are at least 10 records in a row in the result table.

 

However there is a secondary requirement to suppress based on the number of rows with data. As a simple example, consider the following workforce data:

Gender

Headcount

Female

15

Male

5

 

Using the formula above would return:

Gender

Headcount

Female

15

Male

null

 

This is not acceptable because a user could query the overall headcount (20) as well as the above and be able to derive the Male headcount. 

 

In theory something like below would be acceptable to control this scenario:

if (count ( group_aggregate (

       if(sum ( head_count )>10) then sum(head_count) else null

       , query_groups ( ) , query_filters ( ) )

     ) >1 

    and sum(head_count) >10) 

then group_aggregate ( sum ( head_count ) , query_groups ( ) , query_filters ( ) ) else null 

 

Unfortunately this formula returns the same result as above. The "count" condition is not working as I'd hoped, which is to perform the group_aggregate calculation to return a list and then count the elements in the list.

Group_count might be a possibility, but it doesn't accept the wild card query_groups() option.

Is anyone able to suggest a solution?

3replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
Like Follow
  • Status Answered
  • 2 mths agoLast active
  • 3Replies
  • 72Views
  • 2 Following