Nested Aggregation - Percentage compared to Max aggregated value

Hi community,

I'm stuck with (maybe) a simple calculus regarding flexible aggregation functions.

For each category, I need to calculate the percentage compared to the max values (both aggregated).

This could be a sample dataset:

Date Category Value
01/01/2019 A 2000
02/01/2019 B 1000
03/01/2019 A 1000
01/01/2019 B 1000
02/01/2019 C 500
03/01/2019 A 1000

And this is the output I'd love to get (see image attached).

Category Value % compared to Max
A 4000 100%
B 2000 50%
C 500 12,50%

Basically I'm stuck with the nested calculus because I need "first" to get Sum for each caterory (A-4000; B-2000; C-500), and "then" comparing those results to the maximum aggregated value (4000).

Probably I'm missing something because I end up with some chasm trap.

 

Can anyone help?

 

Thanks in advance.

8replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • I am not at a system to check this at the moment, but I would have thought that you could use the group_aggregate function to help with this, e.g. something like

    safe_divide(sum(value),max(group_aggregate(sum(value),{category},{})))

    Reply Like
  • Actually, not sure that will work, I suspect it will still only give you the max within the category, rather than across categories.  Will have a rethink.

    Reply Like
  • I resolve this using a view. I would be interested if this can be resolved without a view.

    Step 1: Create View at Category Level

    Step 2: Use View as source to create ratio where the formulas are:

    max_category = group_aggregate (max ( total value ) , {} , query_filters ( ) ) 

    ratio = (total value / max_category ) *100

    Reply Like 1
  • Damian Waldron nice work! 

    Reply Like
  • Paul Froggatt 
    Thanks for your feedback, but actually every category returns always 100%.

    Reply Like
  • Damian Waldron 

    Hi Damian, thanks for your suggestion.

    I got your point doing a pre-aggregation first but as far as I understand, this method may not work once you want to filter date or other potential dimension in the original dataset.

    At the end of the day I would need a formula at original datasource level in order to combine 2 different aggregation calculated in 2 different steps:
    1) sum values for each category
    2) compare "those" sum(values) to the overall max(values)

    Something like:
    sum(sales)/window_max(sum(sales))

     

    Does this make sense to you?

     

    Thanks a lot in advance

    Reply Like
  • You may be able to use the nested In keyword to achieve something like this 

    Reply Like
  • Marc Price 

    I don't think that the IN keyword could be useful in this scenario because of the different level of detail of the 2 steps; furthermore I'm not sure it can be used "inside" a formula.

    Did you use it for a similar purpouse?

    Reply Like
Like1 Follow
  • 1 Likes
  • 2 wk agoLast active
  • 8Replies
  • 28Views
  • 5 Following