Help on a Calculated Formula for a Group.

Hey ThoughtSpotters!

I have a dataset as follows where a customer can be in multiple different categories.  I would like to create some type of “IF / OR" statement that creates group that I can use a filter in my pinboard or even dimension.

Note:  I would creating this in my “combined” worksheet that references a tables. 

Custno

NewGuy

FunGuy

BoringGuy

NerdGuy

Money

123456

TRUE

TRUE

FALSE

FALSE

50

234561

TRUE

FALSE

FALSE

FALSE

150

345612

FALSE

TRUE

FALSE

FALSE

250

456123

FALSE

TRUE

FALSE

TRUE

350

561234

FALSE

FALSE

TRUE

FALSE

450

612345

FALSE

FALSE

TRUE

TRUE

550

654321

FALSE

FALSE

FALSE

TRUE

650

543216

TRUE

FALSE

FALSE

TRUE

750

 

Here’s a sample of a total report with the new dimension as a filter.

Custno

NewGuy

FunGuy

BoringGuy

NerdGuy

8 Cust

3

3

2

4

Money

950

650

1000

2300

9replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • Roland Schlichting it is possible to create a report as follows using count_if and sum_if. Will that suffice? Not sure what you mean by the filter.

     

    Cust Count

    NewGuy Count

    NewGuy Money

    FunGuy Count

    FunGuy Money

    BoringGuy Count

    BoringGuy Money

    NerdGuy Count

    NerdGuy Money

    8

    3

    950

    3

    650

    2

    1000

    4

    2300

    Reply Like
  • I am not exactly clear on your requirements, could you confirm if the following is what you are trying to do.

    1. You want a single field: Guy Type. This field has the values NewGuy, FunGuy, BoringGuy, NerdGuy. This field can be used as a PinBoard filter or a dimension.

    2. A customer could be classified as multiple Guy Types for a single transaction. Refer to 456123 above

    3. You three measures: Sum Money, Count Unique CustNo, Count CustNo per GuyType.

    If this is the case you could solve this with modelling the data into separate tables. 

    Table 1: Fact

    CustNo, Money

    123456, 50

    234561, 150

    345612, 250

     

    Table 2: Guy Type

    CustNo, Guy Typ

    123456, New Guy

    123456, Fun Guy

    234561, New Guy

    345612, Fun Guy

    Reply Like
  • Thank you Siva Singaram and Damian Waldron for responding and following up so quickly.  So to add some additional clarity:

    I would like to optimally create a field like "Guy Type" (as Damian mentioned) that I could add to a pinboard.  The pinboard would have metrics (in this case, # of Customers, Total Money).  When I select the "Guy Type" as a Pinboard Filter it updates the two metrics.

    A single client can indeed be different "Guy Types".  For example I think of myself as customer 612345 (Boring and Nerdy) lol. 

    The reason why I would like this:  Currently, if i want a pinboard for "New Guy" I would have to create an answer for New Guy = "true".  Then add to a pinboard.  If the user of the Pinboard wants to know the "Boring Guy" stats, another Answer needs to be created. 

    In a perfect world, I wouldn't need to create 4 answers to answer 4 questions, I would provide a pinboard and give the option to select on "Guy Type" Filter. Keeping it simple and leveraging one pinboard for multiple uses.

    Siva, for the idea of using a count_if and sum_if...  Does this mean I would create 8 formulas (2 each for each metric and guy type?).  When i'm profiling customers, i'm going to have many more "metrics" and "guy types" ...

    Thank you both in advance.

    Roland 

    Reply Like
  • Roland Schlichting I think you're looking to do a simple bucketing formula in the worksheet, like this one we do in training which buckets by customer age:

    if ( customer_age < 30 ) then 'under 30'
    else if ( customer_age >= 30 and customer_age < 50 ) then '30-49'
    else if ( customer_age >= 50 and customer_age < 65 ) then '50-64'
    else '65+'

    This formula acts like a derived column, and end users will be able to sort or filter on it, just like any column.

    The above works fine, but it does calculate on the fly. This means that if your dataset is very large (millions of rows), you're better off pushing this logic into the ETL to get the best performance for end users.

    Reply Like
  • Alicia Avrach Hello Alicia and hope all is well. 

    hmmm  I would say almost, but note that i am looking at 4 different fields (unlike the age example above with one) and the client can be in 1 or several of those fields. If i'm misreading this or you have a sample syntax i could use for my example i'm willing to try it out?

    Thanks...

    Reply Like
  • Roland Schlichting i agree as the result is not mutually exclusive so conditional IF statements will not really work. Can you split the tables as suggested in the model?

    Reply Like
  • Damian Waldron Not following Damian.  split tables?

    Reply Like
  • Roland Schlichting 

    As per my first post instead of loading the date as a single table with columns for each guytype load the data as two tables. Where table 1 has the customer ID and money and table two has the customer ID and a single column with the values applicable for the customer. Therefore there would be multiple rows for each customer if they exist in multiple columns. The below link is demonstrating this in excel with the unpivot function. Therefore changing the data from CrossTab to Tabular.

     

    https://www.excel-university.com/unpivot-excel-data/

    Reply Like
  • Damian Waldron  This looks complicated and time consuming to set up.  Sometimes the answer is "can't be done" or "no simple solution".  This may be one of those times, which is totally fine.

    Reply Like
Like Follow
  • 1 mth agoLast active
  • 9Replies
  • 80Views
  • 4 Following