Monday, November 13, 2017

HASH GROUP BY not used when using more that 354 aggregate functions

Few days ago we had performance problem with one of our main application views. This was complex view that used a lot of aggregate function. Functions were used to transpose rows into columns.

When developer added few more aggregate functions for a new columns, query performance changed significantly and we had performance problem.

After quick analysis I have noticed one change in the execution plan.

HASH GROUP BY aggregation was replaced with less performant SORT GROUP BY. I have tried to force HASH GROUP BY using hints but nothing helped.


We tried to reproduce problem using dummy tables and then colleague found what was triggering plan change.

In this example I have query with 354 unique aggregate functions which is using HASH GROUP BY.

SELECT
          *
      FROM (SELECT LEVEL ID
             FROM DUAL CONNECT BY LEVEL < 1000) VANJSKI,
          (  SELECT
                    123 UNUTARNJI_ID,
                     sum(1) kolona0,
                     sum(1) kolona1,
                     sum(2) kolona2,
...
...
...
                     sum(350) kolona350    ,
                     sum(351) kolona351    ,
                     sum(352) kolona352    ,
                     sum(353) kolona353    ,
                     sum(354) kolona354
               FROM DUAL
           GROUP BY 123) UNUTARNJI
    WHERE     VANJSKI.ID = UNUTARNJI.UNUTARNJI_ID(+);

Plan hash value: 2294628051
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |       |     5 (100)|    999 |00:00:00.01 |       |       |          |
|*  1 |  HASH JOIN OUTER               |      |      1 |      1 |  4631 |     5  (20)|    999 |00:00:00.01 |  2293K|  2293K| 1549K (0)|
|   2 |   VIEW                         |      |      1 |      1 |    13 |     2   (0)|    999 |00:00:00.01 |       |       |          |
|   3 |    CONNECT BY WITHOUT FILTERING|      |      1 |        |       |            |    999 |00:00:00.01 |       |       |          |
|   4 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   5 |   VIEW                         |      |      1 |      1 |  4618 |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   6 |    HASH GROUP BY               |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |   677K|   677K|  723K (0)|
|   7 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Notice what will happen if I change "sum(1) kolona0" function and add one more unique function.

SELECT
          *
      FROM (SELECT LEVEL ID
             FROM DUAL CONNECT BY LEVEL < 1000) VANJSKI,
          (  SELECT
                    123 UNUTARNJI_ID,
                     sum(355) kolona0,
                     sum(1) kolona1,
                     sum(2) kolona2,
...
...
...
                     sum(350) kolona350    ,
                     sum(351) kolona351    ,
                     sum(352) kolona352    ,
                     sum(353) kolona353    ,
                     sum(354) kolona354
               FROM DUAL
           GROUP BY 123) UNUTARNJI
    WHERE     VANJSKI.ID = UNUTARNJI.UNUTARNJI_ID(+);

Plan hash value: 2326946862
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |      |      1 |        |       |     5 (100)|    999 |00:00:00.01 |       |       |          |
|*  1 |  HASH JOIN OUTER               |      |      1 |      1 |  4631 |     5  (20)|    999 |00:00:00.01 |  2293K|  2293K| 1645K (0)|
|   2 |   VIEW                         |      |      1 |      1 |    13 |     2   (0)|    999 |00:00:00.01 |       |       |          |
|   3 |    CONNECT BY WITHOUT FILTERING|      |      1 |        |       |            |    999 |00:00:00.01 |       |       |          |
|   4 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   5 |   VIEW                         |      |      1 |      1 |  4618 |     2   (0)|      1 |00:00:00.01 |       |       |          |
|   6 |    SORT GROUP BY               |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 | 20480 | 20480 |18432  (0)|
|   7 |     FAST DUAL                  |      |      1 |      1 |       |     2   (0)|      1 |00:00:00.01 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------

Query execution plan changed - HASH GROUP BY was replaced with SORT GROUP BY.


This was obviously limitation for HASH GROUP BY but I couldn't find more information using Oracle docs or Google so I have asked Oracle support for a confirmation.

From Oracle support I have received answer that similar case was bug closed as not bug, without workaround. Using default DB_BLOCK_SIZE, the limitation is set to 354 aggregate functions.

To solve performance problem we have changed view to avoid HASH GROUP BY limitation.

Testing environment - Oracle Database 12c Enterprise Edition Release 12.1.0.2.0



2 comments:

  1. >To solve performance problem we have changed view to avoid HASH GROUP BY limitation.
    Have you tried forsing old plan with a hint?

    ReplyDelete
  2. Yep! Testing with HINT was one of my first steps but couldn't force HASH GROUP BY.
    As you can see this test example is using dummy DUAL table so you can easily run it at your environment.

    Maybe I was doing something wrong :-)

    Regards,
    Marko

    ReplyDelete