My test table will have 1000000 rows with 1000 NULL values for “OBJECT_NAME” column.
Tests are performed on 11gR1 version.
Table/column info:
select c.table_name, c.column_name, c.data_type, c.num_nulls, t.num_rows table_num_rows from dba_tab_columns c, dba_tables t where 1=1 and c.table_name = t.table_name and c.table_name = 'OBJECTS' and c.column_name = 'OBJECT_NAME'; TABLE_NAME COLUMN_NAME DATA_TYPE NUM_NULLS TABLE_NUM_ROWS ------------- --------------- ------------ ---------- -------------- OBJECTS OBJECT_NAME VARCHAR2 1000 1000000
Goal:
Count how many OBJECT_NAMEs are NULL.
Index object_name column using classic B-tree index.
create index idx_objname on objects(object_name); select count(*) from objects where object_name is null; COUNT(*) ---------- 1000 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 4252 (100)| | | 1 | SORT AGGREGATE | | 1 | 25 | | | |* 2 | TABLE ACCESS FULL| OBJECTS | 1000 | 25000 | 4252 (1)| 00:00:52 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OBJECT_NAME" IS NULL) drop index idx_objname;
FULL SCAN is used because NULL values are not indexed for single column indexes.
For a workaround you could create function based index to help yourself. This was (till now) my preferred option.
create index idx_objname_f on objects(nvl(object_name,'XX')); select count(*) from objects where nvl(object_name,'XX')='XX'; COUNT(*) ---------- 1000 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | 17 | | | |* 2 | INDEX RANGE SCAN| IDX_OBJNAME_F | 10000 | 166K| 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECTS"."SYS_NC00017$"='XX')
Now INDEX is used but as you can see I had to modify initial query using function to get number of NULLs.
Besides that, this index is unusable if I want to get other values then NULL.
select count(*) from objects where object_name='TEST'; COUNT(*) ---------- 16 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 4253 (100)| | | 1 | SORT AGGREGATE | | 1 | 25 | | | |* 2 | TABLE ACCESS FULL| OBJECTS | 28 | 700 | 4253 (1)| 00:00:52 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OBJECT_NAME"='TEST') drop index idx_objname_f;
The trick that I wanted to show.
In composite index NULL values are stored for leading column when second column is not NULL. I will use dummy value for second column.
create index idx_objname_d on objects(object_name,'X'); select count(*) from objects where object_name is null; COUNT(*) ---------- 1000 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 8 (100)| | | 1 | SORT AGGREGATE | | 1 | 25 | | | |* 2 | INDEX RANGE SCAN| IDX_OBJNAME_D | 1000 | 25000 | 8 (0)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_NAME" IS NULL)
Index is used to count NULL values.
You can also use this index to get other values.
select count(*) from objects where object_name='TEST'; COUNT(*) ---------- 16 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | 25 | | | |* 2 | INDEX RANGE SCAN| IDX_OBJNAME_D | 28 | 700 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_NAME"='TEST') drop index idx_objname_d;
This trick was so nice discovery for me that deserved blog post.
It is worth to mention that bitmap index also stores NULL values.
create bitmap index idx_objname_d on objects(object_name); select count(*) from objects where object_name is null; COUNT(*) ---------- 1000 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | 25 | | | | 2 | BITMAP CONVERSION COUNT | | 1000 | 25000 | 3 (0)| 00:00:01 | |* 3 | BITMAP INDEX SINGLE VALUE| IDX_OBJNAME_D | | | | | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("OBJECT_NAME" IS NULL)
Clean up!
drop table objects;
Good one
ReplyDelete