| ||||||||||
Oracle News |
Home > Oracle Optimizer
Prior to beginning the actual work to satisfy a query Oracle takes a look
around and decides on what it thinks is the best way to perform the query. A
table may have a number of indexes, they may be composite, unique, functional
etc. The statement may have a number of joins across many tables with many
conditions and there could be various ways to perform the same operation.
The optimiser gathers information about the various options and considers how
a full table scan, an index scan and different ordering of those scans would
result in different performance and workload.
There are two main modes in which the optimiser can operate. The Optimiser
Mode as it is known is set on the database instance by the DBA's. Oracle DBA can find
out which mode is in use with the following query.
In a batch operation where no users are involved it is preferable to have the
entire job completed with the minimum of work whereas when Oracle DBA have a user
waiting for a result, it is preferable to have the first item in a list appear
while Oracle is still working on obtaining the rest of the list.
As a very simple rule of thumb, databases on Oracle versions upto 7.3 are
usually configured to use the rule based optimiser. From version 8 onwards, the
cost based optimiser became the default, recommended method and is usually in
use.
Learn Oracle - Table and index statistics For rule based optimisation, table and index statistics are not required. For
cost based optimisation the statistics need to exist on atleast one of the
tables being referenced.
Statistics are needed by the cost based optimiser in order for it to
determine the best access path to the data Oracle DBA require. Consider a table
containing the gender and age of 20000 people. If Oracle DBA wanted to obtain all the
males aged 34 and the table had an index on gender and an index on age there
would be two possibilities.
It is clear then that somehow Oracle needs to know the makeup of an index, ie
how selective it is for the given data (or index expression). Oracle
documentation is not particularly clear to me on the terminology used. They
refer to selectivity and cardinality. I'll probably get shot down for this, but
in my mind I call it the granularity of the index.
The granularity of the gender index could be said to be about 0.5 and the
granularity of the age index about 0.01.
These figures can be obtained by instructing Oracle to Analyze the table and
it's indexes.
Since writing this, it has been brought to my attention that Tom Kyte (a
virtual reference library of Oracle knowledge) has said that for gathering
statistics ... When instructing Oracle to perform object or schema analysis Oracle DBA can also
indicate whether it should compute the statistics (perform exact counts to get
the statistics) or estimate the statistics (work on a subset of the data to come
up with a reasonable estimation). For very large tables it is usually best to
estimate the statistics.
Refer to Oracle documentation for exact syntax of the ANALYZE TABLE command,
DBMS_UTILITY functions and DBMS_STATS functions applicable to your database
version.
When SQL is being optimised through the RULE based optimiser there is no
benefit to having statistics on your tables or indexes. The statistics are not
used to effect the actual access paths.
When using the cost based optimiser Oracle DBA should also ensure that your
statistics are up to date. The statistics on a table or index are correct at the
time that table or index was last analysed. If the demographics of your table
have changed substantially Oracle DBA should bring the statistics up to date by having
the statistics re-estimated or computed.
Learn Oracle - The rule based optimiser
The rule based optimiser, as it's name suggests follows a set of fairly
comprehensive rules for accessing data. The rules are ranked in order of usual
performance and Oracle will always use the highest ranked rules to perform the
data access.
Essentially, each predicate (part of your where clause) causes workload. By
looking at what indexes are available and what would be the likely workload
against that index, Oracle forms it's execution plan. For instance, Oracle would
usually execute an equals predicate first rather than a range predicate over an
index. Quite simply the equals predicate is generally more likely to return less
data and so have less disk I/O.
The rules are a little more complex than this and the Oracle documentation
goes into some detail explaining each of the rules and their ranks.
Prior to version 8, the rule based optimiser was the optimiser of choice, and
the default for a database. The cost based optimiser was still a little flaky
and the reliability of the rule based optimiser to produce consistent access
plans was a definate advantage.
Learn Oracle - The cost based optimiser
The cost based optimiser determines the cost of various different access
methods against a query and will always use the method with the lowest cost.
The Oracle documentation both gives a measurable unit to the cost (one unit
of Disk I/O but can also be configured to measure network traffic or CPU work)
and also states that the Cost has no actual measure, ie it is there as a guide.
Thanks guys.
Whilst the cost figure can be a useful indicator of a statement, it should
not be used as the final target for your tuning goals. As the cost may or may
not only measures disk I/O it may or may not not include the amount of memory
used by your query, or the amount of CPU time.
That all sounds a bit confusing, so in plain terms the Cost is calculated as
some arbitrary unit of work. It's only value is in comparing one execution plan
to another, but do not rely on it as the single measure of workload.
Consider the following... There are some scenarios where the cost based optimiser can get it wrong. In
the standard form of gathering statistics there is no allowance for how data may
be 'skewed'. An example would be a table recording the favourite number of 1
million different people. Quite likely a large number of people might choose 7
as their favourite number. For arguments sake let's say it's 95%.
We could have the "number" column indexed but under the cost based optimiser
with standard statistics, a query on that table of the form To allow the optimiser to understand the data skew, histograms can be
generated on the data as part of the statistics gathering process. The actuall
processes involved and the background to understanding histograms appears fairly
involved. I need to understand them far better myself before writing about them
here. Watch for a separate paper on histograms.
The cost based optimiser is the optimiser of choice for databases from
version 8 onwards. It does require that upto date statistics be maintained for
your indexes and tables and while preferable to the rule based optimiser it
still has some pitfalls.
Want to share or request Oracle Tutorial articles to become a Oracle DBA. Direct your requests
to
surbhidba@yahoo.com |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|