Introduction
Oracle
Partitioning, first introduced in Oracle 8.0 in 1997, is one of the most
important and successful functionalities of the Oracle DB, improving the
performance, manageability, and availability for tens of thousands of
applications.
11gR2 introduces
the 9th generation of partitioning, enabling customers to model even more
business scenarios and making partitioning easier to use, enabling
“Partitioning for the masses”.
Partitioning allows a table, index, or index-organized
table to be subdivided into smaller pieces called partitions using single or set of columns called partitioning key. For ex, orders table
can be partitioned based on order_date as partitioning key. Each piece of such
a DB object is called a partition. Each partition has its own name, and
may optionally have its own storage characteristics.
Basics
of Partitioning
- Partitioning addresses key issues in supporting very large tables and indexes by decomposing them into smaller and more manageable pieces called partitions, which are entirely transparent to an application.
- SQL queries and DML statements do not need to be modified to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individual partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects.
- Each partition of a TABLE / INDEX must have the same logical attributes, such as column names, data types, and constraints, but each partition can have separate physical attributes, such as compression enabled or disabled, physical storage settings, and tablespaces.
- Database objects - tables, indexes, and index-organized tables - are partitioned using a 'partitioning key', a set of columns that determine in which partition a given row will reside.
- However, the DBA can manage and store each monthly partition individually, potentially using different storage tiers, applying table compression to the older data, or store complete ranges of older data in read only tablespaces.
- From the perspective of a DBA, a partitioned object has multiple pieces that can be managed either collectively / individually. This gives the DBA considerable flexibility in managing a partitioned object.
- From the perspective of the Application, a partitioned table is identical to a non-partitioned table; no modifications are necessary when accessing a partitioned table using SQL DML commands.
- Note: All partitions of a partitioned object must reside in tablespaces of a single block size.
Partitioning Key: Each row in a partitioned table is
unambiguously assigned to a single partition. Partitioning key consists of one
or more columns that determine the partition where each row is stored. Oracle
automatically directs insert, update, and delete operations to the appropriate
partition with the partitioning key.
Partitioned Tables: Any table can be
partitioned into a million separate partitions except those tables containing
columns with LONG / LONG RAW data types. You can, however, use tables
containing columns with CLOB / BLOB data types.
Partitioned
Index-Organized Tables: They are very useful
for providing improved performance, manageability, and availability for
index-organized tables. For partitioning an index-organized table:
- Partition columns must be a subset of the primary key columns.
- Secondary indexes can be partitioned (both locally and globally).
- OVERFLOW data segments are always equi-partitioned with the table partitions.
Exactly, when to
use partitioning is a rather subjective decision. Some general guidelines that
Oracle and I suggest are listed below.
When to Partition
a Table
- Tables greater than 2 GB should always be considered as candidates for partitioning.
- Tables containing historical data, in which new data is added into the newest partition. A typical ex. is a historical table where only the current month's data is updatable and the other 11 months are read only.
- When the contents of a table need to be distributed across different types of storage devices.
When
to Partition an Index
- Avoid rebuilding the entire index when data is removed.
- Perform maintenance on parts of the data without invalidating the entire index.
- Reduce the impact of index skew caused by an index on a column with a monotonically increasing value
Benefits
of Partitioning
- It enables the DB objects to be managed and accessed at a finer level of granularity.
- It improves the performance of certain queries or maintenance operations by an order of magnitude.
- It can greatly reduce the total cost of data ownership, using a “tiered archiving” approach of keeping older relevant information still online on low cost storage devices.
- It enables an efficient and simple, yet very powerful approach when considering Information Lifecycle Management (ILM) for large environments.
- It also enables DB designers and DBA's to tackle some of the toughest problems posed by cutting-edge applications.
- It is a key tool for building multi-terabyte systems or systems with extremely high availability requirements.
- It can provide tremendous benefit to a wide variety of applications by improving Performance, Manageability, and Availability.
Partitioning
for
- Availability: Partitioned DB objects provide partition independence. This characteristic of partition independence can be an important part of a high-availability strategy.
- Manageability: It allows tables and indexes to be partitioned into smaller, more manageable units, providing DBA with the ability to pursue a "divide and conquer" approach to data management.
- Performance: By limiting the amount of data to be examined / operated on, partitioning provides a number of performance benefits. These features include: Partitioning Pruning and Partition-wise Joins.
Oracle
additionally provides a comprehensive set of SQL commands for managing
partitioning tables. These include commands for adding new partitions,
dropping, splitting, moving, merging, truncating, and optionally compressing
partitions.
Partitioning
Pruning / Partition Elimination
- It is the simplest and also the most substantial means to improve performance using partitioning. Partition pruning can often improve query performance by several orders of magnitude.
- In Partition Pruning, the optimizer analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded partitions when building the partition access list. As a result, Oracle DB performs operations only on those partitions that are relevant to the SQL statement. Partition pruning dramatically reduces the amount of data retrieved from disk and shortens processing time, thus improving query performance and optimizing resource utilization.
- Partition pruning works with all of Oracle's other performance features. Oracle will utilize partition pruning in conjunction with any indexing technique, join technique, or parallel access method.
Partition-wise
Joins
- It breaks a large join into smaller joins that occur between each of the partitions, completing the overall join in less time. It can be applied when two tables are being joined together, and at least one of these tables is partitioned on the join key or when a reference partitioned table is joined with its parent table.
- It reduces query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel. This significantly reduces response time and improves the use of both CPU and memory resources. Partition-wise joins can be Full or Partial. Oracle DB decides which type of join to use.
- It break a large join into smaller joins of “identical” data sets for the joined tables. “Identical” here is defined as covering exactly the same set of partitioning key values on both sides of the join, thus ensuring that only a join of these 'identical' data sets will produce a result and that other data sets do not have to be considered.
- Oracle is using either the fact of already (physical) equi-partitioned tables for the join or is transparently redistributing (= “repartitioning”) one table at runtime to create equi-partitioned data sets matching the partitioning of the other table, completing the overall join in less time. This offers significant performance benefits both for serial and parallel execution.
Evolution
of Partitioning in Oracle
Oracle Version #
|
Core functionality
|
Performance
|
Manageability
|
8
|
Range
partitioning
Global
range indexes
|
“Static”
partition pruning
|
Basic
maintenance operations:
add,
drop, exchange
|
8i
|
Hash
partitioning
Composite
range-hash partitioning
|
Partition-wise
joins
“Dynamic”
pruning
|
Merge
operation
|
9i
|
List
partitioning
|
Global
index maintenance
|
|
9i R2
|
Composite
range-list partitioning
|
Fast
partition split
|
|
10g
|
Global
hash indexes
|
Local
index maintenance
|
|
10g R2
|
1M
partitions per table
|
“Multi-dimensional”
pruning
|
Fast
drop table
|
11g
|
More
composite choices
REF
partitioning
Virtual
column partitioning
|
Interval
partitioning
Partition
Advisor
|
Partitioning
Strategies in 11g R2
Each partitioning strategy has different advantages and design
considerations. Thus, each strategy is more appropriate for a particular
situation.
Basic Partitioning Strategies
|
Partitioning Extensions
|
|
Single-level
partitioning
|
Composite
partitioning
|
Manageability
Extensions
|
Partitioning Strategies
|
Data Distribution
|
Sample Business Case
|
Single-level
partitioning
|
Based on consecutive
ranges of values
Based on unordered lists
of values
Based on a hash algorithm
|
Orders table range
partitioned by order_date
Orders table list
partitioned by country
Orders table range
partitioned by customer_id
|
Composite partitioning
|
Based on a combination of
two of the above mentioned basic techniques of Range, List, Hash, and
Interval
|
Orders table is range
partitioned by order_date and sub_partitioned by
|
Interval partitioning
|
An extension to Range
Partition.
Defined by an interval,
providing equi-width ranges. With exception of the first partition all
partitions are automatically created on-demand when matching data arrives.
|
Orders table partitioned
by order_date with a predefined daily interval, starting with '01-Jan-2012'
|
REF Partitioning or
Partition by Reference
|
Partitioning for a child
table is inherited from the parent table through a primary key - foreign key
relationship. The partitioning keys are not stored in actual columns in the
child table.
|
(parent) Orders table
range partitioned by order_date and inherits the partitioning technique to
(child) order lines table. Column order_date is only present in the parent
orders table
|
Virtual
column-based
partitioning
|
Defined by one of the
above mentioned partition techniques and the partitioning key is based on a
virtual column. Virtual columns are not stored on disk and only exist as
metadata.
|
Orders table has a virtual
column that derives the sales region based on the first three digits of the
customer account number. The orders table is then list partitioned by sales
region.
|
Single-Level Partitioning: A table is
defined by specifying one of the following data distribution methodologies,
using one or more columns as the partitioning key:
- Range: It maps data to partitions based on ranges of values of the partitioning key that you establish for each partition. It is the most common type of partitioning and is often used with dates.
- List: It enables you to explicitly control how rows map to partitions by specifying a list of discrete values for the partitioning key in the description for each partition. The advantage of list partitioning is that you can group and organize unordered and unrelated sets of data in a natural way.
- Hash: It maps data to partitions based on a hashing algorithm that Oracle applies to the partitioning key that you identify. Hashing algorithm evenly distributes rows among partitions, giving partitions approximately the same size.
Composite
Partitioning: It is a combination of the basic data
distribution methods; a table is partitioned by one data distribution method
and then each partition is further subdivided into subpartitions using a second
data distribution method. It supports historical operations, such as adding new
range partitions, but also provides higher degrees of potential partition
pruning and finer granularity of data placement through subpartitioning.
- Range-Range: It enables logical range partitioning along two dimensions; for example, partition by order_date and range subpartition by shipping_date.
- Range-Hash: It partitions data using the range method, and within each partition, subpartitions it using the hash method. Composite range-hash partitioning provides the improved manageability of range partitioning and the data placement, striping, and parallelism advantages of hash partitioning.
- Range-List: It partitions data using the range method, and within each partition, subpartitions it using the list method. Composite range-list partitioning provides the manageability of range partitioning and the explicit control of list partitioning for the subpartitions.
- List-Range: It enables logical range subpartitioning within a given list partitioning strategy; for example, list partition by country_id and range subpartition by order_date.
- List-Hash: It enables hash subpartitioning of a list-partitioned object; for example, to enable partition-wise joins.
- List-List: It enables logical list partitioning along two dimensions; for example, list partition by country_id and list subpartition by sales_channel.
Manageability Extensions: The following extensions
significantly enhance the manageability of partitioned tables
Interval
Partitioning: It is an extension of range partitioning
which instructs the DB to automatically create partitions of a specified
interval when data inserted into the table exceeds all of the existing range
partitions. Range partitioning key value determines the high value of the range
partitions, which is called the transition point, and DB creates interval
partitions for data with values that are beyond that transition point.
When using interval
partitioning, consider the following restrictions:
- You can only specify one partitioning key column, and it must be of NUMBER or DATE type.
- Interval partitioning is not supported for index-organized tables.
- You cannot create a domain index on an interval-partitioned table.
You can create
single-level interval partitioned tables and the following composite
partitioned tables:
- Interval-Range
- Interval-Hash
- Interval-List
Partition Advisor: It is part of the SQL Access Advisor. Beginning with 11gR2, the SQL Access Advisor
has been enhanced to generate partitioning recommendations, in addition to the
ones it already provides for indexes, materialized views, and materialized view
logs.
- It can recommend a partitioning strategy for a table based on a supplied workload of SQL statements which can be supplied by the SQL Cache, a SQL Tuning set, or be defined by the user.
- Recommendations generated by the SQL Access Advisor - either for Partitioning only or holistically - will show the anticipated performance gains that will result if they are implemented. The generated script can either be implemented manually or submitted onto a queue within Oracle Enterprise Manager.
- With the extension of partitioning advice, customers not only can get recommendation specifically for partitioning but also a more comprehensive holistic recommendation of SQL Access Advisor, improving the collective performance of SQL statements overall.
- The Partition Advisor, integrated into the SQL Access Advisor, is part of Oracle's Tuning Pack, an extra licensable option. It can be used from within Enterprise Manager or via a command line interface.
Partitioning
Key Extensions: The following extensions extend the
flexibility in defining partitioning keys
Reference partitioning: It enables the partitioning of two tables that are related to one
another by referential constraints. The partitioning key is resolved through an
existing parent-child relationship, enforced by enabled and active primary key
and foreign key constraints.
- The benefit of this extension is that tables with a parent-child relationship can be logically equi-partitioned by inheriting the partitioning key from the parent table without duplicating the key columns. The logical dependency also automatically cascades partition maintenance operations, thus making application development easier and less error-prone.
- All basic partitioning strategies are available for reference partitioning. Interval partitioning cannot be used with reference partitioning.
- Virtual columns remove that restriction and enable the partitioning key to be defined by an expression, using one or more existing columns of a table. The expression is stored as metadata only.
- Virtual column-based partitioning is supported with all basic partitioning strategies, including reference partitioning, and interval and interval-* composite partitioning.
Partitioned Indexes
Just like partitioned tables, partitioned indexes improve manageability,
availability, performance, and scalability. They can either be partitioned
independently (global indexes) or automatically linked to a table's
partitioning method (local indexes). In general, you should use global indexes
for OLTP applications and local indexes for DW or DSS applications. Also,
whenever possible, try to use local indexes because they are easier to manage.
Types of Partitioned
Indexes
- Local Indexes: A local index is an index on a partitioned table that is coupled with the underlying partitioned table, 'inheriting' the partitioning strategy from the table. Consequently, each partition of a local index corresponds to one - and only one - partition of the underlying table. The coupling enables optimized partition maintenance; for example, when a table partition is dropped, Oracle simply has to drop the corresponding index partition as well. No costly index maintenance is required. Local indexes are most common in DW environments.
- Global Partitioned Indexes: A global partitioned index is an index on a partitioned or non-partitioned table that is partitioned using a different partitioning-key / partitioning strategy than the table. Global-partitioned indexes can be partitioned using range or hash partitioning and are uncoupled from the underlying table. For ex, a table could be range-partitioned by month and have 12 partitions, while an index on that table could be hash-partitioned using a different partitioning key and have a different number of partitions. Global partitioned indexes are more common for OLTP than for DW environments.
- Global Non-Partitioned Indexes: A global non-partitioned index is essentially identical to an index on a non-partitioned table. The index structure is not partitioned and uncoupled from the underlying table. In DW environments, the most common usage of global non-partitioned indexes is to enforce PK constraints. OLTP environments on the other hand mostly rely on global non-partitioned indexes.
Deciding on the Type of Partitioned Index to
use: Consider the following guidelines as shown below
- If the table partitioning column is a subset of the index keys, then use a local index. If this is the case, then you are finished. If this is not the case, then continue to guideline 2.
- If the index is unique and does not include the partitioning key columns, then use a global index. If this is the case, then you are finished. Otherwise, continue to guideline 3.
- If your priority is manageability, then use a local index. If this is the case, then you are finished. If this is not the case, continue to guideline 4.
- If the application is an OLTP type and users need quick response times, and then use a global index. If the application is a DSS type and users are more interested in throughput, and then use a local index.
Miscellaneous
Information about Creating Indexes on Partitioned Tables
- You can create bitmap indexes on partitioned tables, with the restriction that the bitmap indexes must be local to the partitioned table. They cannot be global indexes.
- Global indexes can be unique. Local indexes can only be unique if the partitioning key is a part of the index key.
Partitioned
Indexes on Composite Partitions
Here are a few points to remember when using partitioned indexes on
composite partitions:
- Subpartitioned indexes are always local and stored with the table subpartition by default.
- Tablespaces can be specified at either index or index subpartition levels.