| Oracle®
Database Data Warehousing Guide 10g Release 1 (10.1) Part Number B10736-01 |
|
|
View PDF |
This chapter covers tuning in a parallel execution environment and discusses:
Miscellaneous Parallel Execution Tuning Tips
|
Note: Some features described in this chapter are available only if you have purchased Oracle Database Enterprise Edition with the Oracle Real Application Clusters Option. |
Parallel execution dramatically reduces response time for data-intensive operations on large databases typically associated with decision support system s (DSS) and data warehouses. You can also implement parallel execution on certain types of online transaction processing (OLTP) and h ybrid systems. Parallel execution improves processing for:
Queries requiring large table scans, joins, o r partitioned index scans
Creation of large indexes
Creation of large table s (including materialized views)
Bulk inserts, updates, merges, and deletes
You can als o use parallel execution to access object types within an Oracle database. For example, you can use parallel execution to access larg e objects (LOBs).
Parallel execution benefits systems with all of the following chara cteristics:
Symmetric multiprocessors (SMPs), clusters, or massively parallel systems
Underutilized or intermittently used CPUs (for example, systems where CPU usage is typically less than 30%)
Sufficient memory to support additional memory-intensive processes, such as sorts, hashing, and I/O buffers
If your system lacks any of these characteristics, parallel execution might not significantly improve performance. In fact, parallel execution may reduce system performance on overutilized systems or systems with small I/O bandwidth.
The benefits of parallel execution can be seen in DSS and data warehousing envi
ronments. OLTP systems can also benefit from parallel execution during batch processing and during schema maintenance operations such
as creation of indexes. The average simple DML or SELECT statements that characterize OLTP applications would not see a
ny benefit from being executed in parallel.
Parallel execution is not normally useful for:
Environments in which the typical query or transaction is very short (a few seconds or less). This includes most online transaction systems. Parallel execution is not useful in these environments because there is a cost associated with coordinating the parallel execution servers; for short transactions, the cost of this coordination may out weigh the benefits of parallelism.
Environments in which the CPU, memory, or I/O resources are already h eavily utilized. Parallel execution is designed to exploit additional available hardware resources; if no such resources are availabl e, then parallel execution will not yield any benefits and indeed may be detrimental to performance.
You can use parallel execution for any of the following:
Access methods
Some examples are table s cans, index full scans, and partitioned index range scans.
Join methods
Some examples are nested loop, sort merge, hash, and star transformation.
DDL statements
Some examples are CREATE TABLE AS SELECT, CREATE INDEX, REBUILD INDEX
, REBUILD INDEX PARTITION, and MOVE/SPLIT/COALESCE PARTITION.
You can normally use parallel DDL where you use regular DDL. There are, however, some additional de
tails to consider when designing your database. One important
restriction is that parallel DDL cannot be used on tables with object or LOB columns.
All of these DDL operations can be performed in NOLOGGING mode f
or either parallel or serial execution.
The CREATE TABLE statement for an index-organized table can be paralle
lized either with or without an AS SELECT clause.
Different parallelism is used for different operat ions. Parallel create (partitioned) table as select and parallel create (partitioned) index run with a degree of parallelism equal to the number of partitions.
Parallel operations require accurate statistics to perform optimally.
DML statements
Some examples are INSERT AS SELECT, updates, deletes, and MERGE operations.
Parallel DML (parallel insert, update, merge, and delete) uses parallel execution mechanisms to speed up or s
cale up large DML operations against large database tables and indexes. You can also use INSERT ... SELECT
statements to insert rows into multiple tables as part of a single DML statement. You can normally use parallel DML where you use reg
ular DML.
Although data manipulation language (DML) normally includes queries, the term parallel DML refers only to inserts, u pdates, upserts and deletes done in parallel.
Miscellaneous SQL operations
Some examples are BY, NOT IN, SELECT DISTINCT, UNION, ALL, CUBE, and ROLLUP, as well as aggregate and table functions.
Parallel query
You can parallelize queries and subqueries in SELECT statements, as well as the que
ry portions of DDL statements and DML statements (INSERT, UPDATE, DELETE, and MERGE).
SQL*Loader
You can parallelize the use of SQL*Loader, where large amounts of data are routinely encountered. To speed up your loads, you can use a parallel direct-path load as in the following example:
sqlldr USERI D=SCOTT/TIGER CONTROL=LOAD1.CTL DIRECT=TRUE PARALLEL=TRUE sqlldr USERID=SCOTT/TIGER CONTROL=LOAD2.CTL DIRECT=TRUE PARALLEL=TRUE sqlld r USERID=SCOTT/TIGER CONTROL=LOAD3.CTL DIRECT=TRUE PARALLEL=TRUE
You can also use a parameter file to achieve the same th ing.
An important point to remember is that indexes are not maintained during a parallel load.
Parallel executio n divides the task of executing a SQL statement into multiple small units, each of which is executed by a separate process. Also the incoming data (tables, indexes, partitions) can be divided into parts called granules. The user shadow process that wants to execute a query in parallel takes on the role as parallel execution coordinator or query coordinator. The query coordinator does the followin g:
Parses the query and determines the degree of parallelism
Allocates one or two set of slaves (threads or processes)
Controls the query and sends instructions to the PQ slaves p>
Determines which tables or indexes need to be scanned by the PQ slaves
Produ ces the final output to the user
The parallel execution coordinator may enlist two or more of the instance's parallel execution servers to process a SQL statement. The number of parallel execution servers associated with a s ingle operation is known as the degree of parallelism.
A single operation is a part of a SQL statement such as an order by, a full table scan to perform a join on a nonindexed column table.
Note that the degree of parallelism applies directly only to i ntra-operation parallelism. If inter-operation parallelism is possible, the total number of parallel execution servers for a statemen t can be twice the specified degree of parallelism. No more than two sets of parallel execution servers can run simultaneously. Each set of parallel execution servers may process multiple operations. Only two sets of parallel execution servers need to be active to g uarantee optimal inter-operation parallelism.
Parallel execution is designed to effectively use multiple CPUs and disks to ans wer queries quickly. When multiple users use parallel execution at the same time, it is easy to quickly exhaust available CPU, memory , and disk resources.
Oracle Database provides several ways to manage resource utilization in conjunction with parallel execut ion environments, including:
The adaptive multiuser algorithm, which is enabled by default, reduces the degree of parallelism as the load on the system increases.
User resource limits and profiles, which allo w you to set limits on the amount of various system resources available to each user as part of a user's security domain.
The Database Resource Manager, which lets you allocate resources to different groups of users.
When an instance starts up, Oracle creat
es a pool of parallel execution servers which are available for any parallel operation. The initialization parameter PARAL
LEL_MIN_SERVERS specifies the number of parallel execution servers that Oracle Database creates at instance startup.
Wh en executing a parallel operation, the parallel execution coordinator obtains parallel execution servers from the pool and assigns th em to the operation. If necessary, Oracle can create additional parallel execution servers for the operation. These parallel exe cution servers remain with the operation throughout job execution, then become available for other operations. After the statement ha s been processed completely, the parallel execution servers return to the pool.
Note that the parallel execution coordinator a nd the parallel execution servers can only service one statement at a time. A parallel execution coordinator cannot coordinate, for e xample, a parallel query and a parallel DML statement at the same time.
When a user issues a SQL statement, the optimizer deci des whether to execute the operations in parallel and determines the degree of parallelism (DOP) for each operation. You can specify the number of parallel execution servers required for an operation in various ways.
If the optimizer targets the statement for parallel processing, the following sequence of events takes place:
The SQL statement's foregroun d process becomes a parallel execution coordinator.
The parallel execution coordinator obtains as many parallel exec ution servers as needed (determined by the DOP) from the server pool or creates new parallel execution servers as needed.
Oracle executes the statement as a sequence of operations. Each operation is performed in parallel, if possible.
When statement processing is completed, the coordinator returns any resulting data to the user process that issued the statement a nd returns the parallel execution servers to the server pool.
The parallel execution coordinator calls upon the parallel execution servers during the execution of the
SQL statement, not during the parsing of the statement. Therefore, when parallel execution is used with the shared server, the server
process that processes the EXECUTE call of a user's statement becomes the parallel execution coordinator for the statem
ent. See "Setting the Degree of Parallelism for Parallel Execution" for more information.
If the number of parallel operations processed concurrently by an instance changes sign ificantly, Oracle automatically changes the number of parallel execution servers in the pool.
If the number of parallel operations increases, Oracle creates additional parallel
execution servers to handle incoming requests. However, Oracle never creates more parallel execution servers for an instance than the
value specified by the initialization parameter PARALLEL_MAX_SERVERS.
<
a id="sthref1898" name="sthref1898">If the number of parallel operations decreases, Oracle terminates any parallel execution serv
ers that have been idle for a threshold period of time. Oracle does not reduce the size of the pool less than the value of PARA
LLEL_MIN_SERVERS, no matter how long the parallel execution servers have been idle.
Oracle can process a parallel operation with fewer than the requested number of pro cesses.
If all parallel execution servers in the pool are occupied and the maximum number of parallel execution servers has be en started, the parallel execution coordinator switches to serial processing.
See "Minimum Number of Paral
lel Execution Servers" for information about using the initialization parameter PARALLEL_MIN_PERCENT and "Tuning General Parameters for Parallel Execution" for information about the PARALLEL_MIN_PERCENT and PARALLEL_MAX_SERVERS initialization parameters.
To execute a query in parallel, Oracle generally creates a producer queue server and a consumer se rver. The producer queue server retrieves rows from tables and the consumer server performs operations such as join, sort, DML, and D DL on these rows. Each server in the producer execution process set has a connection to each server in the consumer set. This means t hat the number of virtual connections between parallel execution servers increases as the square of the DOP.
Each communicatio n channel has at least one, and sometimes up to four memory buffers. Multiple memory buffers facilitate asynchronous communication am ong the parallel execution servers.
A single-instance environment uses at most three buffers for each communication channel. A n Oracle Real Application Clusters environment uses at most four buffers for each channel. Figure 24-1 illust rates message buffers and how producer parallel execution servers connect to consumer parallel execution servers.
Figure 24-1 Parallel Execution Server Connections and Buffers

When a connection is between two processes on the same instance , the servers communicate by passing the buffers back and forth. When the connection is between processes in different instances, the messages are sent using external high-speed network protocols. In Figure 24-1, the DOP is equal to the numbe r of parallel execution servers, which in this case is n. Figure 24-1 does not show the parallel exe cution coordinator. Each parallel execution server actually has an additional connection to the parallel execution coordinator.
div>Each SQL statement undergoes an optimizat ion and parallelization process when it is parsed. When the data changes, if a more optimal execution or parallelization plan becomes available, Oracle can automatically adapt to the new situation.
After the optimizer determines the execution plan of a statem ent, the parallel execution coordinator determines the parallelization method for each operation in the plan. For example, the parall elization method might be to parallelize a full table scan by block range or parallelize an index range scan by partition. The coordi nator must decide whether an operation can be performed in parallel and, if so, how many parallel execution servers to enlist. The nu mber of parallel execution servers in one set is the DOP. See "Setting the Degree of Parallelism for Parallel Exe cution" for more information.
T he parallel execution coordinator examines the redistribution requirements of each operation. An operation's redistribution requireme nt is the way in which the rows operated on by the operation must be divided or redistributed among the parallel execution servers. p>
After determining the redistribution requirement for each operation in the execution plan, the optimizer determines the order i n which the operations must be performed. With this information, the optimizer determines the data flow of the statement.
As a n example of parallel query with intra- and inter-operation parallelism, consider the following, more complex query:
SELECT /*+ PARALLEL(employees 4) PARALLEL(departments 4)
USE_HASH(employees) ORDERED */ MAX(salary), AVG(salary
)
FROM employees, departments
WHERE employees.department_id = departments.department_id
GROUP BY employees.department_id;
<
p>Note that hints have been used in the query to force the join order and join method, and to specify the DOP of the tables emp
loyees and departments. In general, you should let the optimizer determine the order and method.
Figure 24-2 illustrates the data flow graph or query plan for this query.
PARALLEL hint in the query that specifies the DOP. In other words, the DOP will be four because each set of parallel execution servers wi
ll have four processes.
Slave set SS1 first scans the table employees while SS2 will fetch rows from SS1 and buil
d a hash table on the rows. In other words, the parent servers in SS2 and the child servers in SS2 work concurrently: one in scanning
employees in parallel, the other in consuming rows sent to it from SS1 and building the hash table for the hash join in
parallel. This is an example of inter-operation parallelism.
After SS1 has finished scanning the entire table employees
(that is, all granules or task units for employees are exhausted), it scans the table departments i
n parallel. It sends its rows to servers in SS2, which then perform the probes to finish the hash-join in parallel. After SS1 is done
scanning the table departments in parallel and sending the rows to SS2, it switches to performing the GROUP BY in parallel. This is how two server sets run concurrently to achieve inter-operation parallelism across various ope
rators in the query tree while achieving intra-operation parallelism in executing each operation in parallel.
Another importan
t aspect of parallel execution is the re-partitioning of rows while they are sent from servers in one server set to another. For the
query plan in Figure 24-2, after a server process in SS1 scans a row of employees, which server
process of SS2 should it send it to? The partitioning of rows flowing up the query tree is decided by the operator into which the row
s are flowing into. In this case, the partitioning of rows flowing up from SS1 performing the parallel scan of employees
into SS2 performing the parallel hash-join is done by hash partitioning on the join column value. That is, a server process scanning
employees computes a hash function of the value of the column employees.employee_id to decide the number o
f the server process in SS2 to send it to. The partitioning method used in parallel queries is explicitly shown in the EXPLAIN<
/code> PLAN of the query. Note that the partitioning of rows being sent between sets of execution servers should not be
confused with Oracle's partitioning feature whereby tables can be partitioned using hash, range, and other methods.
Operations that require the output of other operations are known as consumer operations. In
Figure 24-2, the GROUP BY SORT operation is the producer of the HASH JOIN operation because GROUP BY SORT requires the HASH
JOIN output.
Producer operations can begin consuming rows as soon as the producer operations have produced rows.
In the previous example, while the parallel execution servers are producing rows in the FULL SCAN dep
artments operation, another set of parallel execution servers can begin to perform the HASH JOIN ope
ration to consume the rows.
Each of the two operations performed concurrently is given its own set of parallel execution servers. Therefore, both query operations and the data flow tree i tself have parallelism. The parallelism of an individual operation is called intraoperation parallelism and the parallelism between o perations in a data flow tree is called interoperation parallelism. Due to the producer-consumer nature of the Oracle server's operat ions, only two operations in a given tree need to be performed simultaneously to minimize execution time. To illustrate intraoperatio n and interoperation parallelism, consider the following statement:
SELECT * FROM employees ORDER BY la st_name;
The execution plan implements a full scan of the employees table. This operation is followed by a s
orting of the retrieved rows, based on the value of the last_name column. For the sake of this example, assume the last_name column is not indexed. Also assume that the DOP for the query is set to 4, which means that four parallel execution
servers can be active for any given operation.
Figure 24-3 illustrates the parallel execution of the example query.
Figure 24-3 Interoperation Parallelism and Dynamic Partitioning
< img src="./dwhsg090.gif" alt="Description of dwhsg090.gif follows" />As you can see from Figure 24-3, there are actually eight parallel execution servers involved in the query even though the DOP is 4. Thi s is because a parent and child operator can be performed at the same time (interoperation parallelism).
Also note that all of
the parallel execution servers involved in the scan operation send rows to the appropriate parallel execution server performing the
SORT operation. If a row scanned by a parallel execution server contains a value for the last_name column b
etween A and G, that row gets sent to the first ORDER BY parallel execution serve
r. When the scan operation is complete, the sorting processes can return the sorted results to the coordinator, which, in turn, retur
ns the complete query results to the user.
The following parallel scan methods are supported on index-organized tables:
Parallel fast f ull scan of a nonpartitioned index-organized table
Parallel fast full scan of a partitioned index-organi zed table
Parallel index range scan of a partitioned index-organized table
These scan m ethods can be used for index-organized tables with overflow areas and for index-organized tables that contain LOBs.
Parallel query on a nonpartitioned index-organized table uses parallel fa st full scan. The DOP is determined, in decreasing order of priority, by:
A PARALLEL
hint (if present)
An ALTER SESSION FORCE PARALLEL QUERY
code> statement
The parallel degree associated with the table, if the parallel degree is specified in the CREA
TE TABLE or ALTER TABLE statement
The allocation of work is done by di viding the index segment into a sufficiently large number of block ranges and then assigning the block ranges to parallel execution s ervers in a demand-driven manner. The overflow blocks corresponding to any row are accessed in a demand-driven manner only by the pro cess which owns that row.
Both index range scan and fast fu ll scan can be performed in parallel. For parallel fast full scan, parallelization is exactly the same as for nonpartitioned index-or ganized tables. For parallel index range scan on partitioned index-organized tables, the DOP is the minimum of the degree picked up f rom the previous priority list (like in parallel fast full scan) and the number of partitions in the index-organized table. Depending on the DOP, each parallel execution server gets one or more partitions (assigned in a demand-driven manner), each of which contains the primary key index segment and the associated overflow segment, if any.
Parallel queries can be performed on object type tables and tables containing object type columns. Parallel query for object types supports all of the features that are a vailable for sequential queries on object types, including:
Methods on object types
Attribute access of object types
Constructors to create object type instances
Object views
PL/SQL and OCI queries for object types
There are no limitatio ns on the size of the object types for parallel queries.
The following restrictions apply to using parallel query for object types.
A MAP function is needed to parallelize queries involving joins and sorts (through ORDER BY, GROUP BY, or set operations). In the absence of a MAP function, the query will automatically be executed serial
ly.
Parallel DML and p arallel DDL are not supported with object types. DML and DDL statements are always performed serially.
In all cases where the query cannot execute in parallel because of any of these restrictions, the whole query executes serially without giving an error message.
This section includes the following topics on parallelism for DDL statements:
You can parallelize DDL statements for tables and indexes that are nonpartitioned or pa rtitioned. Table 24-3 summarizes the operations that can be parallelized in DDL statements.
The parall el DDL statements for nonpartitioned tables and indexes are:
CREATE INDEX
CREATE TABLE ... AS SELECT
ALTER INDEX ... REBUILD
The parallel DDL statements for partitioned tables and indexes are:
C
REATE INDEX
CREATE TABLE ... AS SELECT<
/code>
ALTER TABLE ... [MOVE|SPLIT|COALESCE] PARTITION
ALTER INDEX ... [REBUILD|SPLIT] PARTITION
This statement can be executed in parallel only if the (global) index partition being split is usable.
All of these DDL operations can be performed in no-logging mode for either parallel or serial execution.
CREATE TABLE for an index-organized t
able can be parallelized either with or without an AS SELECT clause.
Different parallelism is used f
or different operations (see Table 24-3). Parallel CREATE TABLE ... AS
SELECT statements on partitioned tables and parallel CREATE INDEX statements on partitioned i
ndexes execute with a DOP equal to the number of partitions.
Partition parallel analyze table is made less necessary by the {TABLE, INDEX} PARTITION statements, since parallel analyze of an entire partitioned tab
le can be constructed with multiple user sessions.
Par
allel DDL cannot occur on tables with object columns. Parallel DDL cannot occur on non-partitioned tables with LOB colum
ns.
For performa nce reasons, < /a>decision support applications often require large amounts of data to be summarized or rolled up int o smaller tables for use with ad hoc, decision support queries. Rollup occurs regularly (such as nightly or weekly) during a short pe riod of system inactively.
Parallel execution lets you parallelize the query and crea te operations of creating a table as a subquery from another table or set of tables.
Clustered tables cannot be created and po pulated in parallel.
Figure 24-4 illustrates creating a table from a subquery in parallel.
When summary ta ble data is derived from other tables' data, recoverability from media failure for the smaller summary table may not be important and can be turned off during creation of the summary table.
If you disable logging during parallel table creation (or any other p arallel DDL operation), you should back up the tablespace containing the table once the table is created to avoid loss of the table d ue to media failure.
Use the NOLOGGING clause of the CREATE TABLE, CREATE
INDEX, ALTER TABLE, and ALTER INDEX statements to disable undo and r
edo log generation.
< a id="sthref1965" name="sthref1965">Creating a table or index in parallel has space management implications that affect both the storage space required during a parallel operation and the free space available after a table or index has been created.
When creating a table or index in parallel, each parallel ex
ecution server uses the values in the STORAGE clause of the CREATE statement to create temporary segments t
o store the rows. Therefore, a table created with a NEXT setting of 5 MB and a PARALLEL DEGREE
of 12 consumes at least 60 megabytes (MB) of storage during table creation because each process starts with an extent of 5 MB. When
the parallel execution coordinator combines the segments, some of the segments may be trimmed, and the resulting table may be smaller
than the requested 60 MB.
When you create indexes and tables in parallel, each parallel execution server allocates a new extent and fills the extent with the table or index data. Thus, if you create an index with a DOP o f 3, the index will have at least three extents initially. Allocation of extents is the same for rebuilding indexes in parallel and f or moving, splitting, or rebuilding partitions in parallel.
Serial operations require the schema object to have at least one e xtent. Parallel creations require that tables or indexes have at least as many extents as there are parallel execution servers creati ng the schema object.
When you create a table or index in parallel, it is possible to create pockets of free space—either externa l or internal fragmentation. This occurs when the temporary segments used by the parallel execution servers are larger than what is n eeded to store the rows.
<
/a>If the unused space in each temporary segment is larger than the value of the MINIMUM EXTENT parameter s
et at the tablespace level, then Oracle trims the unused space when merging rows from all of the temporary segments into the table or
index. The unused space is returned to the system free space and can be allocated for new extents, but it cannot be coalesced into a
larger segment because it is not contiguous space (external fragmentation).
If the unused space in each
temporary segment is smaller than the value of the MINIMUM EXTENT parameter, then unused space cannot be t
rimmed when the rows in the temporary segments are merged. This unused space is not returned to the system free space; it becomes par
t of the table or index (internal fragmentation) and is available only for subsequent inserts or for updates that require additional
space.
For example, if you specify a DOP of 3 for a CREATE TABLE ... AS SELECT statement, but there is only one datafile in the tablespace, then internal fragmentation may occur, as shown in Figure 24-5. The pockets of free space within the internal table extents of a datafile cannot be coalesced with oth
er free space and cannot be allocated as extents.
See Ora cle Database Performance Tuning Guide for more information about creating tables and indexes in parallel.
Figure 24-5 Unusable Free Space (Internal Fragmentation)

Par
allel DML (PARALLEL INSERT, UPDATE, DELETE, and MERGE) uses parallel
execution mechanisms to speed up or scale up large DML operations against large database tables and indexes.
|
Note: Although DML generally includes queries, i n this chapter the term DML refers only to inserts, updates, merges, and deletes. |
This section discusses the following parallel DML topics:
You can parallelize DML operations manually by issuing multiple DML statements simultaneously again st different sets of data. For example, you can parallelize manually by:
Issuing multiple INSERT
code> statements to multiple instances of an Oracle Real Application Clusters to make use of free space from multiple free list block
s.
Issuing multiple UPDATE and DELETE statements with different key value rang
es or rowid ranges.
However, manual parallelism has the following disadvantages:
It is difficult to use. You have to open multiple sessions (possibly on different instances) and issue multiple statements.
There is a lack of transactional properties. The DML statements are issued at different times; and, as a result, the ch anges are done with inconsistent snapshots of the database. To get atomicity, the commit or rollback of the various statements must b e coordinated manually (maybe across instances).
The work division is complex. You may have to query the table in order to find out the rowid or key value ranges to correctly divide the work.
The calculation is complex. The calculation of the degree of parallelism can be complex.
There is a lack of affinity and resource information. You need to know affinity information to issue the right DML statement at the right instance when running an O racle Real Application Clusters. You also have to find out about current resource usage to balance workload across instances.
Parallel DML removes these disadvantages by performing inserts, updates, and deletes in parallel automatically.
< a id="sthref1992" name="sthref1992">Parallel DML operations are mainly used to speed up large DML operations against large database objects . Parallel DML is useful in a DSS environment where the performance and scalability of accessing large objects are important. Paralle l DML complements parallel query in providing you with both querying and updating capabilities for your DSS databases.
The overhead of setting up parallelism makes parallel DML operations infeasible for short OLTP tran sactions. However, parallel DML operations can speed up batch jobs running in an OLTP database.
Some of the scenarios where pa rallel DML is used include:
In a data warehouse system, large tables need to be refreshed (updated) periodically with new or modified data from the productio
n system. You can do this efficiently by using parallel DML combined with updatable join views. You can also use the MERGE statement.
The data that needs to be refreshed is generally loaded into a temporary table before starting the refresh proce
ss. This table contains either new rows or rows that have been updated since the last refresh of the data warehouse. You can use an u
pdatable join view with parallel UPDATE to refresh the updated rows, and you can use an anti-hash join with parallel
In a DSS environment, many applications require complex computations that involve constructing and manipulati ng many large intermediate summary tables. These summary tables are often temporary and frequently do not need to be logged. Parallel DML can speed up the operations against these large intermediate tables. One benefit is that you can put incremental results in the intermediate tables and perform parallel update.
In addition, the summary tables may contain cumulative or comparison informat ion which has to persist beyond application sessions; thus, temporary tables are not feasible. Parallel DML operations can speed up t he changes to these large summary tables.
Many DSS applications score customers periodically based on a set of criter ia. The scores are usually stored in large DSS tables. The score information is then used in making a decision, for example, inclusio n in a mailing list.
This scoring activity queries and updates a large number of rows in the large table. Parallel DML can spe ed up the operations against these large tables.
Historical tables describe the
business transactions of an enterprise over a recent time interval. Periodically, the DBA deletes the set of oldest rows and inserts
a set of new rows into the table. Parallel INSERT ... SELECT and parallel DELETE operations c
an speed up this rollover task.
Although you can also use parallel direct loader (SQL*Loader) to insert bulk data from an exte
rnal source, parallel INSERT ... SELECT is faster for inserting data that already exists in another table i
n the database.
Dropping a partition can also be used to delete old rows. However, to do this, the table has to be partitioned by date and with the appropriate time interval.
Batch jobs executed in an OLTP databas e during off hours have a fixed time window in which the jobs must complete. A good way to ensure timely job completion is to paralle lize their operations. As the work load increases, more machine resources can be added; the scaleup property of parallel operations e nsures that the time constraint can be met.
A DML statement can be parallelized only if you have explicitly enabled parallel DML in the session wi
th the ENABLE PARALLEL DML clause of the ALTER SESSION statement. Th
is mode is required because parallel DML and serial DML have different locking, transaction, and disk space requirements.
The
default mode of a session is DISABLE PARALLEL DML. When parallel DML is disabled, no DML will
be executed in parallel even if the PARALLEL hint is used.
When parallel DML is enabled in a session, all DML sta tements in this session will be considered for parallel execution. However, even if parallel DML is enabled, the DML operation may st ill execute serially if there are no parallel hints or no tables with a parallel attribute or if restrictions on parallel operations are violated.
The session's PARALLEL DML mode does not influence the parallelism of SELECT statements, DDL statements, and the query portions of DML statements. Thus, if this mode is not set, the DML operation is not par
allelized, but scans or join operations within the DML statement may still be parallelized.
To execute a DML oper ation in parallel, the parallel execution coordinator acquires or spawns parallel execution servers, and each parallel execution serv er executes a portion of the work under its own parallel process transaction.
Each parallel execution se rver creates a different parallel process transaction.
If you use rollback segments instead of Automatic Undo Management, you may want to reduce contention on the rollback segments. To do this, only a few parallel process transactions sh ould reside in the same rollback segment. See Oracle Database SQL Reference for more information.
The coordinator also has its own coordinator transaction, which can have its own rollback segment. In order to ensure user-level transactional atomicity, the coordinator uses a t wo-phase commit protocol to commit the changes performed by the parallel process transactions.
A session that is enabled for p arallel DML may put transactions in the session in a special mode: If any DML statement in a transaction modifies a table in parallel , no subsequent serial or parallel query or DML statement can access the same table again in that transaction. This means that the re sults of parallel modifications cannot be seen during the transaction.
Serial or parallel statements that attempt to access a table that has already been modified in parallel within the same transaction are rejected with an error message.
If a PL/SQL p rocedure or block is executed in a parallel DML enabled session, then this rule applies to statements in the procedure or block.
< /div>If you use rollback segments instead of A utomatic Undo Management, there are some restrictions when using parallel DML. See Oracle Database SQL Reference for information about restrictions for parallel DML and rollback segments.
< a id="sthref2019" name="sthref2019">The time required to roll back a parallel DML operat ion is roughly equal to the time it takes to perform the forward operation.
Oracle supports parallel rollback after transactio n and process failures, and after instance and system failures. Oracle can parallelize both the rolling forward stage and the rolling back stage of transaction recovery.
See Oracle Database Backup and Recovery Basics for details about parallel rollback.
A us er-issued rollback in a transaction failure due to statement error is performed in parallel by the parallel execution coordinator and the parallel execution servers. The rollback takes approximately the same amount of time as the forward transaction.
Recovery from the failure of a parallel execution c oordinator or parallel execution server is performed by the PMON process. If a parallel execution server or a parallel execution coor dinator fails, PMON rolls back the work from that process and all other processes in the transaction roll back their changes.
Recovery from a system failure requires a ne
w startup. Recovery is performed by the SMON process and any recovery server processes spawned by SMON. Parallel DML statements may b
e recovered using parallel rollback. If the initialization parameter COMPATIBLE is set to 8.1.3 or greater, Fast-Start O
n-Demand Rollback enables terminated transactions to be recovered, on demand one block at a time.
Parallel UPDATE<
/code> uses the space in the existing object, while direct-path INSERT gets new segments for the data.
Space usag e characteristics may be different in parallel than sequential execution because multiple concurrent child transactions modify the ob ject.
A parallel DML operation's lock and enqueue resource req
uirements are very different from the serial DML requirements. Parallel DML holds many more locks, so you should increase the startin
g value of the ENQUEUE_RESOURCES and DML_LOCKS parameters. See "DML_LOCKS" for more
information.
The following restrictions apply to parallel DML (including direct-path INSERT):
Intra-partition parallelism for UPDATE, MERGE, and DELETE operations require that the
COMPATIBLE initialization parameter be set to 9.2 or greater.
INSERT, U
PDATE, MERGE, and DELETE operations on nonpartitioned tables are not parallelized if there is a bitm
ap index on the table. If the table is partitioned and there is a bitmap index on the table, the degree of parallelism will be restri
cted to at most the number of partitions accessed.
A transaction can contain multiple parallel DML state ments that modify different tables, but after a parallel DML statement modifies a table, no subsequent serial or parallel statement ( DML or query) can access the same table again in that transaction.
This restriction also exists after
a serial direct-path INSERT statement: no subsequent SQL statement (DML or query) can access the modified table during t
hat transaction.
Queries that access the same table are allowed before a parallel DML or direct-path < code>INSERT statement, but not after.
Any serial or parallel statements attempting to access a
table that has already been modified by a parallel UPDATE, DELETE, or MERGE, or a direct-path
INSERT during the same transaction are rejected with an error message.
Parallel DML operations cannot be done on tables with triggers.
Replication functionality is not supported for parallel DML.
Parallel DM
L cannot occur in the presence of certain constraints: self-referential integrity, delete cascade, and deferred integrity. In additio
n, for direct-path INSERT, there is no support for any referential integrity.
Parallel DML can be done on tables with object columns provided you are not touching the object columns.
Parallel DML can be done on tables with LOB columns provided the table
is partitioned. However, intra-partition parallelism is not supported.
A transaction involved in a para llel DML operation cannot be or become a distributed transaction.
Clustered tables are not supported.
Violations of these restrictions cause the statement to execute serially without warnings or error messages (except fo r the restriction on statements accessing the same table in a transaction, which can cause error messages). For example, an update is serialized if it is on a nonpartitioned table.
You can only update the partitioning key of a partitioned table to a new value if the update does not cause the row to move to a new partition. The update is possible if the table is defined with the row movement clause enabled.
The function restricti ons for parallel DML are the same as those for parallel DDL and parallel query. See "Parallel Execution of Functi ons" for more information.
This section describes the interactions of integrity con straints and parallel DML statements.
These types of integrity constraints are allowed. They are not a p roblem for parallel DML because they are enforced on the column and row level, respectively.
Restrictions for referential integrity occur whenever a DML operation on one table could cause a recursive DML operatio n on another table. These restrictions also apply when, in order to perform an integrity check, it is necessary to see simultaneously all changes made to the object being modified.
Table 24-1 lists all of the operations that are possib le on tables that are involved in referential integrity constraints.
Table 24-1 Referential Integrity Restrictions
| DML Statement | Issued on Parent | Issued on Child | Self-Referential |
|---|---|---|---|
INSERT |
(Not applicable) | Not parallelize d | Not parallelized |
MERGE |
(Not applicable) | Not parallelized | Not parallelized |
UPDATE No Action |
Supported | Supported | Not parallel ized |
DELETE No Action |
Supported | Supported | Not parallelized |
DELETE Cascade |
Not parallelized | (Not applicable) | Not parallelized |
Delete on tables having a foreign key with delete cascade is not parallelized because parallel execution servers will try to delete rows from multiple partitions (parent and child tables).
A DML operation will not be parallelized if the affected tables contain enabled triggers that may get fired as a result of the statement. This implies that DML statements on tables that are being replicated will not be parallelized.
Relevant triggers must be disabled in order to parallelize DML on the table. Note that, if you enable or disable triggers, the depend ent shared cursors are invalidated.
A DML operation cannot be parallelized if it is in a distributed transaction or if t he DML or the query operation is against a remote object.
This section contains several examples of distributed transaction processing.
Example 24-1 Distributed Transaction Paralleliz ation
In this example, the DML statement queries a remote object:
INSERT /* APPEND PARALLEL (t3,2) */ INTO t3 SELECT * FROM t4@dblink;
The query operation is executed serially without notification because it references a remote object.
Example 24-2 Distributed Transaction Parallelization
In this example, the DML operation is applied to a remote object:
DELETE /*+ PARALLEL (t1, 2 ) */ FROM t1@dblink;
The DELETE operation is not parallelized because it references a remote object.
Example 24-3 Distributed Transaction Parallelization
In this example, the DML operation i s in a distributed transaction:
SELECT * FROM t1@dblink; DELETE /*+ PARALLEL (t2,2) */ FROM t2; COMMIT ;
The DELETE operation is not parallelized because it occurs in a distributed transaction (which is started
by the SELECT statement).
In a parallel DML or DDL statement, as in a parallel query, a user-written function may be execut
ed in parallel if it has been declared with the PARALLEL_ENABLE keyword, if it is declared in a package or type and has
a PRAGMA RESTRICT_REFERENCES that indicates all of RNDS, WNDS, RNPS,
and WNPS, or if it is declared with CREATE FUNCTION and the system can analyze the body of th
e PL/SQL code and determine that the code neither reads nor writes to the database or reads nor modifies package variables.
Fo r a parallel DML statement, any function call that cannot be executed in parallel causes the entire DML statement to be executed seri ally.
For an INSERT ... SELECT or CREATE TABLE ... AS S
ELECT statement, function calls in the query portion are parallelized according to the parallel query rules in the prior parag
raph. The query may be parallelized even if the remainder of the statement must execute serially, or vice versa.
In addition to par allel SQL execution, Oracle can use parallelism for the following types of operations:
Parallel recovery
Parallel propagation (replication)
Parallel load (the SQL*Loader utility)< /p>
Like parallel SQL, parallel recovery and propagation are performed by a parallel execution coordinator and multiple parallel execution servers. Parallel load, however, uses a different mechanism.
The behavior of the parallel execution coordin ator and parallel execution servers may differ, depending on what kind of operation they perform (SQL, recovery, or propagation). For example, if all parallel execution servers in the pool are occupied and the maximum number of parallel execution servers has been st arted:
In parallel SQL, the parallel execution coordinator switches to serial processing.
In parallel propagation, the parallel execution coordinator returns an error.
For a given session, th e parallel execution coordinator coordinates only one kind of operation. A parallel execution coordinator cannot coordinate, for exam ple, parallel SQL and parallel recovery or propagation at the same time.
|
See Also:
|
Parallel execution is enabled by default. The initial computed values of the parallel execution parameters should be acc eptable for the majority of installations. These parameters affect memory usage and the degree of parallelism used for parallel opera tions.
Oracle Database computes defaults for these parameters based on the value at database startup of CPU_COUNT
and PARALLEL_THREADS_PER_CPU. The parameters can also be manually tuned, increasing or decreasing their values to suit
specific system configurations or performance goals.
For example:
On systems where parallel execu
tion will never be used, PARALLEL_MAX_SERVERS can be set to zero.
On large systems with abu
ndant SGA memory, PARALLEL_EXECUTION_MESSAGE_SIZE can be increased to improve throughput.
You can also manually tune parallel execution parameters; however, Oracle recommends using default settings for parallel execution. Manual tuning of parallel execution is more complex than using default settings for two reasons: manual parallel execution tuning requires more att entive administration than automated tuning, and manual tuning is prone to user-load and system-resource miscalculations.
Init ializing and tuning parallel execution involves the following steps:
How Oracle Determines the Degree of Parallelism for Operations
Degree of Parallelism and Adaptive Multiuser: How They Interact
By de fault, Oracle automatically sets parallel execution parameters, as shown in Table 24-2. For most systems, you do not need to make further adjustments to have an adequately tuned parallel execution environment.
Table 24-2 Parameters and Their Defaults
| Parameter | Default | Comments |
|---|---|---|
PARALLEL_ADAPTIVE_MULTI_USER |
TRUE |
Causes parallel execution SQL to throttle DOP re quests to prevent system overload. |
PAR
ALLEL_MAX_SERVERS |
CPU_COUNT x PARALLEL_THREADS_PRE_CPU x (2 if |
Use this limit to maximize the number of processes that parallel execution uses. |
PARALLEL_EXECUTION_MESSAGE_SIZE |
2 KB (port specifi c) | Increase to 4k or 8k to improve parallel execution performance if sufficient SGA m emory exists. |
Note that you can set some parameters in su
ch a way that Oracle will be constrained. For example, if you set PROCESSES to 20, you will not be able to get 25 slaves
.
The parallel execution coordinator may enlist two or more of the instance's parallel execution servers to process a SQL statement . The number of parallel execution servers associated with a single operation is known as the a>degree of parallelism.
The DOP is specified in the following ways:
At the statement level with hints and with the PARALLEL clause
At the session
level by issuing the ALTER SESSION FORCE PARALLEL statement
At the table level in the table's definition
At the index level in the index's definition
The following example shows a statement that sets the DOP to 4 on a table:
ALTER TABLE order s PARALLEL 4;
This next example sets the DOP on an index to 4:
ALTER INDEX iorders PARALLEL 4;
This last example sets a hint to 4 on a query:
SELECT /*+ PARALLEL(orders, 4) */ COUNT(*) FROM orders;
Note that the DOP applies directly only to intraoperation parallelism. If interoperation parallelism is poss ible, the total number of parallel execution servers for a statement can be twice the specified DOP. No more than two operations can be performed simultaneously.
Parallel execution is designed to effectively use multiple CPUs and disks to answer queries quick ly. When multiple users employ parallel execution at the same time, available CPU, memory, and disk resources may be quickly exhauste d. Oracle provides several ways to deal with resource utilization in conjunction with parallel execution, including:
The adaptive multiuser algorithm, which reduces the DOP as the load on the system increases. By default, the adaptive mu ltiuser algorithm is enabled, which optimizes the performance of systems with concurrent parallel SQL execution operations.
User resource limits and profiles, which allow you to set limits on the amount of various system resources availa ble to each user as part of a user's security domain.
The Database Resource Manager, which enables you t o allocate resources to different groups of users.
The parallel execution coordinator determines the DOP by considering several specifications. The coordinator:
< ul>Checks for hints or a PARALLEL clause specified in the SQL statement itself.
Checks for a session value set by the ALTER SESSION FORCE PARALLEL stat
ement.
Looks at the table's or index's definition.
After a DOP is found in one of these specifications, it becomes the DOP for the operation.
Hints, PARALLEL clauses, table or index definitions, and d
efault values only determine the number of parallel execution servers that the coordinator requests for a given operation. The actual
number of parallel execution servers used depends upon how many processes are available in the parallel execution server pool and wh
ether interoperation parallelism is possible.
|
< strong>See Also: |
You can speci fy hints in a SQL statement to set the DOP for a table or index and for the caching behavior of the operation.
The PARALLEL hint is used only
for operations on tables. You can use it to parallelize queries and DML statements (INSERT, UPDATE, M
ERGE, and DELETE).
The PARALLEL_INDEX hint parallelizes an index range scan of a partitioned index. (In an index operation
, the PARALLEL hint is not valid and is ignored.)
See Oracle Database Performance Tuning Guide for information about using hints in SQL statements and the spe
cific syntax for the PARALLEL, NO_PARALLEL, PARALLEL_INDEX, CACHE, and NOCA
CHE hints.
You can specify the DOP within a table or index
definition by using one of the following statements: CREATE TABLE, ALTER TABLE, <
code>CREATE INDEX, or ALTER INDEX.
The default DOP is used when you ask to parallelize an operation but you do n ot specify a DOP in a hint or within the definition of a table or index. The default DOP is appropriate for most applications.
The default DOP for a SQL statement is determined by the following factors:
The value of the parameter
CPU_COUNT, which is, by default, the number of CPUs on the system, the number of RAC instances, and the value of the par
ameter PARALLEL_THREADS_PER_CPU.
For parallelizing by partition, the number of partitions t hat will be accessed, based on partition pruning.
For parallel DML operations with global index maintena nce, the minimum number of transaction free lists among all the global indexes to be updated. The minimum number of transaction free lists for a partitioned global index is the minimum number across all index partitions. This is a requirement to prevent self-deadloc k.
These factors determine the default number of parallel execution servers to use. However, the actual number of pr
ocesses used is limited by their availability on the requested instances during run time. The initialization parameter PARALLEL
_MAX_SERVERS sets an upper limit on the total number of parallel execution servers that an instance can have.
If a mini
mum fraction of the desired parallel execution servers is not available (specified by the initialization parameter PARALLEL_MIN
_PERCENT), a user error is produced. You can retry the query when the system is less busy.
With the adaptive multiuser algorithm, the parallel execution coordinator varies the DOP according to the system load. Oracle determines the load by calculating the number of active Oracle Serve r processes. If the number of server processes currently allocated is larger than the optimal number of server processes, given the n umber of available CPUs, the algorithm reduces the DOP. This reduction improves total system throughput by avoiding overallocation of resources.
The initialization parameter PARALLEL_MIN_PERCENT specifies the desired minimum per
centage of requested parallel execution servers. This parameter affects DML and DDL operations as well as queries.
For example
, if you specify 50 for this parameter, then at least 50 percent of the parallel execution servers requested for any parallel operati
on must be available in order for the operation to succeed. If 20 parallel execution servers are requested, then at least 10 must be
available or an error is returned to the user. If PARALLEL_MIN_PERCENT is set to null, then all parallel operations will
proceed as long as at least two parallel execution servers are available for processing.
In Oracle Real Applica tion Clusters, instance groups can be used to limit the number of instances that participate in a parallel operation. You can create any number of instance groups, each consisting of one or more instances. You can then specify which instance group is to be used for any or all parallel operations. Parallel execution servers will only be used on instances which are members of the specified instance group. See Oracle Real Application Clusters Administrator's Guide< /a> and Oracle Real Application Clusters Deployment and Perfor mance Guide for more information about instance groups.
To optimize performance, all parallel execution servers should have equal workloads. For SQL statements parallel ized by block range or by parallel execution servers, the workload is dynamically divided among the parallel execution servers. This minimizes workload skewing, which occurs when some parallel execution servers perform significantly more work than the other processe s.
For the relatively few SQL statements parallelized by partitions, if the workload is evenly distributed among the partition s, you can optimize performance by matching the number of parallel execution servers to the number of partitions or by choosing a DOP in which the number of partitions is a multiple of the number of processes. This applies to partition-wise joins and PDML on tables created before Oracle9i. See "Limitation on the Degree of Parallelism" for details regarding this to pic.
For example, suppose a table has 10 partition, and a parallel operation divides the work evenly among them. You can use 1 0 parallel execution servers (DOP equals 10) to do the work in approximately one-tenth the time that one process would take. You migh t also use five processes to do the work in one-fifth the time, or two processes to do the work in one-half the time.
If, howe ver, you use nine processes to work on 10 partitions, the first process to finish its work on one partition then begins work on the 1 0th partition; and as the other processes finish their work, they become idle. This configuration does not provide good performance w hen the work is evenly divided among partitions. When the work is unevenly divided, the performance varies depending on whether the p artition that is left for last has more or less work than the other partitions.
Similarly, suppose you use four processes to w ork on 10 partitions and the work is evenly divided. In this case, each process works on a second partition after finishing its first partition, but only two of the processes work on a third partition while the other two remain idle.
In general, you cannot as sume that the time taken to perform a parallel operation on a given number of partitions (N) with a given number of parallel executio n servers (P) will be N/P. This formula does not take into account the possibility that some processes might have to wait while other s finish working on the last partitions. By choosing an appropriate DOP, however, you can minimize the workload skew and optimize per formance.
A SQL statement can be p
arallelized if it includes a parallel hint or if the table or index being operated on has been declared PARALLEL with a
CREATE or ALTER statement. In addition, a DDL statement can be parallelized by using the PARALLEL clause. However, not all of these methods apply to all types of SQL statements.
Parallelization has two components: the de cision to parallelize and the DOP. These components are determined differently for queries, DDL operations, and DML operations.
< p>To determine the DOP, Oracle looks at the referen ce objects:Parallel query looks at each table and index, in the portion of the query being parallelized , to determine which is the reference table. The basic rule is to pick the table or index with the largest DOP.
For parallel DML (INSERT, UPDATE, MERGE, and DELETE), the reference objec
t that determines the DOP is the table being modified by an insert, update, or delete operation. Parallel DML also adds some limits t
o the DOP to prevent deadlock. If the parallel DML statement includes a subquery, the subquery's DOP is the same as the DML operation
.
For parallel DDL, the reference object that determines the DOP is the table, index, or partition being created, rebuilt, split, or moved. If the parallel DDL statement includes a subquery, the subquery's DOP is the same as the DDL oper ation.
This section discusses some rules f or parallelizing queries.
A SELECT statement ca
n be parallelized only if the following conditions are satisfied:
The query includes a parallel hint spe
cification (PARALLEL or PARALLEL_INDEX) or the schema objects referred to in the query have a PARALLE
L declaration associated with them.
At least one of the tables specified in the query requir es one of the following:
A full table scan
An index range scan spanning mul tiple partitions
No scalar subqueries are in the SELECT list
The DOP for a query is determined by the following rules:
The query uses the maximum DOP taken from all of the table declarations involved in the query and all of the potential indexes t hat are candidates to satisfy the query (the reference objects). That is, the table or index that has the greatest DOP determines the query's DOP (maximum query directive).
If a table has both a parallel hint specification in the query a nd a parallel declaration in its table specification, the hint specification takes precedence over parallel declaration specification . See Table 24-3 for precedence rules.
The DOP is determined by the same rules as for the queries. Note that in the case of UPDATE and DELETE<
/code> operations, only the target table to be modified (the only reference object) is involved. Thus, the parallel hint specification takes precedence over the parallel declaration specification of the target table. In oth
er words, the precedence order is: UPDATE or MERGE, UPDATE, DELETE hint > Session > Parallel decla
ration specification of target table. See Table 24-3 for precedence rules.
A parallel execution server can update or merge into, or delete from multiple partitions, but each partition can only be updated or deleted by one parallel exec ution server.
If the DOP is less than the nu mber of partitions, then the first process to finish work on one partition continues working on another partition, and so on until th e work is finished on all partitions. If the DOP is greater than the number of partitions involved in the operation, then the excess parallel execution servers will have no work to do.
UPDATE tbl_1 SET c1=c1+1 WHERE c1>100;
If tbl_1 is a partitioned table and its table definition has a p
arallel clause, then the update operation is parallelized even if the scan on the table is serial (such as an index scan), assuming t
hat the table has more than one partition with c1 greater than 100.
An INSERT ... SELECT statement parallelizes its INSERT and
You can specify a parallel hint after the INSERT k
eyword in an INSERT ... SELECT statement. Because the tables being queried are usually not the same as the
table being inserted into, the hint enables you to specify parallel directives specifically for the insert operation.
You have
the following ways to specify parallel directives for an INSERT ... SELECT statement (assuming that
PARALLEL DML mode is enabled):
SELECT parallel hints specified at the s
tatement
Parallel clauses specified in the definition of tables being selected
INSERT parallel hint specified at the statement
Parallel clause specified in the defini tion of tables being inserted into
You can use the ALTER SESSION FORCE PARALLEL DML statement to override parallel clauses for subsequent INSERT operations in a session. Parallel hints in insert operations override the ALTER SESSION PARALLEL DML statement.
The following rule determines whet
her the INSERT operation should be parallelized in an INSERT ... SELECT statement:
The
INSERT operation will be parallelized if and only if at least one of the following is true:
The PARALLEL hint is specified after the INSERT in the DML statement.
The tab
le being inserted into (the reference object) has a PARALLEL declaration specification.
An ALTER SESSION FORCE PARALLEL DML
code> statement has been issued previously during the session.
The decision to parallelize the INSERT o
peration is made independently of the SELECT operation, and vice versa.
Once the decision to parallelize the SELECT or INSERT operation is made, one parallel directiv
e is picked for deciding the DOP of the whole statement, using the following precedence rule Insert hint directive > Session> P
arallel declaration specification of the inserting table > Maximum query directive.
In this context, maximum query directiv e means that among multiple tables and indexes, the table or index that has the maximum DOP determines the parallelism for the query operation.
The chosen parallel directive is applied to both the SELECT and INSERT operations.
Example 24-6 P arallelization: Example 3
The DOP used is 2, as specified in the INSERT hint:
INSERT /*+ PARALLEL(tbl_ins,2) */ INTO tbl_ins SELECT /*+ PARALLEL(tbl_sel,4) */ * FROM tbl_sel;
You need to keep the fo llowing in mind when parallelizing DDL statements.
DDL operations can be parallelized if a PA
RALLEL clause (declaration) is specified in the syntax. In the case of CREATE INDEX and ALTER<
/code> INDEX ... REBUILD or ALTER INDEX ... REBUILD PARTITION<
/code>, the parallel declaration is stored in the data dictionary.
You can use the SESSION FORCE PARALLEL DDL statement to override the parallel cl
auses of subsequent DDL statements in a session.
The DOP is determined b
y the specification in the PARALLEL clause, unless it is overridden by an ALTER SESSION
FORCE PARALLEL DDL statement. A rebuild of a partitioned index is never parallelized.
Paralle
l clauses in CREATE TABLE and ALTER TABLE statements specify table parallelism. I
f a parallel clause exists in a table definition, it determines the parallelism of DDL statements as well as queries. If the DDL stat
ement contains explicit parallel hints for a table, however, those hints override the effect of parallel clauses for that table. You
can use the ALTER SESSION FORCE PARALLEL DDL statement to override p
arallel clauses.
The following rules apply:
The CREATE INDEX and ALTER INDEX ... REBUILD statements can be paral
lelized only by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL
DDL statement.
ALTER INDEX ... REBUILD can be parallelized only for a non
partitioned index, but ALTER INDEX ... REBUILD PARTITION can be parallelized by a
PARALLEL clause or an ALTER SESSION FORCE PARALLEL DDL
statement.
The scan operation for ALTER INDEX ... REBUILD (nonpartitioned), ALTE
R INDEX ... REBUILD PARTITION, and CREATE INDEX has the same
parallelism as the REBUILD or CREATE operation and uses the same DOP. If the DOP is not specified for REBUILD or CREATE, the default is the number of CPUs.
The ALTERINDEX ... MOVE PARTITION
and ALTERINDEX ...SPLIT PARTITION statements can be parallel
ized only by a PARALLEL clause or an ALTER SESSION FORCE PARALLEL MOVE or SPLIT operations. If the DOP is not specified, the default is the number of CPUs.
The CREATE TABLE ..
. AS SELECT statement contains two parts: a CREATE part (DDL) and a SELECT part (
query). Oracle can parallelize both parts of the statement. The CREATE part fol
lows the same rules as other DDL operations.
The query part of a CREATE <
code>TABLE ... AS SELECT statement can be parallelized only if the following conditions are satisfie
d:
The query includes a parallel hint specification (PARALLEL or PARALLEL_INDEX) or the CREATE part of the statement has a PARALLEL clause specification or the schema objects referred t
o in the query have a PARALLEL declaration associated with them.
At least one of the t ables specified in the query requires one of the following: a full table scan or an index range scan spanning multiple partitions.
The DOP for the query part of a CREATE
TABLE ... AS SELECT statement is determined by one of the following rules:
The query part uses the values specified in the PARALLEL clause of the CREATE part.
If the PARALLEL clause is not specified, the default DOP is the number of CPUs.
If the CREATE is serial, then the DOP is determined by the query.
Note that any values specified in a hint for parallelism are ignored.
The CREATE operation of CREATE TABLE ... AS SELECT<
/code> can be parallelized only by a PARALLEL clause or an ALTER SESSION FORCE DDL statement.
When the CREATE operation of CREATE TABLE ... AS SELECT is parallelized, Oracle also parallelizes the scan operation if possible. The scan operati
on cannot be parallelized if, for example:
The SELECT clause has a NO_PARALLEL
hint
The operation scans an index of a nonpartitioned table
When the CREATE operation is not parallelized, the SELECT can be parallelized if it has a PARALLEL hint or if the select
ed table (or partitioned index) has a parallel declaration.
CREATE operation, and for the SELECT operation if it is parallelized, is specified by the
PARALLEL clause of the CREATE statement, unless it is overridden by an ALTER SESSION FORCE PARALLEL DDL statement. If the PARALLEL clause does not specify the DO
P, the default is the number of CPUs.Table 24-3 shows how various types of SQL statements can be parallelized and indicates which methods o f specifying parallelism take precedence.
The priority (1) specification overrides priority (2) and prio rity (3).
The priority (2) specification overrides priority (3).
Table 24-3 Parallelization Rules
The DOP of tables involved in parallel operations affect the DOP for operations on thos
e tables. Therefore, after setting parallel tuning
parameters, you must also enable parallel execution for each table you want parallelized, using the PARALLEL clause of t
he CREATE TABLE or ALTER TABLE statements. <
/a>You can also use the PARALLEL hint with SQL statements to enable parallelism for that operation only, or use the ALTER SESSION statement to enable parallelism for all subsequent operations in
the session.
When you parallelize tables, you can also specify the DOP or allow Oracle to use a default DOP. The value of the
default DOP is derived automatically, based on the value of PARALLEL_THREADS_PER_CPU and the number of CPUs available t
o Oracle.
ALTER TABLE employees PARALLEL; -- uses default DOP ALTER TABLE employees PARALLEL 4; -- users DOP of 4
The DOP specifies the number of available processes, or threads, used in parallel operations. Each parallel thread can use one or two query processes, depending on the query's complexity.
The adaptive multiuser feature adjusts the DOP based on user load. For example, you might have a table with a DOP of 5. This DOP might be acceptable with 10 users. However, if 10 more users enter the system and you enable the < code>PARALLEL_ADAPTIVE_MULTI_USER feature, Oracle reduces the DOP to spread resources more evenly according to the perceived O racle load.
Once Oracle determines the DOP for a query, the DOP does not change for the duration of the query.
It is be
st to use the parallel adaptive multiuser feature w
hen users process simultaneous parallel execution operations. By default, PARALLEL_ADAPTIVE_MULTI_USER is set to T
RUE, which optimizes the performance of systems with concurrent parallel SQL execution operations. If PARALLEL_ADAPTIVE_
MULTI_USER is set to FALSE, each parallel SQL execution operation receives the requested number of parallel execu
tion server processes regardless of the impact to the performance of the system as long as sufficient resources have been configured.
The adaptive multiuser algorithm has s
everal inputs. The algorithm first considers the number of active Oracle Server processes as calculated by Oracle. The algorithm then
considers the default settings for parallelism as set in the initialization parameter file, as well as parallelism options used in <
code>CREATE TABLE and ALTER TABLE statements and SQL hints.
When a system is ove rloaded and the input DOP is larger than the default DOP, the algorithm uses the default degree as input. The system then calculates a reduction factor that it applies to the input DOP. For example, using a 16-CPU system, when the first user enters the system and it is idle, it will be granted a DOP of 32. The next user will be give a DOP of eight, the next four, and so on. If the system settles into a steady state of eight users issuing queries, all the users will eventually be given a DOP of 4, thus dividing the system evenl y among all the parallel users.
If you are sure you want to execute in parallel and want to avoid setting the DOP for a table or modifying the queries involved, you can force parallelism with the following statement:
ALT ER SESSION FORCE PARALLEL QUERY;
All subsequent queries will be executed in parallel provided no restrictions are violate d. You can also force DML and DDL statements. This clause overrides any parallel clause specified in subsequent statements in the ses sion, but is overridden by a parallel hint.
In typical OLTP environments, for example, the tables are not set parallel, but ni ghtly batch scripts may want to collect data from these tables in parallel. By setting the DOP in the session, the user avoids alteri ng each table in parallel and then altering it back to serial when finished.
The initialization parameter PARALLEL_THREADS_PER_CPU affects algorithms controlling both the DOP and the adaptive multiuser feature. O
racle multiplies the value of PARALLEL_THREADS_PER_CPU by the number of CPUs for each instance to derive the number of t
hreads to use in parallel operations.
The adaptive multiuser feature also uses the default DOP to compute the target number of
query server processes that should exist in a system. When a system is running more processes than the target number, the adaptive a
lgorithm reduces the DOP of new queries as required. Therefore, you can also use PARALLEL_THREADS_PER_CPU to control the
adaptive algorithm.
PARALLEL_THREADS_PER_CPU enables you to adjust for hardware configurations with I/O subsyste
ms that are slow relative to the CPU speed and for application workloads that perform few computations relative to the amount of data
involved. If the system is neither CPU-bound nor I/O-bound, then the PARALLEL_THREADS_PER_CPU value should be increased
. This increases the default DOP and allow better utilization of hardware resources. The default for PARALLEL_THREADS_PER_CPU
code> on most platforms is 2. However, the default for machines with relatively slow I/O subsystems can be as high as eight.
This section discusses the following topics:
The parameters that establish resource limits are:
The PARALLEL_MAX_SEVERS parameter sets a resource limit on the maximum
number of processes available for parallel execution. Most parallel operations need at most twice the number of query server processe
s as the maximum DOP attributed to any table in the operation.
Oracle sets PARALLEL_MAX_SERVERS to a default valu
e that is sufficient for most systems. The default value for PARALLEL_MAX_SERVERS is as follows:
(CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5)
This might not be eno
ugh for parallel queries on tables with higher DOP attributes. We recommend users who expects to run queries of higher DOP to set
2 x DOP x NUMBER_OF_CONCURRENT_USERS
For example
, setting PARALLEL_MAX_SERVERS to 64 will allow you to run four parallel queries simultaneously, assuming that each quer
y is using two slave sets with a DOP of eight for each set.
If the hardware system is neither CPU bound nor I/O bound, then yo
u can increase the number of concurrent parallel execution users on the system by adding more query server processes. When the system
becomes CPU- or I/O-bound, however, adding more concurrent users becomes detrimental to the overall performance. Careful setting of
PARALLEL_MAX_SERVERS is an effective method of restricting the number of concurrent parallel operations.
If users
initiate too many concurrent operations, Oracle might not have enough query server processes. In this case, Oracle executes the oper
ations sequentially or displays an error if P
ARALLEL_MIN_PERCENT is set to a value other than the default value of 0 (zero).
This condition can be verified through
the GV$SYSSTAT view by comparing the statistics for parallel operations not downgraded and parallel operations downgrade
d to serial. For example:
SELECT * FROM GV$SYSSTAT WHERE name LIKE 'Parallel operation%';
When concurrent users have too many query server processe
s, memory contention (paging), I/O contention, or excessive context switching can occur. This contention can reduce system throughput
to a level lower than if parallel execution were not used. Increase the PARALLEL_MAX_SERVERS value only if the system has sufficient memory and I/O bandwidth for the resu
lting load.
You can use operating system performance monitoring tools to determine how much memory, swap space and I/O bandwid th are free. Look at the runq lengths for both your CPUs and disks, as well as the service time for I/Os on the system. Verify that t he machine has sufficient swap space exists on the machine to add more processes. Limiting the total number of query server processes might restrict the number of concurrent users who can execute parallel operations, but system throughput tends to remain stable.
To increase the number of concurrent users, you must restrict the resource usage of each individual user. You can ach ieve this by using the parallel adaptive multiuser feature or by using resource consumer groups. See Oracle Database Administrator's Guide and Oracle Database Concepts for more information about resource consumer groups and the Database Resour ce Manager.
< /a>You can limit the amount of parallelism available to a given user by establishing a resou rce consumer group for the user. Do this to limit the number of sessions, concurrent logons, and the number of parallel processes tha t any one user or group of users can have.
Each query server process working on a parallel execution statement is logged on wi th a session ID. Each process counts against the user's limit of concurrent sessions. For example, to limit a user to 10 parallel exe cution processes, set the user's limit to 11. One process is for the parallel coordinator and the other 10 consist of two sets of que ry server servers. This would allow one session for the parallel coordinator and 10 sessions for the parallel execution processes.
See Oracle Database Administrator's Guide for more
information about managing resources with user profiles and Oracle
Real Application Clusters Administrator's Guide for more information on querying GV$ views.
The recommended value for the PARALLEL_MIN_SERVERS parameter is 0 (zero), which is the def
ault.
This parameter lets you specify in a single instance the number of processes to be started and reserved for parallel ope rations. The syntax is:
PARALLEL_MIN_SERVERS=n
The n variable is the num ber of processes you want to start and reserve for parallel operations.
Setting PARALLEL_MIN_SERVERS balances the
startup cost against memory usage. Processes started using PARALLEL_MIN_SERVERS do not exit until the database is shut
down. This way, when a query is issued the processes are likely to be available. It is desirable, however, to recycle query server pr
ocesses periodically since the memory these processes use can become fragmented and cause the high water mark to slowly increase. Whe
n you do not set PARALLEL_MIN_SERVERS, processes exit after they are idle for five minutes.
Parallel execution requires memory resources in addition to those required by serial SQL execution. Additional memory is used for communication and passing data between query server processes and the query coordinator.
Oracle Database allocates memory for query server processes from the shared pool. Tune the shared pool as follows:
Allow for other clients of the shared pool, such as shared cursors and stored procedures.
Remember that larger values improve performance in multiuser systems, but smaller values use less memory.
You must also take into account that using parallel execution generates more cursors. Look at statistics in the V$SQLAREA
view to determine how often Oracle recompiles cursors. If the cursor hit ratio is poor, increase the size of the pool. This h
appens only when you have a large number of distinct queries.
You can then monitor the number of buffers used by parallel exec
ution and compare the shared pool PX msg pool to the current high wa
ter mark reported in output from the view V$PX_PROCESS_SYSSTAT.
|
Note: If you do not have enough memory available, error message 12853 occurs (insuffi cient memory for PX buffers: current stringK, max needed stringK). This is caused by having insufficient SGA memory available for PX buffers. You need to reconfigure the SGA to have at least (MAX - CURRENT) bytes of additi
onal memory. |
By default, Oracle allocates parallel ex ecution buffers from the shared pool.
If Oracle displays the following error on startup:
ORA-271 02: out of memory SVR4 Error: 12: Not enough space
You should reduce the value for SHARED_POOL_SIZE low en
ough so your database starts. After reducing the value of SHARED_POOL_SIZE, you might see the error:
ORA-04031: unable to allocate 16084 bytes of shared memory
("SHARED pool","unknown object","SHARED pool heap","PX msg
pool")
If so, execute the following query to determine why Oracle could not allocate the 16,084 bytes:
SELECT NAME, SUM(BYTES) FROM V$SGASTAT WHERE POOL='SHARED POOL' GROUP BY ROLLUP (NAME);
Your output shou ld resemble the following:
NAME SUM(BYTES)
-------------------------- ----------
PX msg pool 1474572
free memory 562132
2036704
If
you specify SHARED_POOL_SIZE and the amount of memory you need to reserve is bigger than the pool, Oracle does not alloc
ate all the memory it can get. Instead, it leaves some space. When the query runs, Oracle tries to get what it needs. Oracle uses the
560 KB and needs another 16KB when it fails. The error does not report the cumulative amount that is needed. The best way of determi
ning how much more memory is needed is to use the formulas in "Adding Memory for Message Buffers".
To
resolve the problem in the current example, increase the value for SHARED_POOL_SIZE. As shown in the sample output, the
SHARED_POOL_SIZE is about 2 MB. Depending on the amount of memory available, you could increase the value of SHARE
D_POOL_SIZE to 4 MB and attempt to start your database. If Oracle continues to display an ORA-4031 message, gradu
ally increase the value for SHARED_POOL_SIZE until startup is successful.
After you determine the initial setting for the shared pool, you must calculate additional memory requirements for message buffers and determine how much additional space you need for cursors.
You must increase the value for the SHARED_POOL_SIZE par
ameter to accommodate message buffers. The message buffers allow query server processes to communicate with each other.
Oracle
uses a fixed number of buffers for each virtual connection between producer query servers and consumer query servers. Connections in
crease as the square of the DOP increases. For this reason, the maximum amount of memory used by parallel execution is bound by the h
ighest DOP allowed on your system. You can control this value by using either the PARALLEL_MAX_SERVERS parameter or by u
sing policies and profiles.
To calculate the amount of memory required, use one of the following formulas:
For SMP systems:
mem in bytes = (3 x size x users x groups x connections)
For SMP Real Application Clusters and MPP systems:
mem in bytes = ((3 x local) + (2 x re mote)) x (size x users x groups) / instances
Each instance uses the memory computed by the formula.
T he terms are:
SIZE = PARALLEL_EXECUTION_MESSAGE_SIZE
USERS = the number of concurrent parallel execution users that you expect to have running with the optimal DOP
GROUPS = the number of query server process groups used for each query
A simple SQL state ment requires only one group. However, if your queries involve subqueries which will be processed in parallel, then Oracle uses an ad ditional group of query server processes.
CONNECTIONS = (DOP2 + 2 x DOP)
If your system is a cluster or MPP, then you should account for the number of instances because this will increase the DOP. In other
words, using a DOP of 4 on a two instance cluster results in a DOP of 8. A value of PARALLEL_MAX_SERVERS times the numbe
r of instances divided by four is a conservative estimate to use as a starting point.
LOCAL
= CONNECTIONS/INSTANCES
REMOTE = CONNECTIONS -
Add this amount to your original setting for the shared pool. However, before setting a value for eit her of these memory structures, you must also consider additional memory for cursors, as explained in the following section.
Parallel execution plans consume more space in the SQL area than serial execution plans. You should regularly monitor shared pool resource use to ensure that the memory used by both messages an d cursors can accommodate your system's processing requirements.
The formulas in this section are just starting points. Whether you are using automated or m anual tuning, you should monitor usage on an on-going basis to make sure the size of memory is not too large or too small. To do this , tune the shared pool using the following query:
SELECT POOL, NAME, SUM(BYTES) FROM V$SGASTAT WHERE PO OL LIKE '%pool%' GROUP BY ROLLUP (POOL, NAME);
Your output should resemble the following:
POOL NAME SUM(BYTES)
----------- -------------------------- ----------
shared pool Checkpoint queue
38496
shared pool KGFF heap 1964
shared pool KGK heap 4372
shared poo
l KQLS heap 1134432
shared pool LRMPD SGA Table 23856
shared pool PLS non-lib hp
2096
shared pool PX subheap 186828
shared pool SYSTEM PARAMETERS 55756
shared pool State obj
ects 3907808
shared pool character set memory 30260
shared pool db_block_buffers 200000
s
hared pool db_block_hash_buckets 33132
shared pool db_files 122984
shared pool db_handles
52416
shared pool dictionary cache 198216
shared pool dlm shared memory 5387924
shared pool
enqueue_resources 29016
shared pool event statistics per sess 264768
shared pool fixed allocation callback
1376
shared pool free memory 26329104
shared pool gc_* 64000
shared pool latch nowai
t fails or sle 34944
shared pool library cache 2176808
shared pool log_buffer 24576
sha
red pool log_checkpoint_timeout 24700
shared pool long op statistics array 30240
shared pool message pool freequeue
116232
shared pool miscellaneous 267624
shared pool processes 76896
shared pool se
ssion param values 41424
shared pool sessions 170016
shared pool sql area 954
9116
shared pool table columns 148104
shared pool trace_buffers_per_process 1476320
shared pool transactions
18480
shared pool trigger inform 24684
shared pool 52248968
90641768
Evaluate the memory used as shown in your output, and alter the setting for
SHARED_POOL_SIZE based on your processing needs.
To obtain more memory usage statistics, execute the following q uery:
SELECT * FROM V$PX_PROCESS_SYSSTAT WHERE STATISTIC LIKE 'Buffers%';
Your output shoul d resemble the following:
STATISTIC VALUE ------------------- ----- Buffers Allocated 23225 Buffers Freed 23225 Buffers Current 0 Buffers HWM 3620
The amount of memory used appears in the Buffers Curre
nt and Buffers HWM statistics. Calculate a value in bytes by multiplying the number of buffers by th
e value for PARALLEL_EXECUTION_MESSAGE_SIZE. Compare the high water mark to the parallel execution message pool size to
determine if you allocated too much memory. For example, in the first output, the value for large pool as shown in px pool is 38,092,812 or 38 MB. The Buffers HWM from the second output is 3,620, w
hich when multiplied by a parallel execution message size of 4,096 is 14,827,520, or approximately 15 MB. In this case, the high wate
r mark has reached approximately 40 percent of its capacity.
The recommended value for the PARALLEL_MIN_PERCENT parameter is 0 (zero).
This parameter enables users to wait for an acceptable DOP, dep
ending on the application in use. Setting this parameter to values other than 0 (zero) causes Oracle to return an error when the requ
ested DOP cannot be satisfied by the system at a given time. For example, if you set PARALLEL_MIN_PERCENT to 50, which t
ranslates to 50 percent, and the DOP is reduced by 50 percent or greater because of the adaptive algorithm or because of a resource l
imitation, then Oracle returns ORA-12827. For example:
SELECT /*+ PARALLEL(e, 8, 1) */ d.d epartment_id, SUM(SALARY) FROM employees e, departments d WHERE e.department_id = d.department_id GROUP BY d.department_id ORDER BY d .department_id;
Oracle responds with this message:
ORA-12827: insufficient parallel query s laves available
The first group of parameters discussed in this section affects memory and resource consumption for all parallel operations, in particular, for parallel execution. These parameters are:
A second subset of parameters discussed in this section explains parameters affecting parall el DML and DDL.
To control resource consumption, you should configure memory at two levels:
At the Oracle l evel, so the system uses an appropriate amount of memory from the operating system.
At the operating sys tem level for consistency. On some platforms, you might need to set operating system parameters that control the total amount of virt ual memory available, summed across all processes.
The SGA is typically part of real physical memory. The SGA is sta tic and of fixed size; if you want to change its size, shut down the database, make the change, and restart the database. Oracle allo cates the shared pool out of the SGA.
A large percentage of the memory used in data warehousing operations is more dynamic. Th
is memory comes from process memory (PGA), and both the size of process memory and the number of processes can vary greatly. Use the
PGA_AGGREGATE_TARGET parameter to control both the process memory and the number of processes.
You ca
n simplify and improve the way PGA memory is allocated by enabling automatic PGA memory management. In this mode, Oracle dynamically
adjusts the size of the portion of the PGA memory dedicated to work areas, based on an overall PGA memory target explicitly set by th
e DBA. To enable automatic PGA memory management, you have to set the initialization parameter PGA_AGGREGATE_TARGET. See
Oracle Database Performance Tuning Guide for descripti
ons of how to use PGA_AGGREGATE_TARGET in different scenarios.
The PARALLEL_EXECUTION_MESSAGE_SIZE parameter specifies the size of the buffer used for parallel execution messages. The
default value is os specific, but is typically 2K. This value should be adequate for most applications, however, increasing this val
ue can improve performance. Consider increasing this value if you have adequate free memory in the shared pool or if you have suffici
ent operating system memory and can increase your shared pool size to accommodate the additional amount of memory required.
The parameters that affect parallel DML and parallel DDL resource consumption are:
Parallel inserts, updates, and deletes require more resources than seria
l DML operations. Similarly, PARALLEL CREATE TABLE ..
. AS SELECT and PARALLEL CREATE IN
DEX can require more resources. For this reason, you may need to increase the value of several additional initialization param
eters. These parameters do not affect resources for queries.
F
or parallel DML and DDL, each query server process starts a transaction. The parallel coordinator uses the two-phase commit protocol to commit transactions; therefore, the number of transactions being processed increases by t
he DOP. As a result, you might need to increase the value of the TRANSACTIONS initialization parameter.
The TRANSACTIONS parameter specifies the maximu
m number of concurrent transactions. The default assumes no parallelism. For example, if you have a DOP of 20, you will have 20 more
new server transactions (or 40, if you have two server sets) and 1 coordinator transaction. In this case, you should increase T
RANSACTIONS by 21 (or 41) if the transactions are running in the same instance. If you do not set this parameter, Oracle sets
it to a value equal to 1.1 x SESSIONS. This discussion does not apply if you are using server-managed undo.
If a system fails when there are uncom
mitted parallel DML or DDL transactions, you can speed up transaction recovery during startup by using the FAST_START_PARALLEL_ROLLBACK parameter.
This parameter co
ntrols the DOP used when recovering terminated transactions. Terminated transactions are transactions that are active before a system
failure. By default, the DOP is chosen to be at most two times the value of the CPU_COUNT parameter.
If the defa
ult DOP is insufficient, set the parameter to the HIGH. This gives a maximum DOP of at most four times the value of the
CPU_COUNT parameter. This feature is available by default.
Check the statistic buffer allocation retries in the V$SYSSTAT view. If this value is high relative to redo blockswritten, try to increase the LOG_BUFFER size. A common LOG_BUFFER size for a system genera
ting numerous logs is 3 MB to 5 MB. If the number of retries is still high after increasing LOG_BUFFER size, a problem m
ight exist with the disk on which the log files reside. In that case, tune the I/O subsystem to increase the I/O rates for redo. One
way of doing this is to use fine-grained striping across multiple disks. For example, use a stripe size of 16 KB. A simpler approach
is to isolate redo logs on their own disk.
Th
is parameter specifies the maximum number of DML locks. Its value should equal the total number of locks on all tables referenced by
all users. A parallel DML operation's lock and enqueue resource requirement is very different from serial DML. Parallel DML holds man
y more locks, so you should increase the value of the ENQUEUE_RESOURCES and DML_LOC
KS parameters by equal amounts.
Table 24-4 shows the types of locks acquired by coordinator and parallel execution server processes for different types of parallel DML statements. Using this information, you can determine the va lue required for these parameters.
Table 24-4 Locks Acquired by Parallel DML Statements
| Type of statement | Coordinator process acquires : | Each parallel execution server acquires: |
|---|---|---|
Parallel UPDATE or DELETE into partitioned table; WHERE clause pruned to a s
ubset of partitions or subpartitions |
1 table lock SX
1 partition lock X for eac h pruned (sub)partition |
1 table lock SX
1 partition lock 1 partition-wait lock S for each pruned (sub)partition owned by the query server process |
Parallel row-migrating UPDATE into partitioned table; WHERE clause pruned to a subset of (sub)partitions<
/td>
| 1 table lock SX | 1 table lock SX td> |
| 1 partition X lock for each pruned (sub)partition | 1 partition lock NULL for each pruned (sub)pa
rtition owned by the query server process
1 partition-wait lock S for each pruned partition owned by the query server process | |
| 1 partition lock SX for all othe r (sub)partitions | 1 partition lock SX for all other (sub)partitions | tr>|
Parallel UPDATE, MERGE
, DELETE, or INSERT into partitioned table |
1 table lock S
X
Partition locks X for all (sub)partitions |
1 table lock SX
1 partition lock NULL for each (sub)partition 1 partition-wait lock S for each (sub)partition |
Parallel INSERT into partitioned table; destination table with parti
tion or subpartition clause |
1 table lock SX
1 partition lock X for each specifi ed (sub)partition |
1 table lock SX
1 partition lock 1 partition-wait lock S for each specified (sub)partition |
Parallel INSERT into nonpartitioned table |
1 table lock X | None |
|
Note: Table, partition, and partition-wait DML locks all appear as TM locks in theV$LOCK view. |
| Insert Type | Parallel | Serial | NOLOGGING |
|---|---|---|---|
| Conventional | No | Yes | No |
Direct-pathINSERT
( |
Yes
, but requires:
|
Yes, but requires:
|
Yes, but requi
res:
|
If parallel DML is enabled and there is a PARALLEL hint or PARALLEL attribute set for the table in the data dictionary, then inserts are parallel
and appended, unless a restriction applies. If either the PARALLEL hint or PARALLEL attribute is missing,
the insert is performed serially.
The append mode is the default during a parallel insert: data is always inserted into a new block which is allocate
d to the table. Therefore the APPEND hint is optional. You should use append mo
de to increase the speed of INSERT operations, but not when space utilization needs to be optimized. You can use <
a id="sthref2420" name="sthref2420">NOAPPEND to override append mode.
The APPEND hint applies to both
serial and parallel insert: even serial inserts are faster if you use this hint. APPEND, however, does require more spac
e and locking overhead.
You can use NOLOGGING with APPEND t
o make the process even faster. NOLOGGING means that no redo log is generated for the operation. NOLOGGING
is never the default; use it when you wish to optimize performance. It should not normally be used when recovery is needed for the ta
ble or partition. If recovery is needed, be sure to take a backup immediately after the operation. Use the ALTER TABLE [NO]LOGGING statement to set the appropriate value.
Wh
en the table or partition has the PARALLEL attribute in the data dictionary, th
at attribute setting is used to determine parallelism of INSERT, UPDATE, and DELETE statements
as well as queries. An explicit PARALLEL hint for a table in a statement overrides the effect of the PARALLEL attribute in the data dictionary.
You can use the NO_PARALLEL hint to override a PARALLEL attrib
ute for the table in the data dictionary. In general, hints take precedence over attributes.
DML operations are considered for
parallelization only if the session is in a PARALLEL DML enabled mode. (Use ALTER SESSI
ON ENABLE PARALLEL DML to enter this mode.) The mode does not affect parallelization of
queries or of the query portions of a DML statement.
In the INSERT ...
SELECT statement you can specify a PARALLEL hint after the INSERT keyword, in addition to the hint a
fter the SELECT keyword. The PARALLEL hint after the INSERT keyword applies to the INSER
T operation only, and the PARALLEL hint after the SELECT keyword applies to the SELECT
operation only. Thus, parallelism of the INSERT and SELECT operations are independent of each other. If one
operation cannot be performed in parallel, it has no effect on whether the other operation can be performed in parallel.
The
ability to parallelize inserts causes a change in existing behavior if the user has explicitly enabled the session for parallel DML a
nd if the table in question has a PARALLEL attribute set in the data dictionary entry. In that case, existing INSE
RT ... SELECT statements that have the select operation parallelized can also have their insert operation paralle
lized.
If you query multiple tables, you can specify multiple SELECT PARALLEL hints and multiple
Example 24-7 Parallelizing INSERT ... SELECT
Add the new employee
s who were hired after the acquisition of ACME.
INSERT /*+ PARALLEL(EMP) */ INTO employees SELECT /*+ PARALLEL(ACME_EMP) */ * FROM ACME_EMP;
The APPEND keyword is not required in this example becau
se it is implied by the PARALLEL hint.
The PARALLEL hint (placed immediately after the UPDATE or DELETE keyword
) applies not only to the underlying scan operation, but also to the UPDATE or DELETE operation. Alternativ
ely, you can specify UPDATE or DELETE parallelism in the PARALLEL clause specified in the defi
nition of the table to be modified.
If you have explicitly enabled parallel DML for the session or transaction, UPDATE
code> or DELETE statements that have their query operation parallelized can also have their UPDATE or DELETE operation parallelized. Any subqueries or updatable views in the statement can have their own separate PARALLEL<
/code> hints or clauses, but these parallel directives do not affect the decision to parallelize the update or delete. If these opera
tions cannot be performed in parallel, it has no effect on whether the UPDATE or DELETE portion can be perf
ormed in parallel.
Tables must be partitioned in order to support parallel UPDATE and DELETE.
Example 24-8 < strong>Parallelizing UPDATE and DELETE
Give a 10 percent salary raise to all clerks in Dallas. p>
UPDATE /*+ PARALLEL(EMP) */ employees SET SAL=SAL * 1.1 WHERE JOB='CLERK' AND DEPTNO IN (SELECT DEPTNO FROM DEPT WHERE LOCATION='DALLAS');
The PARALLEL hint is applied to the UPDATE operation as we
ll as to the scan.
Example 24-9 Parallelizing UPDATE and DELETE
R emove all products in the grocery category because the grocery business line was recently spun off into a separate company.
DELETE /*+ PARALLEL(PRODUCTS) */ FROM PRODUCTS WHERE PRODUCT_CATEGORY ='GROCERY';
Again, the parall
elism is applied to the scan as well as UPDATE operation on table employees.
Parallel DML combined with the updatable join views facility provides an efficient soluti on for refreshing the tables of a data warehouse system. To refresh tables is to update them with the differential data generated fro m the OLTP production system.
In the following example, assume that you want to refresh a table named customer th
at has columns c_key, c_name, and c_addr. The differential data contains either new rows or ro
ws that have been updated since the last refresh of the data warehouse. In this example, the updated data is shipped from the product
ion system to the data warehouse system by means of ASCII files. These files must be loaded into a temporary table, named diff_
customer, before starting the refresh process. You can use SQL*Loader with both the parallel and direct options to efficiently
perform this task. You can use the APPEND hint when loading in parallel as well.
Once diff_customer
is loaded, the refresh process can be started. It can be performed in two phases or by merging in parallel, as demonstrated in the f
ollowing:
The following statement is a straightfo rward SQL implementation of the update using subqueries:
UPDATE customers SET(c_name, c_addr) = (SELECT c_name, c_addr FROM diff_customer WHERE diff_customer.c_key = customer.c_key) WHERE c_key IN(SELECT c_key FROM diff_customer);Unfortunately, the two subqueries in this statement affect performance.
An alternative is to rewrite this query using u pdatable join views. To do this, you must first add a primary key constraint to the
diff_customertable to ensure that t he modified columns map to a key-preserved table:CREATE UNIQUE INDEX diff_pkey_ind ON diff_customer(c_ key) PARALLEL NOLOGGING; ALTER TABLE diff_customer ADD PRIMARY KEY (c_key);You can then update the
customerstable with the following SQL statement:UPDATE /*+ PARALLEL(cust_joinview) */ (SELECT /*+ PARALLEL(cu stomers) PARALLEL(diff_customer) */ CUSTOMER.c_name AS c_name CUSTOMER.c_addr AS c_addr, diff_customer.c_name AS c_newname, diff_cust omer.c_addr AS c_newaddr WHERE customers.c_key = diff_customer.c_key) cust_joinview SET c_name = c_newname, c_addr = c_newaddr;The base scans feeding the join view
cust_joinvieware done in parallel. You can then parallelize the updat e to further improve performance, but only if thecustomerstable is partitioned.
The last phase of the refresh process consi
sts of inserting the new rows from the diff_customer temporary table to the customer table. Unlike the upda
te case, you cannot avoid having a subquery in the INSERT statement:
INSERT /*+PARALLEL(cu stomers)*/ INTO customers SELECT * FROM diff_customer s);
However, you can guarantee that the subquery is transformed int
o an anti-hash join by using the HASH_AJ hint. Doing so enables you to use parallel INSERT to execute the p
receding statement efficiently. Parallel INSERT is applicable even if the table is not partitioned.
You can combine updates and inserts into one statement, commonly known as a merge. The following statement achieves the same result as all of the statements in "Updating the Table in Parallel" and "Inserting the New Rows into the Table in Parallel":
MERGE INTO customers USING diff_customer ON (diff_customer.c_key = customer.c_key) WHEN MATCHED THEN UPDATE SET (c_name, c_addr) = (SELECT c_name, c_addr FROM diff_customer WHERE diff_customer.c_key = customers.c_key) WHEN NOT MATCHED T HEN INSERT VALUES (diff_customer.c_key,diff_customer.c_data);
Query optimization is a sophisticated approach to finding the best execution plan for SQL statements. Oracle automatically uses query optimization with parallel execution.
You must use the <
code>DBMS_STATS package to gather current statistics for cost-based optimization. In particular, tables used in parallel shoul
d always be analyzed. Always keep your statistics current by using the DBMS_STATS package. Failure to do so may result i
n degraded execution performance due to non-optimal execution plans.
Use discretion in employing hints. If used, hints should come as a final step in tuning and only when they demonstrate a necessary and significant performance advantage. In such cases, begin with the execution plan recommended by query optimization, and go on to test the effect of hints only after you have quantified your performance expectations. Remember that hints are powerful. If you use them and the underlying data changes, you might need to chang e the hints. Otherwise, the effectiveness of your execution plans might deteriorate.
The FIRST_ROWS(n) hint enables the optimizer to use a new optimization mode to optimize the query to return n row
s in the shortest amount of time. Oracle Corporation recommends that you use this new hint in place of the old FIRST_ROWS hint for online queries because the new optimization mode may improve the response time compared to the old optimization mode.
Use the FIRST_ROWS(n) hint in cases where you want the first n number of rows in the shortest possible time.
For example, to obtain the first 10 rows in the shortest possible time, use the hint as follows:
SELEC T /*+ FIRST_ROWS(10) */ article_id FROM articles_tab WHERE CONTAINS(article, 'Oracle')>0 ORDER BY pub_date DESC;
Dynamic sampling allows Oracle to derive m ore accurate statistics and thereby improve query performance when statistics do not exist or are out of date. This is particularly u seful in data warehousing environments or when you expect long transactions or queries. In these situations, making sure that Oracle uses the best execution plan is important. Dynamic sampling does, however, have a small cost, so you should use it when that cost is likely to be a small fraction of the total execution time.
If you enable dynamic statistic sampling, Oracle determines at comp ile time whether a query would benefit from dynamic sampling. If so, a recursive SQL statement is issued to scan a small, random samp le of the table's blocks, and to apply the relevant single table predicates to estimate predicate selectivities. Relevant table, inde x and column statistics are also estimated. More accurate selectivity and statistics estimates allow the optimizer to produce better performing plans.
Dynamic sampling is controlled with the initialization parameter OPTIMIZER_DYNAMIC_SAMPLING, wh
ich can be set to a value between 0 and 10, inclusive. Increasing the value of the parameter will result in more aggressive applicati
on of dynamic sampling, in terms of both the type (unanalyzed/analyzed) of tables sampled and the amount of I/O spent on sampling.
Oracle also provides the table-specific hint DYNAMIC_SAMPLING. If the table name is omitted, the hint is considered
cursor-level. The table-level hint forces dynamic sampling for the table.
See Oracle Database Performance Tuning Guide for more information regarding dynamic sampling.