A function-based index includes columns that are either transformed by a function, such as the UPPER function, or included in an expression, such as col1 + col2. With a function-based index, you can store computation-intensive expressions in the index.
Defining a function-based index on the transformed column or expression allows that data to be returned using the index when that function or expression is used in a WHERE clause or an ORDER BY clause. This allows Oracle to bypass computing the value of the expression when processing SELECT and DELETE statements. Therefore, a function-based index can be beneficial when frequently-executed SQL statements include transformed columns, or columns in expressions, in a WHERE or ORDER BY clause.
Oracle treats descending indexes as function-based indexes. The columns marked DESC are sorted in descending order.
For example, function-based indexes defined with the UPPER(column_name) or LOWER(column_name) keywords allow case-insensitive searches. The index created in the following statement:
CREATE INDEX uppercase_idx ON employees (UPPER(last_name));
facilitates processing queries such as:
SELECT * FROM employees
WHERE UPPER(last_name) = 'MARKSON';
Oracle 10 DBA Interview questions and answers
1. Is the following SQL statement syntactically correct? If not, please rewrite it correctly.
SELECT col1 FROM tableA WHERE NOT IN (SELECT col1 FROM tableB);
Ans. SQL is incorrect.
Correct SQL : SELECT col1 FROM tableA WHERE col1 NOT IN (SELECT col1 FROM tableB);
2. What is a more efficient way to write this query, to archive the same set?
Ans: SELECT col1 from tableA minus SELECT col1 from tableB
3.How would you determine that the new query is more efficient than the original query?
Ans: Run explain plan on both query and see the result .
4.How can we find the location of the database trace files from within the data dictionary?
Ans: Generally trace file on the database server machine is located in one of two locations:
1. If you are using a dedicated server connection, the trace file will be generated in the directory specified by
the USER_DUMP_DEST parameter.
2.If you are using a shared server connection, the trace file will be generated in the directory specified by the
BACKGROUND_DUMP_DEST parameter.
you can run sqlplus>SHOW PARAMETER DUMP_DEST
or
select name, value
from v$parameter
where name like '%dump_dest%'
5. What is the correct syntax for a UNIX endless WHILE loop?
while :
do
commands
done
6. Write the SQL statement that will return the name and size of the largest datafile in the database.
SQL> select name,bytes from v$datafile where bytes=(select max(bytes) from v$datafile);
7. What are the proper steps to changing the Oracle database block size?
cold backup all data files and backup controlfile to trace, recreate your database
with the new block size using the backup control file, and restore. It may be easier
with rman. You can not change datbase block size on fly.
8. Using awk, write a script to print the 3rd field of every line.
Ans:
awk '{print }'
awk '{print $3}
awk '{print $3}
9.Under what conditions, is a nested loop better than a merge join?
Ans:
Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving
condition.
It is important to have an index on column of inner join table as this table is probed every time for a new value
from outer table.
Optimizer may not use nested loop in case:
1. No of rows of both the table is quite high
2. Inner query always results in same set of records
3. The access path of inner table is independent of data coming from outer table.
merge join is used to join two independent data sources. They perform better than nested loop when the volume of
data is big in tables
but not as good as hash joins in general.
10.Which database views would you use to ascertain the number of commits a user's session has performed?
Joining V$SESSTAT ,V$STATNAME
select * from V$SESSTAT a ,V$STATNAME b where b.CLASS=a.STATISTIC# and b.NAME='user commits' and a.sid=
11.What does the #!bin/ksh at the beginning of a shell script do? Why should it be there?
Ans: On the first line of an interpreter script, the "#!", is the name of a program which should be used to
interpret the contents of the file.
For instance, if the first line contains "#! /bin/ksh", then the contents of the file are executed as a korn shell
script.
12.What command is used to find the status of Oracle 10g Clusterware (CRS) and the various components it manages
(ONS, VIP, listener, instances, etc.)?
Ans:
$ocrcheck
13.Describe a scenario in which a vendor clusterware is required, in addition to the Oracle 10g Clusterware.
If you choose external redundancy for the OCR and voting disk, then to enable redundancy, the disk subsystem must be configurable for RAID mirroring/vendor clusterware. Otherwise, your system may be vulnerable because the OCR and voting disk are single points of failure.
14.How would you find the interconnect IP address from any node within an Oracle 10g RAC configuration?
using oifcfg command.
se the oifcfg -help command to display online help for OIFCFG. The elements of OIFCFG commands, some of which are
optional depending on the command, are:
*nodename—Name of the Oracle Clusterware node as listed in the output from the olsnodes command
*if_name—Name by which the interface is configured in the system
*subnet—Subnet address of the interface
*if_type—Type of interface: public or cluster_interconnect
You can use OIFCFG to list the interface names and the subnets of all of the interfaces available on the local node
by executing the iflist keyword as shown in this example:
oifcfg iflist
hme0 139.185.141.0
qfe0 204.152.65.16
You can also retrieve specific OIFCFG information with a getif command using the following syntax:
oifcfg getif [ [-global | -node nodename] [-if if_name[/subnet]] [-type if_type] ]
To store a new interface use the setif keyword. For example, to store the interface hme0, with the subnet
139.185.141.0, as a global interface (to be used as an interconnect for all of the RAC instances in your cluster),
you would use the command:
oifcfg setif -global hme0/139.185.141.0:cluster_interconnect
For a cluster interconnect that exists between only two nodes, for example rac1 and rac2, you could create the cms0
interface with the following commands, assuming 139.185.142.0 is the subnet addresses for the interconnect on rac1
and rac2 respectively:
oifcfg setif -global cms0/139.185.142.0:cluster_interconnect
Use the OIFCFG delif command to delete the stored configuration for global or node-specific interfaces. A specific
node-specific or global interface can be deleted by supplying the interface name, with an optional subnet, on the
command line. Without the -node or -global options, the delif keyword deletes either the given interface or all of
the global and node-specific interfaces on all of the nodes in the cluster. For example, the following command
deletes the global interface named qfe0 for the subnet 204.152.65.0:
oifcfg delif -global qfe0/204.152.65.0
On the other hand, the next command deletes all of the global interfaces stored with OIFCFG:
oifcfg delif -global
15.What is the Purpose of the voting disk in Oracle 10g Clusterware?
Voting disk record node membership information. Oracle Clusterware uses the voting disk to determine which instances are members of a cluster. The voting disk must reside on a shared disk. For high availability, Oracle recommends that you have a minimum of three voting disks. If you configure a single voting disk, then you should use external mirroring to provide redundancy. You can have up to 32 voting disks in your cluster.
16.What is the purpose of the OCR in Oracle 10g Clusterware?
Ans: Oracle Cluster Registry (OCR) is a component in 10g RAC used to store the cluster configuration information. It is a shared disk component, typically located in a shared raw volume that must be accessible to all nodes in the cluster.
The daemon OCSSd manages the configuration info in OCR and maintains the changes to cluster in the registry.
17. In Oracle Streams archived log downstream capture, which database view can be used to determine which archived
logs are no longer needed by the capture process?
Ans: V$ARCHIVE_DEST_STATUS
SELECT col1 FROM tableA WHERE NOT IN (SELECT col1 FROM tableB);
Ans. SQL is incorrect.
Correct SQL : SELECT col1 FROM tableA WHERE col1 NOT IN (SELECT col1 FROM tableB);
2. What is a more efficient way to write this query, to archive the same set?
Ans: SELECT col1 from tableA minus SELECT col1 from tableB
3.How would you determine that the new query is more efficient than the original query?
Ans: Run explain plan on both query and see the result .
4.How can we find the location of the database trace files from within the data dictionary?
Ans: Generally trace file on the database server machine is located in one of two locations:
1. If you are using a dedicated server connection, the trace file will be generated in the directory specified by
the USER_DUMP_DEST parameter.
2.If you are using a shared server connection, the trace file will be generated in the directory specified by the
BACKGROUND_DUMP_DEST parameter.
you can run sqlplus>SHOW PARAMETER DUMP_DEST
or
select name, value
from v$parameter
where name like '%dump_dest%'
5. What is the correct syntax for a UNIX endless WHILE loop?
while :
do
commands
done
6. Write the SQL statement that will return the name and size of the largest datafile in the database.
SQL> select name,bytes from v$datafile where bytes=(select max(bytes) from v$datafile);
7. What are the proper steps to changing the Oracle database block size?
cold backup all data files and backup controlfile to trace, recreate your database
with the new block size using the backup control file, and restore. It may be easier
with rman. You can not change datbase block size on fly.
8. Using awk, write a script to print the 3rd field of every line.
Ans:
awk
awk '{print $3}
awk '{print $3}
9.Under what conditions, is a nested loop better than a merge join?
Ans:
Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving
condition.
It is important to have an index on column of inner join table as this table is probed every time for a new value
from outer table.
Optimizer may not use nested loop in case:
1. No of rows of both the table is quite high
2. Inner query always results in same set of records
3. The access path of inner table is independent of data coming from outer table.
merge join is used to join two independent data sources. They perform better than nested loop when the volume of
data is big in tables
but not as good as hash joins in general.
10.Which database views would you use to ascertain the number of commits a user's session has performed?
Joining V$SESSTAT ,V$STATNAME
select * from V$SESSTAT a ,V$STATNAME b where b.CLASS=a.STATISTIC# and b.NAME='user commits' and a.sid=
11.What does the #!bin/ksh at the beginning of a shell script do? Why should it be there?
Ans: On the first line of an interpreter script, the "#!", is the name of a program which should be used to
interpret the contents of the file.
For instance, if the first line contains "#! /bin/ksh", then the contents of the file are executed as a korn shell
script.
12.What command is used to find the status of Oracle 10g Clusterware (CRS) and the various components it manages
(ONS, VIP, listener, instances, etc.)?
Ans:
$ocrcheck
13.Describe a scenario in which a vendor clusterware is required, in addition to the Oracle 10g Clusterware.
If you choose external redundancy for the OCR and voting disk, then to enable redundancy, the disk subsystem must be configurable for RAID mirroring/vendor clusterware. Otherwise, your system may be vulnerable because the OCR and voting disk are single points of failure.
14.How would you find the interconnect IP address from any node within an Oracle 10g RAC configuration?
using oifcfg command.
se the oifcfg -help command to display online help for OIFCFG. The elements of OIFCFG commands, some of which are
optional depending on the command, are:
*nodename—Name of the Oracle Clusterware node as listed in the output from the olsnodes command
*if_name—Name by which the interface is configured in the system
*subnet—Subnet address of the interface
*if_type—Type of interface: public or cluster_interconnect
You can use OIFCFG to list the interface names and the subnets of all of the interfaces available on the local node
by executing the iflist keyword as shown in this example:
oifcfg iflist
hme0 139.185.141.0
qfe0 204.152.65.16
You can also retrieve specific OIFCFG information with a getif command using the following syntax:
oifcfg getif [ [-global | -node nodename] [-if if_name[/subnet]] [-type if_type] ]
To store a new interface use the setif keyword. For example, to store the interface hme0, with the subnet
139.185.141.0, as a global interface (to be used as an interconnect for all of the RAC instances in your cluster),
you would use the command:
oifcfg setif -global hme0/139.185.141.0:cluster_interconnect
For a cluster interconnect that exists between only two nodes, for example rac1 and rac2, you could create the cms0
interface with the following commands, assuming 139.185.142.0 is the subnet addresses for the interconnect on rac1
and rac2 respectively:
oifcfg setif -global cms0/139.185.142.0:cluster_interconnect
Use the OIFCFG delif command to delete the stored configuration for global or node-specific interfaces. A specific
node-specific or global interface can be deleted by supplying the interface name, with an optional subnet, on the
command line. Without the -node or -global options, the delif keyword deletes either the given interface or all of
the global and node-specific interfaces on all of the nodes in the cluster. For example, the following command
deletes the global interface named qfe0 for the subnet 204.152.65.0:
oifcfg delif -global qfe0/204.152.65.0
On the other hand, the next command deletes all of the global interfaces stored with OIFCFG:
oifcfg delif -global
15.What is the Purpose of the voting disk in Oracle 10g Clusterware?
Voting disk record node membership information. Oracle Clusterware uses the voting disk to determine which instances are members of a cluster. The voting disk must reside on a shared disk. For high availability, Oracle recommends that you have a minimum of three voting disks. If you configure a single voting disk, then you should use external mirroring to provide redundancy. You can have up to 32 voting disks in your cluster.
16.What is the purpose of the OCR in Oracle 10g Clusterware?
Ans: Oracle Cluster Registry (OCR) is a component in 10g RAC used to store the cluster configuration information. It is a shared disk component, typically located in a shared raw volume that must be accessible to all nodes in the cluster.
The daemon OCSSd manages the configuration info in OCR and maintains the changes to cluster in the registry.
17. In Oracle Streams archived log downstream capture, which database view can be used to determine which archived
logs are no longer needed by the capture process?
Ans: V$ARCHIVE_DEST_STATUS
Monday, October 20, 2008
Choosing Composite Indexes
A composite index contains more than one key column. Composite indexes can provide additional advantages over single-column indexes:
*Improved selectivity
Sometimes two or more columns or expressions, each with poor selectivity, can be combined to form a composite index with higher selectivity.
*Reduced I/O
If all columns selected by a query are in a composite index, then Oracle can return these values from the index without accessing the table.
A SQL statement can use an access path involving a composite index if the statement contains constructs that use a leading portion of the index.
Note:This is no longer the case with index skip scans. See "Index Skip Scans".
A leading portion of an index is a set of one or more columns that were specified first and consecutively in the list of columns in the CREATE INDEX statement that created the index. Consider this CREATE INDEX statement:
CREATE INDEX comp_ind
ON table1(x, y, z);
*x, xy, and xyz combinations of columns are leading portions of the index
*yz, y, and z combinations of columns are not leading portions of the index
Choosing Keys for Composite Indexes
Follow these guidelines for choosing keys for composite indexes:
*Consider creating a composite index on keys that are used together frequently in WHERE clause conditions combined with AND operators, especially if their combined selectivity is better than the selectivity of either key individually.
*If several queries select the same set of keys based on one or more key values, then consider creating a composite index containing all of these keys.
Of course, consider the guidelines associated with the general performance advantages and trade-offs of indexes described in the previous sections.
Ordering Keys for Composite Indexes
Follow these guidelines for ordering keys in composite indexes:
*Create the index so the keys used in WHERE clauses make up a leading portion.
*If some keys are used in WHERE clauses more frequently, then be sure to create the index so that the more frequently selected keys make up a leading portion to allow the statements that use only these keys to use the index.
*If all keys are used in the WHERE clauses equally often but the data is physically ordered on one of the keys, then place that key first in the composite index.
*Improved selectivity
Sometimes two or more columns or expressions, each with poor selectivity, can be combined to form a composite index with higher selectivity.
*Reduced I/O
If all columns selected by a query are in a composite index, then Oracle can return these values from the index without accessing the table.
A SQL statement can use an access path involving a composite index if the statement contains constructs that use a leading portion of the index.
Note:This is no longer the case with index skip scans. See "Index Skip Scans".
A leading portion of an index is a set of one or more columns that were specified first and consecutively in the list of columns in the CREATE INDEX statement that created the index. Consider this CREATE INDEX statement:
CREATE INDEX comp_ind
ON table1(x, y, z);
*x, xy, and xyz combinations of columns are leading portions of the index
*yz, y, and z combinations of columns are not leading portions of the index
Choosing Keys for Composite Indexes
Follow these guidelines for choosing keys for composite indexes:
*Consider creating a composite index on keys that are used together frequently in WHERE clause conditions combined with AND operators, especially if their combined selectivity is better than the selectivity of either key individually.
*If several queries select the same set of keys based on one or more key values, then consider creating a composite index containing all of these keys.
Of course, consider the guidelines associated with the general performance advantages and trade-offs of indexes described in the previous sections.
Ordering Keys for Composite Indexes
Follow these guidelines for ordering keys in composite indexes:
*Create the index so the keys used in WHERE clauses make up a leading portion.
*If some keys are used in WHERE clauses more frequently, then be sure to create the index so that the more frequently selected keys make up a leading portion to allow the statements that use only these keys to use the index.
*If all keys are used in the WHERE clauses equally often but the data is physically ordered on one of the keys, then place that key first in the composite index.
Choosing Columns and Expressions to Index
A key is a column or expression on which you can build an index. Follow these guidelines for choosing keys to index:
* Consider indexing keys that are used frequently in WHERE clauses.
* Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters for Performance".
* Choose index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value.
Note:Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.
Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.
*Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless the index is modified frequently, as in a high concurrency OLTP application.
*Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
*Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.
*Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
*When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.
* Consider indexing keys that are used frequently in WHERE clauses.
* Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters for Performance".
* Choose index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value.
Note:Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.
Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.
*Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless the index is modified frequently, as in a high concurrency OLTP application.
*Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
*Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.
*Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
*When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.
Friday, October 17, 2008
Finding Locks and Lock Holders
Query V$LOCK to find the sessions holding the lock. For every session waiting for the event enqueue, there is a row in V$LOCK with REQUEST <> 0. Use one of the following two queries to find the sessions holding the locks and waiting for the locks.
If there are enqueue waits, you can see these using the following statement:
SELECT * FROM V$LOCK WHERE request > 0;
To show only holders and waiters for locks being waited on, use the following:
SELECT DECODE(request,0,'Holder: ','Waiter: ') ||
sid sess, id1, id2, lmode, request, type
FROM V$LOCK
WHERE (id1, id2, type) IN (SELECT id1, id2, type FROM V$LOCK WHERE request > 0)
ORDER BY id1, request;
If there are enqueue waits, you can see these using the following statement:
SELECT * FROM V$LOCK WHERE request > 0;
To show only holders and waiters for locks being waited on, use the following:
SELECT DECODE(request,0,'Holder: ','Waiter: ') ||
sid sess, id1, id2, lmode, request, type
FROM V$LOCK
WHERE (id1, id2, type) IN (SELECT id1, id2, type FROM V$LOCK WHERE request > 0)
ORDER BY id1, request;
TM enqueue
TM enqueue
The most common reason for waits on TM locks tend to involve foreign key constraints where the constrained columns are not indexed. Index the foreign key columns to avoid this problem.
The most common reason for waits on TM locks tend to involve foreign key constraints where the constrained columns are not indexed. Index the foreign key columns to avoid this problem.
HW enqueue
The HW enqueue is used to serialize the allocation of space beyond the high water mark of a segment.
*V$SESSION_WAIT.P2 / V$LOCK.ID1 is the tablespace number.
*V$SESSION_WAIT.P3 / V$LOCK.ID2 is the relative dba of segment header of the object for which space is being allocated.
If this is a point of contention for an object, then manual allocation of extents solves the problem.
*V$SESSION_WAIT.P2 / V$LOCK.ID1 is the tablespace number.
*V$SESSION_WAIT.P3 / V$LOCK.ID2 is the relative dba of segment header of the object for which space is being allocated.
If this is a point of contention for an object, then manual allocation of extents solves the problem.
ST enqueue
ST enqueue
If the contended-for enqueue is the ST enqueue, then the problem is most likely to be dynamic space allocation. Oracle dynamically allocates an extent to a segment when there is no more free space available in the segment. This enqueue is only used for dictionary managed tablespaces.
To solve contention on this resource:
*Check to see whether the temporary (that is, sort) tablespace uses TEMPFILES. If not, then switch to using TEMPFILES.
*Switch to using locally managed tablespaces if the tablespace that contains segments that are growing dynamically is dictionary managed.
#If it is not possible to switch to locally managed tablespaces, then ST enqueue resource usage can be decreased by changing the next extent sizes of the growing objects to be large enough to avoid constant space allocation. To determine which segments are growing constantly, monitor the EXTENTS column of the DBA_SEGMENTS view for all SEGMENT_NAMEs. See Oracle Database Administrator's Guide for information about displaying information about space usage.
#Preallocate space in the segment, for example, by allocating extents using the ALTER TABLE ALLOCATE EXTENT SQL statement.
If the contended-for enqueue is the ST enqueue, then the problem is most likely to be dynamic space allocation. Oracle dynamically allocates an extent to a segment when there is no more free space available in the segment. This enqueue is only used for dictionary managed tablespaces.
To solve contention on this resource:
*Check to see whether the temporary (that is, sort) tablespace uses TEMPFILES. If not, then switch to using TEMPFILES.
*Switch to using locally managed tablespaces if the tablespace that contains segments that are growing dynamically is dictionary managed.
#If it is not possible to switch to locally managed tablespaces, then ST enqueue resource usage can be decreased by changing the next extent sizes of the growing objects to be large enough to avoid constant space allocation. To determine which segments are growing constantly, monitor the EXTENTS column of the DBA_SEGMENTS view for all SEGMENT_NAMEs. See Oracle Database Administrator's Guide for information about displaying information about space usage.
#Preallocate space in the segment, for example, by allocating extents using the ALTER TABLE ALLOCATE EXTENT SQL statement.
TX enqueue
TX enqueue
These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT or ROLLBACK.
* Waits for TX in mode 6: occurs when a session is waiting for a row level lock that is already held by another session. This occurs when one user is updating or deleting a row, which another session wishes to update or delete. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
* Waits for TX in mode 4 can occur if the session is waiting for an ITL (interested transaction list) slot in a block. This happens when the session wants to lock a row in the block but one or more other sessions have rows locked in the same block, and there is no free ITL slot in the block. Usually, Oracle dynamically adds another ITL slot. This may not be possible if there is insufficient free space in the block to add an ITL. If so, the session waits for a slot with a TX enqueue in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - allocate ITL entry.
The solution is to increase the number of ITLs available, either by changing the INITRANS or MAXTRANS for the table (either by using an ALTER statement, or by re-creating the table with the higher values).
*Waits for TX in mode 4 can also occur if a session is waiting due to potential duplicates in UNIQUE index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001 should be raised or not. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
*Waits for TX in mode 4 is also possible if the session is waiting due to shared bitmap index fragment. Bitmap indexes index key values and a range of ROWIDs. Each 'entry' in a bitmap index can cover many rows in the actual table. If two sessions want to update rows covered by the same bitmap index fragment, then the second session waits for the first transaction to either COMMIT or ROLLBACK by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
*Waits for TX in Mode 4 can also occur waiting for a PREPARED transaction.
*Waits for TX in mode 4 also occur when a transaction inserting a row in an index has to wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait event enq: TX - index contention.
These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT or ROLLBACK.
* Waits for TX in mode 6: occurs when a session is waiting for a row level lock that is already held by another session. This occurs when one user is updating or deleting a row, which another session wishes to update or delete. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
* Waits for TX in mode 4 can occur if the session is waiting for an ITL (interested transaction list) slot in a block. This happens when the session wants to lock a row in the block but one or more other sessions have rows locked in the same block, and there is no free ITL slot in the block. Usually, Oracle dynamically adds another ITL slot. This may not be possible if there is insufficient free space in the block to add an ITL. If so, the session waits for a slot with a TX enqueue in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - allocate ITL entry.
The solution is to increase the number of ITLs available, either by changing the INITRANS or MAXTRANS for the table (either by using an ALTER statement, or by re-creating the table with the higher values).
*Waits for TX in mode 4 can also occur if a session is waiting due to potential duplicates in UNIQUE index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001 should be raised or not. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
*Waits for TX in mode 4 is also possible if the session is waiting due to shared bitmap index fragment. Bitmap indexes index key values and a range of ROWIDs. Each 'entry' in a bitmap index can cover many rows in the actual table. If two sessions want to update rows covered by the same bitmap index fragment, then the second session waits for the first transaction to either COMMIT or ROLLBACK by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
*Waits for TX in Mode 4 can also occur waiting for a PREPARED transaction.
*Waits for TX in mode 4 also occur when a transaction inserting a row in an index has to wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait event enq: TX - index contention.
Thursday, October 16, 2008
sar - Top 20 DBA unix commands
SAR: Displays the activity for the CPU.
Syntax
sar [-a] [-A] [-b] [-c] [-d] [-g] [-k] [-m] [-p] [-q] [-r] [-u] [-v] [-w] [-y ] [ -o filename ] t [ n ]
sar [-a] [-A] [-b] [-c] [-d] [-g] [-k] [-m] [-p] [-q] [-r] [-u] [-v] [-w] [-y ] [ -e time ] [ -f filename ] [-i sec ] [ -s time ]
-a Report use of file access system routines: iget/s, namei/s, dirblk/s
-A Report all data. Equivalent to -abcdgkmpqruvwy.
-b Report buffer activity:
bread/s, bwrit/s
transfers per second of data between system buffers and disk or other block devices.
lread/s, lwrit/s
accesses of system buffers.
%rcache, %wcache
cache hit ratios, that is, (1-bread/lread) as a percentage.
pread/s, pwrit/s
transfers using raw (physical) device mechanism.
-c Report system calls:
scall/s system calls of all types.
sread/s, swrit/s, fork/s, exec/s
specific system calls.
rchar/s, wchar/s
characters transferred by read and write system calls. No incoming or outgoing exec and fork calls are reported.
-d Report activity for each block device (for example, disk or tape drive) with the exception of XDC disks and tape drives. When data is displayed, the device specification dsk- is generally used to represent a disk drive. The device specification used to represent a tape drive is machine dependent. The activity data reported is:
%busy, avque
portion of time device was busy servicing a transfer request, average number of requests outstanding during that time.
read/s, write/s, blks/s
number of read/write transfers from or to device, number of bytes transferred in 512-byte units.
avwait average wait time in milliseconds.
avserv average service time in milliseconds.
-g Report paging activities:
pgout/s
page-out requests per second.
ppgout/s
pages paged-out per second.
pgfree/s
pages per second placed on the free list by the page stealing daemon.
pgscan/s
pages per second scanned by the page stealing daemon.
%ufs_ipf
the percentage of UFS inodes taken off the freelist by iget which had reusable pages associated with them. These pages are flushed and cannot be reclaimed by processes. Thus, this is the percentage of igets with page flushes.
-k Report kernel memory allocation (KMA) activities:
sml_mem, alloc, fail
information about the memory pool reserving and allocating space for small requests: the amount of memory in bytes KMA has for the small pool, the number of bytes allocated to satisfy requests for small amounts of memory, and the number of requests for small amounts of memory that were not satisfied (failed).
lg_mem, alloc, fail
information for the large memory pool (analogous to the information for the small memory pool).
ovsz_alloc, fail
the amount of memory allocated for oversize requests and the number of oversize requests which could not be satisfied (because oversized memory is allocated dynamically, there is not a pool).
-m Report message and semaphore activities:
msg/s, sema/s
primitives per second.
-p Report paging activities:
atch/s
page faults per second that are satisfied by reclaiming a page currently in memory (attaches per second).
pgin/s
page-in requests per second.
ppgin/s
pages paged-in per second.
pflt/s
page faults from protection errors per second (illegal access to page) or "copy-on-writes".
vflt/s
address translation page faults per second (valid page not in memory).
slock/s
faults per second caused by software lock requests requiring physical I/O.
-q Report average queue length while occupied, and percent of time occupied:
runq-sz, %runocc
run queue of processes in memory and runnable.
swpq-sz, %swpocc
these are no longer reported by sar .
-r Report unused memory pages and disk blocks:
freemem average pages available to user processes.
freeswap disk blocks available for page swapping.
-u Report CPU utilization (the default):
%usr, %sys, %wio, %idle
portion of time running in user mode, running in system mode, idle with some process waiting for block I/O, and otherwise idle.
-v Report status of process, i-node, file tables:
proc-sz, inod-sz, file-sz, lock-sz
entries/size for each table, evaluated
ov
overflows that occur between sampling points for each table.
-w Report system swapping and switching activity:
swpin/s, swpot/s, bswin/s, bswot/s
number of transfers and number of 512-byte units transferred for swapins and swapouts (including initial loading of some programs).
pswch/s
process switches.
-y Report TTY device activity:
rawch/s, canch/s, outch/s
input character rate, input character rate processed by canon, output character rate.
rcvin/s, xmtin/s, mdmin/s
receive, transmit and modem interrupt rates.
-o filename Save samples in file, filename, in binary format.
-e time Select data up to time . Default is 18:00.
-f filename Use filename as the data source for sar . Default is the current daily data file /var/adm/sa/sadd.
-i sec Select data at intervals as close as possible to sec seconds.
Examples
sar - display today's CPU activity so far. You must have permission and the information must be present on the computer.
top: top - display top CPU processes
SYNOPSIS
top [-] [d delay] [p pid] [q] [c] [C] [S] [s] [i] [n iter] [b]
DESCRIPTION
top provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage and runtime. can be better configured than the standard top from the procps suite. Most features can either be selected by an interactive command or by specifying the feature in the personal or system-wide configuration file. See below for more information.
COMMAND-LINE OPTIONS
-d
Specifies the delay between screen updates. You can change this with the s interactive command.
-p
Monitor only processes with given process id. This flag can be given up to twenty times. This option is neither available interactively nor can it be put into the configuration file.
-q
This causes top to refresh without any delay. If the caller has superuser privileges, top runs with the highest possible priority.
-S
Specifies cumulative mode, where each process is listed with the CPU time that it as well as its dead children has spent. This is like the -S flag to ps(1). See the discussion below of the S interactive command.
-s
Tells top to run in secure mode. This disables the potentially dangerous of the interactive commands (see below). A secure top is a nifty thing to leave running on a spare terminal.
-i
Start top ignoring any idle or zombie processes. See the interactive command i below.
-C
display total CPU states instead of individual CPUs. This option only affects SMP systems.
-c
display command line instead of the command name only. The default behavior has been changed as this seems to be more useful.
-H
Show all threads.
-n
Number of iterations. Update the display this number of times and then exit.
-b
Batch mode. Useful for sending output from top to other programs or to a file. In this mode, top will not accept command line input. It runs until it produces the number of iterations requested with the n option or until killed. Output is plain text suitable for display on a dumb terminal.
FIELD DESCRIPTIONS
top displays a variety of information about the processor state. The display is updated every 5 seconds by default, but you can change that with the d command-line option or the s interactive command.
"uptime"
This line displays the time the system has been up, and the three load averages for the system. The load averages are the average number of process ready to run during the last 1, 5 and 15 minutes. This line is just like the output of uptime(1). The uptime display may be toggled by the interactive l command.
processes
The total number of processes running at the time of the last update. This is also broken down into the number of tasks which are running, sleeping, stopped, or undead. The processes and states display may be toggled by the t interactive command.
"CPU states"
Shows the percentage of CPU time in user mode, system mode, niced tasks, iowait and idle. (Niced tasks are only those whose nice value is positive.) Time spent in niced tasks will also be counted in system and user time, so the total will be more than 100%. The processes and states display may be toggled by the t interactive command.
Mem
Statistics on memory usage, including total available memory, free memory, used memory, shared memory, and memory used for buffers. The display of memory information may be toggled by the m interactive command.
Swap
Statistics on swap space, including total swap space, available swap space, and used swap space. This and Mem are just like the output of free(1).
PID
The process ID of each task.
PPID
The parent process ID each task.
UID
The user ID of the task's owner.
USER
The user name of the task's owner.
PRI
The priority of the task.
NI
The nice value of the task. Negative nice values are higher priority.
SIZE
The size of the task's code plus data plus stack space, in kilobytes, is shown here.
TSIZE
The code size of the task. This gives strange values for kernel processes and is broken for ELF processes.
DSIZE
Data + Stack size. This is broken for ELF processes.
TRS
Text resident size.
SWAP
Size of the swapped out part of the task.
D
Size of pages marked dirty.
LC
Last used processor. (That this changes from time to time is not a bug; Linux intentionally uses weak affinity. Also notice that the very act of running top may break weak affinity and cause more processes to change current CPU more often because of the extra demand for CPU time.)
RSS
The total amount of physical memory used by the task, in kilobytes, is shown here. For ELF processes used library pages are counted here, for a.out processes not.
SHARE
The amount of shared memory used by the task is shown in this column.
STAT
The state of the task is shown here. The state is either S for sleeping, D for uninterruptible sleep, R for running, Z for zombies, or T for stopped or traced. These states are modified by trailing < for a process with negative nice value, N for a process with positive nice value, W for a swapped out process (this does not work correctly for kernel processes).
WCHAN
depending on the availability of either /boot/psdatabase or the kernel link map /boot/System.map this shows the address or the name of the kernel function the task currently is sleeping in.
TIME
Total CPU time the task has used since it started. If cumulative mode is on, this also includes the CPU time used by the process's children which have died. You can set cumulative mode with the S command line option or toggle it with the interactive command S. The header line will then be changed to CTIME.
%CPU
The task's share of the CPU time since the last screen update, expressed as a percentage of total CPU time per processor.
%MEM
The task's share of the physical memory.
COMMAND
The task's command name, which will be truncated if it is too long to be displayed on one line. Tasks in memory will have a full command line, but swapped-out tasks will only have the name of the program in parentheses (for example, "(getty)").
"A , WP"
these fields from the kmem top are not supported.
INTERACTIVE COMMANDS
Several single-key commands are recognized while top is running. Some are disabled if the s option has been given on the command line.
space
Immediately updates the display.
^L
Erases and redraws the screen.
h or ?
Displays a help screen giving a brief summary of commands, and the status of secure and cumulative modes.
k
Kill a process. You will be prompted for the PID of the task, and the signal to send to it. For a normal kill, send signal 15. For a sure, but rather abrupt, kill, send signal 9. The default signal, as with kill(1), is 15, SIGTERM. This command is not available in secure mode.
i
Ignore idle and zombie processes. This is a toggle switch.
I
Toggle between Solaris (CPU percentage divided by total number of CPUs) and Irix (CPU percentage calculated solely by amount of time) views. This is a toggle switch that affects only SMP systems.
n or #
Change the number of processes to show. You will be prompted to enter the number. This overrides automatic determination of the number of processes to show, which is based on window size measurement. If 0 is specified, then top will show as many processes as will fit on the screen; this is the default.
q
Quit.
r
Re-nice a process. You will be prompted for the PID of the task, and the value to nice it to. Entering a positve value will cause a process to be niced to negative values, and lose priority. If root is running top, a negative value can be entered, causing a process to get a higher than normal priority. The default renice value is 10. This command is not available in secure mode.
S
This toggles cumulative mode, the equivalent of ps -S, i.e., that CPU times will include a process's defunct children. For some programs, such as compilers, which work by forking into many separate tasks, normal mode will make them appear less demanding than they actually are. For others, however, such as shells and init, this behavior is correct. In any case, try cumulative mode for an alternative view of CPU use.
s
Change the delay between updates. You will be prompted to enter the delay time, in seconds, between updates. Fractional values are recognized down to microseconds. Entering 0 causes continuous updates. The default value is 5 seconds. Note that low values cause nearly unreadably fast displays, and greatly raise the load. This command is not available in secure mode.
f or F
Add fields to display or remove fields from the display. See below for more information.
o or O
Change order of displayed fields. See below for more information.
l
toggle display of load average and uptime information.
m
toggle display of memory information.
t
toggle display of processes and CPU states information.
c
toggle display of command name or full command line.
N
sort tasks by pid (numerically).
A
sort tasks by age (newest first).
P
sort tasks by CPU usage (default).
M
sort tasks by resident memory usage.
T
sort tasks by time / cumulative time.
W
Write current setup to ~/.toprc. This is the recommended way to write a top configuration file.
The Field and Order Screens
After pressing f, F, o or O you will be shown a screen specifying the field order on the top line and short descriptions of the field contents. The field order string uses the following syntax: If the letter in the filed string corresponding to a field is upper case, the field will be displayed. This is furthermore indicated by an asterisk in front of the field description. The order of the fields corresponds to the order of the letters in the string.
From the field select screen you can toggle the display of a field by pressing the corresponding letter.
From the order screen you may move a field to the left by pressing the corresponding upper case letter resp. to the right by pressing the lower case one.
Configuration Files
The command top reads it's default configuration from two files, /etc/toprc and ~/.toprc. The global configuration file may be used to restrict the usage of top to the secure mode for non-non-privileged users. If this is desired, the file should contain a 's' to specify secure mode and a digit d (2<=d<=9) for the default delay (in seconds) on a single line. The personal configuration file contains two lines. The first line contains lower and upper letters to specify which fields in what order are to be displayed. The letters correspond to the letters in the Fields or Order screens from top. As this is not very instructive, it is recommended to select fields and order in a running top process and to save this using the W interactive command.
The second line is more interesting (and important). It contains information on the other options. Most important, if you have saved a configuration in secure mode, you will not get an insecure top without removing the lower 's' from the second line of your ~/.toprc.
A digit specifies the delay time between updates, a capital 'S' cumulative mode, a lower 'i' no-idle mode, a capital 'I' Irix view. As in interactive mode, a lower 'm', 'l', and 't' suppresses the display of memory, uptime resp. process and CPU state information. Currently changing the default sorting order (by CPU usage) is not supported.
Syntax
sar [-a] [-A] [-b] [-c] [-d] [-g] [-k] [-m] [-p] [-q] [-r] [-u] [-v] [-w] [-y ] [ -o filename ] t [ n ]
sar [-a] [-A] [-b] [-c] [-d] [-g] [-k] [-m] [-p] [-q] [-r] [-u] [-v] [-w] [-y ] [ -e time ] [ -f filename ] [-i sec ] [ -s time ]
-a Report use of file access system routines: iget/s, namei/s, dirblk/s
-A Report all data. Equivalent to -abcdgkmpqruvwy.
-b Report buffer activity:
bread/s, bwrit/s
transfers per second of data between system buffers and disk or other block devices.
lread/s, lwrit/s
accesses of system buffers.
%rcache, %wcache
cache hit ratios, that is, (1-bread/lread) as a percentage.
pread/s, pwrit/s
transfers using raw (physical) device mechanism.
-c Report system calls:
scall/s system calls of all types.
sread/s, swrit/s, fork/s, exec/s
specific system calls.
rchar/s, wchar/s
characters transferred by read and write system calls. No incoming or outgoing exec and fork calls are reported.
-d Report activity for each block device (for example, disk or tape drive) with the exception of XDC disks and tape drives. When data is displayed, the device specification dsk- is generally used to represent a disk drive. The device specification used to represent a tape drive is machine dependent. The activity data reported is:
%busy, avque
portion of time device was busy servicing a transfer request, average number of requests outstanding during that time.
read/s, write/s, blks/s
number of read/write transfers from or to device, number of bytes transferred in 512-byte units.
avwait average wait time in milliseconds.
avserv average service time in milliseconds.
-g Report paging activities:
pgout/s
page-out requests per second.
ppgout/s
pages paged-out per second.
pgfree/s
pages per second placed on the free list by the page stealing daemon.
pgscan/s
pages per second scanned by the page stealing daemon.
%ufs_ipf
the percentage of UFS inodes taken off the freelist by iget which had reusable pages associated with them. These pages are flushed and cannot be reclaimed by processes. Thus, this is the percentage of igets with page flushes.
-k Report kernel memory allocation (KMA) activities:
sml_mem, alloc, fail
information about the memory pool reserving and allocating space for small requests: the amount of memory in bytes KMA has for the small pool, the number of bytes allocated to satisfy requests for small amounts of memory, and the number of requests for small amounts of memory that were not satisfied (failed).
lg_mem, alloc, fail
information for the large memory pool (analogous to the information for the small memory pool).
ovsz_alloc, fail
the amount of memory allocated for oversize requests and the number of oversize requests which could not be satisfied (because oversized memory is allocated dynamically, there is not a pool).
-m Report message and semaphore activities:
msg/s, sema/s
primitives per second.
-p Report paging activities:
atch/s
page faults per second that are satisfied by reclaiming a page currently in memory (attaches per second).
pgin/s
page-in requests per second.
ppgin/s
pages paged-in per second.
pflt/s
page faults from protection errors per second (illegal access to page) or "copy-on-writes".
vflt/s
address translation page faults per second (valid page not in memory).
slock/s
faults per second caused by software lock requests requiring physical I/O.
-q Report average queue length while occupied, and percent of time occupied:
runq-sz, %runocc
run queue of processes in memory and runnable.
swpq-sz, %swpocc
these are no longer reported by sar .
-r Report unused memory pages and disk blocks:
freemem average pages available to user processes.
freeswap disk blocks available for page swapping.
-u Report CPU utilization (the default):
%usr, %sys, %wio, %idle
portion of time running in user mode, running in system mode, idle with some process waiting for block I/O, and otherwise idle.
-v Report status of process, i-node, file tables:
proc-sz, inod-sz, file-sz, lock-sz
entries/size for each table, evaluated
ov
overflows that occur between sampling points for each table.
-w Report system swapping and switching activity:
swpin/s, swpot/s, bswin/s, bswot/s
number of transfers and number of 512-byte units transferred for swapins and swapouts (including initial loading of some programs).
pswch/s
process switches.
-y Report TTY device activity:
rawch/s, canch/s, outch/s
input character rate, input character rate processed by canon, output character rate.
rcvin/s, xmtin/s, mdmin/s
receive, transmit and modem interrupt rates.
-o filename Save samples in file, filename, in binary format.
-e time Select data up to time . Default is 18:00.
-f filename Use filename as the data source for sar . Default is the current daily data file /var/adm/sa/sadd.
-i sec Select data at intervals as close as possible to sec seconds.
Examples
sar - display today's CPU activity so far. You must have permission and the information must be present on the computer.
top: top - display top CPU processes
SYNOPSIS
top [-] [d delay] [p pid] [q] [c] [C] [S] [s] [i] [n iter] [b]
DESCRIPTION
top provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage and runtime. can be better configured than the standard top from the procps suite. Most features can either be selected by an interactive command or by specifying the feature in the personal or system-wide configuration file. See below for more information.
COMMAND-LINE OPTIONS
-d
Specifies the delay between screen updates. You can change this with the s interactive command.
-p
Monitor only processes with given process id. This flag can be given up to twenty times. This option is neither available interactively nor can it be put into the configuration file.
-q
This causes top to refresh without any delay. If the caller has superuser privileges, top runs with the highest possible priority.
-S
Specifies cumulative mode, where each process is listed with the CPU time that it as well as its dead children has spent. This is like the -S flag to ps(1). See the discussion below of the S interactive command.
-s
Tells top to run in secure mode. This disables the potentially dangerous of the interactive commands (see below). A secure top is a nifty thing to leave running on a spare terminal.
-i
Start top ignoring any idle or zombie processes. See the interactive command i below.
-C
display total CPU states instead of individual CPUs. This option only affects SMP systems.
-c
display command line instead of the command name only. The default behavior has been changed as this seems to be more useful.
-H
Show all threads.
-n
Number of iterations. Update the display this number of times and then exit.
-b
Batch mode. Useful for sending output from top to other programs or to a file. In this mode, top will not accept command line input. It runs until it produces the number of iterations requested with the n option or until killed. Output is plain text suitable for display on a dumb terminal.
FIELD DESCRIPTIONS
top displays a variety of information about the processor state. The display is updated every 5 seconds by default, but you can change that with the d command-line option or the s interactive command.
"uptime"
This line displays the time the system has been up, and the three load averages for the system. The load averages are the average number of process ready to run during the last 1, 5 and 15 minutes. This line is just like the output of uptime(1). The uptime display may be toggled by the interactive l command.
processes
The total number of processes running at the time of the last update. This is also broken down into the number of tasks which are running, sleeping, stopped, or undead. The processes and states display may be toggled by the t interactive command.
"CPU states"
Shows the percentage of CPU time in user mode, system mode, niced tasks, iowait and idle. (Niced tasks are only those whose nice value is positive.) Time spent in niced tasks will also be counted in system and user time, so the total will be more than 100%. The processes and states display may be toggled by the t interactive command.
Mem
Statistics on memory usage, including total available memory, free memory, used memory, shared memory, and memory used for buffers. The display of memory information may be toggled by the m interactive command.
Swap
Statistics on swap space, including total swap space, available swap space, and used swap space. This and Mem are just like the output of free(1).
PID
The process ID of each task.
PPID
The parent process ID each task.
UID
The user ID of the task's owner.
USER
The user name of the task's owner.
PRI
The priority of the task.
NI
The nice value of the task. Negative nice values are higher priority.
SIZE
The size of the task's code plus data plus stack space, in kilobytes, is shown here.
TSIZE
The code size of the task. This gives strange values for kernel processes and is broken for ELF processes.
DSIZE
Data + Stack size. This is broken for ELF processes.
TRS
Text resident size.
SWAP
Size of the swapped out part of the task.
D
Size of pages marked dirty.
LC
Last used processor. (That this changes from time to time is not a bug; Linux intentionally uses weak affinity. Also notice that the very act of running top may break weak affinity and cause more processes to change current CPU more often because of the extra demand for CPU time.)
RSS
The total amount of physical memory used by the task, in kilobytes, is shown here. For ELF processes used library pages are counted here, for a.out processes not.
SHARE
The amount of shared memory used by the task is shown in this column.
STAT
The state of the task is shown here. The state is either S for sleeping, D for uninterruptible sleep, R for running, Z for zombies, or T for stopped or traced. These states are modified by trailing < for a process with negative nice value, N for a process with positive nice value, W for a swapped out process (this does not work correctly for kernel processes).
WCHAN
depending on the availability of either /boot/psdatabase or the kernel link map /boot/System.map this shows the address or the name of the kernel function the task currently is sleeping in.
TIME
Total CPU time the task has used since it started. If cumulative mode is on, this also includes the CPU time used by the process's children which have died. You can set cumulative mode with the S command line option or toggle it with the interactive command S. The header line will then be changed to CTIME.
%CPU
The task's share of the CPU time since the last screen update, expressed as a percentage of total CPU time per processor.
%MEM
The task's share of the physical memory.
COMMAND
The task's command name, which will be truncated if it is too long to be displayed on one line. Tasks in memory will have a full command line, but swapped-out tasks will only have the name of the program in parentheses (for example, "(getty)").
"A , WP"
these fields from the kmem top are not supported.
INTERACTIVE COMMANDS
Several single-key commands are recognized while top is running. Some are disabled if the s option has been given on the command line.
space
Immediately updates the display.
^L
Erases and redraws the screen.
h or ?
Displays a help screen giving a brief summary of commands, and the status of secure and cumulative modes.
k
Kill a process. You will be prompted for the PID of the task, and the signal to send to it. For a normal kill, send signal 15. For a sure, but rather abrupt, kill, send signal 9. The default signal, as with kill(1), is 15, SIGTERM. This command is not available in secure mode.
i
Ignore idle and zombie processes. This is a toggle switch.
I
Toggle between Solaris (CPU percentage divided by total number of CPUs) and Irix (CPU percentage calculated solely by amount of time) views. This is a toggle switch that affects only SMP systems.
n or #
Change the number of processes to show. You will be prompted to enter the number. This overrides automatic determination of the number of processes to show, which is based on window size measurement. If 0 is specified, then top will show as many processes as will fit on the screen; this is the default.
q
Quit.
r
Re-nice a process. You will be prompted for the PID of the task, and the value to nice it to. Entering a positve value will cause a process to be niced to negative values, and lose priority. If root is running top, a negative value can be entered, causing a process to get a higher than normal priority. The default renice value is 10. This command is not available in secure mode.
S
This toggles cumulative mode, the equivalent of ps -S, i.e., that CPU times will include a process's defunct children. For some programs, such as compilers, which work by forking into many separate tasks, normal mode will make them appear less demanding than they actually are. For others, however, such as shells and init, this behavior is correct. In any case, try cumulative mode for an alternative view of CPU use.
s
Change the delay between updates. You will be prompted to enter the delay time, in seconds, between updates. Fractional values are recognized down to microseconds. Entering 0 causes continuous updates. The default value is 5 seconds. Note that low values cause nearly unreadably fast displays, and greatly raise the load. This command is not available in secure mode.
f or F
Add fields to display or remove fields from the display. See below for more information.
o or O
Change order of displayed fields. See below for more information.
l
toggle display of load average and uptime information.
m
toggle display of memory information.
t
toggle display of processes and CPU states information.
c
toggle display of command name or full command line.
N
sort tasks by pid (numerically).
A
sort tasks by age (newest first).
P
sort tasks by CPU usage (default).
M
sort tasks by resident memory usage.
T
sort tasks by time / cumulative time.
W
Write current setup to ~/.toprc. This is the recommended way to write a top configuration file.
The Field and Order Screens
After pressing f, F, o or O you will be shown a screen specifying the field order on the top line and short descriptions of the field contents. The field order string uses the following syntax: If the letter in the filed string corresponding to a field is upper case, the field will be displayed. This is furthermore indicated by an asterisk in front of the field description. The order of the fields corresponds to the order of the letters in the string.
From the field select screen you can toggle the display of a field by pressing the corresponding letter.
From the order screen you may move a field to the left by pressing the corresponding upper case letter resp. to the right by pressing the lower case one.
Configuration Files
The command top reads it's default configuration from two files, /etc/toprc and ~/.toprc. The global configuration file may be used to restrict the usage of top to the secure mode for non-non-privileged users. If this is desired, the file should contain a 's' to specify secure mode and a digit d (2<=d<=9) for the default delay (in seconds) on a single line. The personal configuration file contains two lines. The first line contains lower and upper letters to specify which fields in what order are to be displayed. The letters correspond to the letters in the Fields or Order screens from top. As this is not very instructive, it is recommended to select fields and order in a running top process and to save this using the W interactive command.
The second line is more interesting (and important). It contains information on the other options. Most important, if you have saved a configuration in secure mode, you will not get an insecure top without removing the lower 's' from the second line of your ~/.toprc.
A digit specifies the delay time between updates, a capital 'S' cumulative mode, a lower 'i' no-idle mode, a capital 'I' Irix view. As in interactive mode, a lower 'm', 'l', and 't' suppresses the display of memory, uptime resp. process and CPU state information. Currently changing the default sorting order (by CPU usage) is not supported.
iostat - Top 20 DBA unix commands
iostat: Reports Central Processing Unit (CPU) statistics and input/output statistics for the entire system, adapters, tty devices, disks and CD-ROMs.
Syntax
iostat [ -s ] [ -a ] [ -d | -t ] [ PhysicalVolume ... ] [ Interval [ Count ] ]
Description
The iostat command is used for monitoring system input/output device loading by observing the time the physical disks are active in relation to their average transfer rates. The iostat command generates reports that can be used to change system configuration to better balance the input/output load between physical disks and adapters.
The first report generated by the iostat command provides statistics concerning the time since the system was booted. Each subsequent report covers the time since the previous report. All statistics are reported each time the iostat command is run. The report consists of a tty and CPU header row followed by a row of tty and CPU statistics. On multiprocessor systems, CPU statistics are calculated system-wide as averages among all processors.
If the -s flag is specified, a system-header row is displayed followed by a line of statistics for the entire system. The hostname of the system is printed in the system-header row.
If the -a flag is specified, an adapter-header row is displayed followed by a line of statistics for the adapter. This will be followed by a disk-header row and the statistics of all the disks/CD-ROMs connected to the adapter. Such reports are generated for all the disk adapters connected to the system.
A disks header row is displayed followed by a line of statistics for each disk that is configured. If the PhysicalVolume parameter is specified, only those names specified are displayed.
If the PhysicalVolume parameter is specified, one or more alphabetic or alphanumeric physical volumes can be specified. If the PhysicalVolume parameter is specified, the tty and CPU reports are displayed and the disk report contains statistics for the specified drives. If a specified logical drive name is not found, the report lists the specified name and displays the message Drive Not Found. If no Logical Drive Names are specified, the report contains statistics for all configured disks and CD-ROMs. If no drives are configured on the system, no disk report is generated. The first character in the PhysicalVolume parameter cannot be numeric.
The Interval parameter specifies the amount of time in seconds between each report. The first report contains statistics for the time since system startup (boot). Each subsequent report contains statistics collected during the interval since the previous report. The Count parameter can be specified in conjunction with the Interval parameter. If the Count parameter is specified, the value of count determines the number of reports generated at Interval seconds apart. If the Interval parameter is specified without the Count parameter, the iostat command generates reports continuously.
The iostat command is useful in determining whether a physical volume is becoming a performance bottleneck and if there is potential to improve the situation. The % utilization field for the physical volumes indicates how evenly the file activity is spread across the drives. A high % utilization on a physical volume is a good indication that there may be contention for this resource. Since the CPU utilization statistics are also available with the iostat report, the percentage of time the CPU is in I/O wait can be determined at the same time. Consider distributing data across drives if the I/O wait time is significant and the disk utilization is not evenly distributed across volumes.
Note: Some system resource is consumed in maintaining disk I/O history for the iostat command. Use the sysconfig subroutine, or the System Management Interface Tool (SMIT) to stop history accounting.
Reports
The iostat command generates four types of reports, the tty and CPU Utilization report, the Disk Utilization report, the System throughput report and the Adapter throughput report.
tty and CPU Utilization Report
The first report generated by the iostat command is the tty and CPU Utilization Report. For multiprocessor systems, the CPU values are global averages among all processors. Also, the I/O wait state is defined system-wide and not per processor. The report has the following format:
Column Description
tin Shows the total number of characters read by the system for all ttys.
tout Shows the total number of characters written by the system to all ttys.
% user Shows the percentage of CPU utilization that occurred while executing at the user level (application).
% sys Shows the percentage of CPU utilization that occurred while executing at the system level (kernel).
% idle Shows the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.
% iowait Shows the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
This information is updated at regular intervals by the kernel (typically sixty times per second). The tty report provides a collective account of characters per second received from all terminals on the system as well as the collective count of characters output per second to all terminals on the system.
Methods Used to Compute CPU Disk I/O Wait Time
Operating system version 4.3.3 and later contain enhancements to the method used to compute the percentage of CPU time spent waiting on disk I/O (wio time).The method used in AIX 4.3.2 and earlier versions of the operating system can, under certain circumstances, give an inflated view of wio time on SMPs. The wio time is reported by the commands sar (%wio), vmstat (wa) and iostat (% iowait).
The method used in AIX 4.3.2 and earlier versions is as follows: At each clock interrupt on each processor (100 times a second per processor), a determination is made as to which of the four categories (usr/sys/wio/idle) to place the last 10 ms of time. If the CPU was busy in usr mode at the time of the clock interrupt, then usr gets the clock tick added into its category. If the CPU was busy in kernel mode at the time of the clock interrupt, then the sys category gets the tick. If the CPU was not busy, a check is made to see if any I/O to disk is in progress. If any disk I/O is in progress, the wio category is incremented. If no disk I/O is in progress and the CPU is not busy, the idle category gets the tick. The inflated view of wio time results from all idle CPUs being categorized as wio regardless of the number of threads waiting on I/O. For example, systems with just one thread doing I/O could report over 90 percent wio time regardless of the number of CPUs it has.
The method used in AIX 4.3.3 and later is as follows: The change in operating system version 4.3.3 is to only mark an idle CPU as wio if an outstanding I/O was started on that CPU. This method can report much lower wio times when just a few threads are doing I/O and the system is otherwise idle. For example, a system with four CPUs and one thread doing I/O will report a maximum of 25 percent wio time. A system with 12 CPUs and one thread doing I/O will report a maximum of 8 percent wio time. NFS client reads/writes go through the VMM, and the time that biods spend in the VMM waiting for an I/O to complete is now reported as I/O wait time.
Disk Utilization Report
The second report generated by the iostat command is the Disk Utilization Report. The disk report provides statistics on a per physical disk basis. The report has a format similar to the following:
% tm_act Indicates the percentage of time the physical disk was active (bandwidth utilization for the drive).
Kbps Indicates the amount of data transferred (read or written) to the drive in KB per second.
tps Indicates the number of transfers per second that were issued to the physical disk. A transfer is an I/O request to the physical disk. Multiple logical requests can be combined into a single I/O request to the disk. A transfer is of indeterminate size.
Kb_read The total number of KB read.
Kb_wrtn The total number of KB written.
Statistics for CD-ROM devices are also reported.
For large system configurations where a large number of disks are configured, the system can be configured to avoid collecting physical disk input/output statistics when the iostat command is not executing. If the system is configured in the above manner, the first Disk report displays the message Disk History Since Boot Not Available instead of the disk statistics. Subsequent interval reports generated by the iostat command contain disk statistics collected during the report interval. Any tty and CPU statistics after boot are unaffected. If a system management command is used to re-enable disk statistics keeping, the first iostat command report displays activity from the interval starting at the point that disk input/output statistics were enabled.
System Throughput Report
This report is generated if the -s flag is specified. This report provides statistics for the entire system. This report has the following format:
Kbps Indicates the amount of data transferred (read or written) in the entire system in KB per second.
tps Indicates the number of transfers per second issued to the entire system.
Kb_read The total number of KB read from the entire system.
Kb_wrtn The total number of KB written to the entire system.
Adapter Throughput Report
This report is generated if the -a flag is specified. This report provides statistics on an adapter-by-adapter basis.This report has the following format:
Kbps Indicates the amount of data transferred (read or written) in the adapter in KB per second.
tps Indicates the number of transfers per second issued to the adapter.
Kb_read The total number of KB read from the adapter.
Kb_wrtn The total number of KB written to the adapter.
Disk Input/Output History
To improve performance on, the collection of disk input/output statistics has been disabled. To enable the collection of this data, enter:
chdev -l sys0 -a iostat=true
To display the current settings, enter:
lsattr -E -l sys0 -a iostat
If the collection of disk input/output history is disabled, the first disk report of iostat output displays the message Disk History Since Boot Not Available instead of disk statistics. As before, subsequent interval reports generated by the iostat command contain disk statistics collected during the report interval.
Flags
-a Displays the adapter throughput report.
-d The -d flag is exclusive of the -t flag and displays only the disk utilization report.
-s Displays the system throughput report.
-t The -t flag is exclusive of the -d flag and displays only the tty and cpu usage reports.
Notes:
* The -s and -a flags can both be specified to display both the system and adapter throughput reports.
* If the -a flag is specified with the -t flag, the tty and CPU report is displayed, followed by the adapter throughput report. Disk Utilization reports of the disks connected to the adapters, will not be displayed after the Adapter throughput report.
* If the -a flag is specified with the -d flag, tty and CPU report will not be displayed. If Physical Volume parameter is specified, the Disk Utilization Report of the specified Physical volume will be printed under the corresponding adapter to which it belongs.
Examples
1. To display a single history since boot report for all tty, CPU, and Disks, enter:
iostat
2. To display a continuous disk report at two second intervals for the disk with the logical name disk1, enter:
iostat -d disk1 2
3. To display six reports at two second intervals for the disk with the logical name disk1, enter:
iostat disk1 2 6
4. To display six reports at two second intervals for all disks, enter:
iostat -d 2 6
5. To display six reports at two second intervals for three disks named disk1, disk2, disk3, enter:
iostat disk1 disk2 disk3 2 6
6. To print the System throughput report, enter:
iostat -s
7. To print the Adapter throughput report, enter:
iostat -a
8. To print the System and Adapter throughput reports, with only the tty and CPU report (no disk reports), enter:
iostat -sat
9. To print the System and Adapter throughput reports with the Disk Utilization reports of hdisk0 and hdisk7, enter
iostat -sad hdisk0 hdisk7
File
/usr/bin/iostat Contains the iostat command.
Syntax
iostat [ -s ] [ -a ] [ -d | -t ] [ PhysicalVolume ... ] [ Interval [ Count ] ]
Description
The iostat command is used for monitoring system input/output device loading by observing the time the physical disks are active in relation to their average transfer rates. The iostat command generates reports that can be used to change system configuration to better balance the input/output load between physical disks and adapters.
The first report generated by the iostat command provides statistics concerning the time since the system was booted. Each subsequent report covers the time since the previous report. All statistics are reported each time the iostat command is run. The report consists of a tty and CPU header row followed by a row of tty and CPU statistics. On multiprocessor systems, CPU statistics are calculated system-wide as averages among all processors.
If the -s flag is specified, a system-header row is displayed followed by a line of statistics for the entire system. The hostname of the system is printed in the system-header row.
If the -a flag is specified, an adapter-header row is displayed followed by a line of statistics for the adapter. This will be followed by a disk-header row and the statistics of all the disks/CD-ROMs connected to the adapter. Such reports are generated for all the disk adapters connected to the system.
A disks header row is displayed followed by a line of statistics for each disk that is configured. If the PhysicalVolume parameter is specified, only those names specified are displayed.
If the PhysicalVolume parameter is specified, one or more alphabetic or alphanumeric physical volumes can be specified. If the PhysicalVolume parameter is specified, the tty and CPU reports are displayed and the disk report contains statistics for the specified drives. If a specified logical drive name is not found, the report lists the specified name and displays the message Drive Not Found. If no Logical Drive Names are specified, the report contains statistics for all configured disks and CD-ROMs. If no drives are configured on the system, no disk report is generated. The first character in the PhysicalVolume parameter cannot be numeric.
The Interval parameter specifies the amount of time in seconds between each report. The first report contains statistics for the time since system startup (boot). Each subsequent report contains statistics collected during the interval since the previous report. The Count parameter can be specified in conjunction with the Interval parameter. If the Count parameter is specified, the value of count determines the number of reports generated at Interval seconds apart. If the Interval parameter is specified without the Count parameter, the iostat command generates reports continuously.
The iostat command is useful in determining whether a physical volume is becoming a performance bottleneck and if there is potential to improve the situation. The % utilization field for the physical volumes indicates how evenly the file activity is spread across the drives. A high % utilization on a physical volume is a good indication that there may be contention for this resource. Since the CPU utilization statistics are also available with the iostat report, the percentage of time the CPU is in I/O wait can be determined at the same time. Consider distributing data across drives if the I/O wait time is significant and the disk utilization is not evenly distributed across volumes.
Note: Some system resource is consumed in maintaining disk I/O history for the iostat command. Use the sysconfig subroutine, or the System Management Interface Tool (SMIT) to stop history accounting.
Reports
The iostat command generates four types of reports, the tty and CPU Utilization report, the Disk Utilization report, the System throughput report and the Adapter throughput report.
tty and CPU Utilization Report
The first report generated by the iostat command is the tty and CPU Utilization Report. For multiprocessor systems, the CPU values are global averages among all processors. Also, the I/O wait state is defined system-wide and not per processor. The report has the following format:
Column Description
tin Shows the total number of characters read by the system for all ttys.
tout Shows the total number of characters written by the system to all ttys.
% user Shows the percentage of CPU utilization that occurred while executing at the user level (application).
% sys Shows the percentage of CPU utilization that occurred while executing at the system level (kernel).
% idle Shows the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.
% iowait Shows the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
This information is updated at regular intervals by the kernel (typically sixty times per second). The tty report provides a collective account of characters per second received from all terminals on the system as well as the collective count of characters output per second to all terminals on the system.
Methods Used to Compute CPU Disk I/O Wait Time
Operating system version 4.3.3 and later contain enhancements to the method used to compute the percentage of CPU time spent waiting on disk I/O (wio time).The method used in AIX 4.3.2 and earlier versions of the operating system can, under certain circumstances, give an inflated view of wio time on SMPs. The wio time is reported by the commands sar (%wio), vmstat (wa) and iostat (% iowait).
The method used in AIX 4.3.2 and earlier versions is as follows: At each clock interrupt on each processor (100 times a second per processor), a determination is made as to which of the four categories (usr/sys/wio/idle) to place the last 10 ms of time. If the CPU was busy in usr mode at the time of the clock interrupt, then usr gets the clock tick added into its category. If the CPU was busy in kernel mode at the time of the clock interrupt, then the sys category gets the tick. If the CPU was not busy, a check is made to see if any I/O to disk is in progress. If any disk I/O is in progress, the wio category is incremented. If no disk I/O is in progress and the CPU is not busy, the idle category gets the tick. The inflated view of wio time results from all idle CPUs being categorized as wio regardless of the number of threads waiting on I/O. For example, systems with just one thread doing I/O could report over 90 percent wio time regardless of the number of CPUs it has.
The method used in AIX 4.3.3 and later is as follows: The change in operating system version 4.3.3 is to only mark an idle CPU as wio if an outstanding I/O was started on that CPU. This method can report much lower wio times when just a few threads are doing I/O and the system is otherwise idle. For example, a system with four CPUs and one thread doing I/O will report a maximum of 25 percent wio time. A system with 12 CPUs and one thread doing I/O will report a maximum of 8 percent wio time. NFS client reads/writes go through the VMM, and the time that biods spend in the VMM waiting for an I/O to complete is now reported as I/O wait time.
Disk Utilization Report
The second report generated by the iostat command is the Disk Utilization Report. The disk report provides statistics on a per physical disk basis. The report has a format similar to the following:
% tm_act Indicates the percentage of time the physical disk was active (bandwidth utilization for the drive).
Kbps Indicates the amount of data transferred (read or written) to the drive in KB per second.
tps Indicates the number of transfers per second that were issued to the physical disk. A transfer is an I/O request to the physical disk. Multiple logical requests can be combined into a single I/O request to the disk. A transfer is of indeterminate size.
Kb_read The total number of KB read.
Kb_wrtn The total number of KB written.
Statistics for CD-ROM devices are also reported.
For large system configurations where a large number of disks are configured, the system can be configured to avoid collecting physical disk input/output statistics when the iostat command is not executing. If the system is configured in the above manner, the first Disk report displays the message Disk History Since Boot Not Available instead of the disk statistics. Subsequent interval reports generated by the iostat command contain disk statistics collected during the report interval. Any tty and CPU statistics after boot are unaffected. If a system management command is used to re-enable disk statistics keeping, the first iostat command report displays activity from the interval starting at the point that disk input/output statistics were enabled.
System Throughput Report
This report is generated if the -s flag is specified. This report provides statistics for the entire system. This report has the following format:
Kbps Indicates the amount of data transferred (read or written) in the entire system in KB per second.
tps Indicates the number of transfers per second issued to the entire system.
Kb_read The total number of KB read from the entire system.
Kb_wrtn The total number of KB written to the entire system.
Adapter Throughput Report
This report is generated if the -a flag is specified. This report provides statistics on an adapter-by-adapter basis.This report has the following format:
Kbps Indicates the amount of data transferred (read or written) in the adapter in KB per second.
tps Indicates the number of transfers per second issued to the adapter.
Kb_read The total number of KB read from the adapter.
Kb_wrtn The total number of KB written to the adapter.
Disk Input/Output History
To improve performance on, the collection of disk input/output statistics has been disabled. To enable the collection of this data, enter:
chdev -l sys0 -a iostat=true
To display the current settings, enter:
lsattr -E -l sys0 -a iostat
If the collection of disk input/output history is disabled, the first disk report of iostat output displays the message Disk History Since Boot Not Available instead of disk statistics. As before, subsequent interval reports generated by the iostat command contain disk statistics collected during the report interval.
Flags
-a Displays the adapter throughput report.
-d The -d flag is exclusive of the -t flag and displays only the disk utilization report.
-s Displays the system throughput report.
-t The -t flag is exclusive of the -d flag and displays only the tty and cpu usage reports.
Notes:
* The -s and -a flags can both be specified to display both the system and adapter throughput reports.
* If the -a flag is specified with the -t flag, the tty and CPU report is displayed, followed by the adapter throughput report. Disk Utilization reports of the disks connected to the adapters, will not be displayed after the Adapter throughput report.
* If the -a flag is specified with the -d flag, tty and CPU report will not be displayed. If Physical Volume parameter is specified, the Disk Utilization Report of the specified Physical volume will be printed under the corresponding adapter to which it belongs.
Examples
1. To display a single history since boot report for all tty, CPU, and Disks, enter:
iostat
2. To display a continuous disk report at two second intervals for the disk with the logical name disk1, enter:
iostat -d disk1 2
3. To display six reports at two second intervals for the disk with the logical name disk1, enter:
iostat disk1 2 6
4. To display six reports at two second intervals for all disks, enter:
iostat -d 2 6
5. To display six reports at two second intervals for three disks named disk1, disk2, disk3, enter:
iostat disk1 disk2 disk3 2 6
6. To print the System throughput report, enter:
iostat -s
7. To print the Adapter throughput report, enter:
iostat -a
8. To print the System and Adapter throughput reports, with only the tty and CPU report (no disk reports), enter:
iostat -sat
9. To print the System and Adapter throughput reports with the Disk Utilization reports of hdisk0 and hdisk7, enter
iostat -sad hdisk0 hdisk7
File
/usr/bin/iostat Contains the iostat command.
vmstat - Top 20 DBA Unix commands
VMSTAT: Reports virtual memory statistics.
Syntax
vmstat [ -f ] [ -i ] [ -s ] [ -I ] [ -t ] [ PhysicalVolume ... ] [ Interval [ Count ] ]
Description
The vmstat command reports statistics about kernel threads, virtual memory, disks, traps and CPU activity. Reports generated by the vmstat command can be used to balance system load activity. These system-wide statistics (among all processors) are calculated as averages for values expressed as percentages, and as sums otherwise.
If the vmstat command is invoked without flags, the report contains a summary of the virtual memory activity since system startup. If the -f flag is specified, the vmstat command reports the number of forks since system startup. The PhysicalVolume parameter specifies the name of the physical volume.
The Interval parameter specifies the amount of time in seconds between each report. The first report contains statistics for the time since system startup. Subsequent reports contain statistics collected during the interval since the previous report. If the Interval parameter is not specified, the vmstat command generates a single report and then exits. The Count parameter can only be specified with the Interval parameter. If the Count parameter is specified, its value determines the number of reports generated and the number of seconds apart. If the Interval parameter is specified without the Count parameter, reports are continuously generated. A Count parameter of 0 is not allowed.
AIX 4.3.3 and later contain enhancements to the method used to compute the percentage of CPU time spent waiting on disk I/O (wio time).The method used in AIX 4.3.2 and earlier versions of the operating system can, under certain circumstances, give an inflated view of wio time on SMPs.
The method used in AIX 4.3.2 and earlier versions is as follows: At each clock interrupt on each processor (100 times a second per processor), a determination is made as to which of the four categories (usr/sys/wio/idle) to place the last 10 ms of time. If the CPU was busy in usr mode at the time of the clock interrupt, then usr gets the clock tick added into its category. If the CPU was busy in kernel mode at the time of the clock interrupt, then the sys category gets the tick. If the CPU was not busy, a check is made to see if any I/O to disk is in progress. If any disk I/O is in progress, the wio category is incremented. If no disk I/O is in progress and the CPU is not busy, the idle category gets the tick. The inflated view of wio time results from all idle CPUs being categorized as wio regardless of the number of threads waiting on I/O. For example, systems with just one thread doing I/O could report over 90 percent wio time regardless of the number of CPUs it has. The wio time is reported by the commands sar (%wio), vmstat (wa) and iostat (% iowait).
The method used in operating system AIX 4.3.3 and later is as follows: The change in operating system AIX 4.3.3 is to only mark an idle CPU as wio if an outstanding I/O was started on that CPU. This method can report much lower wio times when just a few threads are doing I/O and the system is otherwise idle. For example, a system with four CPUs and one thread doing I/O will report a maximum of 25 percent wio time. A system with 12 CPUs and one thread doing I/O will report a maximum of 8 percent wio time. NFS client reads/writes go through the VMM, and the time that biods spend in the VMM waiting for an I/O to complete is now reported as I/O wait time.
The kernel maintains statistics for kernel threads, paging, and interrupt activity, which the vmstat command accesses through the use of the knlist subroutine and the /dev/kmem pseudo-device driver. The disk input/output statistics are maintained by device drivers. For disks, the average transfer rate is determined by using the active time and number of transfers information. The percent active time is computed from the amount of time the drive is busy during the report.
The following example of a report generated by the vmstat command contains the column headings and their description:
kthr: kernel thread state changes per second over the sampling interval.
r Number of kernel threads placed in run queue.
b Number of kernel threads placed in wait queue (awaiting resource, awaiting input/output).
Memory: information about the usage of virtual and real memory. Virtual pages are considered active if they have been accessed. A page is 4096 bytes.
avm Active virtual pages.
fre Size of the free list.
Note: A large portion of real memory is utilized as a cache for file system data. It is not unusual for the size of the free list to remain small.
Page: information about page faults and paging activity. These are averaged over the interval and given in units per second.
re Pager input/output list.
pi Pages paged in from paging space.
po Pages paged out to paging space.
fr Pages freed (page replacement).
sr Pages scanned by page-replacement algorithm.
cy Clock cycles by page-replacement algorithm.
Faults: trap and interrupt rate averages per second over the sampling interval.
in Device interrupts.
sy System calls.
cs Kernel thread context switches.
Cpu: breakdown of percentage usage of CPU time.
us User time.
sy System time.
id CPU idle time.
wa CPU idle time during which the system had outstanding disk/NFS I/O request(s). See detailed description above.
Disk: Provides the number of transfers per second to the specified physical volumes that occurred in the sample interval. The PhysicalVolume parameter can be used to specify one to four names. Transfer statistics are given for each specified drive in the order specified. This count represents requests to the physical device. It does not imply an amount of data that was read or written. Several logical requests can be combined into one physical request.
If the -I flag is specified, an I/O oriented view is presented with the following column changes.
kthr The column p will also be displayed besides columns r and b.
p
Number of threads waiting on actual physical I/O per second.
page New columns fi and fo will be displayed instead of re and cy columns.
fi
File page-ins per second.
fo
File page-outs per second.
Flags
Note: Both the -f and -s flags can be entered on the command line, but the system will only accept the first flag specified and override the second flag.
-f Reports the number of forks since system startup.
-i Displays the number of interrupts taken by each device since system startup.
-I Displays I/O oriented view with the new columns of output, p under heading kthr,and columns fi and fo under heading page instead of the columns re and cy in the page heading.
-s Writes to standard output the contents of the sum structure, which contains an absolute count of paging events since system initialization. The -s option is exclusive of the other vmstat command options. These events are described as follows:
address translation faults
Incremented for each occurrence of an address translation page fault. I/O may or may not be required to resolve the page fault. Storage protection page faults (lock misses) are not included in this count.
page ins
Incremented for each page read in by the virtual memory manager. The count is incremented for page ins from page space and file space. Along with the page out statistic, this represents the total amount of real I/O initiated by the virtual memory manager.
page outs
Incremented for each page written out by the virtual memory manager. The count is incremented for page outs to page space and for page outs to file space. Along with the page in statistic, this represents the total amount of real I/O initiated by the virtual memory manager.
paging space page ins
Incremented for VMM initiated page ins from paging space only.
paging space page outs
Incremented for VMM initiated page outs to paging space only.
total reclaims
Incremented when an address translation fault can be satisfied without initiating a new I/O request. This can occur if the page has been previously requested by VMM, but the I/O has not yet completed; or if the page was pre-fetched by VMM's read-ahead algorithm, but was hidden from the faulting segment; or if the page has been put on the free list and has not yet been reused.
zero-filled page faults
Incremented if the page fault is to working storage and can be satisfied by assigning a frame and zero-filling it.
executable-filled page faults
Incremented for each instruction page fault.
pages examined by the clock
VMM uses a clock-algorithm to implement a pseudo least recently used (lru) page replacement scheme. Pages are aged by being examined by the clock. This count is incremented for each page examined by the clock.
revolutions of the clock hand
Incremented for each VMM clock revolution (that is, after each complete scan of memory).
pages freed by the clock
Incremented for each page the clock algorithm selects to free from real memory.
backtracks
Incremented for each page fault that occurs while resolving a previous page fault. (The new page fault must be resolved first and then initial page faults can be backtracked.)
lock misses
VMM enforces locks for concurrency by removing addressability to a page. A page fault can occur due to a lock miss, and this count is incremented for each such occurrence.
free frame waits
Incremented each time a process is waited by VMM while free frames are gathered.
extend XPT waits
Incremented each time a process is waited by VMM due to a commit in progress for the segment being accessed.
pending I/O waits
Incremented each time a process is waited by VMM for a page-in I/O to complete.
start I/Os
Incremented for each read or write I/O request initiated by VMM. This count should equal the sum of page-ins and page-outs.
iodones
Incremented at the completion of each VMM I/O request.
CPU context switches
Incremented for each CPU context switch (dispatch of a new process).
device interrupts
Incremented on each hardware interrupt.
software interrupts
Incremented on each software interrupt. A software interrupt is a machine instruction similar to a hardware interrupt that saves some state and branches to a service routine. System calls are implemented with software interrupt instructions that branch to the system call handler routine.
traps
Not maintained by the operating system.
syscalls
Incremented for each system call.
-t Prints the time-stamp next to each line of output of vmstat. The time-stamp is displayed in the HH:MM:SS format.
Note: Time stamp will not be printed if -f, -s, or -i flags are specified.
Examples
1. To display a summary of the statistics since boot, enter:
vmstat
2. To display five summaries at 2-second intervals, enter:
vmstat 2 5
The first summary contains statistics for the time since boot.
3. To display a summary of the statistics since boot including statistics for logical disks scdisk13 and scdisk14, enter:
vmstat scdisk13 scdisk14
4. To display fork statistics, enter:
vmstat -f
5. To display the count of various events, enter:
vmstat -s
6. To display time-stamp next to each column of output of vmstat, enter:
vmstat -t
7. To display the new I/O oriented view with an alternative set of columns, enter:
vmstat -I
Files
/unix Symbolic link to the kernel boot image.
/usr/bin/vmstat Contains the vmstat command.
Syntax
vmstat [ -f ] [ -i ] [ -s ] [ -I ] [ -t ] [ PhysicalVolume ... ] [ Interval [ Count ] ]
Description
The vmstat command reports statistics about kernel threads, virtual memory, disks, traps and CPU activity. Reports generated by the vmstat command can be used to balance system load activity. These system-wide statistics (among all processors) are calculated as averages for values expressed as percentages, and as sums otherwise.
If the vmstat command is invoked without flags, the report contains a summary of the virtual memory activity since system startup. If the -f flag is specified, the vmstat command reports the number of forks since system startup. The PhysicalVolume parameter specifies the name of the physical volume.
The Interval parameter specifies the amount of time in seconds between each report. The first report contains statistics for the time since system startup. Subsequent reports contain statistics collected during the interval since the previous report. If the Interval parameter is not specified, the vmstat command generates a single report and then exits. The Count parameter can only be specified with the Interval parameter. If the Count parameter is specified, its value determines the number of reports generated and the number of seconds apart. If the Interval parameter is specified without the Count parameter, reports are continuously generated. A Count parameter of 0 is not allowed.
AIX 4.3.3 and later contain enhancements to the method used to compute the percentage of CPU time spent waiting on disk I/O (wio time).The method used in AIX 4.3.2 and earlier versions of the operating system can, under certain circumstances, give an inflated view of wio time on SMPs.
The method used in AIX 4.3.2 and earlier versions is as follows: At each clock interrupt on each processor (100 times a second per processor), a determination is made as to which of the four categories (usr/sys/wio/idle) to place the last 10 ms of time. If the CPU was busy in usr mode at the time of the clock interrupt, then usr gets the clock tick added into its category. If the CPU was busy in kernel mode at the time of the clock interrupt, then the sys category gets the tick. If the CPU was not busy, a check is made to see if any I/O to disk is in progress. If any disk I/O is in progress, the wio category is incremented. If no disk I/O is in progress and the CPU is not busy, the idle category gets the tick. The inflated view of wio time results from all idle CPUs being categorized as wio regardless of the number of threads waiting on I/O. For example, systems with just one thread doing I/O could report over 90 percent wio time regardless of the number of CPUs it has. The wio time is reported by the commands sar (%wio), vmstat (wa) and iostat (% iowait).
The method used in operating system AIX 4.3.3 and later is as follows: The change in operating system AIX 4.3.3 is to only mark an idle CPU as wio if an outstanding I/O was started on that CPU. This method can report much lower wio times when just a few threads are doing I/O and the system is otherwise idle. For example, a system with four CPUs and one thread doing I/O will report a maximum of 25 percent wio time. A system with 12 CPUs and one thread doing I/O will report a maximum of 8 percent wio time. NFS client reads/writes go through the VMM, and the time that biods spend in the VMM waiting for an I/O to complete is now reported as I/O wait time.
The kernel maintains statistics for kernel threads, paging, and interrupt activity, which the vmstat command accesses through the use of the knlist subroutine and the /dev/kmem pseudo-device driver. The disk input/output statistics are maintained by device drivers. For disks, the average transfer rate is determined by using the active time and number of transfers information. The percent active time is computed from the amount of time the drive is busy during the report.
The following example of a report generated by the vmstat command contains the column headings and their description:
kthr: kernel thread state changes per second over the sampling interval.
r Number of kernel threads placed in run queue.
b Number of kernel threads placed in wait queue (awaiting resource, awaiting input/output).
Memory: information about the usage of virtual and real memory. Virtual pages are considered active if they have been accessed. A page is 4096 bytes.
avm Active virtual pages.
fre Size of the free list.
Note: A large portion of real memory is utilized as a cache for file system data. It is not unusual for the size of the free list to remain small.
Page: information about page faults and paging activity. These are averaged over the interval and given in units per second.
re Pager input/output list.
pi Pages paged in from paging space.
po Pages paged out to paging space.
fr Pages freed (page replacement).
sr Pages scanned by page-replacement algorithm.
cy Clock cycles by page-replacement algorithm.
Faults: trap and interrupt rate averages per second over the sampling interval.
in Device interrupts.
sy System calls.
cs Kernel thread context switches.
Cpu: breakdown of percentage usage of CPU time.
us User time.
sy System time.
id CPU idle time.
wa CPU idle time during which the system had outstanding disk/NFS I/O request(s). See detailed description above.
Disk: Provides the number of transfers per second to the specified physical volumes that occurred in the sample interval. The PhysicalVolume parameter can be used to specify one to four names. Transfer statistics are given for each specified drive in the order specified. This count represents requests to the physical device. It does not imply an amount of data that was read or written. Several logical requests can be combined into one physical request.
If the -I flag is specified, an I/O oriented view is presented with the following column changes.
kthr The column p will also be displayed besides columns r and b.
p
Number of threads waiting on actual physical I/O per second.
page New columns fi and fo will be displayed instead of re and cy columns.
fi
File page-ins per second.
fo
File page-outs per second.
Flags
Note: Both the -f and -s flags can be entered on the command line, but the system will only accept the first flag specified and override the second flag.
-f Reports the number of forks since system startup.
-i Displays the number of interrupts taken by each device since system startup.
-I Displays I/O oriented view with the new columns of output, p under heading kthr,and columns fi and fo under heading page instead of the columns re and cy in the page heading.
-s Writes to standard output the contents of the sum structure, which contains an absolute count of paging events since system initialization. The -s option is exclusive of the other vmstat command options. These events are described as follows:
address translation faults
Incremented for each occurrence of an address translation page fault. I/O may or may not be required to resolve the page fault. Storage protection page faults (lock misses) are not included in this count.
page ins
Incremented for each page read in by the virtual memory manager. The count is incremented for page ins from page space and file space. Along with the page out statistic, this represents the total amount of real I/O initiated by the virtual memory manager.
page outs
Incremented for each page written out by the virtual memory manager. The count is incremented for page outs to page space and for page outs to file space. Along with the page in statistic, this represents the total amount of real I/O initiated by the virtual memory manager.
paging space page ins
Incremented for VMM initiated page ins from paging space only.
paging space page outs
Incremented for VMM initiated page outs to paging space only.
total reclaims
Incremented when an address translation fault can be satisfied without initiating a new I/O request. This can occur if the page has been previously requested by VMM, but the I/O has not yet completed; or if the page was pre-fetched by VMM's read-ahead algorithm, but was hidden from the faulting segment; or if the page has been put on the free list and has not yet been reused.
zero-filled page faults
Incremented if the page fault is to working storage and can be satisfied by assigning a frame and zero-filling it.
executable-filled page faults
Incremented for each instruction page fault.
pages examined by the clock
VMM uses a clock-algorithm to implement a pseudo least recently used (lru) page replacement scheme. Pages are aged by being examined by the clock. This count is incremented for each page examined by the clock.
revolutions of the clock hand
Incremented for each VMM clock revolution (that is, after each complete scan of memory).
pages freed by the clock
Incremented for each page the clock algorithm selects to free from real memory.
backtracks
Incremented for each page fault that occurs while resolving a previous page fault. (The new page fault must be resolved first and then initial page faults can be backtracked.)
lock misses
VMM enforces locks for concurrency by removing addressability to a page. A page fault can occur due to a lock miss, and this count is incremented for each such occurrence.
free frame waits
Incremented each time a process is waited by VMM while free frames are gathered.
extend XPT waits
Incremented each time a process is waited by VMM due to a commit in progress for the segment being accessed.
pending I/O waits
Incremented each time a process is waited by VMM for a page-in I/O to complete.
start I/Os
Incremented for each read or write I/O request initiated by VMM. This count should equal the sum of page-ins and page-outs.
iodones
Incremented at the completion of each VMM I/O request.
CPU context switches
Incremented for each CPU context switch (dispatch of a new process).
device interrupts
Incremented on each hardware interrupt.
software interrupts
Incremented on each software interrupt. A software interrupt is a machine instruction similar to a hardware interrupt that saves some state and branches to a service routine. System calls are implemented with software interrupt instructions that branch to the system call handler routine.
traps
Not maintained by the operating system.
syscalls
Incremented for each system call.
-t Prints the time-stamp next to each line of output of vmstat. The time-stamp is displayed in the HH:MM:SS format.
Note: Time stamp will not be printed if -f, -s, or -i flags are specified.
Examples
1. To display a summary of the statistics since boot, enter:
vmstat
2. To display five summaries at 2-second intervals, enter:
vmstat 2 5
The first summary contains statistics for the time since boot.
3. To display a summary of the statistics since boot including statistics for logical disks scdisk13 and scdisk14, enter:
vmstat scdisk13 scdisk14
4. To display fork statistics, enter:
vmstat -f
5. To display the count of various events, enter:
vmstat -s
6. To display time-stamp next to each column of output of vmstat, enter:
vmstat -t
7. To display the new I/O oriented view with an alternative set of columns, enter:
vmstat -I
Files
/unix Symbolic link to the kernel boot image.
/usr/bin/vmstat Contains the vmstat command.
Find how much memory is in your unix machine/server?
Solaris
1. dmesg | grep mem
2. prtdiag | grep Memory
3. prtconf -v | grep Memory
AIX
1. bootinfo -r
2. lsattr -E1 sys0 -a realmem
3. getconf REAL_MEMORY
HPUX
1. dmesg | grep Physical
2. /opt/ignite/bin/print_manifest | grep Memory
3. machinfo | grep Memory
Linux
1. dmesg | grep Memory
2. grep -i memtotal /proc/meminfo
3. free
OpenVMS
1. show mem /page
FreeBSD
1. dmesg | grep memory
2. grep memory /var/run/dmesg.boot
3. sysctl -a | grep mem
1. dmesg | grep mem
2. prtdiag | grep Memory
3. prtconf -v | grep Memory
AIX
1. bootinfo -r
2. lsattr -E1 sys0 -a realmem
3. getconf REAL_MEMORY
HPUX
1. dmesg | grep Physical
2. /opt/ignite/bin/print_manifest | grep Memory
3. machinfo | grep Memory
Linux
1. dmesg | grep Memory
2. grep -i memtotal /proc/meminfo
3. free
OpenVMS
1. show mem /page
FreeBSD
1. dmesg | grep memory
2. grep memory /var/run/dmesg.boot
3. sysctl -a | grep mem
Difference between Session, Connection and Process
A connection is a physical circuit between you and the database. A connection might be
one of many types -- most popular begin DEDICATED server and SHARED server. Zero, one or
more sessions may be established over a given connection to the database as show above
with sqlplus. A process will be used by a session to execute statements. Sometimes
there is a one to one relationship between CONNECTION->SESSION->PROCESS (eg: a normal
dedicated server connection). Sometimes there is a one to many from connection to
sessions (eg: like autotrace, one connection, two sessions, one process). A process does
not have to be dedicated to a specific connection or session however, for example when
using shared server (MTS), your SESSION will grab a process from a pool of processes in
order to execute a statement. When the call is over, that process is released back to
the pool of processes.
one of many types -- most popular begin DEDICATED server and SHARED server. Zero, one or
more sessions may be established over a given connection to the database as show above
with sqlplus. A process will be used by a session to execute statements. Sometimes
there is a one to one relationship between CONNECTION->SESSION->PROCESS (eg: a normal
dedicated server connection). Sometimes there is a one to many from connection to
sessions (eg: like autotrace, one connection, two sessions, one process). A process does
not have to be dedicated to a specific connection or session however, for example when
using shared server (MTS), your SESSION will grab a process from a pool of processes in
order to execute a statement. When the call is over, that process is released back to
the pool of processes.
Wednesday, October 15, 2008
Moving data files
Moving or renaming a datafile while the database is in the MOUNT stage requires the use of the ALTER DATABASE
command. When using the ALTER DATABASE method to move datafiles, the datafile is moved after the instance is shut
down. A summary of the steps involved follows:
1. Shutdown the instance
2. Use operating system commands to move or rename the files(s).
3. Mount the database and use the ALTER DATABASE to rename the file within the database.
4. Opening the Database
% sqlplus "/ as sysdba"
SQL> shutdown immediate
SQL> !mv /u05/app/oradata/ORA920/indx01.dbf /u06/app/oradata/ORA920/indx01.dbf
SQL> startup mount
SQL> alter database rename file '/u05/app/oradata/ORA920/indx01.dbf' to '/u06/app/oradata/ORA920/indx01.dbf';
Do not disconnect after this step. Stay logged in
and proceed to open the database!
SQL> alter database open;
SQL> exit
Moving or renaming a datafile while the database is in the 'OPEN' stage requires the use of the ALTER TABLESPACE
command. When using the ALTER TABLESPACE method to move datafiles, the datafile is moved while the instance is
running. A summary of the steps involved follows:
1. Take the tablespace OFFLINE.
2. Use operating system commands to move or rename the file(s).
3. Use the ALTER TABLESPACE command to rename the file within the database.
4. Bring the tablespace back ONLINE.
NOTE: This method can only be used for non-SYSTEM tablespaces. It also cannot be used for tablespaces that contain
active ROLLBACK segments or TEMPORARY segments.
% sqlplus "/ as sysdba"
SQL> alter tablespace INDX offline;
SQL> !mv /u05/app/oradata/ORA920/indx01.dbf /u06/app/oradata/ORA920/indx01.dbf
SQL> alter tablespace INDX
2 rename datafile '/u05/app/oradata/ORA920/indx01.dbf' to '/u06/app/oradata/ORA920/indx01.dbf';
Do not disconnect after this step. Stay logged in
and proceed to bring the tablespace back online!
SQL> alter tablespace INDX online;
SQL> exit
command. When using the ALTER DATABASE method to move datafiles, the datafile is moved after the instance is shut
down. A summary of the steps involved follows:
1. Shutdown the instance
2. Use operating system commands to move or rename the files(s).
3. Mount the database and use the ALTER DATABASE to rename the file within the database.
4. Opening the Database
% sqlplus "/ as sysdba"
SQL> shutdown immediate
SQL> !mv /u05/app/oradata/ORA920/indx01.dbf /u06/app/oradata/ORA920/indx01.dbf
SQL> startup mount
SQL> alter database rename file '/u05/app/oradata/ORA920/indx01.dbf' to '/u06/app/oradata/ORA920/indx01.dbf';
Do not disconnect after this step. Stay logged in
and proceed to open the database!
SQL> alter database open;
SQL> exit
Moving or renaming a datafile while the database is in the 'OPEN' stage requires the use of the ALTER TABLESPACE
command. When using the ALTER TABLESPACE method to move datafiles, the datafile is moved while the instance is
running. A summary of the steps involved follows:
1. Take the tablespace OFFLINE.
2. Use operating system commands to move or rename the file(s).
3. Use the ALTER TABLESPACE command to rename the file within the database.
4. Bring the tablespace back ONLINE.
NOTE: This method can only be used for non-SYSTEM tablespaces. It also cannot be used for tablespaces that contain
active ROLLBACK segments or TEMPORARY segments.
% sqlplus "/ as sysdba"
SQL> alter tablespace INDX offline;
SQL> !mv /u05/app/oradata/ORA920/indx01.dbf /u06/app/oradata/ORA920/indx01.dbf
SQL> alter tablespace INDX
2 rename datafile '/u05/app/oradata/ORA920/indx01.dbf' to '/u06/app/oradata/ORA920/indx01.dbf';
Do not disconnect after this step. Stay logged in
and proceed to bring the tablespace back online!
SQL> alter tablespace INDX online;
SQL> exit
Subscribe to:
Posts (Atom)
Blog Archive
-
▼
2008
(277)
-
▼
October
(14)
- Using Function-based Indexes for Performance
- Choosing Composite Indexes
- Choosing Columns and Expressions to Index
- Finding Locks and Lock Holders
- TM enqueue
- HW enqueue
- ST enqueue
- TX enqueue
- sar - Top 20 DBA unix commands
- iostat - Top 20 DBA unix commands
- vmstat - Top 20 DBA Unix commands
- Find how much memory is in your unix machine/server?
- Difference between Session, Connection and Process
- Moving data files
-
▼
October
(14)