|
1. What is PBXT?
PrimeBase XT (PBXT) is a transactional storage engine for MySQL.
MySQL 5.1 provides a pluggable storage engine API which allows 3rd parties to extend the storage capabilities of the server. Storage engines are built as shared libraries and can be loaded at runtime by the server. Most storage engines provide an alternative way of storing the table data. By doing this they add functionality to the server in terms of transactional support, special indexing methods, data warehousing and archiving facilities or even clustering and high-availability capabilities.
To use a particular engine, you specify the name of the engine when creating a table in MySQL. For example:
CREATE TABLE my_test_tab (
n_id INTEGER PRIMARY KEY AUTO_INCREMENT,
n_text LONGBLOB,
n_string VARCHAR(255)
) ENGINE=PBXT;
This example creates a table using the PBXT storage engine.
The PBXT engine is a general purpose transactional storage engine. PBXT is fully "ACID" compliant. This means PBXT can be used as a replacement for other MySQL transactional engines, such as InnoDB.
PBXT features include the following:
- MVCC Support:
MVCC stands for Multi-version Concurrency Control. MVCC allows reading the database without locking.
- Fully ACID compliant:
This means that all transactions are: atomic, consistent, isolated and durable.
- Row-level locking:
When updating, PBXT uses row-level locking. Row-level locking is also used during SELECT FOR UPDATE.
- Fast Rollback and Recovery:
PBXT uses a specialized method to identify garbage which makes "undo" unnecessary. This make both rollback of transactions and recovery after restart very fast.
- Deadlock Detection:
PBXT identifies all kinds of deadlocks immediately.
- Write-once:
PBXT uses a log-based storage which makes it possible to write transactional data directly to the database, without first being written to the transaction log.
- Referential Integrity:
PBXT supports foreign key definitions, including cascaded updates and deletes.
- BLOB streaming:
In combination with the BLOB Streaming engine PBXT can stream binary and media directly in and out of the database.
2. How to build PBXT
PBXT can be built as a standalone, pluggable storage engine. This creates a shared library which can be installed by a running MySQL server. Note that MySQL 5.1 or 6.0 is required for this functionality.
This section describes how to build the pluggable storage engine on MacOS X and Linux. How to build PBXT under Windows is explained in section 4.
To build PBXT you will need the following:
- A UNIX user: <unix-user>, For example: my-user
- A "build" directory: <build-dir>. For example: /home/my-user/build
- A MySQL test directory: <mysql-dir>. For example: /home/my-user/mysql
- The MySQL 5.1 source tree: <mysql-src>.tar.gz. For example: mysql-5.1.38-rc.tar.gz
- The PBXT source tree: <pbxt-src>.tar.gz. For example: pbxt-1.0.06-beta.tar.gz
If you already have download and compiled MySQL then skip to 2.4.
2.1 Download MySQL Source Distribution
Download the latest source distribution of MySQL from the MySQL web-site: MySQL 5.1 or MySQL 6.0. You will find the "Compressed GNU TAR archive (tar.gz)" version at the bottom of the download page.
Unpack the .tar.gz in your build directory:
$ cd <build-dir>
$ gunzip -c <mysql-src>.tar.gz | tar -x
This will create the following directory: <build-dir>/<mysql-src>
2.2 Configure and build MySQL 5.1
Configure MySQL according to your requirements. For example:
$ ./configure --prefix=<mysql-dir> --enable-assembler --with-extra-charsets=complex --enable-thread-safe-client --with-big-tables --with-readline --with-ssl --with-plugins=max-no-ndb --with-embedded-server --enable-local-infile --with-mysqld-user=<unix-user>
After configuration, make and install MySQL:
$ make install
This will install MySQL in the directory <mysql-dir>.
2.3 MySQL Configuration Options
You may want to change or specify the following configuration options according to your requirements:
--prefix=<mysql-dir>
This options specifies the MySQL installation directory. The default is /usr/local.
This means that by default, make install will install all the files in /usr/local/bin, /usr/local/lib etc.
This is inconvenient for 2 reasons:
- You need root privileges to install in the default location.
- You may already have a production version of MySQL installed here.
The directory /usr/local/mysql is often used. This directory also requires root privileges for installation.
--with-mysqld-user=<unix-user>
Specify an alternative user for the mysqld executable (default is mysql). When started, mysqld will attempt change to running as this user. If the user that started mysqld is not already this user, or has not been started as root, this will fail.
To avoid having to start mysqld as root, set --with-mysqld-user to the name of the user you are using to build mysql.
--with-debug=yes/no/full
For development or debugging purposes use the option --with-debug=yes or --with-debug=full. If you wish to perform performance tests, then make sure you use --with-debug=no, or omit this option.
Note that whichever option you use, you must build PBXT with the same options or it will not load, or otherwise crash (see below).
2.4 Install the MySQL database
Before you can start the MySQL server you have to install the mysql database which contains the system information about the MySQL installation:
$ cd <mysql-dir>
$ bin/mysqld_install_db --no-defaults
Use the --no-defaults option to ignore any my.cnf file that may be installed on your system. This may have been installed as part of a standard or default MySQL installation on your system.
The mysqld_install_db script creates the MySQL data directory: <mysql-dir>/var, or <mysql-dir>/data.
2.5 Start the MySQL server
You can start the MySQL in the foreground as follows:
$ cd <mysql-dir>
$ libexec/mysqld --no-defaults
Starting MySQL in the foreground will print any errors that may occur immediately to the console. The saves having to hunt for the log file if an error occurs, and MySQL fails to start.
To start MySQL in the background enter the following:
$ bin/mysqld_safe &
You can shutdown the MySQL server as follows:
$ bin/mysqladmin -uroot shutdown
2.6 Determine the MySQL plugin directory
The MySQL plugin directory, is the directory searched by MySQL when loading pluggable storage engine shared libraries.
You can determine this directory by first starting the MySQL shell:
$ bin/mysql -uroot
Now enter the following:
mysql> show variables like "%plugin%";
+---------------+--------------------------------------+
| Variable_name | Value |
+---------------+--------------------------------------+
| plugin_dir | /home/my-user/mysql/lib/mysql/plugin |
+---------------+--------------------------------------+
1 row in set (0.01 sec)
If you are using MySQL 5.1.24 you can expect the plugin directory to be <mysql-dir>/lib/mysql/plugin. Before this version 5.1.24 it was <mysql-dir>/lib/mysql.
2.7 Download PBXT
You can download the PBXT source distribution from www.primebase.org or from launchpad.net.
Unpack the .tar.gz in your build directory:
$ cd <build-dir>
$ gunzip -c <pbxt-src>.tar.gz | tar -x
This will create the directory: <build-dir>/<pbxt-src>
Alternatively you can checkout the latest source tree from Launchpad.net using a bazaar client, as follows
$ cd <build-dir>
$ bzr branch lp:pbxt <pbxt-src>
2.8 Configure and build PBXT
Configure PBXT as follows:
$ ./configure --with-mysql=<build-dir>/<mysql-src> --with-plugindir=<mysql-dir>/lib/mysql/plugin
Note, the --with-mysql option is required but --with-plugindir may be omitted. If used --with-plugindir should be set to the value of the plugin_dir system variable in step 2.6 above.
Now make and install PBXT:
$ make
$ make install
These commands will build the PBXT plug-in, which is a shared library called libpbxt.so, and copy the plug-in to the MySQL plugin directory. It will also build the xtstat executable, and copy it to the bin directory (see Monitoring PBXT using xtstat).
Before you can use the PBXT storage engine you need to install the plug-in as described in the next section: Installing PBXT as a "pluggable" Storage Engine.
2.9 PBXT Configuration options
PBXT configure supports the following options:
--with-mysql=<build-dir>/<mysql-src>
This option is required and specifies the path to the MySQL source tree.
--with-plugindir=<mysql-plugin-dir>
Use this option to specify where the pluggable storage engine should be installed. This is usually in the lib/mysql/plugin directory in the MySQL home/installation directory. In our example above, this is the directory: /home/my-user/mysql/lib/mysql/plugin.
If you omit the --with-plugindir option then it will automatically be set to <mysql-dir>/lib/mysql/plugin.
--with-debug=no/yes/only/full/prof
Set this option to yes if you wish to debug PBXT. It is possible to debug PBXT even if the MySQL server was not built with the debug option on.
If --with-debug is omitted then the debug and optimization flags used will be identical to those used by MySQL.
Option supported are:
no - No debugging. This builds an optimized version of PBXT, even if MySQL was built with debug on.
yes - Debugging on. This builds PBXT for debug symbols enabled (-g), and turns off optimization.
only - This option turns debug symbols on but retains the optimization options set for MySQL.
full - Full debugging. This builds PBXT for debug symbols enabled, and also turns on assertions, memory checking and other debug code.
prof - Profile version. This builds an optimized version of the plug-in which will generate profiling information.
3. Installing PBXT as a "pluggable" Storage Engine
PBXT is "pluggable", which means that it can be loaded dynamically by MySQL at runtime. When PBXT is built outside of the MySQL source tree it creates a shared library called libpbxt.so, which can be installed and loaded by the MySQL server.
To install the plug, it has to be copied to the MySQL plugin directory. This can be determined by retrieving the value of the plugin_dir system variable:
mysql> show variables like "%plugin%";
+---------------+----------------------------+
| Variable_name | Value |
+---------------+----------------------------+
| plugin_dir | /usr/local/mysql/lib/mysql |
+---------------+----------------------------+
1 row in set (0.01 sec)
Once you have copied the libpbxt.so to the plugin directory, you install the plugin by entering the following command:
mysql> INSTALL PLUGIN pbxt SONAME 'libpbxt.so';
The server will load and register the PBXT plug-in in the mysql.plugin table. The plug-in is then available whenever the server is started.
Once you have installed the plug-in, you can create tables that use the PBXT engine by setting ENGINE=PBXT in the table declaration. For example:
mysql> CREATE TABLE t1 (c1 int, c2 text) engine=pbxt;
You can also convert existing tables to PBXT, for example
mysql> ALTER TABLE t1 engine=pbxt;
Note that if you use the correct value for the --libdir option when configuring PBXT, then the plug-in will be copied to the MySQL plugin directory when you execute make install.
4. Building PBXT as part of the MySQL source tree
PBXT can also built directly as part of the MySQL source tree. To do this, just copy the PBXT source code tree into the MySQL source storage directory. Here you should rename the PBXT directory to pbxt, for example:
cd storage
wget http://www.primebase.org/download/pbxt-1.0.06-beta.tar.gz
gunzip -c pbxt-1.0.06-beta.tar.gz | tar -x
rm pbxt-1.0.06-beta.tar.gz
mv pbxt-1.0.06-beta pbxt
Now return to the top-most directory of the MySQL source tree and enter the following:
rm -rf autom4te.cache/
aclocal
autoconf
autoheader
automake -a
Now the MySQL configure command will recognize the PBXT storage engine, for example:
./configure --help
...
=== PBXT Storage Engine ===
Plugin Name: pbxt
Description: MVCC-based transactional engine
Supports build: static and dynamic
Configurations: max, max-no-ndb
You can now build MySQL normally, and the PBXT storage engine will be compiled in as part of the build process.
5. Building PBXT under Windows
5.1 Download the MySQL sources
Download the MySQL source tree and unpack the archive in a working directory (for example C:\workdir).
5.2 Install the PBXT sources
Move the PBXT package into the MySQL storage directory, and rename it to 'pbxt'.
For example, if you are using MySQL 5.1.38, the path to this directory would then be:
C:\workdir\mysql-5.1.38-rc\storage\pbxt
5.3. Configure the build with PBXT
To configure the build with PBXT, use the WITH_PBXT_STORAGE_ENGINE option. For example:
cscript win\configure.js WITH_PBXT_STORAGE_ENGINE WITH_INNOBASE_STORAGE_ENGINE WITH_BLACKHOLE_STORAGE_ENGINE WITH_PARTITION_STORAGE_ENGINE WITH_ARCHIVE_STORAGE_ENGINE WITH_FEDERATED_STORAGE_ENGINE __NT__
5.4 Generate MS Visual Studio projects
Generation of the MS Visual Studio projects requires CMake:
For Visual Studio 8 (Microsoft Visual Studio 2005), execute:
win\build-vs8
For Visual Studio 7.1, execute:
win\build-vs71
Confirm that the following project file has been generated:
storage\pbxt\pbxt.vproj
5.5 Build MySQL
Double-click on the file mysql.sln (in the mysql-5.1.38-rc directory) to open it with Visual Studio, and build the solution.
NOTE: You will find more details about how to build MySQL for window in the file win\README.
6. PBXT System Variables
The following system variables are available for controlling the behavior of the PBXT engine.
System variable can be set in the my.cnf file or on the mysqld command line. Variable that can be set dynamically can be set using the SET GLOBAL command which is sent to the server. System variables that can be set dynamically are specified as such below.
Variables that specify a number of bytes may include a unit indication after the value. For example: 100KB, 64MB, etc. There should be no space between the number and the unit. Units are case insensitive (KB = Kb = kb). If no unit is specified then bytes is assumed. The recognized units are:
KB (or K) - Kilobyte, 1024 byte
MB (or M) - Megabyte, 1024 KB
GB (or G) - Gigabyte, 1024 MB
TB (or T) - Terabyte, 1024 GB
PB (or P) - Petabyte, 1024 TB
Variable that use this type of value are: pbxt_index_cache_size, pbxt_record_cache_size, pbxt_log_cache_size, pbxt_log_file_threshold, pbxt_checkpoint_frequency, pbxt_data_log_threshold, pbxt_log_buffer_size, pbxt_data_file_grow_size and pbxt_row_file_grow_size.
3.1 Cache Memory Variables
PBXT has 3 caches: index, record and log. The current memory usage in these 3 areas can be determined using SHOW ENGINE PBXT STATUS.
- pbxt_index_cache_size
This is the amount of memory allocated to the index cache. Default value is 32MB. The memory allocated here is used only for caching index pages (.xti files).
- pbxt_record_cache_size
This is the amount of memory allocated to the record cache used to cache table data. The default value is 32MB. This memory is used to cache changes to the handle data (.xtd) and row index (.xtr) files.
- pbxt_log_cache_size (version 1.0/1.1)
pbxt_trx_log_cache_size (version 1.5+)
The amount of memory allocated to the transaction log cache used to cache on transaction log data (xlog-*.xt files). The default is 16MB.
- pbxt_data_log_cache_size (version 1.5+)
The amount of memory allocated to the data log cache used to cache data log data (dlog-*.xt files). The default is 16MB.
3.2 Transaction Manager Variables
The transaction manager variable control various aspects of the PBXT transactional system.
- pbxt_log_file_threshold (version 1.0/1.1)
pbxt_trx_log_threshold (version 1.5+)
The size of a transaction log file (xlog-*.xt files) before "rollover", and a new log file is created. The default value is 32MB.
- pbxt_transaction_buffer_size (version 1.0/1.1)
pbxt_trx_log_buffer_size (version 1.5+)
The size of the global transaction log buffer (the engine allocates 2 buffers of this size). The default is 1MB. Data to be written to a transaction log file is first written to the transaction log buffer. Since the buffer is flushed on transaction commit, it only makes sense to use a large transaction log buffer if you have longer running transactions, or many transaction running in parallel.
-
- pbxt_checkpoint_frequency
The amount of data written to the transaction log before a checkpoint is performed. The default value is 24MB.
- pbxt_sweeper_priority
Determines the priority of the background Sweeper thread.:
0 - Low (default).
1 - Normal (same as user threads).
2 - High.
The Sweeper is responsible for removing deleted records and index entries (deleted records also result from UPDATE statements). If many old deleted records accumulate search operations become slower. Therefore it may improve performance to increase the priority of the Sweeper on a machine with 4 or more cores.
- pbxt_auto_increment_mode
The parameter determines how PBXT manages auto-increment values. There are 2 options:
0 - MySQL standard (default).
1 - Previous ID's are never re-used.
In the standard MySQL mode it is possible that an auto-increment value is re-issued. This occurs when the maximum auto-increment value is deleted, and then MySQL is restarted. This occurs because the next auto-increment value to be issued is determined at startup by retrieving the current maximum auto-increment value from the table
In mode 1, auto-increment values are never re-issued because PBXT automatically incrementing the table level AUTO_INCREMENT table option. The AUTO_INCREMENT table is incremented in steps of 100. Since this requires the table file to be flushed to disk, this can influence performance.
- pbxt_flush_log_at_trx_commit
This variable specifies the durability of recently committed transactions. By reducing the durability, the speed of write operations can be increased:
0 - Lowest durability, the transaction log is not written or flushed on transaction commit. In this case it is possible to loose transactions if the server executable crashes.
1 - Full-durability, the transaction log is written and flushed on every transaction commit (default).
2 - Medium durabilty, the transaction log is written, but not flushed on transaction commit. In this case it is possible to loose transactions of the server machine crashes (for example, a power failer).
In all cases, the transaction log is flushed at least once every second. This means that it is only every possible to loose database changes that occurred within the last second.
-
- pbxt_log_file_count
The number of transaction log files on disk before logs that are no longer required are deleted, default value is 3. The number of transaction logs on disk may exceed this number if the logs are still being read.
If a transaction log has been read (i.e. the log is offline), it will be recycled for writing again, unless it must be deleted because the number of logs on disk exceeds this threshold. Recycling logs is an optimization because the writing a pre-allocated file is faster then writing to the end of a file.
Note: an exception to this rule is Mac OS X. On Mac OS X old log files are not recycled because writing pre-allocated file is slower, then writing to the end of file (see below).
- pbxt_offline_log_function
This variable determines what happens to a transaction log when it is offline. A log is offline if PBXT is no longer reading or writing to the log. There are 3 possibilities:
0 - Recycle log (default). This means the log is renamed and written again.
1 - Delete log (default on Mac OS X).
2 - Keep log. The logs can be used to repeat all operations that were applied to the database.
- pbxt_record_write_threshold (version 1.1+)
This value determines when the writer will start to transfer data from the transaction log to the record pointer (.xtr) and handle data (.xtd) files. The default is 4MB. If the value is set to zero then the parameter is ignored. The writer will also begin to transfer data when the transaction log cache is exhausted, or the record cache is full.
- pbxt_index_dirty_threshold (version 1.1+)
This is a percentage value. When the number of dirty pages in the index cache reaches this level, all indices are flushed. The default value is 80. The value 0 indicates that the indices should only be flushed when no cache pages can be freed.
- pbxt_support_xa (version 1.1+)
This variable pbxt_support_xa determines if XA (2-phase commit) support is enabled. Note: due to MySQL bug #47134, enabling XA support could lead to a crash. Default is disabled.
3.3 Data Log Variables
PBXT stores part of the database in the data logs. This is mostly data from rows containing long VARCHAR fields or BLOB data. The data logs are managed by the "compactor" thread. When a record is deleted from a data log, the data is marked as garbage. When the total garbage in a data log reaches a certain threshold, the compactor thread compacts the data log by copying the valid data to a new data log, and deleting the old data log.
The following system variables control the way data logs are managed.
-
- pbxt_data_log_threshold
The maximum size of a data log file. The default value is 64MB. PBXT can create a maximum of 32000 data logs, which are used by all tables. So the value of this variable can be increased to increase the total amount of data that can be stored in the database.
-
- pbxt_data_log_buffer_size (version 1.5+)
The size of the global data log buffer (the engine allocates 2 buffers of this size). Data to be written to the data logs is first written to the data log buffer. Just like the transaction log buffer, the data log buffer is flushed on transaction commit.
-
- pbxt_log_buffer_size
In version 1.0 and 1.1, this is the size of the buffer used when writing a data log. The default is 256MB. The engine allocates one buffer per thread, but only if the thread is required to write a data log.
In version 1.5 this is the size of the buffer used to cache data from transaction and data logs during sequential scans. The engine allocates 3 buffers of this size, one for each of the system background processes: Compactor, Sweeper and Writer. The default is 512Kb.
-
- pbxt_garbage_threshold
The percentage of garbage in a data log file before it is compacted. This is a value between 1 and 99. The default is 50.
3.4 File Grow Variables
The handle data and row index files are grown in chunks.
- pbxt_data_file_grow_size
The grow size of the handle data (.xtd) files. The default is 2MB.
- pbxt_row_file_grow_size
The grow size of the row index (.xtr) files. The default is 256Kb.
3.5 Data log 2nd Level Cache Variables (version 1.5+)
The Data log 2nd Level Cache stores data log records that fall out of the RAM based data log cache. The cache stores the records in a file on disk. The file should be located on hardware that allows fast random access, such as a Solid State Drive (SSD).
- pbxt_dlog_lev2_cache_file (versions 1.5+)
The full path of the file in which the 2nd level data log cache is to be stored (default is: ./pbxt/dlog-lev2-cache.xt).
- pbxt_dlog_lev2_cache_size (versions 1.5+)
The size of the data log level 2 cache (default is 16GB). On startup the cache file with this size will be created if it does not exist. This can take a while.
- pbxt_dlog_lev2_cache_enabled (versions 1.5+)
Enable PBXT data log level 2 cache (default is 0 - i.e. not enabled).
7. Table Options
table_option:
ENGINE [=] PBXT
| AUTO_INCREMENT [=] value
| [DEFAULT] CHARACTER SET [=] charset_name
| [DEFAULT] COLLATE [=] collation_name
| COMMENT [=] 'string'
| AVG_ROW_LENGTH [=] byte_length
PBXT supports the following table options:
- ENGINE [=] PBXT
Set the ENGINE option to PBXT to create a PBXT table.
- AUTO_INCREMENT
This option determines the minimum auto-increment value on the table.
- [DEFAULT] CHARACTER SET
Specify a default character set for the table. By default the database character set is used.
- [DEFAULT] COLLATE
Specify a default collation for the table.
- COMMENT
A comment for the table. Maximum of 60 characters.
- AVG_ROW_LENGTH
Specifies the average row size of the table. This size will be used for the length of the fixed data section of a row. Use CHECK TABLE to determine the best value for the average row size of an existing table. Note that CHECK TABLE dumps the information required to the MySQL log.
8. Differences to other Engines
The following is a list of differences between PBXT and other storage engines that are shipped by default with MySQL.
8.1 AUTO_INCREMENT option in SHOW CREATE TABLE
On tables created with the AUTO_INCREMENT table option, PBXT displays the original value set by the CREATE TABLE or ALTER TABLE statement.
For example:
CREATE TABLE t1 (
ID int primary key AUTO_INCREMENT,
NAME varchar(200)
) AUTO_INCREMENT = 1000 ENGINE=PBXT;
INSERT t1 (NAME) VALUES ("abc"), ("def"), ("ghi");
SHOW CREATE TABLE t1;
The result returned is:
CREATE TABLE `t1` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`NAME` varchar(200) DEFAULT NULL,
PRIMARY KEY (`ID`);
) ENGINE=PBXT AUTO_INCREMENT=1000 DEFAULT CHARSET=latin1
With ENGINE=MyISAM, the result is:
CREATE TABLE `t1` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`NAME` varchar(200) DEFAULT NULL,
PRIMARY KEY (`ID`);
) ENGINE=MyISAM AUTO_INCREMENT=1003 DEFAULT CHARSET=latin1
8.2 No Statement Level Sub-transactions
PBXT does not support statement level sub-transactions. This means that if an error occurs, PBXT will rollback the entire transaction, not just the current statement.
The only exception to this is the duplicate key error. PBXT undoes the INSERT or UPDATE statement when an duplicate key error occurs. The user is then free to decide whether to continue, or rollback the transaction.
Note that this can cause problems for replication, and means that INSERT with multiple rows should not be used when replicating PBXT tables. For example:
use test;
drop table if exists t;
create table t ( i int not null, unique index i (i) ) engine = PBXT;
set autocommit = 0;
insert into t values (1),(2),(2);
# error 1062 (duplicate key) occurs
commit;
select * from t;
i
1
2
In the above example, the INSERT statement is not completely rolled back. MySQL replication, however, will not replicate any part of the statement because it assumes that a transactional storage engine supports statement level sub-transactions.
9. Monitoring PBXT using xtstat
The executable xtstat can be used to monitor all internal activity of PBXT. The binary is built and installed automatically, along with the plug-in when you execute make install, as described above. xtstat is installed in the bin directory.
xtstat polls the INFORMATION_SCHEMA.PBXT_STATISTICS table. The poll interval can be set using the --delay option, and is 1 second by default.
For the most statistics xtstat will display the difference in values between the current and previous polls. For example, if bytes written current value is 1000, and on the previous call it was 800, then xtstat display 200. This means that 200 bytes were written to disk in the intervening period.
Invoke xtstat as follows:
$ xtstat [ options ]
For example:
$ xtstat -D10
In order to poll every 10 seconds.
Note that statistic counters are never reset, even if a rollback occurs. For example, if an UPDATE statement is rolled back, the xtstat will still indicate that one write statement (see stat-write below) was executed.
If MySQL shuts down or crashes, xtstat will attempt to reconnect. xtstat can be terminated any time using the CTRL-C key cimbination.
Before PBXT has recovered, not all statistics are available. In particular, the statistics relating to PBXT background threads are not available (including the sweep and chkpnt statistics).
9.1 Command line options
xtstat options are as follows:
- -?, --help
Prints help text.
- -h, --host=value
Connect to host.
- -u, --user=value
User for login if not current user.
- -p, --password[=value]
Password to use when connecting to server. If password is not given it's asked from the tty.
- -d, --database=value
Database to be used (pbxt or information_schema required), default is information_schema.
- -P, --port=value
Port number to use for connection.
- -S, --socket=value
Socket file to use for connection.
- -D, --delay=value
Delay in seconds between polls of the database. Default is 1 second.
- --protocol=value
Connection protocol to use: default/tcp/socket/pipe/memory.
- --display=value
Columns to display: use short names separated by '-', partial matches are allowed.
Use --display=all to display all columns available. See section below for details.
Connection options will also be taken from the MySQL config file if available.
9.2 Display options
The display string specified using the --display opton is a list of statistics to be displayed. It has the following format:
--display=group-type,group-type,group-type,...
Each statistic has a group and a type. Together both names identify a particular statistic (see below). You may specify a group or type on its own to specify all statistics in the group or type.
Note, for diagnostics it is best to capture all statistics. The reason is because you never know where a problem might turn up, so without certain statistics you may not be able to identify the problem.
In version 1.0 and 1.1 you have to specify -type, to select all statistics of a particular type. In version 1.5 specifications will match either group or type if no '-' is used.
In version 1.0 and 1.1 use '--display=all' to indicate display all statistics, the default is 'time-msec,commt,row-ins,rec,ind,ilog,xlog,data,to,dirty'.
In version 1.5 the keyword 'all' or '*' can be used to specify a group or type, or both. In this way you can select all statistics of a certain group or type. Version 1.5 also includes a number of "non-standard" statistics, including: scan-*, retry-*, xlog-%use, data-%use. Add '+' to including non-standard statistics in a specification. For example xlog+, includes xlog-%use, while xlog includes all the transaction log statistics without the percentage cache usage.
In version 1.5 you can prefix an item with '~' to exclude the statistic (or statistics if a group or type is specified).
Example 1: Display all standard statistics:
--display=all
--display=all-*
Example 2: Include the non-standard statistics:
--display=all+
--display=xlog+
Example 3: Display the current time statistics and the index cache hits and misses:
--display=time,ind-hits,ind-miss
Example 4: Display a number statistic groups and types:
--display=time,ind,xlog,in,out
Example 5: Display read, write and sync times:
--display=wr.ms,rd.ms,sy.ms
Example 6: Display all stastistics accept transaction log stastistics:
--display=all,~xlog
9.3 Size indicators
Values display by xtstat are either a time in milliseconds, or a value in bytes, or a counter. If these values are to large to be displayed then the value is rounded and a size indicator is added.
The following size indicators are used:
- K - Kilobytes (1,024 bytes)
- M - Megabytes (1,048,576 bytes)
- G - Gigabytes (1,073,741,024 bytes)
- T - Terabytes (1,099,511,627,776 bytes)
- t - thousands (1,000s)
- m - millions (1,000,000s)
- b - billions (1,000,000,000s)
9.4 Statistics
The following is a list of the statistics displayed by xtstat. Each statistic as a 2 part display name. The first part is the group and the second part is the type.
|
Display name |
Version |
Name |
Description |
|
time-curr |
1.0 |
Current Time |
The current time in seconds |
|
time-msec |
1.0 |
Time Since Last Call |
Time passed in milliseconds since last statistics call |
|
xact-commt |
1.0 |
Commit Count |
Number of transactions committed |
|
xact-rollb |
1.0 |
Rollback Count |
Number of transactions rolled back |
|
xact-waits |
1.0 |
Wait for Xact Count |
Number of times waited for another transaction |
|
xact-dirty |
1.0 |
Dirty Xact Count |
Number of transactions still to be cleaned up. This also includes all the currently running transactions. Cleanup means that the Sweeper thread must still scan the transcation and collect/mark any "garbage" left by the transaction. Garbage is, for example, versions of rows that are no longer visiable by any transaction. |
|
stat-read |
1.0 |
Read Statements |
Number of SELECT statements |
|
stat-write |
1.0 |
Write Statements |
Number of UPDATE/INSERT/DELETE statements |
|
rec-in |
1.0 |
Record Bytes Read |
Bytes read from the record/row files |
|
rec-out |
1.0 |
Record Bytes Written |
Bytes written to the record/row files. This data is transfered from the transaction logs to the handle data (xtd) and the row index files (xtr). |
|
rec-rd.ms (before 1.5 use rec-r/ms) |
1.1 |
Record Read Time |
The time in milliseconds spent reading record and row files = |
|
rec-wr.ms (before 1.5 use rec-w/ms) |
1.1 |
Record Write Time |
The time in milliseconds spent writing record and row files |
|
rec-syncs/ms |
1.0 |
Record File Flushes |
2 values separated by a '/': the number of flushes to data handle (.xtd) and row index (.xtr) files and the time taken in milliseconds to perform the flush operations. |
|
rec-hits |
1.0 |
Record Cache Hits |
Hits when accessing the record cache. The record cache caches the data handle (.xtd) and row index (.xtr) files. |
|
rec-miss |
1.0 |
Record Cache Misses |
Misses when accessing the record cache |
|
rec-frees |
1.0 |
Record Cache Frees |
Number of record cache pages freed |
|
rec-%use |
|
Record Cache Usage |
Percentage of record cache in use. This value is displayed by xtstat as a percentage of the total cache available, but the value returned by PBXT_STATISTICS table is in bytes used. |
|
ind-in |
1.0 |
Index Bytes Read |
Bytes read from the index files |
|
ind-out |
1.0 |
Index Bytes Written |
Bytes written to the index files. This data is transfered from the index log files (ilog) to the index files (xti), during a consistent flush of the index. |
|
ind-rd.ms (before 1.5 use ind-r/ms) |
1.1 |
Index Read Time |
The time in milliseconds spent reading index files |
|
ind-wr.ms (before 1.5 use ind-w/ms) |
1.1 |
Index Write Time |
The time in milliseconds spent writing index files |
|
ind-syncs/ms |
1.0 |
Index File Flushes |
2 values separated by a '/': the number of flushes to index files and the time taken for the flush operations in milliseconds. |
|
ind-hits |
1.0 |
Index Cache Hits |
Hits when accessing the index cache |
|
ind-miss |
1.0 |
Index Cache Misses |
Misses when accessing the index cache |
|
ind-%use |
1.0 |
Index Cache Usage |
Percentage of index cache used. This value is displayed by xtstat as a percentage of the total cache available, but the value returned by PBXT_STATISTICS table is in bytes used. |
|
ind-%dty |
1.1 |
Index Cache Dirty |
Percentage of index cache that is dirty |
|
ilog-in |
1.0 |
Index Log Bytes In |
Bytes read from the index log files |
|
ilog-out |
1.0 |
Index Log Bytes Out |
Bytes written to the index log files. This data is transfered from the index cache in main memory to the index log files (ilog) during a consistent flush of the index. |
|
ilog-rd.ms |
1.5 |
Index Log Read Time |
The time in milliseconds spent reading index log files |
|
ilog-wr.ms (before 1.5 use ilog-w/ms) |
1.1 |
Index Log Write Time |
The time in milliseconds spent writing index log files |
|
ilog-syncs/ms |
1.0 |
Index Log File Syncs |
2 values separated by a '/': the number of flushes to index log files and the time taken for the flush operations in milliseconds |
|
xlog-in |
1.0 |
Xact Log Bytes In |
Bytes read from the transaction log files |
|
xlog-out |
1.0 |
Xact Log Bytes Out |
Bytes written to the transaction log files. This is data transfered from the transaction log buffer (pbxt_transaction_buffer_size) to the transaction log files (.xlog). This transfer occurs on commit or when the transaction log buffer is full. |
|
xlog-rd.ms (before 1.5 use log-r/ms) |
1.1 |
Trx Log Read Time |
The time in milliseconds spent reading transaction log files. In version 1.1, this statistic incuded data-rd.ms (i.e. the read time for both transaction and data logs) |
|
xlog-wr.ms (before 1.5 use xlog-w/ms) |
1.1 |
Trx Log Write Time |
The time in milliseconds spent writing transaction log files |
|
xlog-syncs |
1.0 |
Trx Log File Syncs |
Number of syncs (flushes) to transaction log files |
|
xlog-sy.ms (before 1.5 use xlog-msec) |
1.0 |
Trx Log Sync Time |
The time in milliseconds spent syncing (flushing) transaction log files |
|
xlog-hits |
1.0 |
Trx Log Cache Hits |
Hits when accessing the transaction log cache |
|
xlog-miss |
1.0 |
Trx Log Cache Misses |
Misses when accessing the transaction log cache |
|
xlog-%use |
1.0 |
Trx Log Cache Usage |
Percentage of transaction log cache used. This value is displayed by xtstat as a percentage of the total cache available, but the value returned by PBXT_STATISTICS table is in bytes used. |
|
data-in |
1.0 |
Data Log Bytes In |
Bytes read from the data log files |
|
data-out |
1.0 |
Data Log Bytes Out |
Bytes written to the data log files. This data is transfered from the data log buffer (pbxt_log_buffer_size) to the data log files (.dlog), when the buffer is full, or on commit. |
|
data-rd.ms |
1.5 |
Data Log Read Time |
The time in milliseconds spent reading data log files. In version 1.1 this statistic was included in the log-r/ms statistic (see xlog-rd.ms above) |
|
data-wr.ms (before 1.5 use data-w/ms) |
1.1 |
Data Log Write Time |
The time in milliseconds spent writing data log files. |
|
data-syncs |
1.0 |
Data Log File Syncs |
Number of flushes to data log files |
|
data-sy.ms (before 1.5 use data-msec) |
1.0 |
Data Log Sync Time |
The time in milliseconds spent flushing data log files |
|
data-hits |
1.5 |
Data Log Cache Hits |
Hits when accessing the data log cache |
|
data-miss |
1.5 |
Data Log Cache Misses |
Misses when accessing the data log cache |
|
data-%use |
1.5 |
Data Log Cache Usage |
Percentage of data log cache used. |
|
l2data-in |
1.0 |
Data Log L2 Cache Bytes In |
Bytes read from the data log level 2 cache file |
|
l2data-out |
1.0 |
Data Log L2 Cache Bytes Out |
Bytes written to the data log level 2 cache file |
|
l2data-rd.ms |
1.5 |
Data Log L2 Cache Read Time |
The time in milliseconds spent reading the data log level 2 cache |
|
l2data-wr.ms |
1.5 |
Data Log L2 Cache Write Time |
The time in milliseconds spent writing the data log level 2 cache |
|
l2data-hits |
1.5 |
Data Log L2 Cache Cache Hits |
Hits when accessing the data log level 2 cache |
|
l2data-miss |
1.5 |
Data Log L2 Cache Cache Misses |
Misses when accessing the data log level 2 cache |
|
l2data-%use |
1.5 |
Data Log L2 Cache Cache Usage |
Percentage of data log level 2 cache used |
|
to-chkpt |
1.0 |
Bytes to Checkpoint |
Bytes written to the transaction log since the last checkpoint |
|
to-write |
1.0 |
Log Bytes to Write |
Bytes written to the transaction log, still to be written to the database |
|
to-sweep |
1.0 |
Log Bytes to Sweep |
Bytes written to the transaction log, still to be read by the Sweeper thread |
|
sweep-waits |
1.0 |
Sweeper Wait on Xact |
Attempts to cleanup a transaction |
|
scan-index |
1.0 |
Index Scan Count |
Number of index scans |
|
scan-table |
1.0 |
Table Scan Count |
Number of table scans |
|
row-sel |
1.0 |
Select Row Count |
Number of rows selected |
|
row-ins |
1.0 |
Insert Row Count |
Number of rows inserted |
|
row-upd |
1.0 |
Update Row Count |
Number of rows updated |
|
row-del |
1.0 |
Delete Row Count |
Number of rows deleted |
|
retry-iscan |
1.5 |
Index Scan Retries |
Index scans restarted because of locked record |
|
retry-rlist |
1.5 |
Record List Rereads |
Record list rescanned due to lock |
|