MySQL guru advice needed: MySQL 2,560,000 records per table

[expired user #3333]'s profile image [expired user #3333] posted 16 years ago in General Permalink
This is not a HeidiSQL question. Let me know if I should post elsewhere.

I need to add 2,560,000 records to a MyISAM table. Each record/row will only have around 32kb of info.

The most rows/records I've ever needed to manage before within a table was slightly less than 175,000.

Any of you MySQL guru's have advice, or know anything I should watch out for before adding the over two and a half million records?
[expired user #3333]'s profile image [expired user #3333] posted 16 years ago Permalink
Now here: http://dev.mysql.com/doc/refman/5.1/en/column-count-limit.html

it says:

Every table has a maximum row size of 65,535 bytes. This maximum applies to all storage engines, but a given engine might have additional constraints that result in a lower effective maximum row size.

The maximum row size constrains the number of columns because the total width of all columns cannot exceed this size.



To me, that means exactly what it says, MAXIMUM ROW SIZE IS 65,535. That's fine, so I'm assuming I can have as many rows that I want per table, just as long as the size of each row doesn't exceed 65k. No problem... as I stated in the first post, I anticipate each row only being 37k.


But what throws me off is this:

Each table has an .frm file that contains the table definition. The .frm file size limit is fixed at 64KB. If a table definition reaches this size, no more columns can be added.



When rows are added to a table, does the .frm file increase? Or does the .frm file ONLY increase when you alter the size of the columns within that table?


Yeah, for a MySQL nut like myself I should know this. I'm sure it's MySQL 101.
ansgar's profile image ansgar posted 16 years ago Permalink

When rows are added to a table, does the .frm file increase? Or does the .frm file ONLY increase when you alter the size of the columns within that table?



The .frm file only contains structure while the .myd file contains data. 64K should be more than enough to hold a lot of column definitions.
ansgar's profile image ansgar posted 16 years ago Permalink

Every table has a maximum row size of 65,535 bytes. This maximum applies to all storage engines, ...



What?? That sounds totally wrong. This is a documentation bug I guess. Unless I understand this totally wrong ...
[expired user #1125]'s profile image [expired user #1125] posted 16 years ago Permalink
Sounds correct to me..

The largest column data type that you can store inline in row data is char/varchar at 8192 bytes, meaning a table with 8 char(8192) columns should work, but you cannot add any more columns.

Remember that BLOB and TEXT columns are stored outside the normal row data flow, so they only occupy 4 or 8 bytes for the pointer, or something close to that (check 'mysql internals' for the actual details).
[expired user #1125]'s profile image [expired user #1125] posted 16 years ago Permalink
Dr. Dave: hard to give good advice without either a SHOW CREATE TABLE or a quick rundown of what kind of data you're going to store, plus an estimate of how you're going to use the data (any/many updates/deletes? insert/select only? random inserts or largely bulk inserts?)
[expired user #3333]'s profile image [expired user #3333] posted 16 years ago Permalink
CREATE TABLE /*!32312 IF NOT EXISTS*/ `rooms` (
`roomID` int(10) unsigned NOT NULL,
`virtualID` int(10) unsigned DEFAULT '0',
`status` tinyint(3) unsigned DEFAULT '0',
`roomName` varchar(200) CHARACTER SET latin1 DEFAULT NULL,
`roomDesc` text CHARACTER SET latin1,
`roomDesc2` text CHARACTER SET latin1,
`roomDesc3` text CHARACTER SET latin1,
`roomDesc4` text CHARACTER SET latin1,
`zone` int(10) unsigned DEFAULT '0',
`sector` int(10) unsigned DEFAULT '0',
`combat` int(10) unsigned DEFAULT '1',
`laylineV` int(10) unsigned DEFAULT '0',
`laylineH` int(10) unsigned DEFAULT '0',
`area` int(10) unsigned DEFAULT '0',
`roam` int(10) unsigned DEFAULT '0',
`type` int(10) unsigned DEFAULT '0',
`capacity` int(10) unsigned DEFAULT '50',
`race_province` int(10) unsigned DEFAULT '0',
`elevation` int(10) unsigned DEFAULT '0',
`temperature` int(10) unsigned DEFAULT '0',
`house` int(10) unsigned DEFAULT '0',
`room` int(10) unsigned DEFAULT '0',
`npc1` int(10) unsigned DEFAULT '0',
`npc1_interval` int(10) unsigned DEFAULT '0',
`npc2` int(10) unsigned DEFAULT '0',
`npc2_interval` int(10) unsigned DEFAULT '0',
`npc3` int(10) unsigned DEFAULT '0',
`npc3_interval` int(10) unsigned DEFAULT '0',
`npc4` int(10) unsigned DEFAULT '0',
`npc4_interval` int(10) unsigned DEFAULT '0',
`roomFlags_int` int(10) unsigned DEFAULT '0',
`roomFlags_string` varchar(35) CHARACTER SET latin1 DEFAULT '0',
`exit_N` int(10) unsigned DEFAULT '0',
`exit_S` int(10) unsigned DEFAULT '0',
`exit_E` int(10) unsigned DEFAULT '0',
`exit_W` int(10) unsigned DEFAULT '0',
`exit_NE` int(10) unsigned DEFAULT '0',
`exit_NW` int(10) unsigned DEFAULT '0',
`exit_SE` int(10) unsigned DEFAULT '0',
`exit_SW` int(10) unsigned DEFAULT '0',
`exit_UP` int(10) unsigned DEFAULT '0',
`exit_DOWN` int(10) unsigned DEFAULT '0',
`exit_OUT` int(10) unsigned DEFAULT '0',
`specialNotes` text CHARACTER SET latin1,
PRIMARY KEY (`roomID`),
UNIQUE KEY `roomID` (`roomID`),
KEY `roomID_2` (`roomID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;


Within this table will be 2.5 million records.

As far as accessing/altering. Once the records are entered, they will not be UPDATEd, or REPLACEd, or have any changes to them. Each record will only be accessed once (whenever the program that uses the data reboots) and stored into program memory. After the initial loading of the records into the program memory, the entire table will not be referenced.

I hope this gives enough insight to warrant advice. :)
[expired user #1125]'s profile image [expired user #1125] posted 16 years ago Permalink
Actual structure and queries always help ;-).

1) A MySQL primary key is a unique index. (named 'PRIMARY' and with the additional requirement that all of it's columns must have a NOT NULL constraint). The unique index over roomID is superfluous.

From your description, I'm guessing that all queries are similar to 'SELECT * FROM rooms WHERE roomID=123' ?

2) If all SELECTs (I assume it's SELECT performance that you care about) are done using a 'WHERE roomID=...' clause, then performance should be fine. That's because the row offset in the data file can be looked up via the primary index, making random access possible, eg there'll be no hefty scanning of the data file. That means size (nr of records) ceases to matter. With that out of the way, performance is mostly proportional to the number of disk seeks caused by a query. The primary index is kept permanently in RAM in the key buffer, so if you make sure that the key buffer is large enough to hold it, that should cause no disk activity. After a lookup in the primary index, there'll be one disk seek to locate each relevant row, and one additional disk seek for each of the text fields of each row (at worst). There's a bit of CPU usage for joining the TEXT columns with the main row data and shipping results to the client, which should be very minimal.

If the queries are not of the form mentioned above, all bets are off, of course ;-). You can use EXPLAIN SELECT to have MySQL tell you how fast a particular statement is going to execute. There's also BENCHMARK for testing.
[expired user #1125]'s profile image [expired user #1125] posted 16 years ago Permalink

The unique index over roomID is superfluous.



So is the third key over that same column, get rid of that too!
Extra indexes hurt INSERT performance and fills up the key buffer.

If you're absolutely sure that there'll only ever be 2,6 mio records, then the size of the primary index can be reduced by changing roomID from an int (32-bit) to a mediumint (24-bit, max 16 mio records). This frees up a bit of space in the key buffer (1 byte per record, total 2,6 mio bytes saved).
[expired user #3333]'s profile image [expired user #3333] posted 16 years ago Permalink
Excellent advice. I certainly need to make a few changes then.

Here are the queries that are run on every record when the program is booted.
void roomFunction()
{
// Some code
// Some code
snprintf(szQueryText, sizeof(szQueryText), "SELECT * FROM game_data.rooms WHERE status = '1' ORDER BY roomID ASC");
My_result = MySQL__query(szQueryText);
MySQL__endquery(My_result);
snprintf(szQueryText, sizeof(szQueryText), "UPDATE game_data.rooms SET virtualID = '%d', zone = '%d', roomFlags_int = '%d' WHERE roomID = '%d'", data, data, data, data);
My_result = MySQL__query(szQueryText);
MySQL__endquery(My_result);
// Some code
// Some code
}


Yes, there is an UPDATE in there I missed before.


Anyway, I suppose I'm getting way off track. My original question has definitely been answered, and I thank you for the awesome responses.
[expired user #3333]'s profile image [expired user #3333] posted 16 years ago Permalink
Here's a much more usable way.
CREATE TABLE /*!32312 IF NOT EXISTS*/ `rooms` (
`roomID` mediumint(7) unsigned NOT NULL,
`virtualID` mediumint(7) unsigned DEFAULT '0',
`status` tinyint(3) unsigned DEFAULT '0',
`roomName` varchar(200) CHARACTER SET latin1 DEFAULT NULL,
`roomDesc` text CHARACTER SET latin1,
`roomDesc2` text CHARACTER SET latin1,
`roomDesc3` text CHARACTER SET latin1,
`roomDesc4` text CHARACTER SET latin1,
`zone` smallint(5) unsigned DEFAULT '0',
`sector` smallint(5) unsigned DEFAULT '0',
`combat` tinyint(2) unsigned DEFAULT '1',
`laylineV` tinyint(3) unsigned DEFAULT '0',
`laylineH` tinyint(3) unsigned DEFAULT '0',
`area` smallint(5) unsigned DEFAULT '0',
`roam` mediumint(7) unsigned DEFAULT '0',
`type` mediumint(7) unsigned DEFAULT '0',
`capacity` smallint(5) unsigned DEFAULT '50',
`race_province` tinyint(2) unsigned DEFAULT '0',
`elevation` smallint(5) unsigned DEFAULT '0',
`temperature` smallint(5) unsigned DEFAULT '0',
`house` mediumint(7) unsigned DEFAULT '0',
`room` smallint(5) unsigned DEFAULT '0',
`npc1` int(7) unsigned DEFAULT '0',
`npc1_interval` int(10) unsigned DEFAULT '0',
`npc2` mediumint(7) unsigned DEFAULT '0',
`npc2_interval` int(10) unsigned DEFAULT '0',
`npc3` mediumint(7) unsigned DEFAULT '0',
`npc3_interval` int(10) unsigned DEFAULT '0',
`npc4` mediumint(7) unsigned DEFAULT '0',
`npc4_interval` int(10) unsigned DEFAULT '0',
`roomFlags_int` bigint(20) unsigned DEFAULT '0',
`roomFlags_string` varchar(35) CHARACTER SET latin1 DEFAULT '0',
`exit_N` mediumint(7) unsigned DEFAULT '0',
`exit_S` mediumint(7) unsigned DEFAULT '0',
`exit_E` mediumint(7) unsigned DEFAULT '0',
`exit_W` mediumint(7) unsigned DEFAULT '0',
`exit_NE` mediumint(7) unsigned DEFAULT '0',
`exit_NW` mediumint(7) unsigned DEFAULT '0',
`exit_SE` mediumint(7) unsigned DEFAULT '0',
`exit_SW` mediumint(7) unsigned DEFAULT '0',
`exit_UP` mediumint(7) unsigned DEFAULT '0',
`exit_DOWN` mediumint(7) unsigned DEFAULT '0',
`exit_OUT` mediumint(7) unsigned DEFAULT '0',
`specialNotes` text CHARACTER SET latin1,
PRIMARY KEY (`roomID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;


That look better? That should take care of a lot of unnecessary overhead.

Please login to leave a reply, or register at first.