distal-attribute
distal-attribute
distal-attribute
distal-attribute

MySQL guru advice needed: MySQL 2,560,000 records per table

User, date Message
Written by Dr. DaveMoney, Euro
6 years ago
Category: General
100 posts since Sun, 27 Jan 08
This is not a HeidiSQL question. Let me know if I should post elsewhere.

I need to add 2,560,000 records to a MyISAM table. Each record/row will only have around 32kb of info.

The most rows/records I've ever needed to manage before within a table was slightly less than 175,000.

Any of you MySQL guru's have advice, or know anything I should watch out for before adding the over two and a half million records?
Written by Dr. DaveMoney, Euro
6 years ago
100 posts since Sun, 27 Jan 08
Now here: http://dev.mysql.com/doc/refman/5.1/en/column-count-limit.html

it says:

wrote: Every table has a maximum row size of 65,535 bytes. This maximum applies to all storage engines, but a given engine might have additional constraints that result in a lower effective maximum row size.

The maximum row size constrains the number of columns because the total width of all columns cannot exceed this size.



To me, that means exactly what it says, MAXIMUM ROW SIZE IS 65,535. That's fine, so I'm assuming I can have as many rows that I want per table, just as long as the size of each row doesn't exceed 65k. No problem... as I stated in the first post, I anticipate each row only being 37k.


But what throws me off is this:

wrote: Each table has an .frm file that contains the table definition. The .frm file size limit is fixed at 64KB. If a table definition reaches this size, no more columns can be added.



When rows are added to a table, does the .frm file increase? Or does the .frm file ONLY increase when you alter the size of the columns within that table?


Yeah, for a MySQL nut like myself I should know this. I'm sure it's MySQL 101.
Written by ansgar
6 years ago
4793 posts since Fri, 07 Apr 06

Dr. Dave wrote: When rows are added to a table, does the .frm file increase? Or does the .frm file ONLY increase when you alter the size of the columns within that table?



The .frm file only contains structure while the .myd file contains data. 64K should be more than enough to hold a lot of column definitions.
Written by ansgar
6 years ago
4793 posts since Fri, 07 Apr 06

wrote: Every table has a maximum row size of 65,535 bytes. This maximum applies to all storage engines, ...



What?? That sounds totally wrong. This is a documentation bug I guess. Unless I understand this totally wrong ...
Written by rosenfield
6 years ago
127 posts since Wed, 24 Jan 07
Sounds correct to me..

The largest column data type that you can store inline in row data is char/varchar at 8192 bytes, meaning a table with 8 char(8192) columns should work, but you cannot add any more columns.

Remember that BLOB and TEXT columns are stored outside the normal row data flow, so they only occupy 4 or 8 bytes for the pointer, or something close to that (check 'mysql internals' for the actual details).
Written by rosenfield
6 years ago
127 posts since Wed, 24 Jan 07
Dr. Dave: hard to give good advice without either a SHOW CREATE TABLE or a quick rundown of what kind of data you're going to store, plus an estimate of how you're going to use the data (any/many updates/deletes? insert/select only? random inserts or largely bulk inserts?)
Written by Dr. DaveMoney, Euro
6 years ago
100 posts since Sun, 27 Jan 08

CREATE TABLE /*!32312 IF NOT EXISTS*/ `rooms` (
`roomID` int(10) unsigned NOT NULL,
`virtualID` int(10) unsigned DEFAULT '0',
`status` tinyint(3) unsigned DEFAULT '0',
`roomName` varchar(200) CHARACTER SET latin1 DEFAULT NULL,
`roomDesc` text CHARACTER SET latin1,
`roomDesc2` text CHARACTER SET latin1,
`roomDesc3` text CHARACTER SET latin1,
`roomDesc4` text CHARACTER SET latin1,
`zone` int(10) unsigned DEFAULT '0',
`sector` int(10) unsigned DEFAULT '0',
`combat` int(10) unsigned DEFAULT '1',
`laylineV` int(10) unsigned DEFAULT '0',
`laylineH` int(10) unsigned DEFAULT '0',
`area` int(10) unsigned DEFAULT '0',
`roam` int(10) unsigned DEFAULT '0',
`type` int(10) unsigned DEFAULT '0',
`capacity` int(10) unsigned DEFAULT '50',
`race_province` int(10) unsigned DEFAULT '0',
`elevation` int(10) unsigned DEFAULT '0',
`temperature` int(10) unsigned DEFAULT '0',
`house` int(10) unsigned DEFAULT '0',
`room` int(10) unsigned DEFAULT '0',
`npc1` int(10) unsigned DEFAULT '0',
`npc1_interval` int(10) unsigned DEFAULT '0',
`npc2` int(10) unsigned DEFAULT '0',
`npc2_interval` int(10) unsigned DEFAULT '0',
`npc3` int(10) unsigned DEFAULT '0',
`npc3_interval` int(10) unsigned DEFAULT '0',
`npc4` int(10) unsigned DEFAULT '0',
`npc4_interval` int(10) unsigned DEFAULT '0',
`roomFlags_int` int(10) unsigned DEFAULT '0',
`roomFlags_string` varchar(35) CHARACTER SET latin1 DEFAULT '0',
`exit_N` int(10) unsigned DEFAULT '0',
`exit_S` int(10) unsigned DEFAULT '0',
`exit_E` int(10) unsigned DEFAULT '0',
`exit_W` int(10) unsigned DEFAULT '0',
`exit_NE` int(10) unsigned DEFAULT '0',
`exit_NW` int(10) unsigned DEFAULT '0',
`exit_SE` int(10) unsigned DEFAULT '0',
`exit_SW` int(10) unsigned DEFAULT '0',
`exit_UP` int(10) unsigned DEFAULT '0',
`exit_DOWN` int(10) unsigned DEFAULT '0',
`exit_OUT` int(10) unsigned DEFAULT '0',
`specialNotes` text CHARACTER SET latin1,
PRIMARY KEY (`roomID`),
UNIQUE KEY `roomID` (`roomID`),
KEY `roomID_2` (`roomID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;



Within this table will be 2.5 million records.

As far as accessing/altering. Once the records are entered, they will not be UPDATEd, or REPLACEd, or have any changes to them. Each record will only be accessed once (whenever the program that uses the data reboots) and stored into program memory. After the initial loading of the records into the program memory, the entire table will not be referenced.

I hope this gives enough insight to warrant advice. :)
Written by rosenfield
6 years ago
127 posts since Wed, 24 Jan 07
Actual structure and queries always help ;-).

1) A MySQL primary key is a unique index. (named 'PRIMARY' and with the additional requirement that all of it's columns must have a NOT NULL constraint). The unique index over roomID is superfluous.

From your description, I'm guessing that all queries are similar to 'SELECT * FROM rooms WHERE roomID=123' ?

2) If all SELECTs (I assume it's SELECT performance that you care about) are done using a 'WHERE roomID=...' clause, then performance should be fine. That's because the row offset in the data file can be looked up via the primary index, making random access possible, eg there'll be no hefty scanning of the data file. That means size (nr of records) ceases to matter. With that out of the way, performance is mostly proportional to the number of disk seeks caused by a query. The primary index is kept permanently in RAM in the key buffer, so if you make sure that the key buffer is large enough to hold it, that should cause no disk activity. After a lookup in the primary index, there'll be one disk seek to locate each relevant row, and one additional disk seek for each of the text fields of each row (at worst). There's a bit of CPU usage for joining the TEXT columns with the main row data and shipping results to the client, which should be very minimal.

If the queries are not of the form mentioned above, all bets are off, of course ;-). You can use EXPLAIN SELECT to have MySQL tell you how fast a particular statement is going to execute. There's also BENCHMARK for testing.
Written by rosenfield
6 years ago
127 posts since Wed, 24 Jan 07

rosenfield wrote: The unique index over roomID is superfluous.



So is the third key over that same column, get rid of that too!
Extra indexes hurt INSERT performance and fills up the key buffer.

If you're absolutely sure that there'll only ever be 2,6 mio records, then the size of the primary index can be reduced by changing roomID from an int (32-bit) to a mediumint (24-bit, max 16 mio records). This frees up a bit of space in the key buffer (1 byte per record, total 2,6 mio bytes saved).
Written by Dr. DaveMoney, Euro
6 years ago
100 posts since Sun, 27 Jan 08
Excellent advice. I certainly need to make a few changes then.

Here are the queries that are run on every record when the program is booted.


void roomFunction()
{
// Some code
// Some code
snprintf(szQueryText, sizeof(szQueryText), "SELECT * FROM game_data.rooms WHERE status = '1' ORDER BY roomID ASC");
My_result = MySQL__query(szQueryText);
MySQL__endquery(My_result);
snprintf(szQueryText, sizeof(szQueryText), "UPDATE game_data.rooms SET virtualID = '%d', zone = '%d', roomFlags_int = '%d' WHERE roomID = '%d'", data, data, data, data);
My_result = MySQL__query(szQueryText);
MySQL__endquery(My_result);
// Some code
// Some code
}



Yes, there is an UPDATE in there I missed before.


Anyway, I suppose I'm getting way off track. My original question has definitely been answered, and I thank you for the awesome responses.
Written by Dr. DaveMoney, Euro
6 years ago
100 posts since Sun, 27 Jan 08
Here's a much more usable way.

CREATE TABLE /*!32312 IF NOT EXISTS*/ `rooms` (
`roomID` mediumint(7) unsigned NOT NULL,
`virtualID` mediumint(7) unsigned DEFAULT '0',
`status` tinyint(3) unsigned DEFAULT '0',
`roomName` varchar(200) CHARACTER SET latin1 DEFAULT NULL,
`roomDesc` text CHARACTER SET latin1,
`roomDesc2` text CHARACTER SET latin1,
`roomDesc3` text CHARACTER SET latin1,
`roomDesc4` text CHARACTER SET latin1,
`zone` smallint(5) unsigned DEFAULT '0',
`sector` smallint(5) unsigned DEFAULT '0',
`combat` tinyint(2) unsigned DEFAULT '1',
`laylineV` tinyint(3) unsigned DEFAULT '0',
`laylineH` tinyint(3) unsigned DEFAULT '0',
`area` smallint(5) unsigned DEFAULT '0',
`roam` mediumint(7) unsigned DEFAULT '0',
`type` mediumint(7) unsigned DEFAULT '0',
`capacity` smallint(5) unsigned DEFAULT '50',
`race_province` tinyint(2) unsigned DEFAULT '0',
`elevation` smallint(5) unsigned DEFAULT '0',
`temperature` smallint(5) unsigned DEFAULT '0',
`house` mediumint(7) unsigned DEFAULT '0',
`room` smallint(5) unsigned DEFAULT '0',
`npc1` int(7) unsigned DEFAULT '0',
`npc1_interval` int(10) unsigned DEFAULT '0',
`npc2` mediumint(7) unsigned DEFAULT '0',
`npc2_interval` int(10) unsigned DEFAULT '0',
`npc3` mediumint(7) unsigned DEFAULT '0',
`npc3_interval` int(10) unsigned DEFAULT '0',
`npc4` mediumint(7) unsigned DEFAULT '0',
`npc4_interval` int(10) unsigned DEFAULT '0',
`roomFlags_int` bigint(20) unsigned DEFAULT '0',
`roomFlags_string` varchar(35) CHARACTER SET latin1 DEFAULT '0',
`exit_N` mediumint(7) unsigned DEFAULT '0',
`exit_S` mediumint(7) unsigned DEFAULT '0',
`exit_E` mediumint(7) unsigned DEFAULT '0',
`exit_W` mediumint(7) unsigned DEFAULT '0',
`exit_NE` mediumint(7) unsigned DEFAULT '0',
`exit_NW` mediumint(7) unsigned DEFAULT '0',
`exit_SE` mediumint(7) unsigned DEFAULT '0',
`exit_SW` mediumint(7) unsigned DEFAULT '0',
`exit_UP` mediumint(7) unsigned DEFAULT '0',
`exit_DOWN` mediumint(7) unsigned DEFAULT '0',
`exit_OUT` mediumint(7) unsigned DEFAULT '0',
`specialNotes` text CHARACTER SET latin1,
PRIMARY KEY (`roomID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_cs;



That look better? That should take care of a lot of unnecessary overhead.
 

Please login to leave a reply, or register at first.