creating a rowid field

fish posted 10 years ago in General
I am trying to create a rowid field in a table and I am having a lot of problems. Here is the background of why I am doing this, my proposed solution and specific question related to it:

I have a large table of geographically referenced bathymetric (water depth) data. So there are 3 fields (longitude,latitude,depth) and 22*10^6 observations. My problem is that to create maps with depth contours in the programme R I need to import (or run an sql query from R to fetch the data from mysql) and then draw the contours. R is not very good with large data sets and because the bathymetric data table has many more data observations than I need for the detail of contour I am looking for, I can thin the data by at least a factor of 10. The only way I can figure to do this in sql is with the "mod" function in a command like

select * From bathymetry where rowid mod 10 = 1;

And in this case, rowid is a unique identifier for an observation. I do not want to randomly sample the data but select just every 10th observation so that my observations are equally spaced so that contours have equal resolution along their entire length.

I need to create a unique rowid for each of the 22 million observations. My solution so far has been to import the data as a CSV file then in another program create a vector from 1 to 22*10^6 and import to mysql and then perform a cartesian product join on the two but this is sucking my computing resources dry and it seems clumsy but it commands like "sequence" do not appear to be supported in mysql and I cannot seem to make heidi auto-increment a field from 1:end of table.

Is there an easier way to create and rowid, hidden or visible that can be directly queried? In fact it would be preferable to have the id hidden.

Thanks very much,
ansgar posted 10 years ago
I'm not 100% what you mean by rowid, but I have the feeling you want to numerate each record, preferrably automatically. If yes, you can let MySQL do that with the following query:
ansgar posted 10 years ago
Btw, that query can partly fix this bug: http://bugs.heidisql.com/1685659
fish posted 10 years ago
Thanks anse, it works great and is a lot better than my solution.

Oracle automatically creates a hidden field in tables called "rownum" which just goes from 1:end.of.table and can be queried directly. It is a very useful feature for example if you just want to see a particular row or say the first 10 rows in a table just to get a taste of what it is -- this is something I do often especially when browsing a new database:

select * from strangetable where rownum<10;

It would be interesting for me at least to have this automatically done in mysql and I suppose heidisql could fake it by having that field made on each import. Anyway, just something I think could be a useful feature but I am not sure anyone else would think so and it may take too much space and resource in queries.

Thanks very much again for your help and keep up the good work on HeidiSQL, I appreciate your efforts.
SQareLogic posted 10 years ago
"rowid" is Oracle speak for what MySQL calls auto_increment value. In Oracle, it is a pseudo column. In MySQL it is a standard data column of type INTEGER which is user-defined. It is common practice to use the auto_increment (or rowid field) as the primary key in SQL. If you define the type UNSIGNED INT(10) for this column you get around four billion unique identifiers which should be enough for your purposes. If not, you can make the auto_increment field part of a composite index or use BIGINT instead.

In Oracle, rowid is different from rownum. Unfortunately there is no equivalent for rownum in MySQL. To get the first 10 rows in MySQL you would use LIMIT instead:

SELECT * FROM strangetable LIMIT 10

to return the first 10 rows or:

SELECT * FROM strangetable LIMIT 10, 10

to retrieve the next ten rows (11-20), and so on.

Cheers, SQareLogic
fish posted 10 years ago
Thanks SQareLogic. Very useful stuff. I have used three databases (Oracle, SAS, MySQL) which I have accessed with SQL and each has a slightly different implementation. For instance the function "distinct" for "unique" threw me off for awhile, it is interesting to find different solutions to similar problems in each.


Please login to leave a reply, or register at first.