I'm just starting to learn about database stuff, playing around with PostgreSQL (and spgsql). So, if I have a data structure that contains an ordered list of another data structure, what's the best way to design the tables? I can think of at least two ways to do it, but I'm not sure I like either: add ordinal numbers to the second table, or store arrays in the first table. In the former, I'd have to renumber if I want to insert something in the middle of a list; in the latter, I'd have to generate a synthetic primary key to use in the arrays. Which I guess isn't so bad, but I don't know how to do this atomically in SQL. Any hints? Is there some other standard idiom for ordered lists? Maybe I need a good online tutorial for database design, any pointers?

From: [identity profile] rikchik.livejournal.com


Also, good (non-mySQL) DBMSes allow you to create arbitrarily complex atomic actions. I don't know if PostgreSQL has this ability, but I'd expect it to.

From: [identity profile] dougo.livejournal.com


Well, yeah, I guess I could make a stored procedure that does nextID++ atomically, but I was wondering if there was some mechanism that already did this. I'm sort of surprised/annoyed that rows don't already have unique identities, but I guess that'd just be an OODB. (Which is maybe what I really want, hm.)

From: [identity profile] ahkond.livejournal.com


In some database systems you can declare a field using a special data type which is effectively a system-generated unique ID. When you insert rows into the table you omit that field from the list of values you provide, and the system generates the unique ID using a seed/increment system. Microsoft SQL Server and Access both have this, for instance. But I've found that they're a lot of trouble (in terms of managing things six months later when we have to change the table because of a new feature) and I only use them as a last resort.

This is generally not there by default because the philosophy of a relational database is that the rows of a table are inherently un-ordered. They're just instances of a class-like sort of thing. For example, states of the union could be ordered by name, by population, by date they joined the union. No particular ordering is inherently the default ordering. So if you set up a table for "States" with those fields and started typing, the DB is free to store them in whatever order it likes and if you select * from the States table you won't necessarily get them back in the order you typed. You could say "ORDER BY name" or "ORDER BY area" and that would work, but how it works is the DB's business.

You can use unique ID numbers for setting up incoming foreign keys but these should be inherently meaningless, just placeholders for the relationship. The database generally makes no guarantees about what order rows are stored in a table unless you create a clustered (sorted) index.

From: [identity profile] dougo.livejournal.com


I understand that rows (and columns) are unsorted, that's why I have to deal with ordinals in the first place. But generating unique IDs is independent of ordering-- I don't care if the IDs are in order, or even consecutive, I just need to guarantee they're unique, even if two INSERTs happen concurrently.

After a little browsing around, I see that PostgreSQL does in fact have this, in the form of CREATE SEQUENCE and sequence functions. Good to know, but I think I don't need it yet.

Anyway, thanks for all your advice.

From: [identity profile] dougo.livejournal.com


Turns out it's even easier: PostgreSQL has a SERIAL column keyword that will handle the auto-increment automatically whenever a row is inserted with that column being DEFAULT (or no value). Cool.
.

Most Popular Tags

Powered by Dreamwidth Studios

Style Credit

Expand Cut Tags

No cut tags