I'm just starting to learn about database stuff, playing around with PostgreSQL (and spgsql). So, if I have a data structure that contains an ordered list of another data structure, what's the best way to design the tables? I can think of at least two ways to do it, but I'm not sure I like either: add ordinal numbers to the second table, or store arrays in the first table. In the former, I'd have to renumber if I want to insert something in the middle of a list; in the latter, I'd have to generate a synthetic primary key to use in the arrays. Which I guess isn't so bad, but I don't know how to do this atomically in SQL. Any hints? Is there some other standard idiom for ordered lists? Maybe I need a good online tutorial for database design, any pointers?

From: [identity profile] ahkond.livejournal.com


I might use ordinal numbers in the second table because then you get the benefits of table implementation for the second structure. If you put arrays in the first table there's no quickie way to ensure that they're all set up right.

Another possibility that still lets you put the second structure in a table is to have each row in that second table have a field containing the ID value(s) of the NEXT such instance. That is, each row in the second table acts like a member of a linked list.

The second table would be like this:

unique ID
foreign key field pointing to table 1 for the big grouping
foreign key pointing to table 2 to reference the next entry after this one
(maybe ditto for previous entry)
data a
data b
data c
etc.

In this case, if you need to insert new members, all you have to do is update some rows, like stitching something into the middle of a linked list.

A foreign key that points from a table to itself is no big deal. Typically you leave it NULL to indicate the beginning and end of the list; then you can easily find the beginning of the list by querying for that.

The downside is that it's hard to say "give me all the rows in table 2 that belong to element x of table 1 ... IN ORDER". Instead you have to select them all and put them into a linked list in your client program (if such a thing exists), and then traverse the linked list outside the DB. If you need to go through the list in order as part of a data query then ordinals would probably be easier.

Dealing with ordinals isn't all that bad because you can say "update FOO set ordinalfield = ordinalfield + 1 where ordinalfield > (some breakpoint)" to shift everybody up one (above a certain point) to make a gap in the middle.


From: [identity profile] dougo.livejournal.com


Huh, cute. There is something appealing about using a linked list and mapping it directly to a Scheme list, but I think that would probably end up being more trouble than it's worth. Sounds like ordinals will be less painful than I was thinking.

Do I still need that unique ID, though? The combination of the ordinal plus the foreign key back-pointer will do as a primary key, but I don't expect to have any references into this table anyway since the association has been inverted.

From: [identity profile] ahkond.livejournal.com


In that case you probably don't need the unique ID. For me it's just become a habit.

From: [identity profile] dougo.livejournal.com


By the way, I guess it wasn't clear, I was intending to put arrays of foreign keys into the first table, not arrays with actual data.

From: [identity profile] ahkond.livejournal.com


Ugh! Let the database deal with the foreign keys using its own foreign key machinery. I don't know about PostgreSQL, but the stuff I use at work lets you declare foreign keys as such and it will enforce them for you. If you hide them in arrays the system won't verify them for you, no?

Besides the question of enforcing data integrity, the database might be smart enough to use real foreign keys to help it optimize queries.

In other words, if your data has relationships, try to embody them using the machinery of the database system (indexes, keys, etc.) and your DB will perform better.

From: [identity profile] dougo.livejournal.com


Yeah, I was thinking I could tell it that the array had foreign keys, but I guess you can only tell it whether a whole column is a single foreign key. So, dumb idea.
.

Most Popular Tags

Powered by Dreamwidth Studios

Style Credit

Expand Cut Tags

No cut tags