dougo | Database newbie question

You're viewing

dougo's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

I'm just starting to learn about database stuff, playing around with PostgreSQL (and spgsql). So, if I have a data structure that contains an ordered list of another data structure, what's the best way to design the tables? I can think of at least two ways to do it, but I'm not sure I like either: add ordinal numbers to the second table, or store arrays in the first table. In the former, I'd have to renumber if I want to insert something in the middle of a list; in the latter, I'd have to generate a synthetic primary key to use in the arrays. Which I guess isn't so bad, but I don't know how to do this atomically in SQL. Any hints? Is there some other standard idiom for ordered lists? Maybe I need a good online tutorial for database design, any pointers?

Current Music: Jesus Lizard, "Pop Song"

Flat | Top-Level Comments Only

Date: 2004-12-28 04:29 am (UTC)

From:

ahkond.livejournal.com

no subject

I might use ordinal numbers in the second table because then you get the benefits of table implementation for the second structure. If you put arrays in the first table there's no quickie way to ensure that they're all set up right.

Another possibility that still lets you put the second structure in a table is to have each row in that second table have a field containing the ID value(s) of the NEXT such instance. That is, each row in the second table acts like a member of a linked list.

The second table would be like this:

unique ID
foreign key field pointing to table 1 for the big grouping
foreign key pointing to table 2 to reference the next entry after this one
(maybe ditto for previous entry)
data a
data b
data c
etc.

In this case, if you need to insert new members, all you have to do is update some rows, like stitching something into the middle of a linked list.

A foreign key that points from a table to itself is no big deal. Typically you leave it NULL to indicate the beginning and end of the list; then you can easily find the beginning of the list by querying for that.

The downside is that it's hard to say "give me all the rows in table 2 that belong to element x of table 1 ... IN ORDER". Instead you have to select them all and put them into a linked list in your client program (if such a thing exists), and then traverse the linked list outside the DB. If you need to go through the list in order as part of a data query then ordinals would probably be easier.

Dealing with ordinals isn't all that bad because you can say "update FOO set ordinalfield = ordinalfield + 1 where ordinalfield > (some breakpoint)" to shift everybody up one (above a certain point) to make a gap in the middle.

Date: 2004-12-28 04:59 am (UTC)

From:

dougo.livejournal.com

no subject

Huh, cute. There is something appealing about using a linked list and mapping it directly to a Scheme list, but I think that would probably end up being more trouble than it's worth. Sounds like ordinals will be less painful than I was thinking.

Do I still need that unique ID, though? The combination of the ordinal plus the foreign key back-pointer will do as a primary key, but I don't expect to have any references into this table anyway since the association has been inverted.

Date: 2004-12-28 01:59 pm (UTC)

From:

ahkond.livejournal.com

no subject

In that case you probably don't need the unique ID. For me it's just become a habit.

Date: 2004-12-28 05:04 am (UTC)

From:

dougo.livejournal.com

no subject

By the way, I guess it wasn't clear, I was intending to put arrays of foreign keys into the first table, not arrays with actual data.

Date: 2004-12-28 02:03 pm (UTC)

From:

ahkond.livejournal.com

no subject

Ugh! Let the database deal with the foreign keys using its own foreign key machinery. I don't know about PostgreSQL, but the stuff I use at work lets you declare foreign keys as such and it will enforce them for you. If you hide them in arrays the system won't verify them for you, no?

Besides the question of enforcing data integrity, the database might be smart enough to use real foreign keys to help it optimize queries.

In other words, if your data has relationships, try to embody them using the machinery of the database system (indexes, keys, etc.) and your DB will perform better.

Date: 2004-12-28 04:57 pm (UTC)

From:

dougo.livejournal.com

no subject

Yeah, I was thinking I could tell it that the array had foreign keys, but I guess you can only tell it whether a whole column is a single foreign key. So, dumb idea.

Flat | Top-Level Comments Only

Profile

dougo

The Steak Place

Page Summary

ahkond.livejournal.com - (no subject)

Style Credit

Style: Black for Negatives by phoenix

Expand Cut Tags

No cut tags

Doug Orleans's Journal

Database newbie question

no subject

no subject

no subject

no subject

no subject

no subject

Profile

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags