database menu

Databases from scratch II:
Simple Database Design

Have you ever noticed advertisements in the job classifieds for ‘database architects’ or ‘database administrators’? It makes database design sound pretty intimidating, doesn’t it? If they’re so difficult to use that they require architects and administrators, what hope do we mere mortals have?

In fact, databases range from being ultra-simple to use to ultra-complex. In the world of personal computers, there are two main types of database programs. At the simple end of the scale are flat-file databases, also called single-file or list managers. These programs are as easy to learn as a word processor or spreadsheet. The initial concepts may take a little more time to absorb than word processing, but they’re well within anyone’s grasp. If you’re using a database at home, in a class at school or in a small club or organisation, chances are the simple flat-file database will fill your needs.

Relational databases

Things can get more complex when you use the other type of PC database program, called a relational database. With a relational database program you can create a range of databases, from flat-file structures to demanding multi-file systems. If you’re using a database in your small business, a large organisation, or an ambitious school project, you’re likely to need at least some of the features of a more complex relational database.

Whichever type of database program you use, the most crucial step in using it is to design your database structure carefully. The way you structure your data will affect every other action. It will determine how easy it is to enter information into the database; how well the database will trap inconsistencies and exclude duplicate records; and how flexibly you will be able to get information out of the database.

A simple example

Let’s take an ultra simple example: the phonebook. Say you’ve been given the job of placing your school or organisation’s phone book on computer.

It should be easy: all you need is the name, the address and the phone number of each person. Your initial table design thus consists of three fields: name, address and phone number. Right?

Let’s check.

Testing your design

The way to see if your database design works is to test it with some sample data, so feed the following records into your hypothetical table:

J. T. Apples, 100 Megalong Dr Haberfield, 4992122
B. York, 2/53 Alice Leichhardt, 5050011
M. R. Sullivan, 9 Jay Leichhardt, 4893892
B. J. Anderson, 71 Wally Rd Glebe, 2298310

Now tell the database program to sort the information:

B. J. Anderson, 71 Wally Rd Glebe, 2298310
B. York, 2/53 Alice Leichhardt, 5050011
J. T. Apples, 100 Megalong Dr Haberfield, 4992122
M. R. Sullivan, 9 Jay Leichhardt, 4893892

Revising your design

Immediately, you can see this is not what you want. You want the table sorted alphabetically by last name, not initials.

How can you fix this? Well, you could do some complex manipulation using a database feature called ‘string functions’ – if your program supports such a feature. Or, you could come up with a better table design in the first place: last name, initials, address and phone number. Feed your test data into this revised structure then tell the database program to sort your table using the last name followed by the initials. This time, you’ll get the correct alphabetical listing:

Anderson, B. J., 71 Wally Rd Glebe, 2298310
Apples, J. T., 100 Megalong Dr Haberfield, 4992122
Sullivan, M. R., 9 Jay Leichhardt, 4893892
York, B., 2/53 Alice Leichhardt, 5050011

Keep on refining

Don’t stop there. The table can be even more effective if you break the structure down further. For instance, if you’d like to have an easy way to list only those people who live in Leichhardt, this design won’t help you. But with a little more work, you can break your database structure down further into last name, initials, street address, suburb, and phone number.

With this structure, you’ll be able to sort your database alphabetically by last name or by suburb, and you’ll be able to pluck out all those people who live in a particular suburb.

Take a look at this process of table refining in action.

More is less

Notice how creating an efficient table structure consists of breaking down your fields into simpler and simpler components? You end up with a table with many more fields than you might originally have thought necessary, but each of those fields houses much more basic information.

There’s a technical term for this process: normalisation. If you wanted to become a database architect you’d have to become conversant with normalisation and functional dependencies and normal forms. If you’re happy to remain a normal human, you can safely ignore these terms, provided you keep in mind that your task is to create a database structure that provides an efficient store for your information and that makes it flexible and easy to extract useful information.

In some ways, creating a database that’s effective and simple to use is almost an anti-intuitive process. For example, our initial structure:

phone number

seems like it’s a simpler design than our end result:

first name
last name
street address
phone number

Creating useful fields

What you need to remember is that while the structure might look more complex, the contents of each field have been reduced to the simplest useful components. I say “useful” because we could, of course, break each field down further. For instance, we could break the street address field into two fields, street number and street name. But what would be the point? There’s no need to extract or sort information in the database simply by street number or name, and so it’s not a useful basis for breaking up the field. On the other hand, if we wanted to deal with multi-line addresses which are common for businesses, such as:

Suite 5
122 Jones Street

then it makes sense to break the address field down into two simpler fields, address line 1 and address line 2. You’re not likely to want to sort information based on only one of these fields, nor are you likely to use either of these fields in isolation. What is likely is that you’ll want to have an easy way to print address line 1 and address line 2 as separate lines when addressing envelopes. So this field division becomes useful when getting information out of your database.

Permissable design infractions

I should add that creating these address sub-fields is not, technically, a good solution. Notice how with this table design, many records will have nothing in the address line 2 field, as most people’s address consists of one line, rather than two. So you’ll be wasting a lot of space in your database. Additionally, what about addresses that require more than two lines, such as:

Suite 5
Level III, Building A20
122 Jones Street

Our new table structure can’t cope with this. Should we add another address line to ensure we cater to the infrequent address that needs three lines? Should we add a fourth line just in case…?

You can see the problems you can create by not getting the design right. A technically rigorous solution is to remove the address lines from our phonebook table altogether, and stick them in an address table, that we then link to the phonebook table by a common field.

However, there’s no need to fuss too much about such details. Many databases get by with minor infractions of database design rules, and you shouldn’t feel hampered by such rules provided your table structures:

Computer-less design

One thing I hope you’ve noticed is that we’ve done all our design without the aid of a computer. This is as it should be: it lets you focus on the significance of the task without the distractions of trying to learn a database program at the same time.

You can design and test your database structure without going near a computer. The only thing you really need to know is the type of database program you’ll use: if it’s a flat-file database, such as Microsoft Works, you’ll be limited to single-table database design. If it’s a relational program, such as FileMaker’s FileMaker Pro, Microsoft Access or Lotus Approach, you can design single- or multi-table databases.

In the next article in this series, we’ll move on to relational design. You’ve seen how breaking down your fields into simpler components in a single table can help make it easier to get useful information out of your database. When we look at relational design, you’ll discover how extending this process lets you address the other two goals of database design: eliminating redundant information and excluding inconsistencies.

top home database menu