Monday, October 2, 2006

How to Reverse Engineer a Database with Microsoft Visio

What do we study in Information Systems Grad School? If there's a single topic that comes up in every class, it's databases. If we haven't memorized the first three normal forms by now, we haven't learned much. While few of us will bother going into Boyce-Codd Normal Form, 4NF and 5NF, every specification for a system we write that has a database needs an entity relationship diagram. An ERD is a visual representation of your data model, and your data model is probably the single most important part of any system you design. A good data model will survive several major versions of your software; a poor data model will make your system useless. Thus, we spend a lot of time doing data models and documenting them with ERDs.

As much as I love Visio, drawing the things from scratch is somewhat tedious. It's much easier to design and test in Access. (I have it on good authority that even elite Oracle DBAs who hand-tune Solaris for better performance will design and test in Access just because it's easy.) So what do you do when you have a decent test DB in Access and you don't want to diagram every little change in your masterful Visio ERD? Reverse engineer.

In Visio, it's fairly easy, but there are a couple of spots where it doesn't behave as nicely as it should. I'm going to refer to Visio 2007, still in Beta and free for the download and registration. Visio 2003 is almost the same. Visio 2003 Enterprise Architect Edition will create the database from your diagram, in case you can design an enterprise DB but don't know how to create the tables in SQL. (Not really someone you'd want touching your SQL server.)

1. Open Visio and select New | Software & Database | Database Model Diagram with the units of your choice.
2. Now that you have a database model diagram open, the database menu will appear. Select "Reverse Engineer" off of the database diagram.
3. A confusing dialog box will appear. Use it to verify you have the right drivers installed.
4. For Microsoft Access, choose Microsoft Access as your driver, and hit Next.
5. A username and password dialog box will pop up. Unless you've assigned a username and pw to the database, leave it blank and hit OK.
6. Navigate your filesystem and select your database. Ignore the clunkiness and be grateful that you can see filenames longer than 8.3.
7. When you have found your .MDB file, choose it and hit OK.
8. Select the types of objects you would like to import and hit Next. (No, you don't get stored procedures and triggers in Access, but you would in SQL.)
9. Select the specific tables, queries, etc. you want to see in your diagram and hit Next.
10. Select Yes to add them to your current diagram, and hit Finish. (Select no you have a lot of tables, queries, etc.)
11. You should see your tables in the diagram.
12. To add the crow's feet and cardinality, select options on the database menu. (Database | Options | Document )
13. You get three sections to change here: The General tab covers symbol sets: IDEF1X or Relational, Conceptual, Physical, both, or names based on symbol set. The table tab lets you display keys, indexes, non keys, and the IDEF1X optionality 0. The relationship tab lets you display relationships (duh), crow's feet, cardinality, and referential actions. You must select cardinality before you select crow's feet. (caridinality gets greyed out when crow's feet is checked.)
14. To update your diagram, select "Refresh Model" on the Database menu...

Now that you can see your information model, you know why it's messed up. That database that your business/organization/department runs on -- it's not in any kind of normal form. Or it has about 100 tables more than you thought it should.

The database people in my office have a debate: were the software engineers just trying to make it impossible to wean your organization off of their support, or were they just bad at information modeling?