UML  Database Modeling

© Cincom Systems, Inc. 2007
All Rights Reserved

 

1.              Background

This paper proposes a database modeling language based on the Unified Modeling Language.

TheUnified Modeling Language (UML) is a standard language which was developed for object oriented software modeling.  See http://www.omg.org/uml/ .  It can be applied in a natural way for object oriented database modeling.  As the language is flexible it can also be used to model relational databases.  Several mechanisms are available for extending the language where necessary.

Several authors have proposed data modeling languages based on UML.  See:

http://www.agiledata.org/essays/umlDataModelingProfile.html
http://www.rational.com/media/whitepapers/tp162.pdf?SMSESSION=NO

This paper builds on the models proposed in the above work.  It defines in more detail the elements of the language, defines the use of constraints and ties together relational and object oriented concepts.  This language does not include SQL3 elements but could be extended to include these.  The paper assumes familiarity with the UML and the UML Object Constraint Language (OCL).  For an introduction see http://www.omg.org/gettingstarted/what_is_uml.htm .

The Database UML profile (DB-UML) is defined using UML extension mechanisms. Stereotypes are used to create subtypes of standard UML elements.  Standard methods and attributes are defined for elements.  For elements which do not have attributes or methods, standard tagged values are defined.

The language is useful for modeling object relational databases.  By ignoring the object oriented structures the language is also useful for modeling relational databases.

 

1.             Example Model

Throughout the paper we will use the following example of an auction database.  The database organizes a set of items for sale at an auction.  For each item a record of bids is kept.  The bid records identify the company submitting the bid and records the amount.  Also the auction’s seller is defined.  The table definitions might be as follows:

create schema Auction;

create table Auction.Company
(
   name String primary key
);

create table Auction.Item
(
   name String primary key
);

create table Auction.Auction
(
   saleDate DATE not null,
   saleItems sequence of Item,
   seller Company not null
);

create table Auction.Bid
(
   myItem String not null,
   amount Monetary not null,
   bidder String not null,
   constaint fk_myItem foreign key(myItem) references Item
   constaint fk_bidder foreign key(bidder) references Company
);

create view Auction.HighBidders
(
   name String,
   highAmount Monetary
)
as select name, amount from Auction.Company, Auction.Bid
where name = bidder and amount > 10,000;

 

The database shows both object oriented structures and relational structures.  The Bid table shows the use of relational structures as the column myItem is a foreign key to the Item table.  And the bidder column is a foreign key to the Company table.  The Auction table shows the use of object oriented structures.  The seller column is a reference to an instance (a row) in the Company table.  The saleItems column is an array of references to instances in the Item table.

The database also shows a view that derives data from several of the tables.  The view HighBidders shows companies that have bid over a certain amount in the auction.

A complete modeling diagram in DBUML for the elements and relationships follows.  Each of the elements and relationships is defined in later sections with an explanation of the properties that are relevant for database modeling.

2.             DBUML Language Elements

Database

A Database is the highest level modeling element.  It is used to organize database Schemas and to associate model elements with a physical database.

The stereotype <<Database>> when used as a UML Component defines a physical database.  A Database is represented by the following figure:

A Database has the following standard tagged values: driver, URL, user and password.  As a Database identifies a user and password it can be used to define a connection to a physical database.  

A Database can be associated with one or more Schema elements where the Schema elements have a dependency as a client.

Schema

Database schemas are typically used as an organization mechanism.  They are used to qualify tables and to facilitate security.   The implementation of schemas varies by database.  For example, some databases may use a table’s user as the schema for the table.  

The stereotype <<Schema>> when used as a UML Package defines a database schema.  A Schema is represented by the following figure:

A Schema can be associated with a Database element through a dependency.  A Schema can have and one or more Table or View elements as owned elements.

If the Schema name is the same as the associated Database name, the Schema is considered to be the “default” Schema.  The owned elements of a Default Schema have database catalog names unqualified by a Schema name.  If a Database does not support Schemas (the catalog does not support Schema qualified names) then only the default Schema may have an association to the Database.  If a Database does support Schemas then other Schema elements may have an association to the Database.  The owned elements of these Schemas have database catalog names qualified by the Schema names. 

Table

A table is a set of database rows with the same structure. 

The stereotype <<Table>> when used as a UML Class defines a database table.  A Table is represented by the following figure:

For the auction database example a diagram showing an Auction Schema and the Tables associated with the Schema might look like the following:

Table Columns

 

A column is the basic database organization unit used to store data.  Each row of a table is composed of columns.  Columns are modeled as class attributes. 

 

Not Null Constraint. 

Columns defined as “not null” are modeled using an invariant attribute constraint with the “null” method.  For example, in the auction database the Auction Table defines the seller column as “not null”.  Using the Object Constraint Language (OCL) this could be defined as:

 

            context Auction inv nullCon:
            null(seller.oclAsType(object)) <> 1

 

This defines an invariant constraint named nullCon on the Auction Table.  The constraint tests that the unique method does not return 1 for the seller attribute. 

 

Unique Constraint.  

Unique columns are modeled using an invariant attribute constraint.  The constraint uses the OCL function “isUnique”.  For example, in the auction database the Item Table defines the name column as unique.  Using OCL this could be defined as:

 

            context Item inv uniCon:
            isUnique(name.oclAsType(object)) = 1

 

This defines an invariant constraint named uniCon on the Item Table.  The constraint tests that the unique method returns 1 for the name attribute.  The name attribute is cast to object using the oclAsType method.

 

Primary Key Constraint. 

In SQL, a column defined to have the “not null” and “unique” constraints can simply be defined as having a “primary key” constraint..  A Column defined with a primary key constraint is modeled by an attribute with stereotype <<PK>>. 

 

If the SQL primary key constraint is named, the name is modeled with a tagged value “SQL_constraint”.

 

Foreign Key Constraint.  

A foreign key constraint is modeled as an association between the table with the foreign key and the table with the primary key.  Also, the foreign key column is modeled by an attribute with the stereotype <<FK>>. 

 

For example, in the auction database there is a association between the Bid Table and the Item Table.  The Bid Table has a foreign key myItem associated with the primary key name in the Item Table.  This is shown in the following diagram:

 

The association has the same name and stereotype as the foreign key attribute.  The association has the same end name and stereotype as the the primary key attribute. 

 

If the SQL foreign key constraint is named, the name is modeled with a tagged value “SQL_constraint”.  For  multi-column SQL foreign key constraints the key sequence number is modeled by the tagged value “KEY_SEQ”.

 

 

If a referential integrity constraint is defined for a foreign key then an invariant constraint is defined for the table.  For example, if a cascade delete action is defined for the Bid entries associated with an Item then a constraint would be defined as follows:

 

            context Bid inv del_Item:
            onDelete(fk_myItem, “cascade”) = 1

 

This defines an invariant constrant named del_Item on the Bid table.  The constraint tests that the onDelete method returns a 1 for the fk_myItem relationship with the “cascade” option.

 

 

Index.  

Constraints are also used to model database indices.  A database index is a physical structure used to improve performance.  An index defined over a column is modeled using an invariant attribute constraint with the “index” method.  For example, in the auction database we might have an index defined over the amount column in the Bid Table.  This would help performance when sorting results by amount.  Using OCL this could be defined as:

 

            context Bid inv indexCon:
            index(amount.oclAsType(object), 1) = 1

 

This defines an invariant constraint named indexCon on the Bid Table.  The constraint tests that the index method returns 1 for the amount attribute.  The second parameter in the index method defines the position of the attribute in a multiple attribute index.

 

Data Types

 

SQL-92 types are modeled as primitive datatypes.  The datatypes are organized in the package types.sql.  Types specific to a database can also be defined in new packages.

 

For object oriented databases, attributes can have a type defined by another Table.  This is modeled by a one-to-one association.   For example, in the Auction database the Auction Table has a reference to the Company Table defined as the seller attribute.

 

 

Object oriented databases can also define array attributes.  These are modeled by one-to-many associations.  For example, in the Action database, the Auction table has a sequence of references to the Item table defined in the saleItems attribute.

 

 

 

Default Values

 

Default values for columns are modeled as initial values for attributes.

 

Operations

Tables have the following standard operations:

·         Boolean null( object )  This is used to define a “not null” constraint on attributes.

·         Boolean index( object, int )  This is used to define an index constraint on attributes.

·         Boolean onUpdate( object, String )  This is used to define a referential integrity constraint for a Foreign Key..

·         Boolean onDelete( object, String )  This is used to define a referential integrity constraint for a Foreign Key..

Tables may also have user-defined methods which can be used to model database procedures and triggers.  Tables may be associated with Schemas as owned elements.

 

           

 

Views

 

A database view is a virtual table with the source of data being other tables or other views.  Like a table, a view defines columns and methods.  But it also defines how to derive these columns from other tables and views. 

The stereotype <<View>> when used as a UML Class defines a database view.  A View is represented by the following figure:

View columns are modeled as attributes.  The attributes can have the same types and properties as Table attributes except that stereotypes <<FK>> and <<PK>> can not be used on View attributes.  Also the constraints for uniqueness, not null and index do not apply to View attributes.

Views are dependent on the tables and views used to derive data.  This relationship is modeled as a dependency with stereotype <<Derive>>.  The dependency defines the query used to derive data.  The query is defined as a tagged value.

For example, in the auction database we might have a view that only shows companies that have bid a certain amount in the auction.  The View derives data from both the Bid Table and the Company Table.

Views are associated with Schemas as owned elements.

Generalization

Object Oriented databases allow the use of generalization so that a table can inherit the columns and procedures from another table.  Also generalization can be used with views.

For example, in the auction database we might want to use the Company table as a generalization.  We could define sub-tables of Company to represent companies specializing in building or in machine tools.

3.                 DBUML Diagrams

Database Deployment Diagrams

Deployment diagrams show the configuration of run time nodes and components that live on them.  They are used to model the static deployment of a system.  The diagrams address the distribution, delivery and installation parts of the physical system.

Database deployment diagrams show run time nodes and the databases deployed on them.  They also show the relationship of databases and schemas.  For example, the following database deployment diagram shows the demo Database and two associated Schemas.  The Auction Schema is shown and the default Schema is shown.

Database Schema Diagrams

Database Schema diagrams are similar to class diagrams.  Schema diagrams show schemas, tables and views and their relationships.  .