Database Design SQL. Introduction to data modeling

SQL tutorial

Previous page Previous      TOC Contents      Next pageNext

Lesson 1 - Introduction to data modeling

A short history lesson

Once upon a time all files were stored on magnetic tape and all access was sequential. Then came the disk drive and random access and there was joy in the land! But you know, because your parents told you, that too much of a good thing is bad for you. And so it is with random access - unless you organize your data it will be so randomized that you'll never find it again.

So, somebody came up with a way to store large amounts of data in such a way that it could be updated and retrieved when needed. They called the structure containing the data a database and the programs doing all the administrative work of handling the data a Database Management System (DBMS).

The first databases were modeled upon COBOL data structures (in those days every programmer was a COBOL programmer) and were called hierarchical because of the way in which the data are structured. Eventually they improved upon the first model and came up with a network model, which has absolutely nothing to do with Novell or NT, but describes the way the data elements relate to one another. Some of those large databases are still in use today in legacy applications all over the place.

In the early 70's Dr. E.F.Codd, who happened to be a mathematician rather than a programmer, came up with a new model he called relational. This relational model, built on the mathematics of set theory, was powerful, flexible and easy to use. But it turned out to be such a hog for disk space and processing time that it wasn't really a viable alternative to the previous models. It wasn't until hardware performance improved in the late 70's that the model started gaining acceptance.

By the mid 80's, cheap PC's with ever-increasing capabilities made it possible to develop small versions of relational databases. It was Oracle Corp. that really put relational database development on the map and today Oracle is still the leader in the field.


Starting the design process

  • Step 1: Feasibility study
    • Is the project too small? Too big? Technically feasible?

    • What are the costs involved - for development, for maintenance?
      Is it cost-effective - are the savings greater than the costs?

    • Is the timeframe realistic? Can it be done in the time alloted?

  • Step 2: Detailed analysis
    • Start with the output and work backward. What output does the client want to see? What input will be required to produce that output?
      Describe the screens, the reports, the results that will be produced.

    • Study the existing system (there's always something, even if it's done with quill and ink).
      Determine what must be kept, what must be changed and what must be scrapped.

    • Keep this in mind: Any system you deliver must perform at least as well as the system the client is using now. Although that sounds simple enough, I could come up with a pile of cases where an improved system took twice as long to do the work and cost twice as much to operate as the inadequate system it replaced.

  • Step 3: Data modeling
    • Create a model (drawing, graphic representation, schema) of the data. Use a pencil and paper if you have to or, preferably, a software modeling tool. This is the equivalent of blueprints for a house. It does not require much effort to add or remove things from a drawing. It is a lot harder to do once the house is built or the database is coded.

    • The model is created with the help of the client. The client knows what needs to be done although he may not know how it will be done - that's your job. Always keep the client involved at the design stage.


Data modeling

  • Definitions
    • Entity: an object, a thing in the system about which data is kept - equivalent to a file - it will be implemented as a table in the database.

    • Attribute: an item of data refering to an entity - equivalent to a field - it will be implemented as a column in a table in the database.

    • Primary key: the attribute (or combination of attributes) that uniquely identifies every occurence of an entity.

    • Relationship: the way entities link to one another

  • Examples

    Entity  Student,
    RegisteredStudents in "School" application
    Customer in "Billing" application
    Employee in "Payroll" application 
    Attribute  Student_Id,
    Student_Major for "Student" entity
    Cust_Ship_to_Address for "Customer" entity
    Employee_Salary for "Employee" entity 
    Primary key  Student_Id for "Student" entity
    Class_Number for "Class" entity
    Student_Id + Class_Number for "RegisteredStudents" entity 
    Relationship  Professor teaches Class
    Student registered in Class
    Customer orders Product 

  • Graphical representation
    • The main tool in the modeling process is called an Entity-Relationship diagram or E-R diagram for short. It shows all the components we have been discussing:Entities, Attributes of Entities, Key attributes of Entities and Relationships between Entities.

    • There is one symbol that appears in the diagram that we haven't yet discussed: the line with a crow's foot at the end.
      The line tells us that the entities are related and the ends of the line describe the degrees of relationship, also called cardinality. Degrees identify how many occurences of one entity are related to how many occurences of another entity. Degrees are expressed in one of 3 ways:
      • One-to-one
      • One-to-many
      • Many-to-many

      For example: the Student <--> Class relationship is many to many - a given student (an occurence of the Student entity) may take many classes (an occurence of the Class entity) and each class may contain many students. That is shown on the diagram by a crow's foot at each end of the line. The o with the crow's foot says that a student may be signed-up for no classes (a football player?) and a given class may have no students (not offered this term).

      The Class <--> Professor relationship is one to many: a given professor may teach zero or many classes but each class must have one and only one professor.

      If you were told that each Professor only teaches one Class and that each Class only has one Professor, you would be looking at a one to one relationship.

      It is very important to describe the degrees of relationships accurately when you do the preliminary design. The client will not say: "There is a one to many relationship between Class and Professor". He'll tell you, if you bother to ask: "In this School, every class only has one Professor assigned to it". In another school, you may hear: "We're very proud of our team-teaching approach. A class may be taught by several Professors working together." There you're looking at a many to many relationship and you will have to implement the database accordingly.

If you haven't found the database resource you're looking for,
use our Google Search box for more information.



Home | Tutorials | Contact | Sitemap | Add URL | Privacy policy