Database Design and SQL. Queries

SQL tutorial

Previous page Previous     TOC Contents     Next pageNext

Lesson 4 - Queries

The database

Before going any further, please make sure that the ProjectMgt database you are working with matches the model we created initially. You may have experimented with the tables and the columns in the previous lesson and that is perfectly OK! But before going on to the query statements, it is recommended that you consult Fig. 2-8 in Lesson 2 and match your database to that model. Don't worry about primary keys and relationships and so on at this point. We'll take care of that later. But do enter some meaningful data in the tables so that your queries will have something to display when you run them (15 or 20 records in each table should be enough).

When you input data into the tables, if you haven't created the relationships in Access, try to maintain referential integrity. That is: when you assign a department number to an employee, that department number should already exist in the Department table. When you create a Timesheet record, the employee number should exist in the Employee table and the project number should exist in the Project table.

You may use SQL statements to change the database or you may do it with Access. If you're really, sorry, really busy, you can download the database from the Download area after the last lesson.

Import from Access

Since we've already built the Project management database in Access, it seems a shame to waste all that work.

Fortunately, DBManager has a Wizard to convert the Access database into MySQL.

Creating Queries

As we mentionned earlier, it is quite probable that 95% of your work with SQL will consist of questions to the database. If the database structure is well-built and the information has been input, any question can be answered, no matter how tricky. "How many widgets were bought by women aged between 25 and 30 on Tuesdays in months ending in R over the past 5 years?" Give us 5 minutes and we'll build a query that will answer that for you. That's called an ad hoc query, which means "as needed" rather than one which has to be planned and programmed in advance. It can impress the hell out of the Boss or the Sales Manager! Hey! before you know it we'll be as good at this stuff as the guys who do the baseball stats on TV. "Yes Frank, it's amazing that this guy hit 255 when batting left-handed against right-handed pitchers in night games when the moon was full and the temperature was over 75 degrees and there was a light breeze from the west!"

The only statement needed to build a query is the SELECT statement.

The basic syntax of the SELECT statement is:
SELECT column1, column2, ...
      FROM table_name1, table_name2, ...
      WHERE condition;

The SELECT clause lets you specify which columns to display (they may be table columns or they may be calculated from the data in other columns). The FROM clause lets you specify the table or tables from which the data will be obtained. Note that the standard SELECT statement allows you to get the data from as many tables as you need. If you have to access the Employee table and the Timesheet table to build the query, you can do it. If you have to access 15 tables, you can do it. But that's a lot more involved and we'll leave it for another day, more specifically, Lesson 7. For the next few lessons we'll master the SELECT statement to access any information we need in one table at a time. Finally, the WHERE clause (see below) will determine which records, also refered to as rows, will be selected.

Here are some examples of the SELECT in action:

             Fig. 4-1

Instead of listing the columns, use the * to mean 'All the columns'

And the result is:

             Fig. 4-2

Or display only certain columns:

             Fig. 4-3

and the result is:

             Fig. 4-4

To get data from the Employee table:

             Fig. 4-5

from which we get:

             Fig. 4-6

Same query, different look:

             Fig. 4-7

The 'AS' clause allows you to display a column heading that is more representative
than the field name usually displayed by the query. Compare with Fig. 4.6.

             Fig. 4-8

This is what it look like with the Query Editor


As stated previously, the WHERE clause is in fact an IF statement. If a record returns TRUE to the WHERE clause, it is selected to be displayed.

If the table contains 10,000 records, or rows, you may wish to see only a few or even only one. In that case you would specify the condition as "... WHERE primary_key_column = 'value' ...".

The WHERE clause uses the usual operators to build the condition:

=          >          <          >=          <=           <>    or     !=

and a few you may not be as familiar with but which we'll see in the examples:

BETWEEN          LIKE          IN          NOT

For the next examples, suppose we have a new table called Products. Note that we can create this table in the ProjectMgt database, even though it ha absolutely nothing to do with the application. We have to put the table somewhere and that's as good a place as any. It's important to understand that the tables have no relationships between each other until we define those relationships. If we want to create a table to be used on it's own and then drop it when we're done, there is no problem with that.


Fig. 4-9


SELECT * FROM Products
         WHERE ProdNum = 'A1234';

SELECT ProdNum, ProdName, SellPrice
         FROM Products
                  WHERE SellPrice > 50;

SELECT ProdNum, SellPrice, (SellPrice * 1.1)
         FROM Products;

we can display a calculated column, in this case, what a 10% price increase would look like

use the usual arithmetic operators:

+           -          *           /           ^           (   )

There is a common misconception about calculated columns in the SELECT statement - people think that the calculation will somehow change the data in the table. That is impossible. The SELECT statement is strictly a display statement. Any calculations done are read-only. There is no way that a SELECT can modify a table. The only statements that can do that are the ones we looked at in the previous lesson: INSERT, UPDATE and DELETE.

SELECT ProdNum, SellPrice, Cost, (SellPrice - Cost) AS [Profit]
     FROM Products
                  WHERE ProdNum LIKE 'A*';

we can display the calculated column with an appropriate title, for all products whose number starts with 'A'.

* and ? are the widcard characters

* = character string (any number of characters)

? = 1 character

SELECT ProdNum, ProdName
         FROM Products
                  WHERE ProdNum LIKE "A?5??";

SELECT ProdNum, ProdName, SellPrice
         FROM Products
                  WHERE SellPrice BETWEEN 50 AND 150;

could also be written as >= 50 AND <= 150

SELECT ProdNum, ProdName
         FROM Products
                  WHERE ProdName LIKE "*general*";

display if the name contains the string 'general'

SELECT ProdNum, ProdName
         FROM Products
                  WHERE ProdNum IN ('A100', 'A200', 'B500', 'D800');

if the product number is one of those named

AND and OR are used like in all other languages:

SELECT * FROM Products
         WHERE ProdName LIKE "A*" AND SellPrice > 500;

SELECT * FROM Products
         WHERE (SellPrice - Cost) < 10 OR (SellPrice - Cost) > 500;

display the low-profit and the high-profit items

Working with dates

Whenever you develop a commercial application, there is absolutely no way that you can get by without using date fields. There are Birth dates, Hire dates, Delivery dates, Order dates, and so on, and so on ....

In ancient times, like 20 years ago, dates were stored as strings and we all remember what that brought about in 1999. Now all DBMSs handle dates in a Date/Time format, which makes our lives a lot simpler, but we have to be aware of the particular properties of Date formats.

To begin with, know that you can do calculations with dates as you do with numbers.

#2001-01-31# - #2001-01-01# will return 30, the number of days between the 2 dates.

#2001-01-01# + 3 will return #2001-01-04# because a numeric constant is always taken to mean days.

When using the comparison operators, > #date1# is taken to mean later than or after and < #date1# is taken to mean earlier than or before.

In the WHERE ... clause, ... BETWEEN #date1# AND #date2# sets a date between date1 and date2, inclusive.

To work with date fields in SQL, we'll use the Date and Time functions that Access supplies. Note that those functions are available in just about every environment that supports SQL.

The main functions: NOW( ) and DATE( ) return the current date. The difference between the two is that NOW( ) returns date and time, at this moment, and DATE( ) returns only the current date.

In Access, a date or time constant must be identified with # ... #, as in:
... WHERE p_startdate = #2001-01-01#;

Date formats
If you intend to do e-commerce in the global village, you have to understand that different folks have different ways of doing things.

For example, if you are American and you tell your French girlfriend, the love of your life, that you'll meet her under the Eiffel tower on 01/02/03, there is a good chance that you'll never see her again. To you it is obvious that you specified the date as January 2nd, 2003. In France, as in other French areas, like Quebec, the date is understood to be the 1st of February, 2003. In your case, it may work out. If you straighten out the misunderstanding in time, you go back a month later and she's waiting for you. Good luck!

To avoid problems, get used to using the ANSI international standard date format: yyyy-mm-dd, as: 2003-01-02. Note the use of the 4-digit year. Remember all that anguish we went through in 1999 with the 2-digit 00 year? We don't want that to happen again. Also, note that the separator is the dash character - , and not the slash /.

To set the date format, go through the Windows Control Panel, Regional settings. Since SQL and Access get their formatting from Windows, the format will be selected automatically.

In Access and SQL, one of the most useful functions is called: DateDiff( )

DateDiff('interval', #date1#, #date2#) returns the time difference between date1 and date2, expressed in interval units which could be: days, months, years, weeks or hours.

The interval is specified as: 'd' for days, 'w' for weeks, 'm' for months and 'yyyy' for years.

For example:

Datediff('d', #2001-01-01#, now()) returns the number of days between January 1st and today.

Datediff('m', p_StartDate, p_EndDate) returns the length of the project, in months.

If the result displays too many numbers after the decimal, use the ROUND(number, digits) function to display the number rounded to 'digits' positions after the decimal:
ROUND(Datediff('m', p_StartDate, p_EndDate), 2).

In theory, Datediff('yyyy', e_BirthDate, now()) returns the employee's age, expressed in years. In practice however, you will find that it works or doesn't work depending on whether the employee has had his birthday yet this year or not.

To calculate the exact age, use the following formula:

INT(Datediff('d', e_BirthDate, now())/365.25)

Calculate the number of days and divide by the exact number of days in a year, which, as you know, is 365.25 and not 365. That takes leap years into account.
The INT( ) function truncates the result so that 25.9 becomes 25, for example; the employee is 25 years old until the day she turns 26; after the age of 5, you rarely hear people say that they are 25 and a half years old.

When working with age, remember that you can often use Date-of-birth directly, without doing the age calculation. Don't forget that the smallest date refers to the oldest person.

Eliminating duplicates

To close out this section on SELECTs, we'll look at how to eliminate duplicate lines from query results.

For example, suppose we want to see the list of countries where we have projects. If we do this:

SELECT p_Country FROM Projects;

we get all the countries for all the projects; if there are 5 projects in Canada, "Canada" will appear 5 times in the list.

If we want only the different countries, we can do this:

SELECT DISTINCT p_Country FROM Projects;

where the DISTINCT clause will list only different occurences; if an item is selected more than once, only one will appear.

If you haven't found the database resource you're looking for,
use our Google Search box for more information.



Home | Tutorials | Contact | Sitemap | Add URL | Privacy policy