Database Design and SQL. More queries

SQL tutorial

Previous page Previous     TOC Contents     Next pageNext

Lesson 5 - More queries

The database

Our organization,, runs development projects in foreign countries.

To actually operate the projects, FAM calls upon companies from the private sector.

Each project has a name, a project chief (or Head), a start and end date and an amount of money budgeted to complete the project.

For every project, companies are invited to submit bids. They must specify how much they will charge to do the project. FAM will compare the bids and will select a company from the list of bidders.

In order to submit a bid, a company must be registered with FAM. It must provide details such as: name of CEO, number of employees and date of creation of the company. City and state in which the company HQ is located are also important because, as you may guess, there will be politics involved.

The purpose of the database is to track the bids. There are hundreds of projects ongoing or planned at any given time and there are dozens of companies bidding on those projects. Just tracking all the bids is a big job.
FAM wants to be able to see which companies bid on which projects, how much the bids differ from the budget amounts, how qualified are the companies in terms of experience and size, what are the project start and end dates for all the various projects, etc.

Download sample database

You can download the sample Bids management database from the Download Area

The database is created in MySQL and MySQL Query Browser is used to execute the managers' requests.

Which companies bid on project 05-7777?



You use aggregates in the SELECT statement when you want to get summary information (statistics) on sets of data.

Here we assume that you've done enough programming to know that a function is a system-defined program that accepts a parameter from the user and returns an answer. A function is always composed of a keyword followed by parentheses ( ).

Aggregate functions
SUM (expression) total values in a numeric expression
AVG (expression) average values in a numeric expression
COUNT (expression) the number of non-null values
COUNT (*) the number of selected rows
MAX (expression) the highest value in the expression
MIN (expression) the lowest value in the expression

First, note that an aggregate function will always return only one row. That's because it answers a question refering to a group or set of data. You can find the biggest value in the set but, you can't know what item that biggest value refers to. Same with smallest value or average.

It looks like a good idea to write something like:

SELECT ProdNum, ProdName, MIN(Cost) FROM Products;
to get the name of the lowest-cost item. But it won't work because the aggregates don't work on individual rows.


To obtain the biggest SellPrice in the Products table:

SELECT MAX(SellPrice) FROM Products;

To obtain the number of rows in the Products table, in fact the number of products carried:


Now, the previous statement will count the number of products based on the number of primary keys or Product numbers entered.

If you thought that there might be many duplicates in the items carried, you assume that the duplicates would have the same ProdName; so by counting ProdName and DISTINCT ProdName you would get an idea of how many duplicates there are, although you cannot establish what they are:

SELECT COUNT(ProdName) FROM Products;


The WHERE clause can also be used with aggregates to define the set of data to be calculated.

To find out how many big-profit items you have (assuming that big means more that $500), you do this:

SELECT COUNT(*) AS [Number of big-profit items]
        FROM Products
                 WHERE (SellPrice - Cost) > 500;

Or, in this case, to get the average cost of sportings goods, assuming that items in the Sports department all have a number starting with 'S':

SELECT AVG(Cost) AS [Average cost of Sports]
        FROM Products
                 WHERE ProdNum LIKE "S*";

The AVG function will return the average of a set of numerical values and SUM will return a total:

SELECT AVG(SellPrice) FROM Products;

SELECT SUM(Cost) FROM Products;

Although you are not allowed to work on individual rows, you are allowed to use several aggregates in the same statement:

SELECT SUM(Cost), COUNT(Cost), AVG(Cost), AVG(SellPrice)
        FROM Products;



We said earlier that you cannot use an aggregate function with an individual query. You can find what the biggest SellPrice is but you can't find what that Product is. Although that's true with the normal SELECT statement, there is a way to work around it. It's called a subquery and it relies on what we call the priority of operators in programming - the fact that any operation in parentheses, ( ), is executed first in a statement because ( ) is the operator with the highest priority.

When we do this:

SELECT MAX(SellPrice) FROM Products;
the query returns the value of the biggest SellPrice.

Now, if we enclose that statement in parentheses and use it as a subquery in another statement like this:

SELECT ProdNum, ProdName, SellPrice FROM Products
        WHERE SellPrice = (SELECT MAX(SellPrice) FROM Products);
the subquery returns a single value which is then used in the WHERE clause of the main statement to display the number and name of the product having the biggest SellPrice. If more than one products have the max price, several rows will be displayed.

What products cost more than the average cost of products?
First, calculate the average cost in a subquery and then, compare the table with that value:

SELECT ProdNum, ProdName, Cost FROM Products
        WHERE Cost >= (SELECT AVG(Cost) FROM Products);

The subquery can also be used to answer questions where you have to compare data with other rows from the same table.

Getting back to our ProjectMgt example, we'll use the Employee table.

How would you answer this: "Which employees live in the same city as employee '1234'?".

You could do it in steps.

First you have to find the employee's city:

SELECT e_city FROM Employee
        WHERE e_Id = '1234';
and, if it is, let's say, 'Boston', use that in the next statement:

SELECT e_Id, e_Fname, e_Lname FROM Employee
         WHERE e_City = 'Boston';

Or, you could decide to do it efficiently and use the subquery technique:

SELECT e_Id, e_Fname, e_Lname FROM Employee
         WHERE e_City =
                (SELECT e_city FROM Employee WHERE e_Id = '1234');

Which employees are older than John Smith?

SELECT e_Fname, e_Lname, e_BirthDate FROM Employee
         WHERE e_BirthDate <
                (SELECT e_BirthDate FROM Employee
                         WHERE e_Fname = 'John' AND e_Lname = 'Smith' );

Note that the subquery must return one and only one value. The WHERE clause in the main query can only compare to a single value and that means one column from one row. In the previous statement, if there is more than one 'John Smith' in the company, we've got a problem. In that case we would have to use e_Id instead of name to identify the person.

You have to recognize that the following 2 statements don't make any kind of sense:

SELECT e_Id, e_Fname, e_Lname FROM Employee
         WHERE e_City =
                (SELECT * FROM Employee WHERE e_Id = '1234');

SELECT e_Id, e_Fname, e_Lname FROM Employee
         WHERE e_City =
                (SELECT e_city FROM Employee );

If you haven't found the database resource you're looking for,
use our Google Search box for more information.



Home | Tutorials | Contact | Sitemap | Add URL | Privacy policy