SELECT - step by step

The SQL command for querying data in a database is very powerfull and therefore can be quite complex. To adres sthat problem I provide here a small step by step tutorial how to build a SELECT command. May be it is easier for you to answer the following questions and to build so your custom SELECT command step by step:

  1. Which data do you need? In which data fields is this data stored?
  2. AT first you should know and, when you do not know, search, in which data fields of which database tables (or database views) the needed information is located/stored. In this example we want a relatively simple statistics about the customer count per salesman. The id and name of the salesman (vnum, vname) and the customer id (knum) should be enough for this purpose.

    SELECT vnum, vname, knum
    ;
  3. In which tables are the needed data fields stored?
  4. The name and id of the salesman (vnum, vname) are stored in the salesman table (tverkauf) and for the customer id (knum) the customer table (tkunden) seems to be most suitable.

    SELECT a.vnum, a.vname, b.knum
    FROM tverkauf a, tkunden b
    ;
  5. What kind of relationship do the needed tables have to each other, how can they be joined?
  6. We have following knowledge about the two tables needed in this example:
    Not every salesman has customers (may be a new employee), so the data must be joined without loosing any salesman, even he has no corresponding customers: we need all records from the table tverkauf and search matching records from the table tkunden --> LEFT JOIN
    In the customer table (tkunde) we find the salesman id (vnum) and, because this is the primary key of the salesman table (tverkauf), we can use this data field for joining both of the tables together --> a.vnum = b.vnum

    SQL92::
    SELECT a.vnum, a.vname, b.knum
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    ;
    Standard-SQL::
    SELECT a.vnum, a.vname, b.knum
    FROM tverkauf a, OUTER tkunden b
    WHERE a.vnum=b.vnum
    ;

    When you omit LEFT JOIN or the OUTER, then you get only salesman, which have at least one corresponding record in the customer table (tkunden), so we get only salesman with customers.
    When you omit the join criteria a.vnum=b.vnum, then you get a so called cartesian product - the combination of all salesman with all customers, in our case quite a bunch of data vaste.

  7. Do you need all of the data or do you have to filter it?
  8. Let us assume, we are interested only on salesman from Berlin:

    SQL92::
    SELECT a.vnum, a.vname, b.knum
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    ;
    Standard-SQL::
    SELECT a.vnum, a.vname, b.knum
    FROM tverkauf a, OUTER tkunden b
    WHERE a.vnum=b.vnum
    AND a.stadt='Berlin'
    ;
  9. Do you need the original record fields or is a statistics, an aggregation of the original data needed here?
  10. Because we want to know, how much custimers do each salesman have, we should count the customers per salesman --> COUNT.
    That means, we have to aggregate, to group data --> GROUP BY

    SQL92::
    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    ;
    Standard-SQL::
    SELECT a.vnum, a.vname, COUNT(*) AS anzahl
    FROM tverkauf a, OUTER tkunden b
    WHERE a.vnum=b.vnum
    AND a.stadt='Berlin'
    GROUP BY vnum, vname
    ;
  11. Is all of the resulting aggregated statistics data needed?
  12. Let as futher assume, we are not interested in good salesman with lots of customers. And let us assume, that a good salesman have 100 or more customers. Then we have to choose only salesman with less than 100 customers:

    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) 
    ;
  13. Should the query results be sorted?
  14. In our search for bad salesman we probably want to see salesman with less customers first. So we have to order the results by the customer count ascending:

    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) ORDER BY anzahl ASC
    ;
  15. Should the amount of the resulting data be limited to a specific number of records?
  16. Because we do not want to talk with every laisy employee, we probably want to limit the amount of the resulting data records. Let us set the limit to 10 records:

    SQL92::
    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) LIMIT 10
    ;
    Informix::
    SELECT FIRST 10 a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) 
    
  17. Should the query results be stored in a new or temporary table?
  18. When the resulting data have to be processed in some you sometimes want to have it in a handy (temporary) database table, so you can easily use ot in other, futher database query. So the salesmanager can probably decide by him self, how much employees he want to speak. In this case we need to remove the record limit from our select query and write the records into a new (temporary) table:

    Temporäre Tabelle
    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) INTO TEMP verkauf_score
    ;
    Regular, existing table (must be created before with CREATE TABLE)
    INSERT INTO verkauf_score
    SELECT a.vnum, a.vname, COUNT(DISTINCT b.knum) AS anzahl
    FROM tverkauf a LEFT JOIN tkunden b ON a.vnum=b.vnum
    WHERE a.stadt='Berlin'
    GROUP BY vnum, vname
    HAVING COUNT(DISTINCT b.knum) 
    
Categories: