In this article we will delve into the features of the PostgreSQL object-relational database environment and what makes it special. We will focus on features that make PostgreSQL different from most relational database management systems (RDBMS).
What is PostGreSQL?
PostGreSQL is an open-source object relational DBMS (ORDBMS) that traces its roots back to Academia. It traces its roots back to a database called Postgres (developed at UC Berkley in the early 80’s). It was officially known as PostGreSQL around 1996 mostly to reflect the added on ANSI SQL compliant translator. It is perhaps the most feature-rich robust open-source database around and perhaps the most feature rich even among non-open source databases. In this article, we’ll focus on the features that make Postgres a rare gem. Our focus is on the most recent versions of PostgreSQL (7.1.3 and 7.2).
What makes PostGreSQL Different from other Relational Databases
- PostGreSQL in addition to being a relational database, is object-relational as well. What this means is that it has some object-oriented features such as concept of inheritance and ability to define complex datatypes with special functions to deal with these datatypes, but is for the most part relational in nature. In fact, most uses of Postgres do not take advantage of it’s extensive object-oriented functionality. It has features that you may not find in even expensive well-known commercial relational database management systems (RDBMS) and ORDBMS systems. Below are a couple of neat features that make it stand out from the pack.
- Pretty much full ANSI-SQL 92 compliance and a lot of ANSI 99 compliance as well and extensive support for transactions, BEFORE and AFTER triggers, stored procs, constraints, foreign keys, referential integretity, cascade update/delete.
- Support for numerous languages in creation of user-defined database functions (e.g you can use native SQL, PgSQL (postgres counterpart to Oracle PL/SQL or MS SQL Server’s/Sybase TransactSQL), Java, C, C++, TCL, Python, and Perl bindings and to define new PL languages to incorporate into PostgresSQL.
- Inheritance of table structures – this is probably a feature that is rarely used, but the feature comes in handy in certain situations. We’ll provide an example use for such a rare feature.
- Built-in complex data types such as IP Address, Geometries (Points, lines, circles), arrays as a database field type and ability to define your own custom datatypes with properties, operators and functions that work with them.
- Ability to define Aggregate functions – these are functions that work on a set of records rather than a single record.
- Concept of collections and sequences.
- Support for multiple OSes. Some popular ones (Linux, Windows, Unix, Mac)
- It makes a great Web database because it is fast and feature rich.
- It has freely available ODBC drivers and Level 4 JDBC drivers.
In the following sections we’ll explore some of these features and examples of use.
ANSI-SQL 92/99 Compliance
PostGreSQL supports the standard ANSI SQL operators and predicates such as SELECT, DISTINCT, SELECT INTO, INNER JOIN, OUTER JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, FULL, HAVING, GROUP BY, common aggregrate functions such as SUM, AVG, MAX, MIN, and subselects. In addition to those – it supports some non-common ones such as INTERSECT, EXCEPT, and DISTINCT ON. For more information on how to use these and other SQL SELECT predicates – check out the postgres manual sql-select page.
Below is an example of how one would use INTERSECT. Note that these predicates can be mimicked in systems that do not support them using other join constructs, but these provide a cleaner and more logical way of expressing these.
SELECT flight_no, flight_date FROM flightbook WHERE destination = 'BOS' INTERSECT SELECT flight_no, flight_date FROM flightreservations WHERE cust_state='CA';
Would provide a listing of all flights in flightbook that currently have reservations from people who reside in california and whose destination is Boston, MA.
Support for Numerous languages
As of this writing there are 4 (Procedural Languages (PLs) that come with the Postgres source (PL/pgsql, PL/Python, PL/Tcl, and PL/Perl – others are in the works such as a PL/Java that will be similar to Oracle’s PL Java). (PL languages allow one to write Postgres functions and stored procedures natively in the language as opposed to simply binding to a library of already compiled functions.. The Postgres architecture allows to define new PL languages by providng PL handlers. This feature makes Postgres very extensible – for example one can imagine creating a specialized rules language and then providing this as a PL extension to Postgres. This is extremely rare among databases in which introducing a new language option requires reachitecting the system.
Pretty much any language you can name, you can use to define functions that can then be bound to a postgres database to be used in SQL queries or to create new specialty operators (e.g. +, = etc.) that work on particular kinds of data. Note function binding is a little different from creating functions with a PL’d language. When binding, you have a compiled set of functions, classes etc. that you then map to skeleton Postgres stored procedure/function names. The skeleton merely defines the function signature (inputs , outputs and name of the function in a compiled library) as opposed to a PL’d stored procedure in which the stored procedure or function code is actually stored in the database, but interpretated by a PL language handler.
Inheritance of Table Structures
Table inheritance is a feature that is not found in a mere RDBMS, but is one of the hallmarks of an ORDBMS. This feature provides a compromise for those looking for an object orientated database, but who wish to also have the simplicity and speed that a relational database system provides.
Here is an example of such a rare commodity.
Suppose you are put to the task of developing a database that inventories various types of computer equipment. You want to be able to keep track of various kinds of attributes of an item, but don’t want the design to get unwieldy and you also want to make it easy report on. You have several kinds of equipment – computers, printers, routers, switches etc. There are several properties that each shares in common, but also properties that are specific to say only a computer.
There are several ways of doing this – here are a couple of options that come to mind
- Have a single table with all the fields you will need to keep track of each item. The downside of this approach is that for most items, you will have several fields that are not necessary and it would be difficult to explain to someone coming aboard what fields to query when looking for a specific item. E.g for Printers the OS is blank etc.
- Have one table with fields in common across all components and separate tables to store the additional detail of each type of component. The problem with this is that everytime you need to query a particular kind of component, you need to do an inner join between the two tables. Also if you need to insert new items, you need to make sure you insert records into both tables.
- Have a very expandable system where you have one table with fieldsin common and another storing extra attribute info and each attribute is stored in a separate record. While this has great room for growth – it is difficult to do simple datasheet reporting and difficult to validate each different kind of field e.g. date, IP address, etc.
- If you have a system such as Postgres, you can define a table – call it “component” and additional tables that inherit from component, but have more specific info. E.g. You can have a Printer table, Computer table, RoutingDevice table etc.
In fact Postgres allows you to inherit from more than one table or inherit from another inhertied table thus achieving single inheritance as well as multiple inheritance Below is what the table structure definitions might look like
CREATE SEQUENCE "component_component_id_seq" START 1; CREATE TABLE "component" ("component_id" int4 DEFAULT nextval('component_component_id_seq') NOT NULL,"component_name" varchar(50), "ip" inet, "po" varchar(20), "mac" macaddr, "department" varchar(20), CONSTRAINT "component_pkey" PRIMARY KEY ("component_id") ); CREATE TABLE "computer" ( "os" varchar(20) ) INHERITS("component");
With this, to get the full inventory of items, you can write SELECT * FROM component;
or Just computers SELECT * FROM computer; If you want to update say a department field of all components within a particular IP broadcast range and this will update all components whether computers, printers etc.
UPDATE component SET department='a good department' WHERE ip << inet '192.168.1/24';
Built-in complex data types and ability to define new ones
The above example demonstrates a complex data type called inet which can be used for storing ip addresses, subnet masks etc. Although PostgreSQL comes with quite a few complex datatypes, one can define new ones or get extensions to PostgreSQL that extend these.
One such product we’ve found is a product called PostGIS which is an opensource spatial engine that spatially enables PostgreSQL. Although PostgreSQL comes with some simple geometry data types, PostGIS defines new ones and provides an OpenGIS SFSQL interface to these.
Makes a Great Web Database
PostgreSQL is a fairly fast database with ample support in web languages such as PHP, Perl and the ODBC and JDBC drivers make it easily usuable in other languages such as ASP, ASP.Net , ColdFusion and Java. It is often compared with MySQL – one of the fastest databases on the web (opensource or non). It’s querying speed is in line with MySQL and in fact studies have shown that it scales better with more users than MySQL. In terms of features, it is leagues above MySQL – but the new version of MySQL coming out which will provide transactional support and trigger support, and subselects will change some of that . MySQL is in general an easier install for those who don’t need all the extended functionality provided by PostGreSQL. In terms of features though – if you want functionality as extensive as Oracle or SQL Server and don’t want to spend an arm and a leg, then PostGreSQL is definitely a database to take a second look at.
Links of Interest
|PostGreSQL||PostGreSQL is a powerful opensource object-relational database. PHPPgAdmin is a PHP web-based admin interface to PostgreSQL|
|CygWin||CygWin is a Unix emulation layer for Windows that comes prepackaged with PostGreSQL|
|PgAdmin||A windows-based administrative interface to PostgreSQL – similar to look and feel of SQL Server Enterprise Manager Pg Admin|
|PostGIS Spatial Database||An opensource Spatial database engine developed by Refractions Research that seats on top of PostGreSQL DBMS and provides OpenGIS (Spatial Features SQL) SFSQL functionality to PostGreSQL. If you are looking for an easy to deploy affordable spatial database engine with the power of the more expensive ones such as Oracle Spatial, Arc SDE or DB Spatial, look no further.|
|MySQL database||Want an easy to setup relatively powerful and free database. Try downloading and installing MySQL. Also grab the PHPMyAdmin for administration .|
|PHPBuilder.com||This site has a lot of useful articles on how to get started with PHP and some useful articles on using PostGreSQL in PHP|
|Open Source Databases: As the Tables Turn||An article by Tim Perdue comparing MySQL with PostGreSQL|
©Copyright Leo Hsu