Accessing a database from PHP

PHP has a set of functions for each kind of database. We have PostgreSQL on Osiris, so use the functions whose names start with pg_ . If you should change databases, to MySQL, say, you would have to replace all pg_ with mysql_.

You connect to a database (on Osiris), do any number of 'queries', look at the results of select queries, and close the connection (or else it is closed at the end of the script.) For a quick start, see commented example

Connecting: the connection string

pg_connect opens a connection to the database server on the terms specified in the "connection string", here are some examples. If the keyword "host" is not present, a local connection on the server machine is made, otherwise, a TCP/IP connection is made to the default port (5432). If user is not present, it will be assumed to be the user that is currently running the script. In the case of a php page on Osiris, the user is apache.
This returns a "connection object", or FALSE upon failure.
$conn=pg_connect ('dbname=timetable') or die ('Connect failed ');
$conn=pg_connect ('dbname=jsmith user=jsmith password=fishStory');
$conn=pg_connect ('host=localhost dbname=jsmith user=jsmith password=fishStory');
In the first case, a local connection is made with the permissions of the user 'apache', since "user authentication" is in effect on Osiris. This means that you must grant the necessary permissions to 'apache'. These should be the minimum necessary for your scripts, I avoid DELETE permissions, for example. This is the safest way to connect.
GRANT SELECT, INSERT ON students, sports TO apache;

The second case is, again a local connection, this time using jsmith's permissions. Presumably jsmith is the owner of the database. The third case is the same, except that a TCP connection is made. Host could have been given as Osiris, or even cs-linux (a different host).

There are some drawbacks to connecting as the database owner:

  1. jsmith cannot now restrict operations (to only SELECT, in the extreme case), so if any user of jsmith's pages manages to inject malicious SQL statements (something we will be very careful to prevent), she could manage to "drop table students" for example. See cartoon.
  2. Any of us with accounts on Osiris can now know jsmith's password, and access the database server as jsmith. So, jsmith should be entering his password using a form, every time he wants to manage his database, he should not hard-code the password in any PHP file.

Policy changes

At any time, a system administrator may change the policies for accessing a database. For example, there is also a PostgreSQL server running on cs-linux, and over the years, first user authentication was prohibited, and then local connection (so only the third example worked.) In response, I created (on cs-linux) the database user 'mickey', to whom one could grant the permissions desired for apache, without revealing ones own password.
Should a policy change occur, you may have to modify all your connection strings in all your scripts. That's a real bummer.
I promise not to do that on Osiris. I would be generally prudent to put the connection string once in a file which you may include.

Querying the connection

In all the functions, the connection object is the first argument, but it is optional since typically you only make one connection in a script. For simplicity I will omit the connection, which we assume is (the last one to be) made.

A query can either modify the database (INSERT, UPDATE, DELETE) or return results (SELECT). These 2 sorts return somewhat different result objects.

Simple query (pg_query)

When the query can be written in advance, and does not depend on any form data, we simply send the query over the connection, and receive the result:
$query = "SELECT given, family FROM students"
$result = pg_query($query) or die("query failed".pg_last_error());

We can then get the names of students from $result, see below

Query with parameters, such as form data

When the query depends upon form data, we cannot write it in advance. Examples are search criteria for SELECT, and data to be INSERTed or used for UPDATE.
A lot of things could go wrong, because we don't know what the users might input, character data must be delimited by single-quotes (') in SQL statements, so apostrophes (') must be "escaped" or doubled in the constructed statement. Do not use pg_query for variable data. The name O'Hara will fail, and O'; drop table students -- will kill you! This is a cracker trick called "SQL injection". Cartoon at http://xkcd.com/327/

Instead, we write the query with placeholders of the form $1, $2, etc. and then pass the variable data as an array, as follows.

$query2 = 'INSERT into students (given, family, linux) values ($1, $2, $3)';
$resultinsert = pg_query_params($query2, array($g, $f, $linux))
 or die("query failed".pg_last_error());

$query1 ='SELECT * FROM courses where prof=$1'; // will come from form
$result1 = pg_query_params($query, array($profname));

The function pg_query_params escapes the data (doubling any single-quotes), encloses each item in single quotes (ex: 'O''Hara') and puts the result in the query at each placeholder. Note: There must be exactly as many elements in the array as there are placeholders.

Repeated queries

This can be separated into 2 steps, should the same query with placeholders be needed several times, by using the combination pg_prepare() pg_execute(). The above insert example could be done equivalently by
		pg_prepare('instu',$query2);
$resultinsert = pg_execute('instu', array($g, $f, $linux))
 or die("query failed".pg_last_error());
... and we could have more pg_execute() statements for the same named query

Accessing the results

Modify results

The result of a modify could be FALSE, if the statement failed. This could happen if it violated some constraint, such as primary key or character data too long. If it succeeds, the only information we might need is how many rows were inserted (0 or 1) updated or deleted (0, 1, or more). For a modify query, use

print pg_affected_rows ($resultinsert) . " rows inserted for $linux<br>\n";

SELECT results

A select query always returns a table of rows and columns (unless it fails). We might want to know the number of rows in the table (could be 0),
$nrows = pg_num_rows($result);
print "There are $nrows students";
and we probably want to fetch all the rows, one after another. The "fetch" functions normally return successive rows, each one as an array of fields, and FALSE when all have been fetched.

This example prints the field names followed by their values, one field per line. (A fancier example could print the result table as an HTML table.) The use of  htmlentities is optional, it prevents misunderstandings about encoding of special characters.

while ( $line = pg_fetch_assoc($result1) ) 	// all rows
foreach ($line as $key => $value) // all columns
print "$key = ".htmlentities($value, ENT_COMPAT, 'UTF-8')."<br>\n";
pg_free_result($result1); /* Free resultset */

Another example, also loops through the rows, and for each row print the columns on one line and then a <br> tag:

while ($row = pg_fetch_row($result) )
{ foreach ($row as $item)
print $item;
print "<br>\n"; // a new line (break) in html
}

Of course, you would discover that all the items will be scrunched up together, so at least you want to add some spaces between them, or else add the appropriate tags to define an html table. You could add column headings to a table, this php code does not get the column names.

Tidying up

Result sets may be large, and should be freed when no longer needed, if one is not at the end of the script. In addition, it is a good idea to close the connection. Remember $conn? It has been implicit all along.

pg_close($conn);

See also some lab instructions: How to access a database on osiris
Lin Jensen,