Unbuffered queries for large data sets

resource mysql_unbuffered_query ( string query [, resource link_identifier])

So far we've been using the mysql_query() function to do all our data searching in PHP, and it works well enough for the vast majority of cases. However, consider this: how does the mysql_num_rows() function know how many rows mysql_query() returned? The answer is simple: mysql_query() runs the query, and fetches and buffers it all so that it has the complete result set available. The mysql_num_rows() function then has access to all the rows, and so can return the true row count.

But, what do you do if you have a large number of rows and don't want to wait before MySQL has finished fetching them all before you start using them? In this scenario, mysql_unbuffered_query() comes into play: it executes the query and returns a resource pointing to the result of the query while MySQL is still working, which means you can start reading before the query has finished.

Earlier I used the example of a golfscores table with 1,000,000 rows. It takes MySQL about 20 seconds to return all that data to PHP when using mysql_query(), and if you consider that on top of that PHP might do another 20 seconds of work to format and print out that data, the total is 40 seconds of work, of which you see nothing for the first 20 seconds. Using mysql_query() therefore has several obvious disadvantages:

  • PHP must wait while the entire query is executed and returned before it can start processing.

  • In order to return the whole result to PHP at once, all the data must be held in RAM. Thus, if you have 100MB of data to return, the PHP variable to hold it all will be 100MB.

The disadvantages of mysql_query() happen to be the advantages of using mysql_unbuffered_query():

  • The PHP script can parse the results immediately, giving immediate feedback to users.

  • Only one row at a time need be held in RAM.

One nice feature of mysql_unbuffered_query() is that, internally to PHP, it is almost identical to mysql_query(). As a result, you can almost use them interchangeably inside your scripts. For example, this script works fine with either mysql_query() or mysql_unbuffered_query():

mysql_connect("localhost", "php", "alm65z");
    mysql_select_db("phpdb");
    $result = mysql_unbuffered_query("SELECT ID, Name FROM conferences;");
    
    while ($row = mysql_fetch_assoc($result)) {
        extract($row, EXTR_PREFIX_ALL, "conf");
        print "$conf_Name\n";
    }

Before you rush off to make all your queries unbuffered, be aware that there are drawbacks to using mysql_unbuffered_query() that can make it no better than mysql_query():

  • You must read all rows from the return value. If you're thinking of using as a quick way to find something then stop processing the rows part way through, you're way off track - sorry!

  • If you issue another query before you finish processing all the rows from the previous query, PHP will issue a warning. SELECTs within SELECTs are not possible with unbuffered queries.

  • Functions such as mysql_num_rows() only return the number of rows read so far. This will be 0 as soon as the query returns, but as you call mysql_fetch_assoc() it will increment until it has the correct number of rows at the end.

  • Between the time the call to mysql_unbuffered_query() is issued and your processing the last row, the table remains locked by MySQL and cannot be written to by other queries. If you plan to do lengthy processing on each row, this is not good.

Choosing whether you want to run a buffered query or not takes a little thinking - if you're not sure what's best, you should almost certainly use mysql_query().

 

Next chapter: phpMyAdmin >>

Previous chapter: Reading auto-incrementing values

Jump to:

 

Home: Table of Contents

Follow us on Identi.ca or Twitter

Username:   Password:
Create Account | About TuxRadar