Hudzilla Coding Academy: Project Seven

Code
Hudzilla Coding Academy

 

There are two kinds of data sources you need to worry about. The first is XML, which we covered already in Project Five. The second is databases, which come in all sorts of shapes and forms but have the same common behaviour: they store lots of data in a neatly ordered manner, and make it easy to retrieve that data.

In this tutorial you're going to learn how to access a database with C#. I'm going to keep it quite easy for now because I just want to show you how much great stuff you can do if you have access to information. We'll go into more depth about creating databases and writing data in Project Seven.

For now, what we're going to produce is a command-line Firefox quickstarter. You see, Firefox 3 stores all its history information in a database, which means we can load it and query it from the command line. Our little tool will accept any text you give it and use that to match against any page titles in your Firefox history. If it finds matches, it will take the one you have been to most often and load that in Firefox. Easy! Let's get started...

First steps

Start MonoDevelop and create a new command-line project called FoxySearch, then add a Reference to Mono.Data.Sqlite. Create a new class called FoxySearch, then change your Main.cs Main() method to this:

FoxySearch app = new FoxySearch();
app.Run(args);

Inside FoxySearch.cs, give it these "using" statements at the top:

using Mono.Data.Sqlite;
using System;
using System.Diagnostics;
using System.IO;

There are two new ones there: Mono.Data.Sqlite will be used for reading the Firefox database, and System.Diagnostics will be used for launching Firefox.

OK, your project is now all set up and ready, so let's backtrack a little: what is a database?

Databases 101

As I said in the introduction, databases come in all shapes and forms. By that I meant that there are lots of different database apps, lots of different data formats, lots of ways and reading and writing your data... well, basically there are lots of ways of doing just about everything in databases! For now, the database variety we care about is called SQLite, which is pronounced Ess-Cue-Ell-Ite, not Ess-Cue-Ell-Lite or something stranger!

SQL stands for "Structured Query Language", and it's a very simple programming language that lets you tell a database system what you want to do rather than you having to do the work yourself. For example, if you want to read all the data about employees in a company, you shouldn't have to care how that data is stored - you just ask the database manager for it, and it will give you back the results. This is the magic of SQL: there's a database manager that looks after the data for you, so you don't have to worry about storing it, optimising it and the like, you just need to know what to ask for.

We're going to be using SQLite for two reasons. First, it stores all its data in a local file. That might sound rather basic to you, but actually most database systems use what's called a client/server architecture, meaning that data is stored in a big server somewhere and multiple clients connect to it to query information. This client/server system is popular because it means that the data can be shared amongst a large number of people, whereas because SQLite stores its files locally it can only be used locally. The second reason we're going to be using SQLite is pragmatic: that's the system Firefox uses for its data!

You see, SQLite's method of storing local data is perfect for Firefox: it needs to store a lot of data, but it doesn't need client/server nonsense because Firefox is only ever used locally. By storing all its information inside SQLite, Firefox can search it all more intelligently, and, happily, it also means we can search it too!

So, SQLite is one of my database systems out there. But what does a database actually contain? Well, the bits we care about are called tables, which are what hold the actual data. One database file (a physical SQLite file) can hold multiple tables, each holding their own data. Think of a database file as being a container, grouping together related database tables.

Each table contains multiple columns rows and multiple rows. A column is the name of some data you want to search, such as "FirstName", "LastName", or "Age". Each row has its own value for each of those columns, so you might end up with something like this:

FirstName LastName Age
Bilbo Baggins 111
Frodo Baggins 28

We can get that data by running an SQL query, which is a simple SQL question that will return data for us to work with. But first we need to lay our hands on Firefox's history data...

Finding places.sqlite

The file we care about - the file where Firefox stores all its information about sites you've been to - is called places.sqlite, and lives somewhere under your home directory. Usually its path is something like this: /home/XXXXXXXX/.mozilla/firefox/YYYYYYYY.default, where "XXXXXXXX" should be replaced by your username and "YYYYYYYY" should be replaced by a seemingly random string of letters and numbers. Thanks, Firefox!

Firefox's settings directory contains several .sqlite files you can open and work with.

Firefox's settings directory contains several .sqlite files you can open and work with.

What we need to do is fairly straightforward on paper:

  • Find out where the user's home directory is, thus giving us /home/XXXXXXXX.
  • Check whether /home/XXXXXXXX/.mozilla/firefox exists. If it doesn't, bail out.
  • Get a list of all subdirectories that match "*.default" and pick the first one (there usually is only one)
  • Look in that directory for places.sqlite
  • If it finds places.sqlite, save the directory to a hidden file called .foxysearch to avoid having to search later.
  • If we fail at any point, exit and tell the user to create .foxysearch for themselves.
  • Of course, if .foxysearch exists in the first, skip all those checks and just read the directory from there!

To do all that, we need a few new methods:

  • System.Environment.GetFolderPath(). This neat little method can give us the user's home directory, or indeed any other special directory we might want.
  • Directory.Exists(). This works just like the File.Exists() method we've used previously.
  • NeedsCustomConfig(). This is a custom method that we'll write that will print a message telling the user they need to configure FoxySearch by hand because our search method failed.
  • Environment.Exit(). This method will be used by NeedsCustomConfig() - it tells the program to exit immediately.

With all that, here's how the code should look - put this into your FoxySearch constructor method:

// this will store the directory containing places.sqlite
string directory;			

if (File.Exists(".foxysearch")) {
	// our cached location of the file exists - use it!
	directory = File.ReadAllText(".foxysearch").Trim();
} else {
	// step 1: find the user's home directory
	string home = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal);
	
	// do they have a .mozilla/firefox directory?
	if (!Directory.Exists(home + "/.mozilla/firefox")) {
		// no - run away!
		NeedsCustomConfig();
	}
	
	// get a list of all profiles in .mozilla/firefox
	string[] profiles = Directory.GetDirectories(home + "/.mozilla/firefox", "*.default");
	if (profiles.Length == 0) {
		// there aren't any profiles - run away!
		NeedsCustomConfig();
	}
	
	// use the first profile (there's usually only one)
	directory = profiles[0];
	
	// does that contain places.sqlite?
	if (!File.Exists(directory + "/places.sqlite")) {
		// no - run away some more!
		NeedsCustomConfig();
	}
	
	// if we've made it this far, we have a working directory: save it for next time
	File.WriteAllText(".foxysearch", directory);
}

Once all that code has run, the program has either called NeedsCustomConfig() (which will terminate the app) or has put the name of the Firefox configuration directory into the "directory" variable. Before we move on, here's the NeedsCustomConfig() method - paste this new method somewhere into your FoxySearch.cs file:

void NeedsCustomConfig() {
	Console.WriteLine("FoxySearch was unable to locate your Firefox 3 places.sqlite file.");
	Console.WriteLine("Please create the filename .foxysearch in your home directory,");
	Console.WriteLine("and put in there the name of the directory containing places.sqlite.");
	Environment.Exit(0);
}

You can use the Environment.Exit() method whenever you want your program to terminate immediately. There is one situation where Exit() won't immediately stop your app, but you really don't need to know about it for a while yet!

Opening an SQLite database

By now, we've located places.sqlite and are all set to open it. But before we dive into the code necessary to do just that (even thought it's not that hard!), I want to point out two problems:

  1. The places.sqlite file could be corrupt or missing. For example, if someone configured the .foxysearch file by hand and pointed it to the wrong place by accident, we need to make sure our program works gracefully. Equally, if Firefox makes a change to the file (ie, upgrades to a very new version of SQLite or encrypts it), we need to be able to handle that nicely too.
  2. To ensure that no one can write changes to the database while someone else is reading it (which could cause problems), Firefox locks its database while it's using it. If this happens and we try to open it for reading, SQLite will just pause for a long time before exiting to say that the database is locked.

We can work around both those problems very easily, but it's important that you're aware of them!

Working with a database requires three new classes: SqliteConnection, SqliteCommand and SqliteDataReader, which connect to the database, execute commands, and handle results respectively. Put these three lines just before the start of your constructor ("public FoxySearch()"):

static public SqliteConnection SettingsConn;
static public SqliteCommand SettingsComm = new SqliteCommand();
static public SqliteDataReader SettingsReader;

Right now we're really just interested in the first two, because we want to connect to the SQLite database file and execute commands through that connection - the reader will come later.

To connect to a database, you need to tell Mono where to find it - this is known as a "connection string", and as well as the location of the database you can pass in other information such as a username and password if needed. So, what we need to do is put together a connection string pointing at the location of places.sqlite, open that connection, then tell Mono we want to execute future commands through that connection. Here's the code - put this at the end of your Main() method:

string connstring = string.Format("Data Source={0}/places.sqlite;", directory);
SettingsConn = new SqliteConnection(connstring);
SettingsConn.Open();
SettingsComm.Connection = SettingsConn;

Once that's done, we should have the database open for reading, with our SQL command pointing at it ready to go. But what about those two problems I mentioned? Well, the first one is that places.sqlite could be missing, which means that trying to open it with SQLite will cause your program to crash. The quick solution to this is to use something called exception handling, which is one of the few programming fundamentals you still have left to learn. Put simply, exception handling lets you tell Mono, "listen, if anything goes wrong in this next bit of code, here's what I want you to do."

There are several neat things about exception handling:

  • It lets you localise your error checking and correction. Rather than scattering code all over the place, exception handling lets you put the error-handling code right next to the code it works with.
  • You can catch multiple problems from one piece of code, being as specific of as general as you like - Mono will pick the right one for you.
  • They cost next to nothing to use (in terms of processing time) unless something actually goes wrong, in which case it's more important to handle the error nicely than it is to run quickly!

If you handle your exceptions properly, there's no reason your programs should crash any more - you can basically recover from the worst the world can throw at you. Exception handling is something we're going to be looking into much more in the future, but for now all we're going to do is tell Mono "if anything goes wrong, here's what you do." To do that, replace the above four lines of code with these:

try 
	string connstring = string.Format("Data Source={0}/places.sqlite;", directory);
	SettingsConn = new SqliteConnection(connstring);
	SettingsConn.Open();
	SettingsComm.Connection = SettingsConn;
} catch (Exception) {
	if (!File.Exists(directory + "/places.sqlite")) {
		Console.WriteLine("Unable to open your places.sqlite file - sorry!");
	} else {
		Console.WriteLine("places.sqlite was not found in " + directory);
	}
	
	Environment.Exit(0);
}

Putting code inside a "try" block is what makes it safe. If Mono is busily executing code in a "try" block and finds an error, it will stop what it's done and jump to "catch" block. In this case, we're telling Mono we want to catch "Exception", which is the most general type of exception and will match anything going wrong. So if SQLite tries to open the places.sqlite file and can't find it, Mono will jump to the "catch (Exception)" block and start executing there - in the code above it will check whether the file is missing or not, and print the appropriate message.

It's important to understand that Mono won't transfer control back to the "try" block once it has finished with the "catch" block: once it has exited "try" it won't go back. Not that it matters here, though, because in the code above we finish our "catch" block with a call to Environment.Exit() so that the program definitely terminates if the SQLite database can't be found.

Try/catch in C# explained

In this example, the Crash() method call causes a big program error to occur. What will happen is that Mono will look at the type of error, then look at the list of "catch" blocks to see what should be done.

If the error type is "Something", Mono will run the Eek() method and carry on at the end of the try/catch block. For the SomethingElse block note that I've asked Mono to give the SomethingElse error a name, se, so I can send it to the DoSomethingElse() method as a parameter.

Finally there's the catch-all "Exception", which will get a variable "e" and print out the error message to the screen - this is fairly common for debugging purposes.

The other problem I mentioned earlier is the way Firefox locks its database. We can work around that very easily just by copying the places.sqlite file to a temporary location and loading that temporary copy. Change the line "string connstring" to this:

File.Copy(directory + "/places.sqlite", ".places.sqlite.foxysearch", true);
string connstring = "Data Source=.places.sqlite.foxysearch;";

With that in place, Mono will copy the places.sqlite file to the file .places.sqlite.foxysearch before loading the copy into SQLite. Remember, starting a filename with a full stop/period on Linux causes it to be hidden, which is a smart thing for temporary files like this one.

Reading Firefox's history

We're finally in a position to read Firefox's history data, which is where we need to delve into just a little SQL. The constructor method already ensured we have access to places.sqlite for reading, and it has been opened in SettingsConn so we can work with it. What the Run() method needs to do is pretty straightforward:

  • Check how many arguments were passed. If 0, print usage information then exit. Otherwise, use that as our query.
  • Run our SQL query and grab the results.
  • If there wasn't any matching data for our search, tell the user that and exit.
  • If there is data, read the first match and use that URL to load Firefox.

There are only two parts of that are new: running an SQL query and loading Firefox. Let's start with SQL, which, as I mentioned already, is a whole new language for working with databases. While there is a C#-specific system for reading database information, SQL is far more common because it's used in lots of other programming languages, so everything you learn here and in later projects will serve you well no matter what you program in the future.

Right now, we're interested in reading information, which in SQL is done with a SELECT statement, like this:

SELECT field1 FROM table;
SELECT field1, field2, field3 FROM table;
SELECT field1, field2, field3 FROM table WHERE field4 = 'searchcriteria';
SELECT field1, field2, field3 FROM table WHERE field4 = 'searchcriteria' ORDER BY field2 DESC;
SELECT field1 FROM table WHERE field4 = 'searchcriteria' ORDER BY field2 DESC LIMIT 5;

That first query will pull out only one row from the table, and will return all the rows in the table. The second will return three fields for all the rows. The third will return three fields from the table, but will only return rows where field4 is set to 'searchcriteria'. The fourth one will do the same, except will sort the results by field2 descending, meaning that "zebra" comes before "apple" and 1000 comes before 100. Finally, the latest search will return only the first five rows after the data has been sorted.

From those examples, I hope you can see that SQL queries can be really simple, but if you want to add more work to them it's not hard. If you think about it, to sort your results all you have to do is write "ORDER BY somefield" to make them ascending or "ORDER BY somefield DESC" to make them descending - that would take a lot more work if you did it in C#!

In the Firefox places database, there are several tables of data such as what favicons have been cached, what you typed into the URL box before pressing Enter or selecting one of the options in the Awesome Bar, and, importantly, what places you've been to. It's that last one we're interested in, and it's called "moz_places". Inside moz_places there are three fields we care about: title (the page title), url (the URL of the site) and frecency.

Frecency is a word invented by those wacky folks at Mozilla to combine "frequency" and "recency", and is used to help Firefox choose the site you're most likely to want to go to based on how often you've been there and how recently you've been there. Think about it: if you went to a site 1000 times, clearly you like the site. But what if your last visit was a year ago? Well, then you probably don't want to go there right now, so those 1000 times don't mean as much as if you went to that site yesterday.

What we need to do is ask SQLite this:

  • Please tell us the URL (SELECT url)
  • From the list of places (FROM moz_places)
  • Where the title is equal to our search (WHERE title = 'searchcritera')
  • Reverse ordering the results by frecency (ORDER BY frecency DESC)
  • And, by the way, we only the need the top result - the most likely place (LIMIT 1)

I'm hoping you've spotted a small problem in that plan. If not, read over it again. Still stuck? Consider this: if the "title" field was "fizz buzz foo bar" and we searched using "WHERE title = 'fizz buzz'", we wouldn't get a match. The problem is that searching using = looks for an exact match, which means our program would work only if someone typed the exact title field they were looking for, which is pretty unlikely.

The solution is to use a new kind of search, called LIKE. This lets you provide part of a field and, by using the percent symbol (%), tell SQLite that anything can come before or after the field. So, if you want to match "fizz buzz" whether it's the only thing in the field, whether it comes at the beginning and is followed by other stuff, or whether it's in the middle of thousands of other lines of text, you should use this: "WHERE title LIKE '%fizz buzz%'".

Once the query is prepared, you run it using the SettingsComm object and it will return values that can be read. With all that in mind, here's the first version of the Run() method:

public void Run(string[] args) {
	switch (args.Length) {
		case 0:
			Console.WriteLine("Usage: FoxySearch \"search terms here\"");
			Console.WriteLine("");
			
			break;
			
		default:
			// use the first argument as the search criteria
			string search = args[0];

			// we use string.Format() to inject the search criteria easily
			// I've had to split this across two lines - you should make it one
			SettingsComm.CommandText = string.Format("SELECT url FROM moz_places
			   WHERE title LIKE '%{0}%' ORDER BY frecency DESC LIMIT 1;", search);
			
			// this runs the query and saves the result in SettingsReader.
			SettingsReader = SettingsComm.ExecuteReader();
			
			if (SettingsReader.HasRows) {
				Console.WriteLine("Got matches!");
			} else {
				Console.WriteLine("No matches.");
			}
			
			break;
	}
}

When the ExecuteReader() method is executed, it returns all the matching rows to SettingsReader, where we can loop over them or whatever. For now, though, what we care about is whether any rows were returned that matched the search, which is easily checked thanks to the HasRows property of SettingsReader - if that's true, at least one row of data was returned, meaning that we have successfully read data from Firefox's history file.

Loading the best match

Once SettingsReader has some data to work with, it's really easy to pull out the best match URL and load it in Firefox. There are a few new things you need to know:

  • To read a row from SettingsReader, you need to use its Read() method. Usually if there are lots of rows, you call Read() each time you want to move to the next row. But we only have one row, so we just need to call it once.
  • You can access fields from SettingsReader just by specifying their name and converting them to a string, like this: SettingsReader["url"].ToString().
  • When you're done working with a reader and with a connection, you should close them using their Close() methods.
  • To start a new Firefox process, you need to use the Process class and provide it with the name of the program and any parameters you want to pass to it.

Those four things can be done in eight lines of code (plus my comments!) - replace "Console.WriteLine("Got matches!");" with these:

// read the first row of data in
SettingsReader.Read();

// grab the URL field from that row
string url = SettingsReader["url"].ToString();

// now close the reader and the connection
SettingsReader.Close();
SettingsConn.Close();

// create a new process - this lets us start programs
Process firefox = new Process();

// tell it the program's filename is "firefox" and that we want
// to pass the URL to Firefox as an argument
firefox.StartInfo.FileName = "firefox";
firefox.StartInfo.Arguments = url;

// now launch the program
firefox.Start();

You have to admit, it's pretty cool to be able to say "firefox.Start()" - it really makes your source code easy to read and easy to maintain.

The best bit is that we're done: your program is finished and ready to run. But before I let you go, there's one last thing I want to show you:

string search = args[0].Replace("'", "''");

Replace the current "string search" line with that one. What it does is convert any instances of ' into '', which is SQLite's escape sequence. That is, when SQLite sees ' it things the string is ending, but if you give it '' instead it will understand that you just mean the text '.

The finished program launches Firefox from our Mono/C# program.

The finished program works: I ran "FoxySearch tux" from the command line, it read Firefox's history and saw that I like going to TuxRadar.com the most, and loaded that into Firefox for me.

Let's wrap up

This has, I hope, been a very gentle introduction to databases. We haven't covered how to create them or how to add and remove data, but we have done something meaningful: our app is actually very useful, because you can now run the command "FoxySearch mono" and it will look through all the pages you've been to with "mono" in the title (note: it's not case-sensitive, so Mono and MONO will be matched also) and start Firefox with whichever page you have been to most often and recently.

So, while you may only have touched the tip of SQL, you have also learned a bit about Firefox, and I hope you're inspired to see what other apps you can make with Firefox's data - it makes extensive use of SQLite, and all that data is open for you to read and play with. Happy hacking!

Homework

If you're following this with a tutor, you will be required to complete the following homework before continuing. If you're working by yourself, I strongly recommend you find someone who can help check your work and provide feedback.

The homework for this project is made up of three coding problems; all are required.

  • Change the database search so that it checks whether the title or the URL is like the search criteria. You need the word "WHERE" only once for all your checks, and should use "OR" to check multiple things.
  • Allow the program to accept two parameters. If only one parameter is given, it should work as-is. If two parameters are given and the first one is "list" then the program should print out a list of all page titles and URLs that match the second parameter, rather than just loading the top hit into Firefox.
  • Right now we're taking a backup of places.sqlite before reading from it, to avoid Firefox locking the database and thus making it impossible to read. Have the program delete this copy once it finishes.

If you have problems, try to solve them yourself - you might not succeed, but you'll learn a lot by trying! If you're still having problems, drop your tutor an email and ask for help.

The small print

This work was produced for TuxRadar.com as part of the Hudzilla Coding Academy series. All source code for this project is licensed under the GNU General Public License v3 or later. All other rights are reserved.

You should follow us on Identi.ca or Twitter


Your comments

I wonder...

Nice project, but I wonder, why C# of all languages? It's not even Linux native... Perl or Python would be a better choice, IMHO.

Mono?

C# has been ported (somewhat) to Linux under the Mono project.

There are a lot of good parts to the language like having a common runtime.

If you look at the code, it's not difficult to translate to Python or Ruby - perhaps that is it's strength?

Why Mono?

Why use Mono when there are so many other programming languages out there? Do you want to be able to port your programs to Windows? I didnt think so. Then I ask why not use C++, Python, Ruby, NASM, perl, or attempt a bash script? I would even try Brainfuck before the apparent trap that is .Net/Mono.

Apparent Trap?

How is .Net/Mono a trap if it is a standard and we have the patent agreement and M$ flew people in and helped develop it?

Oh thats right since it is a wqeak language and doesn't program well i.e. Tombot, Banshee, F-Spot. We might as well copy the code and make it C++ and double our efforts and have a weaker program to satisfy the closed minded, take persons freedom away RMS and his crew.

If I want Mono and I think open standards are enough why should we lose our freedom from running it?

You all use so much Apple crap (i.e. CUPS its crap) and they are more scary in my books.

Correction

I found a typo...

The connection string i.e. the try/catch block should reside in the constructor block instead of the main.cs file.

Thank you!

Thank you for these cool tutorials. I like that you have choosen Mono!

Thanks!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

CAPTCHA
We can't accept links (unless you obfuscate them). You also need to negotiate the following CAPTCHA...

Username:   Password:
Create Account | About TuxRadar