Beat the CIA
The World Wide Web is the greatest system for sharing information ever created – but how do you stop it sharing too much? Ben Everard investigates.
You’re not paranoid – they really are watching you. Criminals, web companies and governments all have a reason to spy on your online life, and the methods that they use are becoming increasingly sophisticated.
2011 was the most dangerous year to be an online citizen, particularly if you happened not to agree with everything your government said.
199 people around the world were arrested or detained because of content they posted online. Many are still languishing in jail.
The offending information ranged from exposes of environmental damage to religious instruction and criticism of unelected autocrats.
In addition, there has been a recent increase in the use of netizens’ information by web companies. Privacy policies have been extended, and Twitter now sells the rights to users’ data. Some of the self-protection methods shown here will have an impact on how you can use a computer.
For most people, implementing all of them would be over the top.
What we’re aiming to do here is show you who can find out what about you, and how to stop them. What you do with that information is, of course, up to you. Whether you are concerned about the scale of information gathered by web companies, or you are hiding from a corrupt government, read on to find out how to keep your data yours.
You can find out just how much information you’re revealing to the world using Wireshark. This tool captures all information passing through your network interfaces and allows you to search and filter for particular patterns. It takes information from your network interface, so any information displayed in it is visible to other (potentially malicious) people on the network.
Wireshark should be available through your package manager, or from wireshark.org. Once installed, you can start it with:
You will get a message telling you that you’ve started it with super user privileges and this isn’t a good way of doing it. If you plan on using the tool a lot, you should follow their guide on a better set-up, but for a one-off, you can ignore this. Click on your network device in the interface list (probably eth0 for a wired network and wlan0 a wireless) to start a capture. As soon as you start using the network, the top part of the screen will fill with variously coloured packages. The tool has a filter to help you make some sense of this multi-coloured mess. For example, you can keep a prying eye on duckduckgo.com searches using the filter:
http.request.full_uri contains “duckduckgo.com?q”
If you now do a search using http://duckduckgo.com
, it will appear in the list, and the search term will be in the Info column. A similar technique could be used on any of the popular search engines.
The author's devious website (bottom) uses the same icon as a legitimately secure website (top) to try to trick users into thinking that it's safe
Keep up to date
Your web browser and SSL libraries should be considered critical applications and be kept up to date with security patches to stop any malicious agents taking advantage of bugs in them to side-step the encryption.
You may not be concerned about people being able to read your search terms, but exactly the same technique can be used to pull usernames and passwords that are sent in plain text. For example, most forums send passwords in plain text (because they’re not a serious security risk, and secure certificates can be expensive). The www.linuxformat.com
forums are set up in this way.
To sniff LinuxFormat.com passwords, fire up Wireshark and start a package capture using the filter:
http.request.uri contains “login.php”
When you log in to www.linuxformat.com/forums/index.php (you will need to create an account if you don’t already have one), the filter will capture the packet. The line-based text data will contain:
How many computers are you sharing this information with? Depending on your network set-up, probably every other computer on the LAN or Wireless network. As well as these, every computer that sits in the route between you and the server you’re communicating with. To discover what these are, use traceroute to map the path the packets take. For example,
If your computer’s behind a firewall, you may find that this just outputs a series of asterisks. In this case, you can use a web-based traceroute such as the ones indexed at www.traceroute.org. This list is a little out of date, and not all of the servers are still hosting traceroute, but you should be able to find one that works in your area.
Do you know who’s running these computers? Or who has remote access to them? Do you want these people to be able to see everything you do online?
If you use services with unsecured passwords (and there’s no reason you shouldn’t, as long as you understand the implications), then it’s important not to use the same password for a secure service.
Figure 1. Wireshark lets you add extra columns to the packet list.
Secure (using SSL)
The most basic piece of the web privacy puzzle is the Secure Sockets Layer (SSL). This rather obscure-sounding protocol is a way of creating an encrypted channel between an application running on your computer and an application running on another computer. For each insecure network protocol, there’s a secure one that does the same basic task, but through an SSL channel. See the table, above, for details. Any time you use a protocol from the left column, an eavesdropper can read what you send, but if you use one from the right column, only the intended recipient can see the data.
For web browsing, it’s the last one on the table that’s important. As we saw before, many computers can read what we send in HTTP, but if we perform the same test again, but using duckduckgo’s secure web page – https://www.duckduckgo.com (note the s) – then you will find that the information does not appear in Wireshark.
Some web browsers show a padlock when connected to a secure website, but this can be spoofed easily using favicons (see image, left). If you’re unsure, click on the icon.
A legitimate padlock will open a pop-up telling you about the security on the page.
Of course, this ensures only that the information can’t be read as it’s being transmitted between your computer and the server. Once there, the organisation running the server could pass it on to third parties, or transmit it insecurely between their data centres. Once you send information, you lose control of it. Before hitting Submit, always ask yourself, do you trust the organisation receiving the data? If not, don’t send it.
HTTPS is a great way to keep your web browsing private. However, because of the way it has been bolted on top of HTTP, it isn’t always easy to make sure you use it. For example, if you use https://www.google.com to search for ‘wikipedia’, it will direct you to the HTTP version of the encyclopaedia, not the HTTPS version.
The Electronic Frontier Foundation (EFF), a non-profit dedicated to defending digital rights, has developed an extension for Firefox that forces browsers to use HTTPS wherever it’s available. A Chrome version is currently in beta. Get this from https://www.eff.org/https-everywhere
to keep your web usage away from eavesdroppers.
Like all forms of encryption, SSL has a weakness, and that’s the keys which are stored in certificates. Just as a hacker can easily get in to your accounts if they know your password, they can easily eavesdrop on SSL encrypted data – or spoof it – if they can trick your computer into using their certificates. The main point here is that they are stored on the computer, not in your memory like passwords. If an attacker can put files on your system, they can break SSL encryption. You are at particular risk when using a computer you haven’t personally installed the operating system on, such as a work machine or at an internet café.
You should be able to view the current certificates and authorities in your browser’s security settings, but it isn’t always easy to identify things that shouldn’t be there. Here, live distros come to the rescue, since you can carry a trusted operating system with you and use that whenever you are at a computer of dubious provenance.
Using SSL will keep your data safe from eavesdroppers, but what if the companies that you’re communicating with are spying on you?
Google, Facebook, Twitter and others have built business models out of providing users with a free service in return for information about you. This information can then be used to target advertisements at you. Twitter has even gone a step further and sold users’ tweets to market researchers. Some people may consider this a fair trade, but privacy campaigners are becoming increasingly concerned about the shear quantity of data these companies are holding about us. And this data goes way beyond what we voluntarily hand over to them. Both Google and Facebook have established relationships with literally millions of other websites to help them track your movements around the web using cookies. These may sound like tasty treats, but are actually pieces of information stored on your computer to help sites identify you when your browser reaches them.
To find out just how much these companies are tracking us, we can use Wireshark to monitor our network connection and watch for the cookie data being sent back.
Start Wireshark and capture on your main network interface. In the filter box enter
This will now show only packets that relate to cookies that are being sent to web servers. To display a little more of the information that is being acquired, go to the middle pane and click on the arrow next to Hypertext Transfer Protocol. There are two sections in here that allow the web company to track us: the host and the referrer. Right-click on each of these and select Apply As Column (see Figure 1). This will then add these fields to the main view. Each of these two domains allows the host (the organisation receiving the cookie) to monitor your activity on the referrer. In addition to this, the host uses a unique ID to track your activity between sessions.
You can delete the evercookie LSO, but it will reappear next time you visit the site.
Google uses its advertising services to monitor what we do, whereas Facebook uses its Like buttons. There’s no way of knowing exactly what these companies are doing with the data they collect – we can see only what they’re receiving.
Fortunately, most browsers allow you to control cookies. Depending on your personal feelings, you may choose to limit cookies to certain websites (where they can be useful to remember preferences), or block them completely. If you use Firefox, go to Edit > Preferences > Privacy, and change Firefox Will to Use Custom Settings for History.
When Tor doesn't work
If you’ve read the section on Tor and think it provides the perfect way to update your music/film collection for free without the risk of getting caught, think again. There are two good reasons not to do this: Firstly, it will put a heavy load on the network and slow it down for legitimate users; secondly, it probably won’t work because many popular file sharing protocols will leak your IP address. The Tor anonymisation stops the TCP packet headers identifying you as the sender, but this information is included in the data. This is a bit like using email. If you have an anonymous email address, the header won’t reveal who you are, but if you type your address into the main body, the recipient knows where to look.
The simple way of ensuring you remain anonymous online is to use the internet only through the official Tor browser bundle and not any other applications. Use webmail rather than email and beware of documents that can contain web links (such as DOC and PDF files).
If you untick Accept Cookies From Sites, Firefox will not store any cookies. To do the same in Chromium go to Preferences (the spanner by the address bar) > Under the Bonnet and change Cookies to Block Sites From Setting Any Data. In Konqueror, this can be done through Settings > Configure Konqueror > Cookies and unchecking Enable Cookies. For lightweight KDE users, it can be done in Rekonq by going to Settings (the spanner by the address bar) > Network > Cookies and unchecking Enable Cookies. As well as allowing you to completely block cookies, both Firefox and Chromium give you the option of blocking third-party cookies (In Konqueror and Rekonq, this is Only Accept Cookies From Originating Server). This means they block cookies from domains other than that of the current website. If you do this, websites can store data about you, such as your preferences, and can track your movements within the site, but other sites won’t be able to follow your movements once you leave the domain. This will stop companies from tracking your movements across the web.
Hostip.info can locate our IP address to within a mile of LXF Towers.
If you set this up, then run cookie tracking in Wireshark, as was done above, you will see that the referrer and the host are always the same domain. For many users, this will be a happy medium of letting cookies do their original purpose – letting sites use them to recognise returning viewers – but blocking organisations from following their online movements. Cookies aren’t the only way that websites can track you. Even if you have browser cookies disabled, sites can still store tracking information on your computer using Locally Shared Objects (LSOs). These function exactly like cookies, except that they’re accessed through Flash rather than directly through your browser. To view and control what websites are using these, go to Macromedia’s Website Storage Settings Panel at www.macromedia.com/support/documentation/en/flashplayer/help/settings_manager07.html
Webmasters intent on tracking you can use a combination of techniques to create zombie cookies. These store the same information in more than one place so that when you destroy one, they regenerate using the others. For example, if you delete all browser cookies, the website can recreate the cookie from an LSO, and visa versa. As long as one of these remains, all the others can regenerate. Samy Kamkar has taken this to the extreme at samy.pl/evercookie
, where he uses 12 different methods to resurrect
Using Vidalia, which comes with the Tor bundle, you can see how your anonymous connection moves around the world.
Using open source software
Perhaps the biggest risk to privacy is letting code you can’t verify run on your computer – as some iOS users found out when researchers discovered in 2011 that their devices were silently tracking their movements. Governments have also used proprietary systems to hide tracking abilities.
Also in 2011, The Economist reported that an American company sold Sadam Hussain’s government photocopiers that, as most similar systems do, came bundled with a proprietary system. Hidden away inside this sealed package was a method for transmitting GPS co-ordinates to the US army. When George Bush launched operation Enduring Freedom, these tracking units flicked into life and guided missiles directly to the government departments in Baghdad. The full story is at www.economist.com/node/18527456
Richard Stallman restricts his hardware to just those who use entirely free software from the BIOS to all device drivers. Because of this, he uses the Lemote Yeelong with PMON to boot and the gNewSense flavour of Linux
(apologies to Mr Stallman, who will be fuming if he reads this, but for reasons of brevity, we omit the GNU/).
While using open source software will help ensure there are no gremlins in the code, it’s essentially impossible for most users to ensure that there are no malicious aspects hidden in their hardware. The only way to protect yourself is to get the hardware from trusted sources.
We think running the NoScript extension for Firefox should prevent this type of cookie from working, but it also disables the method of testing it! We found that neither Private mode in Firefox, nor Incognito mode in Chromium were able to prevent this. If you need to be sure that your web browsing isn’t being tracked across sessions, the best solution is to use a non-persistent system. That is, a system that doesn’t carry any information over from one session to the next. You can still be tracked during a browsing session, but not between them.
For Linux users, the most obvious option is a live DVD. This doesn’t have to be a physical disc running live – an ISO running in a virtual machine will do the job. This means that all data that the websites can use to track you is reset each time you restart the virtual machine. You can also run more than one virtual machine simultaneously to prevent anyone linking two sessions. If it ever comes into being, a live version of Boot To Gecko would be a particularly convenient way to do this, but this is still in development.
There is one, slightly more devious, technique that websites can use to identify you. This is by amalgamating all information about the capabilities of your browser and system into a digital fingerprint. Because of the amount of information that your browser will, if asked, reveal about you, this fingerprint can often be used to uniquely identify you to a site.
Once again, the EFF is active in this area, and hosts a website to help you understand what your fingerprint is. Point your browser to panopticlick.eff.org
to see how unique you are. At the time of writing, more than two million people had used the site to check their browsers, and we still found that most of our machines could be uniquely identified. This means any website could track us even without cookies, LSOs or any of the other storage techniques. At the moment, this is a theoretical vulnerability, and there have been no known cases of browser fingerprinting in the wild.
The author found 40 cookies on his installation of Chomium, and it's not even his main browser.
The Collusion extension for Firefox creates a graph of how cookies' information passes around sites. Once the extension is installed, click on the white circle in the bottom-right to start.
If you’re concerned about being tracked this way, the best way to prevent it is to stop scripts from running. This reduces massively the amount of information that a website can use to form the fingerprint. The NoScript extension for Firefox provides an easy way to control which scripts run on a site. However, this will severely limit the function of many interactive websites. Web pages are made up of a number of different elements that your browser reassembles to make a single document. These elements may come from many different places, organisations and servers. Any of these could contain some degree of monitoring using a technique called web bugging (also known as web beacons or pixel tags). These use images to generate HTTP requests that log your activities with a different server to one hosting the website. These potentially could be able to track you using browser fingerprinting, but they’re also used more widely. They’re not restricted to web pages, and can be used in any HTML document. Most commonly, they’re used by spammers to identify active email addresses.
Viruses and trojans aren’t the preserve of cyber criminals and geeks with a point to prove. Governments can, and have, used them to spy on their populations. Wikileaks published documents from 2008 showing the Bavarian Police and Prosecutor’s office splitting the cost of a system for intercepting encrypted Skype calls. The State government has since admitted to using the software to intercept calls. The trojan they used evades encryption by grabbing the data from the person’s computer before it is secured.
More recently, the Syrian government has used malware to gain access to dissidents’ computers. This software is capable of logging keystrokes and activating the users’ webcams. This enabled the regime to link people with anonymous internet accounts, and impersonate people in online forums.
To avoid these threats, take all normal precautions against malware. While there are no known viruses capable of infecting modern Linux machines (the few that have existed used security holes that have been closed), the same is not true of trojan horses. As we showed in LXF154, these are easy to create for Linux.
The issue of trojans in Linux briefly hit the headlines in March 2012, when Anonymous-OS was released by persons unknown. Shortly after its release, @YourAnonNews announced on Twitter that it was “wrapped in trojans”. However, security analysts seemed strangely silent on the matter – most just muttered that it wasn’t sensible to run software from an unverified source.
This highlighted the problem that malware identification is much less mature under Linux than on Windows. Tools such as rkhunter will flush out some of the more common trojans, but it missed the one we created in LXF154. One analyst verified that the packages were those from the official servers, and hadn’t been tampered with, but there are many other places to hide malware. Running Tor compounds the problem. This means that you can’t simply monitor the traffic leaving your computer – it’s encrypted and anonymised. The moral of the story is that it’s important to prevent malware getting onto your Linux box, because without good quality anti-virus software, it can be hard to spot, let alone get rid of. The most important ways to protect your system are to install only software signed by a trusted source (such as your distro’s repository) and use full disk encryption to prevent a malicious agent putting it on there.
Running 'sudo rkhunter --check' will check your system for known malware.
If you open an email containing one of these images, the spammer will be able to identify that you’re checking the address, and can be persuaded to open spam emails.
Fortunately, most email clients and web mail providers disable image loading by default.
When you connect to the internet, your service provider assigns you an IP (Internet Protocol) address. This tells web servers and other computers you communicate with where to send the information. Any computer you interact with online can tell which IP address you use. From this, they can find out some information, mainly your service provider and approximate location. Check out www.hostip.info
to find out what you’re transmitting to the world. Since IP addresses change periodically, web servers can’t get closer to you than this. However, government agencies can force your service provider to reveal which subscriber was allocated to which IP address at what time.
In short, they can link an online act with a physical computer.
For example, in April 2004 Shi Tao, a Chinese journalist, emailed the Asia Democracy Foundation with details of the Chinese Government’s attempts to stifle news reports on the 15th anniversary of the Tiananmen Square massacre via Yahoo web mail. His government got the IP address he used from Yahoo, and since the ISP was state-controlled, could find out exactly where it was sent from. In November, he was arrested, and in March 2005 he was sentenced to ten years in prison.
Wireshark captures vast quantities of information. Fortunately though, it also has filters to help you understand it.
The I2P project aims to implement an anonymised and secure network. Unlike Tor or HTTPS, this isn’t a link into the web, but a separate peer to peer network. The software is still in the beta stage, and so it shouldn’t be used in situations where anonymity is essential. However, the project is growing and developing, and hopefully it will soon be another weapon in the war against digital intrusion.
To protect yourself from this level of scrutiny, you need to make sure that there’s no link between you (and your IP) and the server you’re communicating with. Simply encrypting your communication isn’t enough, because it still allows the server to know who sent it – it just prevents eavesdroppers. You can achieve the necessary privacy by passing your data through a series of encrypted relays. This technique is called onion routing, and has been implemented by the Tor Project
- Communicate with the Tor directory server, which will reply with three random relays.
- Encrypt your data with keys for each of the relays.
- Send this encrypted package to the first relay. This server knows your IP address, but doesn’t know what you’re doing, since you data is encrypted with the keys to the other relays. The only piece of information they can access is the location of the second relay.
- The first relay sends the encrypted package to the second relay, that can only decrypt the location of the third relay. This computer knows the location of the other two relays, but not your IP or what you are trying to communicate with.
- The second relay sends the encrypted package to the third. This computer can decrypt your message and send it out of the Tor network on to the intended recipient. The third relay can see the final recipient of your data (and if you’re using an unencrypted protocol, the actual data), and the location of the second relay, but he doesn’t know your identity.
- The recipient gets your request as though it had come from the third relay. They don’t know your identity, or even that there is someone hidden behind the third relay. They respond to the third relay.
- The third relay passes the information back to you through the Tor network in the same manner as you sent it. No one on the network knows both the identity of the original sender, and the recipient. However, Tor is an anonymisation system, not an encryption system. While the data is encrypted as it passes through the relays, once it leaves the network, it’s no more or less secure than any other information on the internet. To keep your data private, you need to use the same precautions you would if you were not using Tor – ie use one of the encrypted protocols listed in the table, above.
Just because something isn’t on the internet, doesn’t mean it can’t be traced. Theoretically, at least, it’s possible to trace DVDs, CDs, some USB sticks and other data storage devices through serial numbers. Using these, government agencies could trace them to the distributor and, depending on the payment method, to the purchaser. If you intend to make private information public through a service such as Wikileaks, using an offline method (such as the postal service), you need to be sure that such things cannot be traced back to you. Not because we think Wikileaks may pass on your information, but because it may be intercepted. To avoid this risk, purchase the devices (including CD/DVD burners) in cash away from your home. And don’t send from a postal location that can be linked to you, such as a close post box or a post office with CCTV.
Sounds complicated? Fortunately, the Tor Project has put all the necessary tools in a single package with a secure version of Firefox. It’s on the disk, or available from www.torproject.org – just unzip the file and run start-tor-browser. It will connect to the network and open a secure browser. If you are on the run (in any sense of the phrase), you can browse securely via Orbot for Android or Covert Browser for iOS.
Even the utterly devious evercookie can't track you between different virtual machines.
There are potential statistical attacks against the network. For example, if an organisation can see all the data going into the network, and all the data coming out of it, the timing and quantities of packets may reveal which user sent what. However, due to the worldwide nature of the system, this would require co-ordinated and systematic monitoring across many countries.
You may think that using an internet account not linked to a physical location – such as mobile or satellite phones – will improve this situation, but it does the opposite. Mobile phone signals can be triangulated, and many satellite phones include the GPS co-ordinates of the phone in the connection to the service provider. Polish firm TS2 sells a product that can pinpoint a satellite phone user: www.ts2.pl/en/News/1/151
. It’s possible that technology similar to this was used by the Syrian regime to target and kill journalists in Homs earlier this year. See the box on When Tor doesn’t work for details of why using Tor would not have protected them from this form of surveillance.
Some regimes, most notably in China, appear to have taken steps to stop their citizens accessing Tor. The simplest way of doing this is to download a list of Tor relays and stop all connections to those machines. To allow users to bypass this, Tor has introduced a series of bridges. These are routes into the network that aren’t published. A game of cat and mouse has now begun between the Tor Project and organisations trying to block access to the anonymisation service.
Like many community-based projects, Tor needs volunteers. However, unusually for a free software project, programmers are not the most needed people. Running a relay or bridge will help keep people anonymous. Translators and people working in advocacy are also in demand. To see how you can help people maintain both their privacy and freedom of speech, check out www.torproject.com/getinvolved/volunteer
If you’re interested in privacy, then the chances are you use full disk encryption. If you don’t then you may wish to consider it. It’s easy to set up, usually just a tickbox during the distro install, and on a modern system the performance penalty should be minor for most purposes. Note that partial disk encryption is considerably less secure – in issue 154 we showed one of many methods for circumventing it. Modern encryption methods using algorithms such as AES are unbreakable without the passphrase, provided a sufficiently long key is used (AES-128 should be considered a minimum. If the CIA is on your tail, then AES-256 is better). There are a few methods a government agency can use to acquire this passphrase. Unfortunately, the easiest (for them) is torture. The second easiest is to try to guess your passphrase using a dictionary attack. However, let’s assume that you’ve picked an unguessable passphrase, and managed to jump out of the window and flee when the knock at the door came. Your secrets will be safe, right? Well, not quite. When you’re using an encrypted drive, the computer stores the decryption keys in the memory.
If they smash through the door just in time to see the computer shutting down, they could put a memory scrubbing tool in your computer and restart it. Contrary to popular belief, the RAM in your computer isn’t wiped when it’s powered off, just very soon afterwards. Researchers at Princeton were able to steal encryption keys from the memory of restarted computers. The tools they created to do this are available from https://citp.princeton.edu/research/memory
Email has become an indispensable communication tool. However, it comes from the days before electronic spying and privacy concerns. In fact, the contents of your mails are sent unencrypted and can be read by anyone with access to the network. Using web mail over HTTPS will limit the number of people who can listen in, but only slightly, because once your mail is at the mail provider’s server, it is decrypted before being sent on to its intended recipient.
If you want to secure your email, you’ll need to encrypt it yourself before you let it leave your computer. The easiest method for doing this in Linux is GPG (Gnu Privacy Guard), which has plugins for most popular email clients. These offer a simple, easy-to-use method for keeping the contents of your mail away from everyone other than the intended recipient. As with all encryption, the security is dependent on the quality of the keys that are used (and their secrecy).
If you only locked or suspended your computer, then the situation’s even worse.
In these cases, the spooks will have time to freeze the memory before rebooting it (or transferring to a computer set up to scrub the memory). At room temperature, memory typically becomes unusable after a few seconds. If it’s frozen to around -50˚C (which is achievable using cheap aerosols), that time increases to several minutes.
To avoid this style of attack, you need to stop them being able to access usable memory. Don’t leave your machine locked or suspended. If you have valuable information on it, turn it off. And prevent booting from devices other than the hard drive without a password. This will stop them booting straight into a tool such as the Princeton researchers’ USB scrubbing tool. By the time they’ve managed to bypass your BIOS’s security, the memory will be useless.
Using longer encryption keys will also help, since slight errors often creep in during the scrubbing process. The longer your key, the more of these errors it’s likely to pick up.
If the men in black are really on your tail, then you could consider running your laptop without its battery. This means that you have only to pull out the power cable before running away.
You should follow us on Identi.ca or Twitter