Kurt Roeckx's journal

9th September 2013

With all the articles about the NSA going around it seems to be hard to follow what is still safe. After reading various things it boils down to:

Properly done encryption is still safe.
They go after the weak things like passwords, bugs in software, ...
Some software might contain backdoors.

I don't think this should really surprise anybody.

There is also speculation that maybe the NSA has better algorithms for reducing the complexity of public key cryptography or that maybe RC4 might be broken. Those things clearly are possible, but it's not clear. Nobody has suggested that AES has any problems, and I think that's still safe to use.

One of the question becomes what does properly done encryption mean. There are various factors to this and many applications using various protocols. SSL/TLS is the most used, so I'm going to concentrate on the parts in that.

Bit sizes

As far as I understand it, 80 bit of security is around what is currently the upper practical limit of a brute force attack. In 2003 NIST said to stop using 80 bit by 2015, in 2005 they said to stop using it by 2010.

When using something like SSL/TLS you using various things each having it's own size. So it's important to known which is the weakest part.

Public key encryption (RSA, DSA, DH) doesn't offer the same amount of security per bit than symmetric keys. The equivalent sizes I find are:

Symmetric	Public
80	1024
112	2048
128	3072
256	15360

There are other sources that have other numbers, but they're similar

So this basically means you really should stop using 80 bit symmetric and 1024 bit public keys and that your public key really should be at least 2048 bit, and should maybe even consider going to 3072 bit.

There seems to be a push to moving Diffie-Hellman (DH) to provide Perfect Forward Secrecy (PFS). There are many variants to this. We want to use ephemeral (temporary) keys. But it basically boils down to 2 variants:

Standard Ephemeral DH (DHE/EDH) with a public key
Elliptic Curve Ephemeral DH (ECDHE)

With DHE you first negotiate a temporary DH key for this session (hopefully) using a random number on both sides, and then using that DH key to exchange a normal symmetric key for the session. This temporary DH key is what provides the PFS, assuming that this temporary key is thrown away. The drawback of this is that creating a sessions takes a lot more CPU time.

The standard DH has the same security as other public key encryption. So we really also want to stop using 1024 keys for that.

With ECDHE the key size needs to be the double of the symmetric size. This is also faster than DHE, so people want to move to this. There are various curves that can be used for this, and some of them are generated the the NSA and people have no trust in them. Some people even don't trust any of the curves. Most people seem to think that ECDH is the way forward, but then limit the curves they support.

There are also hashes used for various things including signatures and MACs. Depending on how it's used the properties of the hash are important. They have 2 important properties collision resistance and preimage resistance.

For a collision attack the upper limit is a little more than the output size over 2. But there are attacks that are known to give worst results. I found those numbers:

MD5	2^20
SHA-1	2^60
SHA-2	No known attack (so SHA-256 would have 1.2 * 2^128)

For preimage attacks I found:

MD5	2^123
SHA-1	No known attack (2^160)
SHA-256	No known attack (2^256)

So I think that MD5 is still safe when a collision attack isn't possible, but should really be avoided for anything else and SHA-1 should probably be considered the same. So you want to avoid them in things like certificates.

SSL/TLS also use MD5 / SHA-1, but as part of a HMAC. I understand that those are not vulnerable to the collision attack but preimage resistance is important then and so are still considered safe.

Randomness

A lot of the algorithms depend on good random numbers. That is that the attacker can't guess what a (likely) random number you've selected. There have been many cases of bad RNG that then resulted in things getting broken. It's hard to tell from the output of most random number generators that they are secure or not.

One important thing is that the RNGs gets seeded with random information (entropy) to begin with. If it gets no random information, very limited amount of possible inputs or information that is guessable as input it can appear to give random numbers, but they end up being predictable There have been many cases where this was broken.

Ciphers

There are various encryption algorithms. But since the public key algorithms take a lot of time we usually want to limit the amount we do with them and do most with a symmetric key encryption. There are basically 2 types of symmetric ciphers: stream and block ciphers.

For block ciphers there are various ways of combining the blocks. The most popular was CBC. But in combination with TLS 1.0 it was vulnerable to the BEAST attack, but there is a workaround for this. An other mode is GCM but it's not yet widely deployed. This resulted in the stream cipher RC4 suddenly getting popular again.

Some ciphers have known weaknesses. For instance triple DES has 3*56=168 bits, but only provides for 2*56=112 bits of security. Other attacks make this even less.

The algorithms them self might be secure, but they might be attacked by other mechanism known as sidechannel attacks. The most important of that is timing attacks. If the algorithm has branches they might result in different amount of time taking in each branch. This might result in an attacker finding out some information about the secret key that is being used. It's important that the difference should be reduced as much as possible. This is usually not a problem if it's implemented in hardware.

SSL/TLS Session

A typical SSL/TLS session uses those things:

A certificate chain. Each of those certificates contain:
- A public key. That needs to be big enough.
- A signature which is usually a combination of a hash with an encryption algorithm. This is signed by public key of next CA in the chain, or a self-signed certificate like a root CA. The hash algorithm's collision resistance is important. But this isn't important for a self-signed certificate, like a root CA, since you either trust that or don't.
a key exchange method so that you can share a secret key for the session. This might contain a separate temporary key in the case of (EC)DHE, in which case it needs to be big enough.
a cipher
a MAC

Important aspects of using certificates:

You need to check that they are still valid. This can be done using a CRL or via OCSP. But nobody seems to be downloading the CRLs and not all CAs offer OCSP.
You need check that the hostname of the certificate you get actually matches the hostname you're trying to connect to.

The client sends it's list of supported ciphers to the server and the server will then pick one. This is usually the first from that list that is supported by the server, but it might also be the order that is configured on the server that is used.

Current status in software

If you go looking at all of those things, you find problems in most if not all software. Some of that is to be compatible with old clients or servers, but most of it is not.

If you want to test web servers for various of those properties you can do that at ssllabs. Based on that there are stats available at SSL-pulse.

Apache only supports 1024 bit DHE keys and they don't want to apply a patch to support other bit size. Apache 2.2 doesn't offer ECDHE but 2.4 does.

I don't know of any good site has lists which browser supports which ciphers, but you can test your own browser here. It lacks information about EC curves that your client offers. You can of course as look at this with wireshark.

Mozilla currently has a proposal to change it's list of supported ciphers here. They currently don't have TLS 1.1 or 1.2 enabled yet, but as far as I understand the NSS library should support it. They should now have support for GCM, but they still have an open bug for a constant time patch. In general Mozilla seems to be moving very slowly with adopting things.

Last I heard Apple still didn't patch Safari for the BEAST attack.

For XMPP see client to server, server to server, clients.

For TOR see here.

No tags

Anti-piracy and privacy

10th August 2013

In the Dutch language area (Netherlands and Flanders) most e-book publishers changed from using DRM to using a watermark since February. It's now something like 62% with watermark, 24% with DRM and 14% without either. I was hoping that more with move to the watermark, but it doesn't seem to change much.

The watermark in the e-books are used to find the original buyer in case of piracy. The information in the e-book does not itself identify the buyer, it just some random ID. That ID can then be used to trace back to buyer.

We work with a 3rd party to deliver the e-books to our consumers. They get a random string from us that we can use to trace back the order, and from there we can trace back who bought it. The 3rd party than generates their own random string and puts that in the e-book. They only only keep those 2 random strings, and so have no way to trace this back to the buyer, we are the only one that known who bought it. And I think this is how it should work. I think this is also what the privacy laws require us to do.

But now we got a new contract that states that we must directly give information about the buyer if some anti-piracy agency (BREIN) finds an e-book file online. We must keep the information about the buyer for minimum of 2 years and maximum of 5 years. And if we don't sign the contract we won't be allowed to sell e-books with watermark anymore.

So this means that they want to bypass the normal judicial system, and probably contact those buyers they accuse of piracy directly. I questioned that this was legal. They say that it is legal according to the Dutch privacy law, but I have a hard time interpreting any of the options in article 8 as that we can give that information without the explicit consent of the person.

I guess you can interpret option f in several ways, and that their lawyers do it differently than me. This will probably remain unclear until some court decides on it.

No tags

iDEAL and ING

6th December 2012

For the e-webshops site I run I started implementing iDEAL because we have a lot customers in the Netherlands. iDEAL is a standard used in the Netherlands to do online payments.

We needed to open a bank account for that in the Netherlands, so we did that with ING in October. It first took them a month to reject us, stating we needed to register our company in Belgium. Of course we're already registered, we couldn't be in business in the first place if we weren't. After pointing them to the government website it seems to have sorted itself out a few days later.

So once we were approved I could actually start to test the software for iDEAL. It uses XML sig which I think is a horrible standard. Once I started testing it I directly ran into a problem. They give me an error back saying that the signature wasn't valid. So I contacted them about 2 weeks ago and have yet to get answer back that's useful.

So far it went like this:

I gave them a full explanation on how they could manually check this using openssl. I showed them the canonical versions of all the XML parts, and how to check that it's valid. I showed the fingerprint of my certificate and how that was the same as on the website.
They replied saying that I either didn't upload a certificate or that it was expired, or that I did something else weird I didn't even understand.
I found that the ideal standard says there is a limit of 5 years for certificates while their documentation says to generate one for 10 years. So I uploaded a new certificate but that didn't help, and said so.
They replied back that my self-signed signature should have been one using sha256 and not sha1. This makes little sense for a self-signed certificate. The iDEAL standard does say that the signature needs to use at least sha256 in case you don't use a self-signed certificate though.
So I mailed them back saying I created a new certificate with a sha256WithRSAEncryption signature algorithm, but that I still have the same problem.
They mail me back saying I have an invalid certificate and that I should read the manual and they can't help me. They give no indication on exactly was is invalid about the certificate, just that it's invalid.
So I say that I follow all requirements of the standard and used the commands they provided.
They reply back that they checked the logs and that we're using the old version and that we should switch to the new version of iDEAL. Of course I'm using the new version and am sending this to the correct URL.
They provide software (no license mentioned) in php, java and .NET. But that's of course is not integrated in anything, it's just the library to do the calls. So I tried the php version and send a request using the same certificate and actually do get a reply back now. Both my version and the php generated version properly verify.

So I'm now back to step 1.

No tags

Getting frustrated by Google

29th March 2012

I recently started working on a webshop called e-webshops which mainly sells e-books. If I want to attract customers there are various things to do so that people can actually find you. So I started using various services from Google, looking at my results in queries and things like that, and noticed that some queries didn't return the expected results. One of the problems is that I can spell and a lot people can't, and Google doesn't consider the words to be synonyms. The proper way to spell e-book in Dutch is either e-boek (preferred) or e-book, but lots of people write it as ebook (or eBook). I of course use the preferred spelling everywhere.

So what do I do? I contact Google asking them to make them synonyms. Almost 2 weeks later I get a mail back from someone that doesn't seem to be getting what I'm asking for, so I reply back. Finally after an other month I get a mail back pointing me to a blog post, which suggest posting it to a forum or tweeting about it. I don't think either form is good way to contact a company about such a thing, which is why I contacted them by email in the first place. I have strong doubts that the right people would actually read all the forum posts. How hard can it be to forward an email to the right people?

At the same time I have various other problems with Google, like things that don't work as advertised, broken documentation links, incorrect documentation, no way to contact them about the topic you want and you need to select a random other one, asking me to rate a help page I visited that day and then give an obviously broken URL, and a whole bunch more.

No tags

Canon EOS 500D

27th December 2009

I recently bought myself a Canon EOS 500D also known as EOS Rebel T1i, and EOS Kiss Digital X3. I also got myself a Canon EF 50mm f/1.4 USM lens.

One of the new things about the 500D, compared to my previous 400D, is that it has live view. So I wanted to see if that's useful or not. I'm not that happy with it. If you want to let it auto focus normally you would half press the release button but instead you need to press the AE lock (*) button, half pressing the release button doesn't have any effect. The live view auto focus is also very slow, takes about 5 seconds on average. It's also has a habit of not focusing it all, or going in the wrong direction to find the focus. I sometimes suddenly feel the focus ring vibrate on the 50 f/1.4 USM while it normally never moves. There is also some weird clicking noise on my zoom lens, it appears to keep trying to send it beyond the smallest focus range while the object is meters away. There is also a "quick" focus mode, so that if you press the AE lock button the mirror goes down, temporary turning live view off, doing the focus the normal way, and then the mirror goes up again. In that mode the auto focus takes about 2 seconds instead. The only thing live view seems useful for is that you can zoom in to see if something is in focus.

I'm also seeing some weird behavior when using the flash. When using the 50mm f/1.4 it really seems to prefer using f/2.8, sometimes f/1.4 or f/4.0, and you can't select anything else in the auto, CA or P mode. Sometimes the depth of field is just too small, so I tried to set it to Av mode and manually select the one I want. What I didn't expect in that case is that it's calculating the shutter time like it won't be using the flash, and you suddenly get shutter times of 5 seconds. Other modes have similar problems, only setting everything manual seems to work properly.

No tags

Software used in Belgian elections.

5th June 2009

Like Machtelt Garrels points out, the source code for the software used on the voting computers for this year is not yet available. It only is made available to the general public the day of the election after the voting has closed. The government has a FAQ about it here (available in Dutch, French and German).

As it says, software from previous years is available, but you need to be able to find it. So here are the links: 2003 2004 2007

There are 2 pieces of software that can be used: Jites or Digivote. They're not compatible with each other, so it depends on the municipality you have to vote in which one is being used.

Here you can find a study of Belgian voting system and a comparison with other countries. It's available in Dutch, French and English.

Update: The sources is available before the vote, just not to the general public. It's available to the politcal parties and the parlement can appoint a specialist to inspect the source code.

No tags

Audio ripping: Drive offsets

21st May 2009

Thanks to Thomas Vander Stichele I've found out that when ripping an audio CD you get a different file depending on the CD drive you use because they don't all start reading at the same position and that for most drives this is a constant offset. So if you know what offset your drive has, you can compare rips done with a different drive.

I was under the impression that this offset was always a multiple of the sector size, because audio CDs do not have a reliable way to detect the position. But it seems this offset can in the middle of a sector too.

It looks like they agreed on some standard to compare the drive offsets. You can find a list of drives here and here. Cdparanoia has an option you can use to correct for that offset.

But I started wondering where this error comes from. There is no real reason why all CD drives shouldn't be able to return the same thing. There is very little real info about it out there, and I basically get the impression that someone just decided to use something as base value. It isn't really a problem, since this is about very small time.

To fully understand this, I think it's important to understand the basic layout of an audio CD. A sector (frame) of a CD contains 2352 bytes (588 samples) main data and 96 byte of sub-channel data. You have 75 such sectors per second, giving you a total of 44100 samples per second.

A sector is made up of 98 sub-frames. Each such sub-frame has 24 bytes (6 samples) of main data, 4 bytes of C1 error correction and 4 bytes of C2 error correction, and 1 "byte" of sub-channel data.

The first 2 sub-frames are used to mark the beginning of the sector. It contains a special pattern at the location of the sub-channel data. The first sub-frame contains S0 and the second S1. So you are left with 96 byte of sub-channel data. The highest bit of the sub-channel byte is used for the P sub-channel, bit 6 for the Q sub-channel, and so on until the W sub-channel. So the sub-channel has 96 bits (12 byte) per sector and each bit is in a different sub-frame.

If you go look at the offsets, you'll notice there are a lot of drives that have an offset that's a multiple of 6 samples, but there are also a lot that aren't a multiple of 6.

In the P and Q sub-channels you can see at what place you're reading. The P channel is a "pause" flag. At least 2 seconds before the track starts the P flag is turned on and the second sector of the track it's turned off again. All bits of P sub-channel should be the same for the whole sector, it can only change after an S0/S1.

In the Q channel you can have different information depending on what is in the first byte. The most useful one is "mode 1", which contain things like the track number, index in the track, and time indication for both the current track and the whole disc, and a CRC. Mode 1 should be in at least 9 out of 10 sectors. So you known perfectly at which location you are. There is just 1 small problem, the sub-channels do not have any error correction, but does have a CRC. And there are lots of sectors with the information in it.

So I wondered if the drives just didn't look at S0/S1 and that the sub channel data was shifted too. So I changed cdparanoia to log the sub-channel data and dumped it. I've tried this with a total of 6 drives, and also look at what the drive offset is. This are the drives I've used and the offsets they have:

Drive	offset
SONY DVD RW DRU-700A VY03	+12
PLEXTOR CD-R PX-W4012A 1.00	+98
PLEXTOR CD-R PX-R412C 1.06	+355
TSSTcorp CD/DVDW TS-L532A TI5a1	+116
HL-DT-ST DVD-ROM GDR-H30N 1.00i	+6
ASUS CD-S520/A 1.7L	+1858

Some notes:

They all start exactly at the sector boundary, sub-channel data is not shifted. So they all perfectly know where a sector starts.
Except for the ASUS, they all start at the correct sector. The ASUS starts 2 sectors too fast, which is probably why it has that 1858 samples offset, which is more than 3 sectors.
Both plextors add 588 to the offset when reading the main data + the sub-channel data. So they start reading 1 sector earlier if you request the sub-channel data.
The PLEXTOR CD-R PX-R412C sometimes also has an offset of +354

I can perfectly understand that it might be off by 1 or more sector, like the ASUS. I could also understand that it might be off by 1 or more sub-frames, but then I wouldn't expect to get proper values for the sub-channel data. Having an offset not be a multiple of the sub-frame makes no sense to me. So I can only come to the conclusion that this are all firmware issues.

No tags

Debian and (non-free) firmware: where to put it?

21st December 2008

One of the discussion that always seems to come back is what to do with firmware. It's mostly about freedom versus usability. I think the main question is where do we want to put non-free firmware in our archive.

The opinion seems to range from that it should be in the non-free archive to that it should be in the main archive, with many opinions in between, and they all have good arguments.

We have 3 options that are related to it in our current vote that seem to come down to:

Move all firmware that doesn't comply with the DFSG to non-free
Change the DFSG to not require source for firmware so that it can be in main
Assume that the blobs in the kernel are source so that it can be in main

I don't see the point of that second option, and really have to wonder if there is any firmware under a license that would comply with the DFSG except that the source isn't available.

The third option can make sense for some of the blobs in the kernel that are basically settings that are written to the device. You could write an editor for it, assuming there is documentation for it. But I doubt that they're all settings, and it just seems to ignore the problem.

I find Theodore Ts'o' suggestion about creating a new section for firmware the best idea so far, specially if we can agree that our official CD/DVD images will have that section on it.

It has the advantage that all software in our main archive can comply with the DFSG, that you don't need to add the non-free section just to be able to use hardware that requires firmware and that you can just take a CD/DVD that works.

No tags

Kurt Roeckx's journal

Bit sizes

Randomness

Ciphers

SSL/TLS Session

Current status in software

Archive