RC4 encryption in C
January 24th, 2008 11:49:44 am pst by Sterling Camden
A couple of years ago, I published a routine to perform RC4 encryption in Synergy/DE, translated from the original VBScript of Mike Shaffer with his permission. Lots of people searching for the RC4 algorithm have landed on that post, most of them not knowing the Synergy/DE language. A couple of days ago, a reader going by the alias “crazysoccer” asked if I could provide an example in C.The download below contains a straight C version of the famous “EnDeCrypt” function, along with a simple “main” test program. The flavor of C I used is vendor and platform neutral. It only presumes the availability of stdio and malloc. Naturally, you could tweak this code to use whatever I/O and memory allocation mechanisms your development framework prefers.
I’ve been asked before about the legality of publishing cryptographic algorithms on the web. According to Wikipedia (which is not a lawyer, nor do I think it has ever played one on TV), rules for exporting cryptography have been relaxed in recent years, including “the effective elimination of export controls on … open source software containing cryptography.” Since I’m making this freely available to all, and I welcome any contributions to improving the code, I’ll consider this “open source”. Besides, I wasn’t the first one to publish this algorithm on the web, and if the Bureau of Industry and Security wants to come after me, it’ll have to go get 4GuysFromRolla, too. But if you plan to use this algorithm in software that will be exported from the US or other nations with similar rules, you might want to consult your lawyer on any restrictions you need to follow.
UPDATE: Changed the API to include a specified length, since the encrypted data can contain a null.
Posted in C and C++, Wildly popular | 31 Comments » RSS 2.0





[...] Here’s a version of the same algorithm in C. Tags: 4guysfromrolla, cryptography, encryption, mikeshaffer, rc4, SynergyDE, synrc4, VBScript [...]
[...] RC4 encryption in C — Chip’s Tips for Developers Putting the “C” back into “RC4″ (tags: encryption cryptography rc4 c algorithms) [...]
may I humbly suggest : that is not a secure implementation?
Sure, Stu. How would you improve it?
Tip: always fill the buffers with random bit patterns before you FREE them. Never leave content in freed space, lest a parallel garbage collector inspect whats there. Thats just one error. Can you see the rest now?
Thanks for that tip, Stu.
There’s a lot wrong with using the test harness as a real-world example. For one thing, it doesn’t even begin to address the mechanism for sharing keys. The routine main() was only intended to exercise the encryption algorithm and demonstrate that it works.
Do you see any flaws specifically within the EnDeCrypt() function?
i tried to encrypt the string “fqwerty.c” with your code….and i faced some problem…
cud u rectify it…
I used the key “123″…sorry to have not posted it in my earlier post
That’s very strange — it only seems to fail with the key “123″, chopping off the length at 5 characters. I’ll have to debug this and figure out what’s going on.
It turns out that it’s possible for the encrypted text to contain a null. I’ve changed the API to include a specified length, and I changed the test harness to account for that.
i am currently in the process of building a client server application and wanted to use rc4 to encrypt the data….but while transferring and encrypting some data there seems to be some problem…some special characters get introduced ….what may be the cause….can u help..
When the data is encrypted, it could contain any character, even a null. Your transmission mechanism needs to take that into account.
re: security of the implementation
Wikipedia has this to say about RC4:
I haven’t looked through your code, and my C is rusty enough that I’m not sure it would do any good, but this might give you some checklist items to address in your implementation. Of course, the “single keystream used twice” issue is probably outside the scope of your implementation. There’s more security stuff further down in the RC4 article at Wikipedia, though.
Thanks for that information, apotheon. Caveat utilisator.
I’m glad it’s helpful — or at least impressed with your ability to make me feel useful even if I’ve said something dumb, in light of the fact I haven’t even read the source.
Heh. But it is helpful, as were Stu’s comments above.
I don’t know how many people have asked me about encryption algorithms and were looking for an easy, instant solution that they could just plop into place without any thought. The truth is, secure encryption is tough. Not necessarily the algorithm itself, but securing keys and any other artifacts of the encryption process.
can any one have RC4 in Nesc language
I’ve never used nesC, but since it’s based on C you could probably adapt the algorithm here.
hi, in your code while runing to free(pDecrypted),the program is fault ,but free(pEncrypted) has not fault .why?
I don’t know why freeing a malloc’d pointer should ever give a fault. Nothing else is referencing it after that point, either. Have you modified the code in any way?
very useful so thanks
Quite welcome.
I think that there’s a problem with the Mike Shaffer implementation – namely, it doesn’t take into account the impact of character encoding.
IMO, an encryption algorithm should not perform bitwise operations on characters/strings directly, and encrypted data should never be stored as a string unless it has been converted using something like base64 that will remove the concerns of character encoding. These types should be encoded as a byte array beforehand using a well-defined encoding (ASCII, UTF8, etc), and encrypted data should be treated as raw, binary data.
Here’s a comparison… using the plaintext “1234567890″ and a key “key1″, and using a standard RC4 implementation that encodes the plaintext as an array of bytes using UTF8:
f6 4d a5 3e 80 bf f4 3a 49 7b
And using EnDeCrypt (VB version):
f6 4d a5 3e 20 ac bf f4 3a 49 7b
There isn’t always a discrepency, btw, so it will be a hit and miss kind of thing (using the key “key” instead of “key1″ will yield identical results in both methods, for example).
Although the Shaffer method will be able to encrypt and decrypt it’s own data, this is a problem when it comes to transmitting data to another application that needs to decrypt the data, or on a different set of hardware where perhaps the default encoding is different.
I might be interpreting these results incorrectly, but as RC4 is designed to operate as a stream cipher, and transformations (presumably?) happen per byte instead of per char or int, I think it is a valid concern.
If the VB result is longer than the original, then I think that would be a bug in the VB version. My C version gives the first result. You may have correctly identified the problem, but I always thought that strings were really just byte arrays in VBScript. I wouldn’t think that “1234567890″ would involve any multi-byte characters, anyway. Hmm.
“1234567890″ doesn’t involve multi-byte characters. However, the result of the encryption when stored as a string (depending on the key used), can include values that may be perceived as part of multi-byte characters when they are, say, stored to a database or a flat-file depending on the default encoding for your environment. Characters are cast one at a time, and in VB at least, there is no guarantee that they will all be cast using the same character set (certain characters will require a 16-bit encoding, and at least VB will switch to 16-bit encoding for _only_ those characters). You were correct that strings are just byte arrays, but a character encoding gives that string context. The same byte array can be a different string in a UTF8 context than in an ASCII context.
The problem is that binary data is being treated as a string; it may not be an issue inside the context of a C script, but it is where high-level languages or data storage is involved (what happens to a string in a database when a character encrypts to ‘null’?) When read out for use in a Java application, or even the same application on another platform, unpredictable results may ensue. A database containing encrypted information will not be portable at best, and at worst will contain irretrievable data.
The reason bring this up: I had to work with a vendor who used the original VB version of this method to store passwords which failed to decrypt about 1 of every 30 passwords that were encrypted on the same platform, and the code they used was copied directly from the VB script you’ve linked to. I think another commenter below is running into this very problem – the solution was to encode the string as a byte array using a standard encoding, encrypt the byte array, and base64 encode the byte array for storage as a string.
That’s a very good point, Randall. Yes, to be safe, you should always treat the string as a byte array. The result of encryption will be a byte array containing who knows what values (anything from 0-255 inclusive). If I were wanting to store that in a database, I would probably convert it to its hex character representation first.
Hi Stenling. Your program is capable of encrypting the given string with its key and returning some value like f6 4d a5 3e 80 bf f4 3a 49 7b. What’s that value sir. And also, once i enters the key value and press Enter its automatically displaying the Encrypted text as well as the decrypted text i.e the original string. If i copy the encrpted text, E.g f6 4d a5 3e 80 bf f4 3a 49 7b and enters the same key value its not giving the same original text which i entered before. How can i debug it.
The encrypted text is shown using hex values, because it probably isn’t printable — are you entering the hex characters, or the string equivalent of the hex values?
wow
Just in case anyone needs the equivalent PHP function for Mike Shaffer’s EnDeCrypt, I thought I’d share it:
Thanks for that, Josh. Fortunately, PHP doesn’t consider whitespace significant, so it’s not a problem that HTML rendering swallows them. But I went ahead and added <pre><code> tags around it to make it render as you intended. For future reference, those tags are allowed in your comment and will preserve your formatting.