Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hacking WinRAR for fun and profit (github.com/taviso)
137 points by taviso on Sept 28, 2012 | hide | past | favorite | 47 comments



The screenshot at the bottom is one of the coolest things I've ever seen.


Is this article implying that their is the possibility of including a malicious application inside of a .rar format that would run upon uncompression? That is crazy if true. Also makes me think using 7zip might be a safer option than WinRAR if they handle the decompression differently.


Yes, but theoretically those applications wouldn't be able to do anything else than modify the data that you're uncompressing, which simply means that you have to decide whether you want to trust the data that you uncompressed.

Now of course there may be exploitable vulnerabilities here, but just as in any other piece of software.

So no, just because this runs a VM doesn't mean it's intrinsically more dangerous than anything else. Basically just any program that takes input from the outside world (a PDF reader, an MP3 player, you name it) is vulnerable to attacks.


But the fact that the VM has not seen much scrunity by the community together with the following snippet from the linked page:

"There are several known bugs in the RarVM.

[redacted as some have security consequences]"

makes it seem quite likely that there are some ugly vulnerabilities.


If anyone sent me an .rar these days, it would go straight into the Trash. (Ditto for Koreans who insist on sending me .alz archives.) Sure, 7-Zip can probably open it, but I still prefer .zip which has much better compatibility overall.

I can understand why those who have used WinRAR for many years might keep using the .rar format out of habit even in 2012, but is there any other reason for anyone else to compress new files with .rar at this point? Sure, you might be able to shave a few more kilobytes off a large file, but small differences like that are becoming increasingly irrelevant compared to interoperability. Are there other technical advantages to the .rar format that other formats like .zip and .tar.gz lack?


Encryption that isn't completely useless. Recovery records. Huge file support. Better compression in specialised cases (x86 binary code, multimedia). A nicer GUI - 7zip is a clunky, fugly thing and WinZip... urgh. Garish as Las Vegas. A real x64 version (for no reason other than OCD-ness about mixed 32/64 bit binaries). And tar.gz is grotty for random-access, I-know-it's-in-here-somewhere, use cases.

That said, I do still use zip for compatibility 90% of the time - but I was only going to pay for one commercial archiver for Windows, and WinRar was it. Why pay for WinZip? (this was years ago, amortized cost is now less than pennies/day)

Of course you could ask why pay for software at all, but that's a different problem. I'm happy to pay someone for software that works better, does things that otherwise can't be done or would be tedious, or is a de-facto standard of some sort. Hence at one time or another I've paid for Adobe Creative Suite, Cubase, MS Office, VMWare Workstation, Rhino 3D, Autopano Pro, and others.


> If anyone sent me an .rar these days, it would go straight into the Trash.

So, a customer sends you a RAR file and it goes straight to the trash? You sound like a swell guy.


We have customers who send .ace files.


multi-volume, robust recovery (up to 20%), built-in solid encryption scheme

Of course in the terminal world you could chain it all. But rar provide a convenient package of it all and with a gui. Zip files are painful to deal with when the system encoding is different from the one that created the archive.


.zip has the table of contents at the end of the file; .rar has it at the front, so if you've downloaded 50% of a .rar file you can easily get 50% of the content. You can't easily extract a zip file until you've downloaded the entire thing. So if downloading over a slow compression, which does still happen depressingly often, .rar does have a significant advantage over zip at least.


ZIP contains full listing at the end of the file. Saying this, ZIP also prefixes file data with file header information, allowing to decompress partial zips. ZIP recovery software will use this to recover broken zips.


> If anyone sent me an .rar these days, it would go straight into the Trash.

These are exactly the sort of stupid statements that make so many people frown on the HN community.


Why do you think it's stupid? Here are my reasons for the statement you quoted:

1. It's been years since anyone has emailed me an .rar attachment. In fact, I don't think I've had a legitimate .rar file in my Inbox since the turn of the millennium.

2. The last time I regularly opened .rar files was when I was into warez. Those things often contained malware.

3. As a result of 1) and 2), I would be very suspicious if somebody sent me an .rar file.

Of course, that's just my perception, so other people might associate the .rar extension with better things.


Multi-volume is the biggest point I think.

Almost everything you'd get on Usenet uses RAR, fwiw. I know Usenet is a dinosaur, but it's still a highly functioning one.


Built in Windows support of the zip format can't handle compressed archives over 2GB.


Zip has filename encoding problem. Rar doesn't. Sure I can use 7zip but many people don't have it installed.


> If anyone sent me an .rar these days, it would go straight into the Trash.

Where do you work?


Koreans should only be allowed to use .azn archives.


Why does WinRAR have it's own VM and what does WinRAR do when it creates a .rar file? I feel like this is missing a lot of elucidation.


(Not sure exactly what WinRAR does, but generally speaking for a compressor..) Backwards compatibility.

Let's say you're putting together WinRAR 4.5 and have a fancy new compression method you want to implement. .RAR files generated with Version 4.5 that use it, won't be compatible with earlier versions.

The way around this is to embed, in the .RAR file itself, a program (written in the WinRAR VM) to decompress the data using the new method. WinRAR 4.0 can just run this embedded program without knowing the precise details of the Version 4.5 algorithm.

So you get some of the advantages of a self-executable archive without the portability and security issues.

I think that WinRAR in particular implements a lot of pre-filters, which are things you run over the data before your main compression algorithm in order to make the data more compressible. This seems like a good use for such a VM - they're simple, less-speed-critical than your core compressor, and you can always write more of them.

Examples of prefilters are BCJ2 (http://en.wikipedia.org/wiki/7z#Pre-processing_filters) and the PNG image filters (http://www.w3.org/TR/PNG-Filters.html).


Why WinRAR team didn't add digital signing to the code for that VMs? Old WinRAR versions would contain VM interpreter, public RSA key and signature verification code; new WinRAR versions would embed already signed blobs of VM code into archive files; private RSA key would stay on WinRAR developers' computers only. And no one would be able to execute arbitrary VM code on end user PC.


What exactly is the drawback of executing arbitrary VM code on an end user PC? If there aren't security flaws in the VM, then the code is sandboxed, and worst case you extract a really big file or hang WinRAR.

Now, if there's a security flaw, that's a different story. But it looks like the VM just gets to play around with memory and registers and doesn't get any libraries or IO, and doesn't rely on type safety for correctness -- which eliminates the typical sources of security holes in more complex VMs such as JVM. And if you don't need great performance, then you can put bounds checks everywhere.

Safe as houses. Unless someone screwed up.


Because it's better to be safe than sorry.


I think you mean forward compatibility


Nice to see more FOSS tools trying to dissect the RAR format.

Now if only a decompressor existed for current versions of RAR. A FOSS unrar tool exists that decompresses RAR 1, RAR 2, and some RAR 3 archives, but not current RAR 3 archives.


No, now there is unar, FOSS which will work for all rar versions.


Hadn't seen that one; thanks for the pointer! Since "unar" proved difficult to search for: http://unarchiver.c3.cx/


"Known Bugs There are several known bugs in the RarVM.

[redacted as some have security consequences]"

I wonder if torrent seeders have exploited this to spread malware/bots....


Doubtful when it is so much easier to get people with 'download this codec to play this video' or putting them in cracks/keygens.


> I wonder if torrent seeders have exploited this to spread malware/bots....

Seeders cannot modify the content of the files they're seeding. Torrent client checks hashes for each chunk of data it downloads.

Only the original uploader could do that and there's regular reputation system at play there. There are known uploaders, known scene groups, etc.

Plus, most torrents, with the exception of applications where you can just modify the installer, don't contain compressed archives.


Wow, had no idea WinRAR contained a mini-VM. Cool, weird and maybe slightly disconcerting


It's funny, so does Bitcoin, Adobe Reader, a lot of things.

Maybe we need to rewrite that old law:

All programs expand until they contain a virtual machine. Those that cannot do so are replaced by ones that can.


What does Bitcoin do with virtual machine?

(If you don't take "running the bitcoin script" as "virtual machine".... then yes, it is technically a virtual machine, but very Turing incomplete.)


Yes, I meant the script. Not turing complete, sure, but still quite flexible.


Greenspun's Rule basically covers that. Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

Although nowadays the art of programming is maturing enough that the inner platform sometimes is deliberately designed and well specified.


Yeah would it be possible to have a .rar containing some VM instructions (maybe pulling data from a payload within the archive) that would automatically execute upon unzipping?


Yes. That's exactly what unzipping (or, well, unrarring) does.


Can someone explain exactly how WinRAR uses this VM to improve compression?


If you read the linked blog post you will see one example. For instance, calls to a foobar function are encoded with relative addresses. The preprocessor can detect this, calculate the absolute address, encode that, and calculate the original relative address at decompression time. I like the idea apart from the security concerns! I was excited to find an unknown aspect of such a prolific program!


In somewhat related news, here is Russ Cox's excellent blog post about cool hacks with the zip format:

http://research.swtch.com/zip


Didn't the winrar author pass away several years ago? Vaguely remember.

Is it still being developed?

Oh okay, it was the distributor, not author: http://en.wikipedia.org/wiki/Ron_Dwight


Eugene Roshal is the author, and from his other products I use everyday the FAR Manager for Windows (Midnight Commander / Norton Commander type of file manager) - http://en.wikipedia.org/wiki/Eugene_Roshal

FAR Manager has been open-sourced - http://www.farmanager.com/


Watch out, the linked post [1] is blocked by my TrendMicro.

[1] http://blog.cmpxchg8b.com/2012/09/fun-with-constrained-progr...


It's probably picking up the text and assuming that it might be a malicious payload.

This sort of thing can happen to some virus scanners when they come across a page with the text of things like malicious VBScript, batch files, etc., even if it's text that's displayed on the page.


TrendMicro ..

I really don't miss Windows.


For what it's worth, 7zip contains a virtual machine too. Not sure how capable it is compared to WinRAR's, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: