Survey of Safari screenshot tools

As part of my research, I need to create screenshots of web pages, exactly as they appear on my screen with Safari, but with indefinite length.

I compiled a little survey of tools that help to create web page shots under Mac OSX. These are the contenders:

I included only tools that create shots from Safari or at least claim to create shots that look like they were from Safari (because they are based on Webkit). I am aware that there are a number of add-ons for Firefox and other browsers, but these are not in the scope of this article.

Introduction

Of course you can produce screenshots any time by invoking the Mac OS screen capture tool with SHIFT-CMD-4. When you press the space bar before clicking into the Safari window you want to grab, this will create a shot of this window only. Under Leopard, the shot will be in PNG format, but you can change the format to PDF with

defaults write com.apple.screencapture type pdf
killall SystemUIServer

(As far as I can see, there is no advantage in creating PDF shots with the screen capture tool, since the PDFs will be image rather than vector-based.)

The major disadvantage of this method — and the reason why I started to investigate other approaches — is that you will get a shot of only that part of the web page that is currently on your screen; longer web pages will be truncated.

Print as PDF (Safari built-in)

This is the most obvious method to produce a screen shot. From the File -> Print menu, you select PDF to generate a PDF of the page you are currently viewing.

But this method has some flaws:

  • It will insert page breaks when the page reaches the end of you paper size, but you can come around this problem by setting up a paper format with a very large page length.
  • On pages with frames, it will print only the selected frame, not the page as a whole.
  • Another problem appears when a web page assigns a special CSS stylesheet for printing. In this case, the PDF shot will not resemble the original page at all, but honor the instructions on the print stylesheet.
  • It will not print any Flash content on the page.
  • It won’t print some elements exactly as they appear on the screen, e.g. select boxes look strange.

Printing PDFs has an advantage over creating screenshots: The generated PDFs are vector-based; the text is encoded as (extractable and searchable) text, and even the links are preserved. This fact raised my interest, and looked for another method to product PDFs from web pages.

Export as PDF via Saft

It was a little surprise to find out that I had this method already available to me, because it is part of the excellent Saft extension for Safari that I rely on every day (it has so many useful functions that I can’t list them here). Saft comes with a nice feature in the context menu of a Safari web view that is called “Export as PDF”. The Saft export function solves the first three problems of the built-in PDF export: It exports pages without inserting any page breaks, it prints frame based layouts as they appear on screen, and it ignores any print stylesheets. It does suffer, though, from the other PDF printing constraints: Flash doesn’t get through, and the generated PDF is just not exactly what you saw on the screen before (although it is close). If it is a problem for you, you also can’t script the export as PDF function, because it is on a context menu and there is no way (that I am aware of) to applescript context menus. I contacted the developer and he says that he will include Applescript support into Saft in the future.

webkit2png

webkit2png is a python program that launches a Webkit component to fetch and render a web page. Since it is a terminal program, webkit2png is perfect to be included batch scripts. webkit2png will render everything that Webkit can render (that is everything that Safari can render). It does not produce any of the artefacts the PDF export methods suffer from (select boxes look just nice), and it will display flash content. The program has to be tweaked though to delay the shot until all flash components on a page have been loaded.

But there is a big drawback to webkit2png and all applications that run independently from your Safari session: It is not able to produce shots of web pages that are covered somewhere in the so-called hidden web, in the web where a page cannot be referred to by an URL and that you can reach only by filling out some (search) forms.

Paparazzi!

Paparazzi! does everything webkit2png does (I think it is based on webkit2png), but it comes as a GUI application. I think it is quite popular, and I can understand why: It is completely free and quite comfortable to use. You just enter the URL you want to grab and select Capture!. It does have the option to delay the shot already built-in, and you can choose whether to save the shot in PNG, PDF, and even JPEG or TIFF format. The PDF will have selectable text and links, Paparazzi! does frames, Flash, and it even has an Applescript interface. The last version that works under Leopard is 0.4.3. There is a 0.5 beta, but it only supports Tiger (the developer is still on 10.4), so I could not test it.

Major drawback: No possibility to make shots of the hidden web.

Netfixer

Netfixer is an open source program created by the Italian company Shiny Frog (interesting, because paparazzo / paparazzi is an italian word as well). The interface is as simple as it can be, there is just a single line where you can enter the URL and “Shoot” button. While it is not as feature-rich as Paparazzi! (there is no way to export PDF), and I had some stability issues, it’s open source nature makes it very interesting.

It suffers, though, from the same problem as all other applications so far: it can’t access the hidden web.

SnapWeb

SnapWeb from Brain Tickling Software costs money ($17.90). It is highly customizable and supports many different output formats. When exporting PNGs, it does do Flash nicely, but it loses all Flash content when exporting PDF. It is the only tool where you can chose between to different PDF variants, the PDF image variant that looks like original, and the PDF text variant that suffers from the artefacts that I described above.

Because you can click into the web pages SnapWeb renders you can follow links and even type text — you can access the hidden web. You have to do that in the SnapWeb session though, isolated from your Safari experience.

Web Snapper

Web Snapper (formerly known as Red Snapper) from Tasty Apps is $15 and looks pretty nice. Although it is now a stand-alone application where you can type in your URL (Red Snapper was a plug-in), it also tightly integrated into Safari: Web Snapper can install a button that will make a shot of the current page and send it to Web Snapper app, where all shots are collected and can be printed or saved. Other programs also offer to add a bookmarklet that will send the current URL to screen shot program. The Web Snapper button in contrary will make a shot of the page, even when it’s a page from the hidden web.

Web Snapper’s PDFs suffer from the usual artefact problems, but, quite strangely, these artefacts also appear when you save as PNG. It seems that the Safari integration depends on the PDF export functionality, and that this file is only later transformed into a PNG image. I am curious if the older Red Snapper already had this problem, or if it was introduced after Apple broke the plugin architecture and Tasty Apps had to develop the new Web Snapper. I tried to get in contact with Tasty Apps twice, also because I wanted to know about Applescript support, but did not get a reply yet. If they would be able to fix these problems, Web Snapper could be a winner.

Conclusion

The perfect Safari screenshot tool does not exist (yet). Only the separate app solutions can produce exact duplicates of original web pages, but they can’t access the hidden web. Web Snapper is the exemption here, it can access the hidden web, but it doesn’t produce perfect copies. So far, SnapWeb is the only application that can produce exact duplicates of all pages on the surface (standard) and hidden web, but its Safari integration is very weak.

I am currently investigating my possibilities to come up with a hack that would be sufficient to produce the kind of shots I am looking for (exact duplicates of surface and deep web pages, but no fancy functionality some of the apps provide). I’ll update this post when I know more.

Comments

  1. With Paparazzi, if you’re logged into a “hidden website” with Safari, Paparazzi will use those credentials….it did for me anyway.

  2. In the part about using webkit2png, you mention that the program has to be tweaked to allow flash items to load. Can you tell me how to accomplish this?

  3. webkit2png will load flash items by default, it just does not give them enough time to show up. You will have to add a delay in captureView(), for example, if you add these two lines to the top of captureView (line 140, check for correct indentation),

    import time
    time.sleep(25)

    it will wait for 25 seconds. Obviously, you should make this a command line argument. Hope that helps.