Apparent bug with bookmark generation
Posted: 11 December 2010 04:38 AM   [ Ignore ]
Total Posts:  20
Joined  2008-04-03

If I open a pdf with Clerk and create some bookmarks and export as a single file, everything works fine.  If reopen that exported file in Clerk and add some additional bookmarks, and export again, the resulting file is corrupted.  It looks fine, but the font information is corrupt. You can’t do searches, and if you copy some portion, and paste into another document you get garbage. “Nearly half of the U.S. fleet of reactors” copies as “3+/1<H!#/<2!‘2!-#+!M@F@!2<++-!‘2!1+/(-‘1%”

Posted: 11 December 2010 06:13 AM   [ Ignore ]   [ # 1 ]
Total Posts:  475
Joined  2007-03-23

I can’t recreate the issue on my system, using the procedure you describe (although I’ve noticed similar issues in the past). Do you get this issue with any file you treat this way, even with, say, the About Stacks PDF that comes by default in your Documents folder? I doubt that adding the bookmarks really has any influence, and would rather expect that it is something to do with the nature of the source material you use, in relation with the PDF parsing/creation code in Mac OS X. Could you send me a sample file?

Unfortunately these are issues in the PDF handling code of Mac OS X, and therefore there is no way I can fix them. Apple has to do that, and they’ve shown themselves either rather unconcerned about, or incapable of fixing these and other problems in their PDF routines for quite a while now. (Also, and I regard this as an issue with the PDF format itself, it is perfectly ‘legal’ for a PDF document to be constructed such that pages display and print just fine, but that the textual content, the character information, is missing, which is what is the matter when you get the results that you do.)

You might try to see if activating the PDFX-3 filter upon PDF export solves the issue for you. (And let us know whether it does.) Another option is to try to print to PDF, instead of exporting to PDF, as that will yield different results. (It will flatten the file, and probably lose the bookmarks, but at least you can see whether you can still extract the text afterwards. Possibly, just possibly, printing the raw source document to PDF first, and then using the resulting document to make your changes, would work around the problem, so you can try that too.)


António Nunes