Cleaning PDF metadata

in Tools


If you want to remove PDF metadata, there doesn't seem to be a good native tool neither on Mac, nor or Windows.

ExifTool is being recommended fairly often, but if you run exiftool -all= "/tmp/example.pdf", you will get a warning:

ExifTool PDF edits are reversible. Deleted tags may be recovered!

There is a tool called MAT2 that seems to fit the job exactly:

Metadata within a file can tell a lot about you. Cameras record data about when a picture was taken and what camera was used. Office documents like PDF or Office automatically adds author and company information to documents and spreadsheets. Maybe you don't want to disclose those information. This is precisely the job of mat2: getting rid, as much as possible, of metadata.

However, the thorough clean will involve a re-render, and my 10 MB compact PDF grew into a 80 MB monster. Even the --lightweight flag still increases the file size from 10 MB to 20 MB.

Because I still wanted to keep my file size down, here's what I did:

  1. Used MAT2 to display the current metadata.
  2. Opened the PDF in TextEdit / Notepad, then search and manually edited out the metadata I did not want. (Use the MAT2 output from step 1 to know what you are looking for.)
  3. Scanned the updated PDF with MAT2 again to confirm the information is no longer there. MAT2 does warn against using the --show flag as a way to decide if the file must be cleaned or not.

Sources and further reading:

#pdf