Introducing Pxsum!

Have you ever wanted a quick and easy way to evaluate the visual equality of images — i.e. their pixels — independent of complicating factors like format and encoding?

Whether you're looking for proof of losslessness or want to clean up an old wp-content/uploads directory with ten thousand copies of the same logo, our new x86-64 Linux CLI tool pxsum can help!

Pxsum works just like traditional checksum tools like md5sum and b3sum, but rather than hashing files directly, it reads and decodes each image into a neutral RGBA bitmap format, then hashes the resulting pixel byte stream.

Under the hood, it uses the modern BLAKE3 hashing algorithm, ensuring fast and collision-safe results.

All of the "popular" image formats — and JPEG 2K/XL, haha — are supported:

Crunching

The interface mirrors those of other checksum tools like md5sum, so should be immediately familiar to the target audience.

# STDIN to STDOUT.
cat image.jpeg | pxsum

# Source path(s) specified as arguments.
pxsum image.jpeg another.png

# Crunch an entire directory.
find ~/Pictures -type f -exec pxsum {} +

The output looks like you'd expect:

fc6e48e935f7b7330cb6bc95c8c725f57e8d9b1efe01b7afc90ea53e9d968aa9  ./assets/ash.jpg
84372180f4763895ff1165487003106774953ad6bce56b5b1344893c52f175c2  ./assets/carl.jpg
2212ead939f81398e078745cbdfeb57f67398c6d3ff16f822449df035bafee0f  ./assets/cmyk.JPG
3c112bc262d2f292adec259c35ad1981088e9cd45ccc5669ada0c9b7efe08559  ./assets/dingo.png

Alternatively, you can group the results by checksum using the -g/--group-by-checksum flag:

2212ead939f81398e078745cbdfeb57f67398c6d3ff16f822449df035bafee0f
  ./assets/cmyk.JPG
2ac10da49b973cd4c44bd3b15cf4e0882f3bfd9bd0edf99bca28b3376ff1a70f
  ./assets/santo.bmp
  ./assets/santo.ico
  ./assets/symto.bmp
3c112bc262d2f292adec259c35ad1981088e9cd45ccc5669ada0c9b7efe08559
  ./assets/dingo.png

The grouped output can, for example, make it easier to identify duplicate images, but if that's your use case, use the --only-dupes flag as that will additionally filter out any one-offs:

2ac10da49b973cd4c44bd3b15cf4e0882f3bfd9bd0edf99bca28b3376ff1a70f
  ./assets/santo.bmp
  ./assets/santo.ico
  ./assets/symto.bmp

Either way, you can save the results to a file the usual way:

pxsum -d ~/Pictures > pictures.chk

Verification

Previously-generated results can be fed back into the program to check that nothing's changed. Just set the -c/--check flag:

# These are all equivalent.
cat pictures.chk | pxsum -c
cat pictures.chk | pxsum -c - # The STDIN dash is optional.
pxsum -c pictures.chk
pxsum pictures.chk -c         # The order doesn't matter.

This output also looks just the way you'd expect:

./assets/ash.jpg: OK
./assets/dingo.png: OK
./assets/cmyk.JPG: OK
./assets/carl.jpg: FAILED

Interested?

As with everything we do, pxsum is licensed WTFPL. You can use and abuse it freely.

For more information, sources, and build/install instructions, visit the Github repo.

Josh Stoik
30 September 2024
Previous Making Up Numbers: Constrained Numerical Types in Rust