Friday, May 25, 2012

OSX: Finding when I took photos, or, Howto List all Files Recursively

Set myself a little project. I was wondering if during the time I've been doing digital photography I've now taken pictures on every day of the year, including Feb 29th.


My photos are arranged in one of two methods:
  1. If I've been on a trip, I create a 'collection' called eg "NZ_Sept_2011" and in there sub-folders for each camera eg "D300" and the sub-folders eg "NEFs" and "JPGs".
  2. For regular day to day stuff, I create numerical based folders eg "Nikon_01200" which would contain Nikon photos from 01201 to 01300.
Also, I use tools such as jhead to change the time stamp of processed files to match the EXIF date/time. This means:
  1. Creating a listing of all my photos with their modification date, so in theory I can just recursively list all files.
  2. Then I need to process the results, shedding extraneous data, and the year, so I'm left with just Day and Month and full path of the file

Here's what I came up with:


  1. First off is to run the command
    $ find /dir/ -ls > all-photos.list

    which creates a (large file), with details such as

    3538871      232 -rwxrwxrwx    1 user            staff              116581 15 Oct  2008 /path/to/photo/John/IMG_7048.jpg

  2. This list brings back everything including Photoshop, Autopano and other files so we need to remove all the extra entries

    $ cat all-photos.txt | grep -i jpg | grep -v -i ".pano" | grep -v -i ".pld" | grep -v -i ".psd" | grep -v -i ".plb" > all-photos-jpg.txt

  3. Now, print only the fields needed. Can do this by inclusion
    $ awk '{print $8,$9,$11,$12,$13,$14,$15,$16;}' all-photos-jpg.txt > all-photos-jpg-fields.txt

    or exclusion

    awk '{ $1=""; $2=""; $3=""; $4="";$5="";$6="";$7="";$10="";print $0 }' all-photos-jpg.txt > all-photos-jpg-fields.txt

  4. Finally, change the separator to a comma so we can import into a spreadsheet and create a pivot table

    $ sed s:" /":,/:g all-photos-jpg-fields.txt > all-photos-jpg-fields.csv

  5. Import into Google Docs, and then add a row at the top. You need the first column to be called say 'Date' and the Second 'Photo'



  6. Select both columns, then click Data > Pivot table report. When it loads:

    row - "date"
    values - "photo" - counta




  7. And here we are:



  8. And even better, by using filtering you can easily see what photos you took on what date





6 comments:

humbytheory said...

You could just use awk and find:

find /dir/to/pictures -type f \( -name "*.jpg" -or -name "*.JPG" \) -exec ls -lrSo {} \; | awk '{printf "%s %s %s,",$5,$6,$7; for(i=8;i<=NF;i++){ printf "%s ", $i } print "" }'

-humbytheory

humbytheory said...

Woops, you wanted last modification time.. (need to add 'u' to the 'ls' commnd):

find /dir/to/pictures -type f \( -name "*.jpg" -or -name "*.JPG" \) -exec ls -lrSou {} \; | awk '{printf "%s %s %s,",$5,$6,$7; for(i=8;i<=NF;i++){ printf "%s ", $i } print "" }'

-humbytheory

Steve Mansfield said...

thanks! I made a couple of alterations: removed the 'year' field, and fed through sed to turn into a CSV:

find /path/to/photos/ -type f \( -name "*.jpg" -or -name "*.JPG" \) -exec ls -lrSou {} \; | awk '{printf "%s %s,",$5,$6; for(i=8;i<=NF;i++){ printf "%s ", $i } print "" }' | sed s:" /":,/:g > all_1.csv

Then upload into Google Docs, add the top line, create pivot, job done.

Easy as!

Steve Mansfield said...

For some reason, your second option ie using 'u' produces the wrong dates on my OSX box. Using the first gives file system system modification time, which is usually what I'm what I'm looking for

previouslysilent said...

why not use

find /foo/bar -type f -name '*.jpg'

rather than grep out the unwanted entried

Steve Mansfield said...

well I could, but I need the date/time stamp of the file. The previous suggestion / comment works really quite well.