Digging Into Spotlight
I came across a nice article on
advanced searching using Spotlight that talked
about search keywords like kind and
date, and as I read it I thought to
myself I wonder what other cool keywords you
can use in Spotlight? At first blush, this
seems like the type of innocent question that
should be easy to quickly answer, but as often is
the case with technology, things are a bit more
complicated than they first appear.
Dropping to the Command Line
My first thought was just to search for help on
Spotlight, and that's not a bad starting place as
the page covers a number of keywords, more
correctly called metadata attributes.
Spotlight's help covers kind,
author, date,
created and by as well as
the boolean operators AND,
OR and NOT. But I knew
there were many other metadata fields that were
commonly used in files such as images and audio
files. I had come across the mdls
shell command before which lists the metadata
fields on a file. A quick check of a JPEG image
revealed all kinds of interesting data:
da-imac-01:stuff david$ mdls IMG_3564.JPG kMDItemAcquisitionMake = "Canon" kMDItemAcquisitionModel = "Canon EOS 10D" kMDItemAperture = 0.970855712890625 kMDItemBitsPerSample = 32 kMDItemColorSpace = "RGB" kMDItemContentCreationDate = 2008-04-22 18:40:25 -0400 kMDItemContentModificationDate = 2008-04-22 18:40:25 -0400 ... kMDItemFlashOnOff = 0 kMDItemFNumber = 1.399999976158142 kMDItemFocalLength = 50 ...
This is a truncated list of the 50+ fields in one of my image files. Note some of the interesting ones like the aperture, flash setting and focal length.
I played around with another of the
Spotlight/metadata shell commands:
mdfind. This lets you do the
equivalent of a Spotlight search from the command
line and after a bit of trial and error, guessing
the keyword names and value formats was fairly
easy:
da-imac-01:stuff david$ mdfind make:canon focallength:50 flash:0 iso:125
/Users/david/Desktop/Turks, April 2008/IMG_3564.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Turks, April 2008/IMG_3564.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Museum Visit/IMG_3273.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Museum Visit/IMG_3276.JPG
/Users/david/Pictures/iPhoto Library/Originals/2008/Mar 23, 2008/IMG_3288.JPG
...
Peeling the Onion with DTrace
Although these shell commands are very useful,
the man pages for the commands do not list the
valid search keywords. I knew there must be a list
of the keywords used in Spotlight search bar that
mapped to these constant names so I thought *what a
great time to learn dtrace!*
For those of you who haven't heard of
dtrace I encourage you to play around
with it. It's a very powerful tool for doing live
probing and tracing of low level activity in the
operating system. After skimming
this nice tutorial I tried this command in one
window:
da-imac-01:bin david$ sudo dtrace -n 'syscall::open*:entry /execname == "mdfind"/ \ { printf("%s %s", execname, copyinstr(arg0)); }' Password: dtrace: description 'syscall::open*:entry ' matched 3 probes
and then ran my mdfind command
again in another Terminal window. The
dtrace "script" says to trace all
system calls whose name begins with "open" when the
system call is entered, but only if they were
called from the mdfind process, and
then print out the name of the system call and the
first argument (which in the case of
open is the file or device name). That
resulted in LOTs of calls like this showing
mdfind opening all kinds of metadata
importer files, which I assume are libraries that
know how to manipulate certain types of metadata
attributes:
CPU ID FUNCTION:NAME 0 18390 open_nocancel:entry mdfind /System/Library/Spotlight/\ Audio.mdimporter 0 18390 open_nocancel:entry mdfind /System/Library/Spotlight/\ Audio.mdimporter/Contents 0 17604 open:entry mdfind /dev/autofs_nowait 0 17604 open:entry mdfind /System/Library/Spotlight/\ Audio.mdimporter/Contents/Info.plist 0 18390 open_nocancel:entry mdfind /System/Library/Spotlight/\ Chat.mdimporter 0 18390 open_nocancel:entry mdfind /System/Library/Spotlight/\ Chat.mdimporter/Contents ...
But the part of the trace I was most interested in was near the very end:
... 1 17604 open:entry mdfind /System/Library/Frameworks/\ CoreServices.framework/Versions/A/Frameworks/Metadata.framework/\ Resources/MDPredicate.plist 1 17604 open:entry mdfind /dev/autofs_nowait 1 17604 open:entry mdfind /System/Library/Frameworks/\ CoreServices.framework/Versions/A/Frameworks/Metadata.framework/\ Resources/English.lproj/MDPredicateKeywords.plist 1 17604 open:entry mdfind /dev/autofs_nowait 1 17604 open:entry mdfind /System/Library/Frameworks/\ CoreServices.framework/Versions/A/Frameworks/Metadata.framework/\ Resources/English.lproj/schema.strings ...
Note the files
MDPredicateKeywords.list and
schema.strings in the
/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Resources/English.lproj
folder. I tried looking at the
schema.strings file but it was in a
binary format. So I tried open
schema.strings and sure enough, Xcode
launched and loaded the file which contains over
400 lines of mostly metadata keyword definitions.
The ones we are interested in are near the end and
of the form kMDItemXXX.ShortName =
yyy:
"kMDItemPixelHeight.ShortName" = "pixelheight,height"; "kMDItemPixelWidth.ShortName" = "pixelwidth,width"; "kMDItemWhiteBalance.ShortName" = "whitebalance"; "kMDItemAperture.ShortName" = "aperture,fstop"; "kMDItemAudioEncodingApplication.ShortName" = "audioencodingapplication"; "kMDItemComposer.ShortName" = "composer,author,by"; "kMDItemLyricist.ShortName" = "lyricist,author,by"; "kMDItemStarRating.ShortName" = "starrating";
These are just a few of the dozens of entries to
whet your appetite. For the most part, I've found
them to work as expected, with one exception:
starrating. I never got any hits using
it so I tried using mdls on an MP3
that I knew had a rating set in iTunes and there
was no metadata attribute set on it for the iTunes
rating. So I guess all you Mac developers out there
should "do as Apple says, not as Apple does."
Satisfaction
One of the things that I really like about OS X
is the ability to work with the system at varying
levels of depth. This diversion started with me
playing around with the Spotlight search bar: a
very advanced "desktop search" feature found only
in the most modern operating systems. But when I
wanted to learn more, I was able to easily muck
around at the command line and experiment with the
very same infrastructure that Spotlight is built
on. Finally, I was able to leverage a very
powerful, low level system tool,
dtrace to probe the details of what
was going on inside OS X which led me to the answer
I was looking for.