April 5, 2007

Understanding Spotlight and Searching Python and Ruby Files

Filed under: Technology — Cory @ 12:11 am

Tonight I spent a little time learning more about how Mac OS X’s Spotlight system works. Specifically, I wanted to know how it indexes a drive, where it stores the index, and how to trigger a re-index.

Apple has a page of Spotlight tips, but it doesn’t really give much information on how to manage the indexing process. This page offered some interesting insight into the md* commands, which I had also read about in this macosxhints.com article.

It turns out that there are two common ways to tell Spotlight to update its index.

The first way, and the Apple recommended way, is to drag your “Macintosh HD” icon into the box in the “Privacy” tab of the Spotlight preferences, and then drag it out again. By doing this you are telling Spotlight to ignore that drive, and it removes the index, which is stored in the root directory of the drive in a directory named .Spotlight-V100. When you drag the drive back out of the Privacy box, it begins to re-build the index. Apparently this can take upwards of an hour on an average Mac with an 80-100 GB drive.

The second, and apparently less predictable, way of forcing a re-index is to use the Spotlight command line tools. The following can be used to initiate a re-index from the root of the main volume (mdutil must be run as root):

sudo mdutil -E /

If you just want Spotlight to re-index a certain folder, you can use the mdimport command (doesn’t need to be run as root):

mdimport ~/Documents

Another interesting thing about Spotlight is that there are “importers” for certain types of files. These importers are used to extract metadata about certain types of files. For example, peeking into a PDF file is different than looking at an MP3, so Spotlight is designed to load lots of different importers to help it index your data. You can see a list of these imports by running the following:

mdimport -L

Also, you can see the default importers, and rearrange their ordering in search results, by going to the “Search Results” tab of the Spotlight preferences.

When I ran the above mdimport -L command I noticed that there were several custom importers on my system. I did a quick Google search (searching about searching, how meta!) to find out what other importers were out there and I found this page on Apple’s site. It turns out that people have written quite a few custom importers, including one to index Python files and another to index Ruby files.

The Python importer automatically begins indexing after you install it, so be aware that your processor may spike for 10 minutes or so after you have installed the importer. The Ruby one does not automatically index, but if you install it before the Python one, Ruby files will be indexed at that time.

Apparently I had never set up a default application for Python and Ruby files, because when they showed up in search results there was no little TextMate icon beside them. To correct this I just located and clicked on a Python file and a Ruby file, pressed Command-I, selected TextMate in the “open with” section, and clicked the “Change All” button to apply this to globally. Now when a Python or Ruby file shows up in search results, selecting it automatically launches TextMate.

There are a couple of other cool importers that I installed as well. I have a lot of zip, tar/gz and tar/bz2 files on my Mac (I keep everything I ever download), and I noticed that there are importers for zip files and tar files, so I went ahead and installed those too!

One other neat thing is that you can do Spotlight searches directly from the command line using the mdfind utility:

$ mdfind firefox
/Applications/Firefox-1.5.0.7.app
/Applications/Firefox.app
/Users/cwright/Library/Application Support/Firefox
...

Apparently Apple is improving Spotlight in Leopard by making it faster, allowing it to index and search network shares, and allowing more specific searches. Woot!

• • •

April 4, 2007

Buy Yourself a Web App

Filed under: Technology — Cory @ 9:44 pm

For the past year or two my buddy Ken has been working on a web app to make building online databases easier. That app is now up for auction on eBay. If you are interested in this sort of thing, go check it out and play with the live demo. And if you happen to know someone who is looking for such an app, and has some spare change hidden in their couch, point them to the auction.

• • •

March 29, 2007

SPF Records for AntiSpam Efforts

Filed under: DNS,Mail,Technology — Cory @ 1:11 am

A few days ago my server got joe-jobbed on a domain that I registered and never used. When it started happening I just changed the MX record for the domain to point to localhost.standblue.net, which is an A record pointing to 127.0.0.1. After doing this I noticed the bounces slowed down as the MTA’s tried to connect to themselves, rather than to my server. At that time I also added SPF records to all the domains that I host on my server.

So tonight I figured it was time to configure my mail server to look at SPF records. While searching around for a SPF implementation that seemed reasonable (ie, not written in Perl), I found python-postfix-policyd-spf, which is written in Python (although the code is not Pythonic at all).

After installing the PyDNS and PySPF module dependencies, I installed python-postfix-policyd-spf by running ‘python setup.py install‘ and then things were ready to be configured.

The next step was to configure Postfix, which was actually very easy. I added the following line to my /etc/postfix/master.cf file:

spfpolicy unix  -       n       n       -       -       spawn
        user=nobody argv=/usr/bin/python /usr/bin/policyd-spf

And the following to /etc/postfix/main.cf:

smtpd_recipient_restrictions =  permit_mynetworks,
                                permit_sasl_authenticated,
                                check_client_access hash:/etc/postfix/pop-before-smtp-relays,
                                check_recipient_maps,
                                reject_unauth_destination,
                                check_recipient_access hash:/etc/postfix/badmailto,
                                check_policy_service inet:127.0.0.1:60000,
                                check_policy_service unix:private/spfpolicy,
                                permit
spfpolicy_time_limit = 3600

Be sure to add the check_policy_service unix:private/spfpolicy line after the reject_unauth_destination line, otherwise you’ll be an open-relay.

Run ‘postfix reload‘ to get Postfix to acknowledge the changes, and that’s it.

After setting this up and watching the logs for a while tonight, I noticed a few things.

First, there are a lot more domains using SPF than I thought. I know it’s been out for a few years now, but apparently it has really taken off. The only shame is that most of the domains that have SPF records seem to still be using the ~all code, which basically makes it pointless.

The second thing I noticed tonight isn’t quite as shocking: spammers are really careless and dumb. The first two messages that were rejected via SPF were because the spammer actually set up SPF records for their spamming domains, but they screwed it up. The log messages for those two are below:

Mar 29 01:07:05 silver policyd-spf[9260]: SPF fail - not authorized:QUEUEID=; 
       identity=mailfrom; client-ip=65.254.160.36; helo=mail.meckcom.net;
       [email protected]; 
       [email protected]; 
Mar 29 01:19:00 silver policyd-spf[9508]: SPF fail - not authorized:QUEUEID=; 
       identity=mailfrom; client-ip=65.254.160.36; helo=mail.meckcom.net; 
       [email protected]; 
       [email protected]; 

Investigating the first one, I found this:

jermaynepaganochristianism.com. 600 IN  TXT     "v=spf1 a mx ip4:38.98.2.0/24 -all"

So the spammer who bought jermaynepaganochristianism.com (which was registered earlier this month) decided to set up a record specifying which hosts could send mail for that domain, and then send the spam through a different server. Brilliant!

Here is another interesting one:

Mar 29 00:58:21 silver policyd-spf[9071]: 
       SPF Permanent Error: Invalid IP4 address: ip4:72.11.154.128/25-all:
       QUEUEID=; identity=mailfrom; client-ip=72.11.154.232; helo=mail.anbermedia.com; 
       [email protected]; [email protected]; 

In this case the spammer who bought anbermedia.com (which was registered today) set up a SPF record, but screwed it up by not placing a space between /25 and -all, thereby making it an invalid record and causing mail to be rejected. Sweet!

In the 2 hours I have had SPF in place, its blocked 10 messages or so. That isn’t a huge amount, but my server doesn’t move a tremendous amount of mail, especially around midnight. It will be interesting to see how well this works when the server is busy.

Next I plan to try out DomainKeys and see if that helps any.

• • •

March 7, 2007

Frontier Airlines’ Awful Website

Filed under: Technology — Cory @ 12:45 am

Recently I have been paying a lot more attention to web design. I thought we had basically moved past the days when companies would lock out potential customers by requiring a specific browser, but I guess not.

Frontier Website Sucks

(Click on it to see a larger image)

People have been telling me good things about Frontier Airlines, so I thought I would check out their prices to Vegas for DefCon this summer. When I pulled up the site I immediately got the above page that told me I was not good enough to use their little e-commerce site. Too bad.

When are these companies going to realize that it is really not that difficult to make a site that works for everyone? Maybe 6 years ago you could justify it because all the browsers sucked in different ways, but today the browsers are pretty good. And with 1 of every 5 web users browsing with something other than IE they are essentially slamming the door in the faces of more than 20% of their customers. With so many cross-platform development toolkits for JavaScript and CSS, there is no excuse for a company the size of Frontier to have such a finicky site.

• • •

March 5, 2007

New Domain Aliases

Filed under: DNS,Technology — Cory @ 11:04 pm

I have been blogging at this address for almost two years now, and people still seem to have a hard time remembering the website address (although, this wouldn’t be a problem if everyone got with the times and read the web via RSS).

So, to make it easier for people to find this site I have registered a few other domains and pointed them here. Now you can get to this blog from antsonthemelon.com, corywright.net, or corywright.org.

I actually bought corywright.com back around 1999 or 2000, but unfortunately I let it lapse and now some wannabe fake Cory Wright owns it. Lamo.

• • •

March 2, 2007

Oldest Domains on the Internet

Filed under: DNS — Cory @ 1:12 am

As my geek friends know, I love DNS, so I found this list of the 100 oldest domains on the Internet pretty fascinating.

All the big names are on there, except Microsoft. It’s hard to believe that some domains have been registered for 22 years. No wonder all the good ones are taken.

• • •

March 1, 2007

Speed Up Apple Mail.app

Filed under: Mail — Cory @ 9:00 pm

I came across this tip today that shows a neat trick to speed up Mail.app.

I’ve been using Mail.app for about 3 years now and after running the sqlite command to vacuum the index I noticed a pretty significant speedup.

Update: I showed this to Will and he tried vacuuming all the other tables as well. I tried it too and it made things even faster! Here are the steps:

cd ~/Library/Mail
sqlite3 Envelope\ Index
sqlite> vacuum;
sqlite> .quit
• • •

January 9, 2007

Gimme Gimme Gimme!

Filed under: Technology — Cory @ 10:53 pm

Finally. Yes, I will be getting one.

I have always hated shopping for cell phones, because they all suck. My verizon contract is up in August, and the iPhone is supposed to be available in June. That gives everyone else two months to buy them, and enough time for Apple to keep the phones in stock before I buy mine.

Seriously, I can’t remember the last time I was this excited about a new gadget.

• • •

December 8, 2006

VLC and the Apple Remote

Filed under: Technology — Cory @ 1:03 am

Tonight I was trying to play some videos that my friend gave me and I wanted to watch them using Front Row. Quicktime would not open the videos, and although I did not spend much time trying, I was not able to convert them into a format that Quicktime would open. VLC, however, played the videos just fine. I really didn’t want to spend much time tonight researching how to make this work, so I just thought I would see if there was any way to get the apple remote to work with VLC, and sure enough there is.

I found this thread where a guy points out a tool he wrote that hooks the apple remote to VLC, which can be downloaded here.

In under 2 minutes my problem was solved. I love the Internets!

• • •

December 7, 2006

My New MacBook

Filed under: Technology — Cory @ 1:54 am

On Black Friday a couple of my friends took advantange of the discounts that Apple was offering on their online store and ordered new MacBooks. After thinking about it, I decided to order one too. ;)

After waiting patiently for a week, my new machine arrived last Friday just before I left for San Antonio for the weekend. The specs:

  • 13″ White MacBook
  • Core 2 Duo 2 GHz
  • 2 GB Memory
  • 160 GB Drive
  • Superdrive (DVD-RW,CD-RW)

Since I was already using a MacBook and I just wanted to migrate all my data to the new machine, I decided to try the “copy all my data from another Mac” option that the Mac OS X installer gives you during the first boot. Although this is my fourth Apple laptop, I’ve never tried this before so I didn’t know what to expect. It asked to me to plug a FireWire cable between the old and the new laptops, and then let me choose from a simple list of things that I wanted to copy (users, applications, etc). I selected everything, clicked “continue” and watched as it began copying.

Apparently it took about an hour and a half (I went to sleep in the mean time), but after it was complete I started the new machine and to my surprise everything was identical to the old one. All my desktop icons were scattered in the same places, my browser tabs restored to the pages I was browsing on the old one, my mail filters were still there, etc. I thought it had to be a little too good to be true, so I pulled up a terminal and checked to see if other things on the hard drive were copied, and sure enough they were. Even my DarwinPorts installation in /opt/local was copied over, as well as my daemontools compile in /package, and the services in /service. Amazing! I checked to see about some weird things like the kernel extensions required to run DoubleCommand, and yup, they were there too. The only thing I could find that wasn’t copied over was non-standard Apple applications that I had installed such as Xcode and X11.app.

My previous MacBook had a core duo 1.83 GHz processor with 1 GB of memory, so its a little hard to pinpoint what is making this new machine so much faster, but one thing is for sure, its much, much faster. Spotlight instantly finds as I type, and most applications launch immediately now. Previously iPhoto could take 5-10 seconds to start, but now it is almost instant. I am very, very happy with this new MacBook.

I also ordered a copy of Parallels Desktop for Mac with my MacBook. I had played around with Parallels back when I got my first MacBook in May, but I never bought a copy of it. Incidentally, a new beta version of Parallels was released on Friday that includes a lot of really amazing new features. I spent some time playing around with it, and getting Windows XP and Kubuntu installed on it, and I must say that I have now found the perfect environment. The machine has enough processing power and RAM to easily handle all three operating systems running at once, which I have been doing. I installed XP and Kubuntu at the same time, and the little MacBook didn’t even stumble, I was quite impressed.

So far, 6 of my friends have switched to the Mac since Apple introduced the MacBook, and 4 of these people are non-technical users. This makes me very happy.

• • •
« Previous PageNext Page »
Powered by: WordPress • Template by: Priss