Sunday, November 25, 2007

5 minute performance picture: Paragon NTFS for Mac OS X

I have a macbook pro, and a large amount of files, and I like to play computer games.

So, I have a large external firewire/USB2 hard drive, and boot camp.

This also means I care about NTFS access from OSX.

I'd been running MacFuse + NTFS3G. The performance was not toooo bad, but it was chock full of bugs. Drives would show up as network drives, and be called "-n External" and "-n" instead of "External". Not to mention that sometimes stuff would just randomly break. Files would sometimes disappear or move around in the finder and sometimes I just simply couldn't mount the drive. It sucked pretty hard, so I ended up booting up vmware and accessing that drive via vmware's USB2 mapping + samba under the windows VM. Not cool

Suffice to say I was very happy when I saw the release of Paragon NTFS for OSX

I downloaded it, got rid of MacFUSE and NTFS3G, and ran some benchmarks.

Before the benchmark results, let me first say that even if it was just as slow as MacFUSE/NTFS3g, Paragon NTFS would still be worth a look, because it seems (so far) to be rock solid. Drives show up as proper drives in the finder. The volume labels are fine, as is everything else I can see. There is no lag, and I even now have the option of backing up my boot camp partition with Time Machine. Basically it's as if Apple had actually bothered to implement full NTFS support in leopard. That's cool.

Anyway, Benchmarks:

To get the write speeds, I did this:

dd if=/dev/random of=//tmp/bigfile bs=1m count=200

For the read speeds, I did this:

dd if=//tmp/bigfile of=/dev/null

Yes I am aware this is a crap method of benchmarking drives/filesystems. I'm not anandtech and I don't have days to do this.

Computer: MacBook Pro 2.2ghz (the cheapest one)

External NTFS drive: 7200RPM 500gig seagate with 16 meg of cache

External HFS+ drive: 7200RPM 160gig seagate with 8 meg of cache

Both use the identical dirt cheap firewire/USB2 enclosures I found at the local PC shop

	WRITE (Bytes/Sec)	WRITE (MB/Sec)	Read (Bytes/Sec)	Read (MB/Sec)
HFS+ Firewire	6049969	5.77	91697378	87.45
NTFS Firewire	6645725	6.34	19899810	18.98
HFS+ Local	6565372	6.26	90154137	85.98
NTFS Local	6495106	6.19	16776180	16.00

Conclusions:

HFS+ is obviously doing some kind of caching on those reads, as there's no way you can get 85+MB/sec off a plain old 7200rpm drive, let alone the 5400rpm Local drive in the macbookpro. For Actual Use, I can't tell the difference between the NTFS and HFS+ drives

Also, the read/write speeds suck compared to the 30/25 odd MB/sec windows reports when reading/writing files to the disk. But windows lets you enable write caching for removable drives. Maybe OSX doesn't do this. I don't know.

Apart from that, it keeps up with HFS+ and in some cases beats it.

That's Not Half Bad. I might send some my hard-earned paragon's way.

Tuesday, October 30, 2007

How to manually send an email using Rails' ExceptionNotifier Plugin

We have a situation in our rails app where we want to catch an exception and display a custom error message to the user, BUT we still want the exception notifier to fire, so we know all the detailed backtrace data etc, and can deal with it if it's a problem on our end.

Without Further ado, here is the code.

begin

    # b0rk b0rk b0rk

rescue => exception
    fake_params = { :id=>some_id, :etc=>'etc' }
    fake_request = ActionController::AbstractRequest.new
    fake_request.instance_eval do
        @env = { 'HTTP_HOST'=>'fake_host' }
        @parameters = fake_params
    end

    ExceptionNotifier.deliver_exception_notification( exception, ActionController::Base.new, fake_request )

Enjoy :-)

Monday, June 04, 2007

5 Things that I don't like about Ruby

I can't remember the quote or source, but there's a pseudo programmer-interview question which goes something like this: "What's your favourite programming language?" "OK, what are 5 things that are wrong with it that other languages do better?" This is something I've thought about from time to time, and so I figure I'll give it a shot. Obviously ruby is my favourite programming language at the moment, mostly(at the moment) due to the map and inject functions :-)

1. Green Threads are Useless!

The ruby interpreter is co-operative - it can't context switch a thread unless that thread happens to call one of a number of ruby methods. This means that as soon as you hit a long-running C library function, your entire ruby process hangs. I encountered this situation, and tried then to ship it out to another process using DRb. This was even more useless, as when you do that, the parent process blocks and waits for the DRb worker process to return from it's remote function... which doesn't happen as the worker is blocking on your C library function :-( I ended up having to create a database table, insert 'jobs' in it, and have a seperate worker which polled the database once a second. STUPID.

2. You can't yield from within a define_method, or write a proc which accepts a block

It appears to be to do with the scoping of the block, but in ruby 1.8.X, this code doesn't work: class Foo define_method :bar do |f| yield f end end # This line raises "LocalJumpError: no block given", even though there obviously is a block Foo.new.bar(6){ |x| puts x } The other way to skin this cat is as follows, which also doesn't work :-( class Foo define_method :bar do |f, &block| block.call(f) end end # The "define_method :bar do |f, &block|" gives you # parse error, unexpected tAMPER, expecting '|' # :-( This means there is a certain class of cool dynamic method generating stuff you just can't do, due to stupid syntax issues. Boo :-(

3. The standard library is missing a few things

I vote for immediate inclusion of Rails' ActiveSupport sub-project into the rails standard library. I'm sure I won't be alone in thinking this.

4. Some of the standard library ruby classes really suck.

Time, I'm looking at you. Strike 1: Not being able to modify the timezone. Seriously, people need to deal with more than just 'local' and 'utc' timezones. Yes I know there are libraries, but they shouldn't need to exist. Timezones are not a new high-tech feature! Strike 2: The methods utc and getutc should be utc! and utc, in keeping with the rest of the language. This alone has caused several nasty and hard-to-spot bugs Strike 3: What the heck is up with the Time class vs the DateTime vs the Date class? This stuff should all be rolled into one and simplified. The Tempfile class is also notably annoying. Why doesn't it just subclass IO like any sane person would expect?

5. The RDoc table of contents annoys me

This is probably more "Firefox should have a 'search in current frame'" feature, but under http://ruby-doc.org/core/, have you ever had the page for say Array open, and wanted to jump to say it's hash method? I usually do this using firefox's find-as-you-type, but seriously, try doing just this in the rdoc generated pages with the 3 frames containing everymethodever open. Cry :-(

Monday, April 23, 2007

HOWTO: Create a GParted LiveUSB which actually works WITHOUT LINUX

EDIT:

Turns out there is a windows version of syslinux, to be found HERE.

If I'd kept reading for about 2 more minutes I would have found that out and managed to avoid pretty much all of the timewasting I did last night. Sigh. At least the other people trying to make it work using loadlin indicates I can't have been the only one to get it wrong :-(

Also, the graphics card thing is a non-problem. Just chose Mini X-vesa in the gparted boot menus and it's fine

Moral of the story? Just because you've found a solution doesn't mean it's the best one. Keep looking until you can be sure it is!

So, I wanted to repartition my hard drive tonight. I've used GParted before and it was brilliant, so off I went to download the liveCD again.

Once at that site, I saw the LiveUSB option from the left-hand menu, and thought "Brilliant, I don't have to waste a CDR and it will be much quicker anyway!"... Little did I know that PAIN and DESPAIR awaited me. I'll publish how I resolved this in the hope that less other people will have to.

Step 1: Download the GParted LiveUSB distro

I clicked 'Downloads', from the navigation, followed the liveUSB links, and wound up here:
http://sourceforge.net/project/showfiles.php?group_id=115843&package_id=195292
I downloaded gparted-liveusb-0.3.1-1.zip, and unzipped it. Hooray, now what?

Problem 1: The GParted LiveUSB documentation is crap!

The GParted LiveUSB information here says firstly I need to download a shell script, then I run it and copy some files to my USB key... Apart from a link to one forum post here that's it. Documentation? Instructions? Why do we need those? What could POSSIBLY go wrong?

Problem 2: Running shell scripts on windows doesn't work too well

The above shellscript invokes syslinux, and just about everything else on the net that talks about creating bootable floppies/USB keys also sooner or later invokes syslinux also. This seems to set up the boot record on the USB key so that you can boot linux off it. DOS used to have a utility like this called 'system' or 'sys' or somesuch but I can't remember. Seems simple enough, except I NEED LINUX TO RUN IT. Actually no I don't... see above. oops

In my humble opinion, if I was running linux already, I wouldn't need the liveUSB, I'd just apt-get install gparted and run the damn thing. Yes some travelling sysadmins might have a linux box at home and also need a usb key to take around, but I'm not one of them. The entire reason I'm trying to get this liveUSB to run is because I DON'T have linux.

So, I read that forum post, and noticed at the bottom someone using loadlin to load linux from a DOS system. Aha!

Step 2: A whole crapload of google searching and researching...

As I can't make my USB key linux-bootable without linux, I need to make it DOS-bootable, then get loadlin to load the linux kernel that comes with the gparted liveUSB. I'm going to skip all the boring details as it took me frickin ages and just explain what to do...

Step 2.1: Download a DOS bootdisk so we have DOS

Goto http://www.bootdisk.com/bootdisk.htm and download the "Windows 98 SE Custom, No Ramdrive" boot disk. This gets you an executable which expects to write to your floppy drive... except I don't have a floppy drive. BAH.

Step 2.2: Extract the DOS bootdisk image with WinImage

Goto http://www.winimage.com/download.htm. I went for "winima80.zip" as I just wanted to run it once without the installer guff.

Run winimage. Do File->Open, and point it at the boot98sc.exe file you downloaded in step 1.

Once this is open, chose Image->Extract, and dump all the DOS system files somewhere

Step 2.3: Make your thumbdrive bootable

Goto http://h18000.www1.hp.com/support/files/serveroptions/us/download/20306.html and download the HP Drive Key format utility. As far as I can tell this is the easiest way to make your USB key bootable. It works with pretty much everything not just HP keys.

Make sure your USB key is plugged in

Run the HP program, and format your USB key using FAT (FAT32 should work too, but I didn't try it). Make sure to select "Create a DOS startup disk", and in the "using DOS system files located at:" box, enter the directory you dumped the DOS system files from winImage earlier

Hit start, and wait for it to finish. JUST IN CASE YOU FORGOT, THIS WILL ERASE ALL THE FILES ON YOUR USB KEY, SO BACK THEM UP FIRST, K

Step 2.4: Get loadlin

Goto http://distro.ibiblio.org/pub/linux/distributions/startcom/DL-3.0.0/os/i386/dosutils/ and download "loadlin.exe" to somewhere on your PC

Step 2.5: Copy files onto your USB key

Unzip "gparted-liveusb-0.3.1-1.zip" if you haven't already, and copy all the files into the root of your USB key. Your USB key should now contain those files, COMMAND.COM, IO.SYS, MSDOS.SYS and nothing else. No directories etc.

Also copy loadlin.exe into the root of your USB key

Step 2.6: Make loadlin run automatically

Note: This is like in the forum post here, except it actually works. I think that's out of date.

In the root of your USB key, create a new file called "loadlin.par"

Open it with notepad or something, and put this in it: linux noapic initrd=initrd.gz root=/dev/ram0 init=/linuxrc ramdisk_size=65000 (for those interested, those are the kernel boot parameters which I stole that out of syslinux.cfg from the gparted liveUSB distro. If that file changes, so should your loadlin parameters)

In the root of your USB key, create a new file called "autoexec.bat"

Open it with notepad or something, and put this in it: loadlin.exe @loadlin.par

Step 3: GO GO GO

Reboot your computer! If you've set up your BIOS properly to boot off USB keys, your computer should now boot the GParted liveUSB. HOORAYZ!!!!1111

Step 4: cry

That's as far as I got, because the version of X.org on the liveUSB doesn't seem to like my NVidia 7600GT, so I'm stuck with a command prompt. Those of you with other graphics cards however should be fine. Whether the liveDistro includes command line partitioning tools I dunno, I might go look at that now.

If anyone would like to copy/distribute these instructions, or edit copies/etc, you are free to, as I am putting this particular blog post in the public domain under the creative commons public domain license.

Sunday, March 04, 2007

Rails 1.2 changes

Formats and respond_to

In earlier versions of rails you could have your actions behave differently depending on what content type the web browser was expecting – eg:

respond_to do |format|

format.xml{ render :xml=>image.to_xml }

format.jpg{ self.image }

end

However to make this work you needed to set the HTTP Accept header in the HTTP web request. This is hard to do outside of tests. A new default route has now been added

map.connect ':controller/:action/:id.:format'

The additional format parameter lets you override the format so you can now load people/12345.xml or image/12345.jpg in your web browser to test what happens instead of mucking about with HTTP headers.

Note you still have to register MIME types for the formats you need – for format.jpg I had to put

Mime::Type.register 'image/jpeg', :jpg

In my environment.rb, as jpg is not noticed by default

Named Routes

map.index '/', :controller=>'home', :action=>'index'

map.home '/:action/:id', :controller=>'home'

These create a bunch of helper methods which you can use anywhere you'd supply a URL or parameters for a redirect – eg:

def first_action

    redirect_to index_url # redirects to /

end

def second_action

    redirect_to home_url( :action=>'second' ) # redirects to /second

    # which is the 'home' controller.

end

<%= link_to 'home', index_url %>

<%= link_to 'test', home_path( :action=>'test' ) %>

The difference between foo_url and foo_path is that foo_url gives the entire url eg: http://www.site.com/people/12345 whereas foo_path just gives /people/12345

Gives your code lots more meaning and makes it shorter. Definite win for commonly used things.

Resources

CRUD means Create, Read, Update, Delete.

These map to the four HTTP methods – POST, GET, PUT, DELETE.

HTTP methods let you have shortcuts, so instead of /people/create you can just do an HTTP POST to /people. Also /people/show/1 maps to GET /people/1, etc etc

Routes are created differently – for the above it is

map.resources :people.

Run script/generate scaffold_resource people to have a look

NOTE: Rails expects resources in both the routes.rb and controller names to be named in plural - eg:

www.example.com/people/1 instead of www.example.com/people/1

Philosophy

Basically they are encouraging you to write your controllers and app so that everything revolves around either a create, read, update, or delete of some resource.

Contrived Example: User Login sessions:

Old way – Revolves around action:

User Logs in – post a form to /users/login – this sticks a 'Login Token' of some sort in the session to identify them.

User does stuff – look up the session and link it back – might put an is_logged_in? method on your user model or something.

User Logs out – posts a form to /users/logout – this removes thing from the session.

New way – Revolves around resources

Identify what the 'resource' is – in this case it's the Login Token.

User Logs in – Create a LoginToken by POSTing a form to /LoginTokens – stick it's id in the session or something

User does stuff – Find the correct LoginToken based on it's id, check it's valid etc.

User Logs out – Delete the LoginToken by DELETEing /LoginTokens/1

Conflict of interest?

This LoginToken is behaving a lot like a model even though it's a controller. In fact you should create a model for it. The LoginTokensController should only be a lightweight wrapper around this model. This is a definite win if you can structure your app like this because it seperates the different areas of code out.

In the old way we had the users controller handling login, logout, and whatever else it needed to do – probably about half a dozen other unrelated things. This gets you messy code which is hard to understand/modify. By moving each part out to its own controller we end up with several separate nice clean controllers instead of one big messy one – easier to maintain and to see who's responsible for what. Very important!

REST API's

Many blog entries talk about how you can get an externally accessible API 'for free' by extending these CRUD controllers. The standard example is something like:

Now that your users are accessible via GET,POST,etc to /users/1, we can extend that controller using respond_to so that you can also query it for an XML or JSON representation of the user – this can then be used by other websites/apps for free! Hooray!

This is nice in theory but not so nice in practice. Why?

If you are in a webapp, doing a POST to create a new LoginSession will result in a redirect_to home_url or something like that. However for an external API, you're meant to return an http response code of 201 – Created to indicate the create was successful. Trying to jam these 2 things into the same controller is a mess.

This does not mean the REST idea is bad, only that you need to think a bit more.

If you've done the right thing and created a LoginSession model, then you can just create 2 lightweight controllers – one which fits into your web app, and another if you like which processes XML/JSON.

You still get the major benefit which is that by thinking of stuff as resources, you get a much better design/structure of your app.

ActiveResource

If you have an external XML REST api, these resources end up looking a lot like some data that you might want to load, update, store, etc, like a kind of remote database.

They therefore decided to make something called ActiveResource which would do for XML REST resources what ActiveRecord does for databases (in a limited fashion)

For Example:

class RemotePerson < ActiveResource::Base

    set_site http://localhost/rest_people ## the base URI

    set_credentials :username => "some_user", :password => "abcdefg" ## for HTTP basic authentication

    set_object_name 'person' ## the other end will expect data called 'person' not 'RemotePerson'

end

You can then do

RemotePerson.find( 1 )

This will fire an HTTP request at http://localhost/rest_people/1. It will load the resulting XML and convert it into an object. You can change its data, and call save, etc like you would with a piece of data from the database. When you have 2 sites that need to communicate with each other, this makes it a WHOLE lot easier

The bad news – This isn't in rails 1.2 They pulled it out in one of the beta versions and there's no indication as to when it's coming back

The good news – I wrote one (a limited version thereof anyway) to replace what didn't ship with rails.

to_xml, from_xml, to_json, etc

For all these active resource things to work, they need an easy way to convert data to and from XML so it can be sent over the HTTP request. There are now new methods - to_xml, from_xml, to_json, and other stuff like that which will convert the object to and from xml. These have been added to Hash, ActiveRecord, and other things like that.

Multibyte Support

Rails 1.2 hacks the string class so it is now sort of Unicode (UTF-8) aware.

TextHelper#truncate/excerpt and String#at/from/to/first/last will now automatically work with Unicode strings, but for everything else you need to call .chars or you will break the string.

In other words, If you need to deal with foreign characters (and for the US we probably will) String.length is broken and so is string[x..y] or just about anything else you'd want to do! Beware!

I have no idea how this is going to impact storing strings in mysql etc.

Tons of other bits and pieces

Lots of things have now been deprecated doing stuff like referencing @params instead of params etc etc – these all get dumped to the logs as deprecation warnings – they are still ok now but will break in rails 2.0

image_tag without a file extension is also deprecated. You should always specify one.

See here:

http://www.railtie.net/articles/2006/09/05/rush-to-rails-1-2-adds-tons-of-new-features