GPU
Computing on Macs – a user view
First version – March 2009
Updated
March 25 – CUDA 2.1 for MacOS update
Updated
April 3 – Comment on which 200 series cards support double precision
(level 1.3)
Updated
December 2009 (rolling) New GPUs, Snow Leopard.
This
web page is not intended to be definitive or authoritative but Mac users
starting to get into GPU computation [as opposed to just relying on a GPU to
accelerate general system or graphics (games!) tasks) might find it helpful
Double
precision computation requires (under CUDA) a ÒCompute CapabilityÓ 1.3 or
higher GPU (e.g. Some but not all 200 series NVIDIA card or TESLA: at the last
update my own information is that the 295, 285 (including Mac edition), 280, 275, 260 desktop GTX cards are OK,
as well as higher level Quadro (including the Mac edition 4800), but not the
desktop 250, the Mac GT120, nor the mobile 260M and 280M [factual correction
welcome]. More on this below – it is slightly awkward as the CC level is
not routinely given in tech specs but has to be dug out of the appendix to the
CUDA programming guide, which is a little way behind the chips. If you have a
working installed card then running deviceQuery will yield the capability in the
first two lines of the output.
For a
useful collection of facts and speculation on the CUDA side, browse the Mac
CUDA forum here.
Open
CL
This
offers considerable promise for the medium to long term. You MUST have upgraded to Snow Leopard
(OS X 10.6) development tools (Xcode). I recommend Snow Leopard 10.6.2 and
Xcode 3.2.1. For access to the Nvidia tools you must use a very recent version
of the GPU Development Kit (2.3, 3.0). There is now a rich set of examples from
Nvidia.
However,
some useful public links are: Khronos, NVIDIAÕs SIGGRAPH demo of
the N-body example ported from CUDA (PDF, YouTube). I am not aware yet the
status as regards OpenCL of the ATI Radeon 4870 card announced in March 09 for
the Mac Pro 08 and 09 models.
CUDA GPU computation
This
is an increasingly straightforward thing to do under OS X on a Mac, and
requires only that a suitable NVIDIA GPU is present and that the relevant CUDA
tools have been downloaded and installed. I have most experience using 8800GT,
GTX 285 and Quadro 4800 cards on a 2008-model Mac Pro, but others have
documented the use of GPUs in other Macs. Here are some links verifying the
configurations shown (I will add to this as I see explicit comment):
MacPro
(early 2008 model) with 8800GT (verified by me) running in isolation under
10.5.6 (and under Windows (bootcamp) together with a 260GTX – see
later).
MacBook
Pro with 9400M and 9600M GT running under 10.5.6: Raymond TayÕs very helpful
blog on Getting started with CUDA 2.0 (This is how I got going on my Mac Pro, though my own
installation is now very different).
NVIDIAÕs
Apple
page
lists several 2009 Macs as having NVIDIA GPUs, including the MacMini (9400M), early
2009 iMac (9400M, GT120, GT130) as well as the MacBooks and 2009-model
(Nehalem) Mac Pro with the GT 120 chip. I have no reason to doubt CUDA works
well on these systems, but have not tried them myself and have not yet seen
explicit comment on the web. My current understanding is that all these GPUs
are not CC1.3 so are limited to single precision work. Note that the late 2009
iMac has a 9400M in the base model but ATI 46xx/48xx in the higher models, so
this are candidates for OpenCL but not CUDA.
What to do for OpenCL 1.0 and CUDA 2.3/3.0
under OS X
Official Apple Mac Pro cards:
If
you have a Mac Pro without an NVIDIA card you will need to install the Mac Edition
8800GT, GT120, 285GTX, Quadro 4800 or other card on a Mac Pro.
In
the case of the 8800GT note that there are two different cards. One is for the
2007 first gen Intel Mac Pro, and the other is for the 2008 model Intel Mac
Pro. These are no longer sold by Apple but you might find them on eBay. This
card does NOT do double precison.
I
have only tried the 08 configuration. (For 09 Mac Pro the GT120 is available as
an option but can also be found separately here).
Note
that it is necessary for an 8800GT to have the supplied auxiliary PCI-E 6-pin
power cable connected to the motherboard via the supplied connector – it
is not enough just to have the card in the slot. If you are already using both
motherboard connections you will need to find power elsewhere. You must use the
supplied Apple power connector, which is not a ÒPC standardÓ form. These
cables can be found independently.
The
Quadro 4800 is a double precision card, and like the older 8800GT only needs
one PCI power connector. The GTX 285 is also a double precision card, and it
needs TWO 6-pin PCI power connectors.
Loading
PC GPU cards into a Mac Pro
This
can be done in various ways. I strongly recommend you read the threads on
forums.macrumors.com for details on how to do it. You will need to sort out
power supply issues (route extra from an optical bay or external ATX PSU, as
you would also do if you were loading official Mac cards needing more than two
6-pin connectors). There are then two routes. First is to put a bigger ROM chip
on the card and flash it with the combination ROM from an official Mac card. I
am not endorsing this route but I know from the web that it can be done. The
second gentler route works if you have one official Nvidia Mac card and wish to
load a second PC card. Go to
and
get the post-boot injector. This will activate many PC cards (I have got it
working with Palit 285 and a Zotac 260) but read the forums before proceeding. I have heard of problems with the 275.
Non-Mac
Pro configurations
I
have no up to date information about which GPUs in which Macs do what. I know
several people happily using Nvidia GPUs in MacBook Pros. I am not aware of any mobile
configurations allowing double precision work.
Software
stage, probably the same for all GPU-equipped Macs, OpenCL 1.0 (Snow Leopard
only) CUDA 2.3 and 3.0:
Go
to
http://www.nvidia.com/object/cuda_get.html
and choose Mac OS X from the
drop down menu. At the time of writing 2.3 is available (watch out for 3.0).
Following the instructions and note which driver to get depending on your
GPU/OS.
You
will first want to try out the supplied examples. To do this, open an Xterm
window and do the following
1.
Enter:
cd /Developer
2.
Enter:
cd /GPU*
3.
For
the Cuda side do cd C, then
4.
Enter:
make
5.
Wait
for the make script to run
6.
Enter:
cd bin/darwin/release
7.
In
that directory (release might be somewhere else but that is where I found it)
you will find a great many fun programs that are almost ready to run. However
there are some path settings (seem to be necessary up to 2.3, not under pre-release
3.0)
export PATH=/usr/local/cuda/bin:$PATH
export
DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
8.
You
can now try out the examples – I like to start with
./deviceQuery
which
will show the card status, then you can try the others, e.g.,
./oceanFFT
./nbody
For the OpenCL side you need to go instead to /Developer/GPU Computing/OpenCL
and run make and the resulting binaries are again under a /bin/Darwin/release
subdirectory. Try out ./oclNbody
The
results of running ./deviceQuery and ./nbody are shown in this OS X screen shot.
This was done under 2.0. The results of running nbody and oclNbody together are
shown on the grid home page.
Under
3.0 the environment settings seem not to be needed. Under some implementations
the 2.3 driver seems to unload itself. You can reload the install routine prior
to a work session, or fiddle with permission settings as described on
forums.nvidia.com in the Mac section. This seems to have been fixed in 3.0.
William
Shaw,
KingÕs
College London