Progress - past, present and future

34 Best Execution Guide to Low Latency 2010
LOWER LATENCY TRADING
time market includes Red Hat’s specialised real
time, messaging and grid offering dubbed MRG,
Novell SUSE Linux real time option “SLERT”
and SUN Solaris, the legacy OS for market data
made contemporary through Solaris 10 and the
specialist features of containers and DTrace for
tuning. For the brave hearted, the open source
market for Linux and Unix provides limitless
depth of innovation, subject to the vagaries
of entering the race track with parts from the
neighbourhood enthusiast.
Networking – the connectivity between domains in
the architecture is crucial. Pressure on traditional
Ethernet in its 1 Gig form opened the door in 2007
to innovation where specialist highly engineered
devices and communication products based on
Infiniband for FSI, led by Voltaire on a Mellanox
base, accelerate messages bypassing bottlenecks
in the OS kernel and traditional network layers.
Now in late 2009 we see the Ethernet alternative
10GbE technology mature, delivered by specialised
switch vendors like Voltaire and Arista combined
with optimised 10GbE network cards for enhanced
performance. Here the NIC market is around
Mellanox ConnectX, Chelsio TOE, Solarflare Open
Onload, and Intel’s NetEffect iWARP technologies
which are having the effect of moving the domain
from niche innovation to mainstream industry
standard. Lab based tests are delivering the results
at this leading edge. The best performance on
Niantic 10Gig E (“standard” NIC) is around 10-12
micro seconds one way hop (application to
application data transfer).
The same packet workload tested on NetEffect
will take 5.5 microseconds. So 2x performance
improvement is achieved from a hardware
swap out which may either give a vital ‘arb’ing’
opportunity or compensate for performance
issues elsewhere. This speed up is achieved due
to the way NetEffect talks to the application,
a fundamental change in approach – which is
kernel and kernel network stack bypass using
hardware features in NE.
Acceleration has been a market buzz for some
time and trading has provided the opening for
engineering shop to meet commercial market. At
the leading edge of statistical arbitrage the world
of the fastest cat in the game reserve applies to
pounce on a vulnerable price out in the market.
Here accelerator devices using GPGPU (graphics
processing) and FPGA (field programmable
gate arrays) apply. Feed handling and market
data has been the focus for this innovation.
The considered thinking at this stag is for
heterogeneous architecture designs which offload
cleansing and feed handling activity around the
pipe to these fast environments, sitting alongside
mainstream CPUs handling maths, analytics,
trade routing and so forth – therefore spreading
the load and using appropriate tools for the job
in hand.
It is in the leading edge of lowest latency trading
that hardware has re-discovered its value add
contribution to solution architectures. From the
micro-architecture design and features of the chip
to the slot and carcass capability of blade frames
housing a complete solution environment in a
few square feet. Intel’s Core2 micro-architecture
has seen unprecedented take up in trading
since 2007 as its bi-annual tick-tock engineering
upgrade schedule delivers performance
enhancement every year and with it a direct and
immediate contribution to the horsepower that
drives the solution stack above. Code named
Nehalem and now established in production as
Xeon 5500 Series, the inclusion of the CPU in this
its latest form has easily leapt into double digit
percentage gains in performance for the overall
solution – and hence CPU and server upgrade has
been a popular route to boost performance. HP’s
BladeSystems and DL servers frame a complete
range of options all optimised for this Intel
architecture (“iA”)
The picture painted so far is one of complexity
and interaction between different parts of the
technology infrastructure. It is a window into the