You are currently browsing the monthly archive for January 2012.

While feverishly triaging my last place fantasy basketball team this past weekend, with no help from us  f2bbooks eclipsed 500 hits for Jan. Must be a record for Business Math blogs. Let’s check in with our nemesis Business Math Blog,  Mrs. Hooker’s Blog at edublogs – Just another Wicomico County Public Schools Edublog. Sure f2bbooks has a clear posting lead but anything can happen in this hypercompetitive  Social Networking world. As long as Mrs. Hooker doesn’t catch on to FPGA’s and the credit derivative pricing angle everything is going to be copacetic in the page views competition.

Tilera started sampling the latest Tile 64-bit processors w 16 and 36 cores at 1.2Ghz, here.

HPC wire: The Year Ahead in HPC – they are going long GPUs (NVIDIA Kepler) and Intel Knights Corner.

Tao and Gowers re: Reed Elsevier journals pricing protest, here

Bunch of JP Morgan Maxeler stuff follows:

Workshop on High Performance Computational Finance at SC10, here.

Rapid computation of value and risk for derivatives portfolios

  1. Stephen Weston1,2,*,
  2. James Spooner3,
  3. Sébastien Racanière3,
  4. Oskar Mencer3

here.

Practical Quant: Maxeler JP Morgan, here

Maxeler in Ptown at the Equad 31 Jan:  Maximum Performance Computing for Exascale Applications here.

Money Science: here.

Xcell Journal:  full text of Rapid computation of value and risk for derivatives portfolios, via Xilinx, here. Wow,  this is published in the first quarter of 2011 and matches in many ways the contents of the JPM Stanford talk linked to earlier. I can think of a bunch of followup questions. Let’s pull them into a single post after we pick through these references.

Technology in Banking slides, here. Maxeler derivative pricing.

Backups, here, possibly the best thing you will see for a month.

Harvard  on Choice Management, here. “The view I had when I started this study, and that I think a lot of economists have today, is that if you just make information salient, if you explain fees, people will understand what’s in their own self-interest and act accordingly,” Laibson said. “I’m no longer a believer in that story. My belief now is that if you give people bad options, even if you explain the characteristics that make them bad, many people will still choose those options.” That explains a lot about some of the code I have seen. No I am not going to eat after I’ve seen what I’ve seen.

HPCWire – Why Fortran Matters, here. Not that the article says anything wrong but it does not really explain this reasonably subtle point at all. The old programming languages prediction ” I don’t know what the programming language of the future will be, but I know that it will be called FORTRAN” partially addresses the issue. It would be a nice to read an argument that negotiates this point sensibly, but it would be hard to write. I’m indifferent whether people use FORTRAN now, but it is easier to have a reality based conversation with them if they understand the issue. If only we could have templates in FORTRAN, that’s the ticket.

Horrifyingly, mathbabe, indirectly points out I may be running a Business Math blog, here. Actually, that reminds me of the cash register metric Boorstin used to measure and characterize developments in American history. These trading floors and financial supercomputer installations are just bigger and fancier cash registers that highlight the growth and evolving commercial organization of the United States. Discuss amongst yourselves.

Take a look again at the Flynn case for Dataflow in his Maxeler video, here. I think the argument that is playing out on the video is predicated on the observation that  Moore’s Law ceased delivering extra clock cycles in about  2002 for big commercial microprocessors. Let’s accept that as true, even though I think if I go check, David Patterson will denote that “no more clock cycles for you” crossover time to sometime more like 2006. In the absence of any subsequent breakthroughs in Instruction Level Parallelism harvesting by compilers, figuratively then Dataflow architecture has a window of opportunity to demonstrate its advantages to the market unlike any previous opportunity that Arvind would have experienced in the  80s and 90s. OK?

The other argument that Flynn plays out is that compiler assisted parallel coding is challenging and typically has not demonstrated scaling parallel speed ups beyond 8 or so conventional cores. Again fine, fits with my experience from SGI compilers to now, I have no issue. Parallel programming with pragmas is no way to write code for fun and profit.

The jump to “so therefore dataflow is good for a bunch of general purpose mathmatics” needs further justification, maybe it is I don’t know. In particular financial applications like: closed form expression evaluation, discounting, lattices, and Monte Carlo for big portfolios of trades/positions, all hit the memory hierarchy slightly differently. The vectorization opportunities also vary in degree between these financial mathematics applications. I’m not sure about this but I think there were science experiments conducted somewhere uptown that showed you can teach dolphins PERL and Python and they can parallelize these position inventory calculations to speed, just don’t let them try to do load balancing because that will delay the project.

Perhaps Dataflow has some large performance advantage in 2012 but the costs of converting to FPGAs or waiting for the latest FGPA Xilinx parts in your supercomputer more than compensates for the advantage. Certainly you do not expect a massive popular dataflow movement where all your friends, holding copies of Dataflow for Dummies, flock about you to find out how to program dataflow on their phone apps. That’s not happening. Best case you are going to get a cool dataflow programming result that will make you a hero in the dataflow community but as a reward you will have to simply smile inwardly to yourself knowingly, because really no one else will ever know what you did or what you are talking about.  What about debugging the production infrastructure and code, doesn’t that  kill you inside a little bit every time you  think about it? So, maybe it’s just me but this Dataflow architecture performance advantage better be pretty big for each of these specific financial computations to account for all the obvious really bad stuff.

Taking Flynn’s argument on face value I would guess there is cross over time for Dataflow architecture to maintain a significant performance advantage over off-the-shelf architecture, assuming everything else stays the same. I would think the cross over time will be closer to the time we see volume production of 16 multicore microprocessor chips, 2015?  But that assumes nothing else  changes. Big assumption.

The thread we started to determine if it made sense to rewrite an optimized Credit derivative P&L and Risk batch (here) can be wrapped up. It is not worth the time to optimize the code. If you need fast Gaussian Copula computing, go make a deal with Maxeler, they have already got one , you see? If you need a fast contemporary credit derivative batch (CDS, CDX,  ITRX) you can probably just optimize your conventional serial code and be competitive in 3-6 months. We have shown how to optimize the code in these posts, if you do not already know how to proceed.   It would make sense to evaluate Maxeler and the dataflow architecture idea but check your wallet if someone starts doing that “equivalent to 12,000 x86 core dance” for a given computation unless you are reasonably familiar with the code. We have no problem with the concept that Maxeler may be significantly faster, even for CDS valaution and risk, than a tightly optimized off the shelf code (MJ Flynn is a hitter with a record in computation circles) but we haven’t seen the argument yet (and we doubt it is a simple argument). Moreover, why does dataflow payoff architecturally now for Flynn when it did not payoff for Arvind for like 20 years ending in Monsoon?

From a Credit trading perspective, realtime or otherwise, there is nothing to see here, just move on.

From a macro risk perspective it is unambiguously a good thing to have a fast view of  the aggregate risk across single name, index, and correlation inventory. Getting solid analytic numbers for a old piece of quant code is notoriously hard. No one with actual market background will check market convention assumptions, trade representations , or the valuations.  If some enterprising company  has a tied out version of the old quant code get it from them.

But there is more to this story. If the broker dealer cannot bring themselves to separate the correlation book from the credit book arguing that the correlation book is managed to zero risk and standard reserves are held  while the trades roll off. Then in order to run second order risks like counterparty valuation adjustment, the fast correlation risk becomes important. Unless you get the correlation book running to speed your second order risk monte carlo simulation batches performance will be dominated by the correlation book performance (assuming every other product book batch demonstrates competitive performance and better turnaround time compared to the correlation book). Hello, Maxeler.

Recipe for Disaster: The Formula That Killed Wall Street, Felix Salmon: Explores the history of David Li and the development and demise of the Gaussian Copula model, here.

The Economist defends the Gaussian Copula, here.

Zerohedge:  Hour in the life of CDX trading in Europe, here. One of a kind piece and exceptionally worth your time to read.

Basic explanation of CDOs from RGE monitor, here  - Note the age of these links, in credit this stuff has not traded  in any sort volume for many years. The CDO links are useful for getting context around the JPM credit batch performance. In Credit, learning Gaussian Copula in 2012 is kind of like learning how a record store works in 2012.

Markit 2008 Primer on Credit Indicies, here.

Creditflux.com, here –  Creditflux is the leading information source globally for credit trading and investing, credit derivatives, structured credit, distressed credit and credit research, publishing a monthly newsletter, online daily news and a comprehensive database of CDOs and credit hedge funds.

Kurzweil is very motivated to stay alive here is his website.

Video here. In this video, Watch Peter Richards and Stephen Weston from J.P. Morgan present a Stanford Computer Systems Colloquium discussing their use of Maxeler FPGA accelerator technology. This is a straight forward and  clear talk worth watching. The optimization stuff they talk about is sensible, the war stories about conversion from templates to C indicates they are kind of in the right place from the perspective of optimization ( the talk is a  little light on optimizing compilers perhaps not surprising given their target is FPGAs).  At 23:00 speaker confirms we are talking about  P&L and risk in this batch not CVA Monte Carlo. Several 100K positions, 1000s of CDO tranches and some CDO square positions. Study quoted FPGAs for bespoke tranche valuation and risk while GPUs for CDS  valuation and risk. They use standard Gaussian Copula w. Stochastic recovery (see 41:32) for the correlation trades. Optimization boils down to the time required for fitting the correlation cube. Interesting that there are Monte Carlo and tree based valuation models in flight and JPM took a 20% ownership stake in Maxeler.

Delivering the next generation of supercomputing with reconfigurable hardware video O. Mencer, here.

League tables for CDO arrangers 2011, here. JPM did one CDO/CLO in Europe and one in US in 2011 according to these folks at Leading Structured Finance News. I don’t believe the other structures are going to be handled in the credit batch.  Maybe there is a lot of prop trading  in standard tranches that needs realtime correlation risk -  let’s look. Need to find global CDO issuance in 2010 and 2011 but here is 2005 through 2009. You can see why fast Gaussian Copula would have been great between 2005 and 2007.

Award-winning technology helps investment bank understand risk profile better

By | Computerworld UK | Published 11:01, 15 December 11

here

The supercomputer solution was developed with HPC solutions provider Maxeler Technologies, which is also providing JP Morgan with the new, upgraded supercomputer for additional areas of the business.

The new dataflow supercomputer – where the computer chips are tailored to perform specific, bespoke tasks – will be equivalent to more than 12,000 conventional x86 cores, providing 128 Teraflops of performance.

One Div Zero: tentative add to our heroes list – James Iry “If cars were built like software then…well, I don’t know squat about building cars so who knows. It might be kinda cool. But probably not.” A Brief and Incomplete and Mostly Wrong History of Programming Languages

Zerohedge on the CDS market trade volume distribution and a proposal for CDS indicies to be exchange traded, here.

Kamakura Corporation, Risk management tools and Jarrow is involved somehow look at the research and blogs, here.

Cloud Computing get reviewed by US DOE, Argonne, and Lawrence Berkeley and gets a grade of meh, here from Clusterstock.

Salmon  on Udacity, here.  Stanford AI professor starts up online University UDACITY. Agreed this looks like it could grow. An AI course at Stanford gets 100K worldwide enrollment? wow. I would like to hear  why the notion that Stanford, Harvard, Ptown, or Oxford should brand this is an obviously bad idea.

HPCWire Russell Fish @ Venray Technologies has an embedded Microprocessor in DRAM  play, here. Problem is apart from Mortgages I doubt much Street P&L/Risk analytics is intrinsically memory bandwidth starved as opposed to processing starved. Super good at creating pipeline bubbles though.

Wall Street Journalhere Maxeler Makes Waves with Dataflow Design

HPCWirehere  J.P. Morgan Deploys Maxeler Dataflow Supercomputer for Fixed Income Trading

Peter Cherasia, Head of Markets Strategies at J.P. Morgan, commented: “With the new Maxeler technology, J.P. Morgan’s trading businesses can now compute orders of magnitude more quickly, making it possible to improve our understanding and control of the profile of our complex trading risk.”

insideHPChere  JP Morgan Fires Up Maxeler FPGA Super

Compute Scotlandhere Near realtime: JP Morgan & Maxeler

Asymco analysis, here, highlights recent trends in computing. It looks increasingly as if you need to adapt to what the commercial market is giving you,  even in floating point. I suspect that Joe and Suzy Sixpack will decide how you get your fp cycles. GPUs start to look more attractive in this light,right?

Oh and Terry Tao on Black Scholes, here

“Sandy Bridge-EP” Xeon E5 processors and their related “Romley” server platforms, are now in volume shipment, here

Overclocking insurance for Sandy Bridge from Intel, here

Whoa more to think about now: Maxeler says Intel’s Knights Ferry simplicity might not suit HPC, here, at The Inquirer. The article by Lawrence Latif has a  subtitle that reads “More effort yields better performance”!  I think I like the Inquirer it looks like a fancy version of the old Microprocessor Reports.

Check out the Thalesian Seminar: Stein from Bloomberg talking about CVA at the NY public Library 31 Jan 6pm

FPGAs in HFT Thalesian Seminar Slides from eigen.systems

Intel Xeon E5-2690 Sandy Bridge-EP Performance Revealed, Tom’s Hardware, here

MJ Flynn Accelerating computation with FPGAs, slides, @Berkeley video nice talk audio starts to break up around 31:00 though

Maxeler Technologies, home page

JP Morgan FPGA Careers factsheet. So, that JP Morgan credit derivative batch @300K CDS and 30K credit curves is going to run in 2 milliseconds (1000x the cheapest MacPro at the Apple Store)  but I might only get 200x in practice so it will be between 2-10 milliseconds on the 40-node hybrid FPGA machine. Do I get all Virtex6 chips or is that hybrid as well?

Xilinx Virtex-7 FPGA Family looks to be the fastest 2X and 50% lower power then there is Virtex-6.

Follow

Get every new post delivered to your Inbox.