I've been back-testing a particular type of RUT butterfly. It's working beautifully. It's incurred no losses over the first ten months I've tested it.

Why am I skeptical that I'll be able to reproduce that kind of result? Why should you be skeptical of those who report their back-tested results to you? There's a world of difference in back-testing and real-life scenarios.

Let's show one example of how back-tested results might not accurately reflect real-life results. When the RUT was at 712.19 on 10/14, I set up a paper-traded 5/10/5 RUT NOV11 butterfly at the 760/710/660 strikes. As you may know, butterflies that are set up near the money tend to be negative delta, and this one was no exception. It had a delta of -34.77. Some traders elect to buy a long call to even out the deltas a bit, and I did that in this back test, buying a NOV11 770 call. The total position cost with commissions of $1.25 per contract was $6,852.00.

The first uncertainties are whether I would have been able to get a fill five minutes before the close that day and whether I would have been able to get a fill at the mark or mid-price. Some traders build in automatic slippage when they're back testing. I elected not to build in that slippage in this case because there's going to be a bigger issue to face as this trade goes along than a few cents slippage on entry. However, one trading friend counts on $100 slippage when she gets into or out of a 3- or 4-contract RUT butterfly. She may or may not incur that slippage, depending on market conditions, but that's what she builds in.

Think-or-Swim has a "thinkBack" application that allows for some limited back-testing. I use another platform for my back-testing, but it's a proprietary one not yet available to the general public. I'm setting up this butterfly on thinkBack, then, and I haven't figured out how to show you the Greeks or the characteristic tent-shape of the butterfly. I was following both on the proprietary site. I can, however, show you the trades on thinkBack.

Back-Test of Butterfly Trade Initiated on 10/14, as shown on thinkBack:

The added long call brought the deltas up by 20.97, so that the butterfly wasn't as negative deltas as it would have been. As it turns out, this trade would have been fine if left alone and not adjusted, with profit locked in the last trading day to avoid expiration-day settlement risk. However, those who might have been in the trade wouldn't have known that as they watched it day by day. We don't have the benefits of hindsight when we're in an actual trade. By early on October 27, it looked as if the trade was in trouble, and many traders in a similar trade would have adjusted. That day, the RUT moved outside the upside expiration breakeven of 746.27. That long call was protecting the trade somewhat, but some traders choose to adjust at the expiration breakeven. It was time for those traders to adjust. And in fact, the RUT was to climb all that day.

Although I'm far from an expert butterfly trader, I of course know that the butterfly can be a flexible trade, although not a fail-proof one, of course. Traders in this situation could have pulled from an arsenal of adjustments that included rolling up five 710/760 verticals, thereby transforming the butterfly into an iron condor; adding an additional butterfly centered at the expiration BE; or even lifting up the original butterfly and replacing it. These are just a few of the many choices. Each has pros and cons. For back-testing purposes, the first two may produce the most reliable back-testing results. Keep in mind that I'm referencing how well the back-tested results could be replicated in live trading, not how well one adjustment performs over another.

Why do I say that the first two adjustments may produce the most reliable back-testing results? Both those first two procedures can sometimes be accomplished in a single trade. Lifting up the original butterfly and replacing it requires at least two trades, with more opportunities for slippage, especially in a fast market like that of October 27. Just this week, as I was editing this article, I was aware of two active butterfly traders who were taking the third tack to adjust their DEC and JAN butterflies--selling the old butterfly and buying a new, recentered one--and were unable to complete the adjustments on the days they began making the adjustments. One added the new butterfly first. There was so much slippage on the price of the old one that he ended up keeping both into the close that day. This experienced butterfly trader thought he could probably get a better fill the next day, and he did, but a less-experienced or less-lucky trader might have found that instead he got an even worse fill the next day. The other sold the old butterfly first, and then wasn't able to buy a new one, so that she ended up with none going into the close on the adjustment day. Her trade was already profitable, however, and she elected to wait to see what transpired the next day before deciding whether to replace the butterfly. Neither of their experiences would have been replicated in a back test when all I had to do was upload the appropriate paper trade.

In contrast, placing an additional butterfly centered at the expiration breakeven is a trade that's relatively easy in most markets although not certain to be filled in a fast market. I usually place my second butterflies ahead of the current price, but that's a judgment call to be made by the trader at the time the adjustment is made. Since the original trade was an all-put butterfly, some traders might elect to place the second butterfly as an all-call one. This avoids the problems with some platforms when one leg of the new butterfly "steps on" a leg of the old one. For example, although I don't believe it's the place where I would have centered a new butterfly in this case, let's imagine that I'd chosen to center it at 760, and that I wanted to have another all-put butterfly.

Adding the Second Butterfly:

[Note: I haven't figured out how to make the current strategy Greeks show up correctly on thinkBack. Note that the deltas for the original butterfly and the single call have not updated when I moved thinkBack forward to October 27. My other graphing program indicates that the adjustment would result in a total strategy delta of -52.35.)

When we examine the bottom order in relationship to the first butterfly, we see that the -10 NOV11 760 puts we're attempting to sell would be stepping on the +5 NOV11 760 puts that were part of the first butterfly. Similarly, the +5 NOV11 710 puts we would be trying to buy as part of the second butterfly would be stepping on the -10 NOV11 710 puts we already have sold as part of the first butterfly.

Some platforms do not allow a single trade that opens some options and closes others. Therefore, an order to open this new put butterfly would be rejected on some platforms. While I could enter it on my back-test by just clicking a few buttons, that doesn't mean I'd get it executed in real life. Traders used to be able to execute such a trade on TOS, and experienced traders will recognize that what I've done by ostensibly adding a second put butterfly is to roll out the put credit spread and "condorize" the original butterfly. However, since the integration with TDAmeritrade, I'm not certain this would still be executed. I'm having some trades that formerly would have been allowed be rejected as "prohibited," sometimes due to glitches in the integration. I haven't tried this particular trade since the integration since I hadn't had any butterflies as I was first writing this article, but I would not consider it a given that I could continue to execute this trade in this manner. I know it wouldn't execute without being forced through by the trading desk at OX and some other trading platforms. If I had been forced to break up this trade into its components to execute it, you can bet the execution prices would be much different than that shown above, and probably much different in a bad way. Back-tested and live results wouldn't match. At all.

Executing the second butterfly as a call butterfly would avoid that problem. The condorized shape of the trade would be almost identical, and the Greeks would measure nearly the same, too. Again, placing it as a 710/760/810 call butterfly would not guarantee execution and especially execution at the mark.

I don't believe those are the strikes I would have chosen in this case, unless I strongly believed that the RUT's rally was overdone and the RUT was ready to roll over. However, this choice, too, if someone had elected this adjustment under those circumstances, would have resulted in a highly profitable trade if no other adjustments were made.

This discussion was meant to pinpoint some of the snafus that might occur when trying to extrapolate back-tested results to live trading. When back-testing, we sometimes just plug in the contracts we want in the amounts we want, without regard to how we would actually execute that trade. Does that mean that we shouldn't back-test or that it's not a valuable process? Of course we should back-test or paper trade simulated trades to follow through the success or failure of various efforts. That helps us weed out strategies that are never going to work.

Next week, we'll look at some of the other adjustments we might have considered and the difficulties we might have encountered trying to translate back-tested results to live ones.