Take the following example, which is fairly typical:

` CGAffineTransform transform = CGAffineTransformMakeTranslation(0, -translation);`

transform = CGAffineTransformScale(transform, scaleFactor, scaleFactor);

view.transform = transform;

Not bad, right? In just three lines of code, we're able to both scale and translate a view or layer. But in reality, there's actually quite a few operations going on behind these three lines of code. The

*CGAffineTransformScale()*function calls

*CGAffineTransformConcat()*to perform a matrix multiplication operation between two affine matrices. But, as you probably know, you can't multiply a 2x3 matrix by another 2x3 matrix. To multiply affine transformations, they have to be converted back to 3x3 vector matrices first.

On today's devices (even mobile devices), this all takes a trivial amount of processing power. But sometimes, when you're doing a lot of these transformations — say thousand or tens of thousands a second — it can be valuable to be able to avoid that conversion and matrix multiplication.

It just so happens that with certain commonly used CGAffineTransforms, you can cheat. Certain matrices can be joined together without performing matrix multiplication. For example, here are the matrices created by CGAffineTransformMakeScale() and CGAffineTransformMakeTranslation(), respectively:

Go ahead and multiply those two together. Plug in any number for

*tx*,

*ty*,

*sx*, and

*sy*and run the numbers. I'll wait. Okay, you don't have to. This is what you'll get:

So, if that's the result we're going to get, why bother going through the matrix multiplication in the first place? Why not just populate the matrix with both the scale and translate values right from the get-go? Well, we can. We can also do the same thing with translate and rotate.

This is all there is to it:

`static inline CGAffineTransform`

static inline CGAffineTransform

That's it. It only saves you two lines of code:

` view.transform = CGAffineTransformMakeScaleTranslate(scaleFactor, scaleFactor, 0, -translation);`

But, your stack allocation is considerably smaller (one CGAffineTransform instead of two CGAffineTransforms and an intermediate 3x3 array. It also saves you eighteen floating point multiplications and nine floating point additions. 99.9% of the time, that number of operations is going to have no noticeable affect on your application - it's a trivial amount of both memory and FLOPS under most normal situations.

But… if you're doing a lot per second, they can add up and it's nice to know there's a way that you can save yourself a little overhead in some situations.

## 7 comments:

Awesome ! I'm going to try using this in my pan/pinch&zoom app. Question : why the "static inline" ? CGAffineTransformTranslate() uses "extern"

CGAffineTransformTranslate() is a function that lives in a framework. Jeff's goodies are inline functions that get expanded into your code. If you look at the expansion of NS_INLINE in the cocoa headers one of the ways it's expressed is static inline.

Hey Everyone,

This is Alex from SlideGamer.com we are a iPhone game review website that likes to showcase up and coming indie games. If you have a killer app you want reviewed please go to http://www.slidegamer.com/?p=39 and submit it there. If you want us to review your app please make sure you promotion codes are in the body of the text.

Happy coding!

Alex Wright

You are saving some instructions but not as much as you think. The sub-expressions in the matrix multiplication involving zero and one can be eliminated or reduced in the underlying implementation (and Apple does do this). CGAffineTransformScale does not call CGAffineTransformConcat and CGAffineTransformConcat does not need a 3x3 intermediate.

I have the feelings that the recent blog posts from Jeff do not have a deep analysis and/or detailed guidances to help users understand the material.

An introduction to what CGAffineTransform is or how the homogeneous coordinates are formed in order to make the transformation matrix is not demonstrated. Neither a sample code or an image to show what the results of the transformation would be is in the post.

I can't think of an example where this kind of micro optimisation could lead to an actual performance increase. Why optimise what is probably already one of the fastest parts of your code?

Did you actually use this specific optimisation somewhere?

cjwl:

That could very well be true. I didn't profile down to the instruction level, but it does definitely save calls.

Hoang:

I have already posted long introductory-level posts on CGAffineTransform and matrix transformations. Given that I'm at my desk about 18 hours a day right now, this is all that I can manage. If it's not enough, I'm sure there are other blogs you can read.

Jakob:

I said right in the post that this would only be an issue in rare cases. Yes, I can think of such a rare case, and yes I've used it in production code. I wrote these macros after profiling an existing app I'm working on. Unfortunately, it's under NDA, so I can't show code, but in my situation, I was having to do quite literally tens of thousands of these a second.

Post a Comment