Array of Structures vs Structure of Arrays
2013-08-14 10:54
309 查看
1. 概念
Array of structures 结果是一个array,类型是一个structure! Structureof Arrays结果是一个struture,类型是arrays!2. 原文
Array of Structures vs Structure of ArraysI was trying out a test today on arranging vertex data as an Array of Structures or as a Structure of Arrays:
The array of structure format is the way most applications lay out data.
e.g. A triangle with vertices A, B, C, color and normals can be represented as:
Array of Structures (AOS)
struct Triangle { float A[3]; float B[3]; float C[3]; float RGBA[4]; float Normal[3]; } // one float is four bytes, so this is 64 byes per trianglebut in the Structure of Arrays (SOA) format it would look like:
struct Triangles { float *Ax; float *Ay; float *Az; float *RGB; float* Normal; }
So if you look at the memory layout in RAM, the AOS layout would have all the triangles together, whereas in the SOA layout you would have all the vertex data together in
RAM. So in theory, the SOA layout would be better performing because when you access the vertex data, you'd get more cache hits.
Let us consider the simple operation of computing the normal of the triangle. This involves two vector subtracts (AB = B-A and AC = C-A) and one cross product of AB cross AC. The number of adds, subtracts and multiplies are the same on AOS and SOA, its just the layout that changes.
So lets do some theoretical analysis before we write some code and test (always test the final thing, theory is nice but you need ground truth).
I have an Apple Macbook Pro 15" with an Intel Core Duo processor running at 2 GHZ. According to the Intel product brief it comes with a 32 KB instruction cache, a 32 KB data cache, a 2 MB L2 cache and then 1 GB of RAM.
Assuming a cache line size of 64 bytes (it changes from processor to processor), the AOS algorithm would access B[0] first, which is a cache miss, causing the processor to fetch in the entire Triangle of 64 bytes into the L1 cache. The rest of the operations would access data in the L1 cache, which is fast. Then, it goes on to the next triangle which is 64 bytes beyond the current one and hence in a different cache line, which is again a cache miss. So in theory in the AOS format we'd be having cache misses on every triangle the first time you read vertex B. This is assuming the compiler doesn't put in speculative pre-fetching for the next triangle.
On the other hand the SOA layout would incur a cache miss for the first triangle when it reads the vertex data Ax, Ay, Az, Bx, By etc. However, since a float is four bytes and the cache line model we use is 64 bytes, each cache line would have 16 floats and hence 16 triangles worth of data. This means that for the first triangle you get a cache miss and for the other 15 you'd get cache hits. So in theory the SOA layout should be a lot faster than the AOS layout.
So writing some code to perform the normal computation for some triangles I discovered the following:
Number of Triangles: | 1000 | 10000 | 100000 |
Array of Structures Time (secs): | 0.00001 | 0.00016 | 0.00476 |
Structure of Arrays Time (secs): | 0.00003 | 0.00020 | 0.00257 |
原文连接:http://hectorgon.blogspot.com/2006/08/array-of-structures-vs-structure-of.html
这篇文章的需求是:要计算三角形法向量。AOS结构中从A、B、C数组取数据计算结果存与normal数组中;SOA中从Ax、Ay、Az数组中读取数据存入Normal数组中。
结论:当数据大小适合数据缓存时,AOS要更快;反之则AOS要慢! 结合上面表格数据,AOS中每个三角形64bytes,L2缓存正好能存放32768个三角形,当三角形个数小于等于32768阈值时,AOS要明显快于SOA。当数据超过32768时,SOA要快!
参考:
1. http://coolshell.cn/articles/10249.html相关文章推荐
- Aggregrate:Initialization of structures and arrays in C++
- Performance of STL vector vs plain C arrays
- 【LeetNode2-1-5】Median of two sorted arrays--C++源代码(VS2015)
- LEETCODE--Intersection of Two Arrays II
- Median of Two Sorted Arrays
- Median of Two Sorted Arrays
- Median of Two Sorted Arrays
- 【leetcode75】Intersection of Two Arrays(数组的交集)
- [LeetCode]-algorithms-Median of Two Sorted Arrays
- [LeetCode] 4. Median of Two Sorted Arrays
- [LeetCode] Median of Two Sorted Arrays 两个排序数组的中位数
- LeetCode-349 Intersection of Two Arrays
- LeetCode 4. Median of Two Sorted Arrays
- java io读书笔记(6) Writing Arrays of Bytes
- War of Emacs Vs Vi
- Substring with Concatenation of All Words & Median of Two Sorted Arrays & Divide Two Integers
- The Future of Real-Time SLAM and “Deep Learning vs SLAM”
- 【一天一道LeetCode】#4 Median of Two Sorted Arrays
- LeetCode349 Intersection of Two Arrays
- leetcode Median of Two Sorted Arrays