U3D性能分析 Profiling

Ports that the Unity profiler uses:

MulticastPort : 54998						组播端口:54998
ListenPorts : 55000 - 55511					监听端口:55000 - 55511
Multicast(unittests) : 55512 - 56023		多路广播(单元测试):55512 - 56023

They should be accessible from within the network node. That is, the devices that you're trying to profile on should be able to see these ports on the machine with the Unity Editor with the Profiler on.

它们应当在网络节点内部是可访问的。也就是说,当设置Unity Editor的分析器为开启时,在你尝试进行分析的设备应当是可以看到这些端口的。

First steps 第一步

Unity relies on the CPU (heavily optimized for the SIMD part of it, like SSE on x86 or NEON on ARM) for skinning, batching, physics, user scripts, particles, etc.


The GPU is used for shaders, drawcalls, image effects.


CPU or GPU bound (CPU 或 GPU限制)

Use the internal profiler to detect the CPU and GPU ms 
使用内置分析器来检测CPU和GPU ms

Pareto analysis 帕累托分析法

A large majority of problems (80%) are produced by a few key causes (20%).

Use the Editor profiler to get the most problematic function calls and optimize them first. 
Make sure the scripts run only when necessary.
Use OnBecameVisible/OnBecameInvisible to disable inactive objects. 
使用OnBecameVisible/OnBecameInvisible 来禁用非活跃对象。
Use coroutines if you don't need some scripts to run every frame. 

// Do some stuff every frame:
// 在每一帧做一些事情:
void Update () {

//Do some stuff every 0.2 seconds:
// 每0.2秒做一些事情:
IEnumerator Start ()_ {
while (true) {
yield return new WaitForSeconds (0.2f);

Use the .NET System.Threading.Thread class to put heavy calculations to the other thread. This allows you to run on multiple cores, but Unity API is not thread-safe. So buffer inputs and results and read and assign them
on the main thread. 
使用 .NET System.Threading.Thread 类来将繁重的运算放到其他线程里。这允许你在多个内核上运行,但是Unity API不是线程安全的。因此缓冲区在主线程中对它们进行输入、输出、读取、赋值。

CPU Profiling(CPU分析)

Profile user code 分析用户代码

Not all of the user code is shown in the Profiler. But you can use Profiler.BeginSample and Profiler.EndSample to make the required user code appear in the profiler.

不是所有的用户代码都被显示在分析器中。但是你可以使用 Profiler.BeginSample 和 Profiler.EndSample 来使得需要的用户代码出现在分析器中。

GPU Profiling (GPU分析)

The Unity Editor profiler cannot show GPU data as of now. We're working with hardware manufacturers to make it happen with the Tegra devices being the first to appear in the Editor profiler.



Tools for iOS (iOS的工具)

Unity internal profiler (not the Editor profiler). This shows the GPU time for the whole scene. 
PowerVR PVRUniSCo shader analyzer. See below. 
PowerVR PVRUniSCo着色分析器。见下文。
iOS: Xcode OpenGL ES Driver Instruments can show only high-level info:
iOS:Xcode OpenGL ES驱动仪器仅可以显示上层信息:
'Device Utilization %' - GPU time spent on rendering in total. >95% means the app is GPU bound. 
'Renderer Utilization %' - GPU time spent drawing pixels. 
'Tiler Utilization %' - GPU time spent processing vertices. 
'Split count' - the number of frame splits, where the vertex data didn't fit into allocated buffers. 

PowerVR is tile based deferred renderer, so it’s impossible to get GPU timings per draw call. However you can get GPU times for the whole scene using Unity's built-in profiler (the one that prints results to Xcode output). Apple's tools currently can only
tell you how busy the GPU and its parts are, but do not give times in milliseconds.

PowerVR是基于平铺的延迟渲染器,因此在每一个draw call时得到GPU计时是不可能的。但是,你可以使用Unity内置的分析器(打印Xcode输出的那一个)得到整个场景的GPU时间。目前,Apple的工具只可以告诉你GPU和它的组件有多繁忙,但是不会给出毫秒单位的时间。
PVRUniSCo gives cycles for the whole shader, and approximate cycles for each line in the shader code. Windows & Mac! But it won't match what Apple's drivers are doing exactly anyway. Still, a good ballpark measure.



Tools for Android 安卓的工具

Adreno (Qualcomm) 高通
PVRTune, PVRUniSCo (PowerVR) 德州仪器
On Tegra, NVIDIA provides excellent performance tools which does everything you want - GPU time per draw call, Cycles per shader, Force 2x2 texture, Null view rectangle, runs on Windows, OSX, Linux. PerfHUD ES does not easily work with consumer devices,
you need the development board from NVIDIA.

在图睿上,英伟达提供了非常棒的性能工具,它们可以做到你想要实现的任何事——每个draw call时的GPU时间,每个着色器的周期数,Force 2x2 贴图,Null视图矩形,它们可以运行在Windows,OSX,Linux。PerfHUD ES不那么容易和用户设备一起工作,你需要英伟达的开发板。
Qualcomm provides excellent Adreno Profiler (Windows only) which is Windows only, but works with consumer devices! It features Timeline graphs, frame capture, Frame debug, API calls, Shader analyzer, live editing.


Graphics related CPU profiling (CPU有关的图形分析)

The internal profiler gives a good overview per module:

time spent in OpenGL ES API 在OpenGL ES API中花费的时间
batching efficiency 批处理效率
skinning, animations, particles 蒙皮,动画,粒子系统

Memory 内存

There is Unity memory and mono memory.


Mono memory (Mono内存)

Mono memory handles script objects, wrappers for Unity objects (game objects, assets, components, etc). Garbage Collector cleans up when the allocation does not fit in the available memory or on a System.GC.Collect() call.

Mono内存为Unity对象(游戏对象,资源,组件等等)控制脚本对象和封装器。当资源分配和可用内存不相配或者在调用 System.GC.Collect()时,清理器就会清理空间。

Memory is allocated in heap blocks. More can allocated if it cannot fit the data into the allocated block. Heap blocks will be kept in Mono until the app is closed. In other words, Mono does not release any memory used to the OS (Unity 3.x). Once you allocate
a certain amount of memory, it is reserved for mono and not available for the OS. Even when you release it, it will become available internally for Mono only and not for the OS. The heap memory value in the Profiler will only increase, never decrease.

内存被分配在堆块中。如果要分配的资源和已分配块不相符时,就会分配更多的内存。堆块将会保留在Mono里,直到app关闭。也就是说,Mono不会释放任何OS使用的内存(Unity 3.x)。一旦你分配了一定数量的内存,它就会被mono保留,并对于OS来说不再是可用的。即使当你释放它,它也仅仅变为是对Mono可用的,而不是对于OS可用。分析器中的堆内存值仅会增加,而永远不会减少。

If the system cannot fit new data into the allocated heap block, the Mono calls a "GC" and can allocate a new heap block (for example, due to fragmentation).


'Too many heap sections' means you've run out of Mono memory (because of fragmentation or heavy usage).


Use System.GC.GetTotalMemory to get the total used Mono memory.


The general advice is, use as small an allocation as possible.


Unity memory (Unity内存)

Unity memory handles Asset data (Textures, Meshes, Audio, Animation, etc), Game objects, Engine internals (Rendering, Particles, Physics, etc). Use Profiler.usedHeapSize to get the total used Unity memory.


Memory map 内存映射

No tools yet but you can use the following.

Unity Profiler - not perfect, skips stuff, but you can get an overview. It works on the device! 
Internal profiler 内置分析器
Shows Used heap and allocated heap - see mono memory. 
Shows the number of mono allocations per frame. 

Xcode tools - iOS
Xcode Instruments Activity Monitor - Real Memory column. 
Xcode仪器活动监视器——Real Memory列。
Xcode Instruments Allocations - net allocations for created and living objects. 
VM Tracker (VM跟踪器)
textures usually get allocated with IOKit label. 
meshes usually go into VM Allocate. 

Make your own tool 制作你自己的工具。
FindObjectsOfTypeAll (type : Type) : Object[]
FindObjectsOfType (type : Type): Object[]
GetRuntimeMemorySize (o : Object) : int
Profiler.BeginSample/EndSample - profile your own code
UnloadUnusedAssets () : AsyncOperation

References to the loaded objects - There is no way to figure this out. A workaround is to 'Find references in scene' for public variables. 

Memory hiccups 内存小信息

Garbage collector 垃圾回收器
This fires when the system cannot fit new data into the allocated heap block. 
Don't use OnGUI on mobiles
It shoots several times per frame 
It completely redraws the view. 
It creates tons of memory allocation calls that require Garbage Collection to be invoked. 

Creating/removing too many objects too quickly?
This may lead to fragmentation. 
Use the Editor profiler to track the memory activity. 
The internal profiler can be used to track the mono memory activity. 

System.GC.Collect() You can use this .Net function when it's ok to have a hiccup. 
当它可以有一个间隔时,你可以使用System.GC.Collect() 这个.Net函数。

New memory allocations 新的内存分配。
Allocation hiccups 分配间隔
Use lists of preallocated, reusable class instances to implement your own memory management scheme. 
Don't make huge allocations per frame, cache, preallocate instead 

Problems with fragmentation? 碎片的问题?
Preallocate the memory pool. 预分配内存池。
Keep a List of inactive GameObjects and reuse them instead of Instantiating and Destroying them. 

Out of mono memory 耗尽mono内存
Profile memory activity - when does the first memory page fill up? 
Do you really need so many gameobjects that a single memory page is not enough? 

Use structs instead of classes for local data. Classes are stored on the heap; structs on the stack. 

class MyClass {
public int a, b, c;

struct MyStruct {
public int a, b, c;

void Update () {
//BAD   //不好的做法
// allocated on the heap, will be garbage collected later!

MyClass c = new MyClass();

//GOOD   // 好的做法
//allocated on the stack, no GC going to happen!

MyStruct s = new MyStruct();

Read the relevant section in the manual Link to 阅读手册中的相关单元链接http://docs.unity3d.com/Documentation/Manual/UnderstandingAutomaticMemoryManagement.html

Out of memory crashes 内存不足崩溃

At some points a game may crash with "out of memory" though it in theory it should fit in fine. When this happens compare your normal game memory footprint and the allocated memory size when the crash happens. If the numbers are not similar, then there is a
memory spike. This might be due to:

Two big scenes being loaded at the same time - use an empty scene between two bigger ones to fix this. 
Additive scene loading - remove unused parts to maintain the memory size. 
Huge asset bundles loaded to the memory 
Loading via WWW or instantiating (a huge amount of) big objects like:
Textures without proper compression (a no go for mobiles). 
Textures having Get/Set pixels enabled. This requires an uncompressed copy of the texture in memory. 
被启用了 获取/设置像素 的贴图。这需要在内存中创建一个贴图的未压缩的复制品。
Textures loaded from JPEG/PNGs at runtime are essentially uncompressed. 
Big mp3 files marked as decompress on loading. 

Keeping unused assets in weird caches like static monobehavior fields, which are not cleared when changing scenes. 

