Unity3D Optimizing Graphics Performance for iOS
License Comparisons http://unity3d.com/unity/licenses#iphone Optimizing Graphics Performance http://unity3d.com/support/documentation/Manual/Optimizing Graphics Performance.html iOS A useful background to iOS optimization can be found on the iOS hardware page. Alpha-Testing iOS设备上运行alpha-test会极耗费cpu。尽可能的用“alpha-blend”材质来替代“alpha-test”材质。如果不可避免的需要使用“alpha-test”材质的话，把alpha-test像素数量调至最少。 多边形顶点 通常情况下，让每帧显示顶点数少于40,000个（iPhone 3GS），对于更老的设备，让顶点数少于10,000个。（iPhone,iPhone 3G,iPod Touch 1代和2代） Lighting Performance Per-pixel dynamic lighting会极大的增加渲染量，不可使用多于1个的Pixel Light来对任何物体照明。尽可能使用平行光。 Per-vertex dynamic lighting会增加大量的顶点位移。避免多个灯光照射同一物体。对于静止物体（Static object）灯光烘培是最好的方法。 多边形模型优化 优化多边形模型时，有2条基本原则： -1- 不要使用任何多余的三角面 -2- 让UV贴图接缝数和硬边（如，doubled-up vertices）尽可能的少 注意，显卡需处理的顶点数通常和3D软件显示的顶点数不同。建模软件通常显示的是构成一个模型的各表面转角顶点数之和。而显卡有时需要把一个多边形上的顶点分割为二个、甚至更多逻辑点以便于渲染。如果一个顶点拥有多重法线、UV坐标或顶点色，则这个顶点将被分割。Unity中的顶点数一般会比3D软件中的显示数目大得多。 Texture Compression Using iOS's native PVRT compression formats will decrease the size of your textures (resulting in faster load times and smaller memory footprint) and can also dramatically increase rendering performance. Compressed textures use only a fraction of the memory bandwidth needed for uncompressed 32bit RGBA textures. A comparison of uncompressed vs compressed texture performance can be found in the iOS Hardware Guide. Some images are prone to visual artifacts in the alpha channels of PVRT-compressed textures. In such cases, you might need to tweak the PVRT compression parameters directly in your imaging software. You can do that by installing the PVR export plugin or using PVRTexTool from Imagination Tech, the creators of the PVRT format. The resulting compressed image file with a .pvr extension will be imported by the Unity editor directly and the specified compression parameters will be preserved. If PVRT-compressed textures do not give good enough visual quality or you need especially crisp imaging (as you might for GUI textures, say) then you should consider using 16-bit textures instead of RGBA textures. By doing so, you will reduce the memory bandwidth by half. Tips for writing high-performance shaders The GPUs on iOS devices have fully supported pixel and vertex shaders since the iPhone 3GS. However, the performance is nowhere near what you would get from a desktop machine, so you should not expect desktop shaders to port to iOS unchanged. Typically, shaders will need to be hand optimized to reduce calculations and texture reads in order to get good performance. Complex mathematical operations Transcendental mathematical functions (such as pow, exp, log, cos, sin, tan, etc) will tax the GPU greatly, so a good rule of thumb is to have no more than one such operation per fragment. Consider using lookup textures as an alternative where applicable. It is not advisable to attempt to write your own normalize, dot, inversesqrt operations, however. If you use the built-in ones then the driver will generate much better code for you. Bear in mind also that the discard operation will make your fragments slower. Floating point operations You should always specify the precision of floating point variables when writing custom shaders. It is critical to pick the smallest possible floating point format in order to get the best performance. If the shader is written in GLSL ES then the floating point precision is specified as follows:- highp - full 32-bit floating point format, suitable for vertex transformations but has the slowest performance. mediump - reduced 16-bit floating point format, suitable for texture UV coordinates and roughly twice as fast as highp lowp - 10-bit fixed point format, suitable for colors, lighting calculation and other high-performance operations and roughly four times faster than highp If the shader is written in CG or it is a surface shader then precision is specified as follows:- float - analogous to highp in GLSL ES, slowest half - analogous to mediump in GLSL ES, roughly twice as fast as float fixed - analogous to lowp in GLSL ES, roughly four times faster than float For further details about shader performance, please read the Shader Performance page. Hardware documentation Take your time to study Apple documentations on hardware and best practices for writing shaders. Note that we would suggest to be more aggressive with floating point precision hints however. Bake Lighting into Lightmaps Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but: It is going to run a lot faster (2-3 times for eg. 2 pixel lights) And look a lot better since you can bake global illumination and the lightmapper can smooth the results Share Materials If a number of objects being rendered by the same camera uses the same material, then Unity iOS will be able to employ a large variety of internal optimizations such as: Avoiding setting various render states to OpenGL ES. Avoiding calculation of different parameters required to setup vertex and pixel processing Batching small moving objects to reduce draw calls Batching both big and small objects with enabled "static" property to reduce draw calls All these optimizations will save you precious CPU cycles. Therefore, putting extra work to combine textures into single atlas and making number of objects to use the same material will always pay off. Do it! Simple Checklist to make Your Game Faster Keep vertex count below: 40K per frame when targeting iPhone 3GS and newer devices (with SGX GPU) 10K per frame when targeting older devices (with MBX GPU) If you're using built-in shaders, peek ones from Mobile category. Keep in mind thatMobile/VertexLit is currently the fastest shader. Keep the number of different materials per scene low - share as many materials between different objects as possible. Set Static property on a non-moving objects to allow internal optimizations. Use PVRTC formats for textures when possible, otherwise choose 16bit textures over 32bit. Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach.（什么意思？） If writing custom shaders, always use smallest possible floating point format: fixed / lowp -- perfect for color, lighting information and normals, half / mediump -- for texture UV coordinates, float / highp -- avoid in pixel shaders, fine to use in vertex shader for vertex position calculations. Minimize use of complex mathematical operations such as pow, sin, cos etc in pixel shaders. Do not use Pixel Lights when it is not necessary -- choose to have only a single (preferably directional) pixel light affecting your geometry. Do not use dynamic lights when it is not necessary -- choose to bake lighting instead. Choose to use less textures per fragment. Avoid alpha-testing, choose alpha-blending instead. Do not use fog when it is not necessary. Learn benefits of Occlusion culling and use it to reduce amount of visible geometry and draw-calls in case of complex static scenes with lots of occlusion. Plan your levels to benefit from Occlusion culling. Use skyboxes to "fake" distant geometry. See Also Optimizing iOS Performance iOS Hardware Guide iOS Automatic Draw Call Batching Modeling Optimized Characters Rendering Statistics ------------------------------------------------------------------------------------------------ Modeling Characters for Optimal Performance http://unity3d.com/support/documentation/Manual/Modeling Optimized Characters.html Below are some tips for designing character models to give optimal rendering speed. Use a Single Skinned Mesh Renderer You should use only a single skinned mesh renderer for each character. Unity optimizes animation using visibility culling and bounding volume updates and these optimizations are only activated if you use one animation component and one skinned mesh renderer in conjunction. The rendering time for a model could roughly double as a result of using two skinned meshes in place of a single mesh and there is seldom any practical advantage in using multiple meshes. Use as Few Materials as Possible You should also keep the number of materials on each mesh as low as possible. The only reason why you might want to have more than one material on a character is that you need to use different shaders for different parts (eg, a special shader for the eyes). However, two or three materials per character should be sufficient in almost all cases. Use as Few Bones as Possible A bone hierarchy in a typical desktop game uses somewhere between fifteen and sixty bones. The fewer bones you use, the better the performance will be. You can achieve very good quality on desktop platforms and fairly good quality on mobile platforms with about thirty bones. Ideally,keep the number below thirty for mobile devices and don't go too far above thirty for desktop games. Polygon Count The number of polygons you should use depends on the quality you require and the platform you are targeting. For mobile devices, somewhere between 300 and 1500 polygons per mesh will give good results, whereas for desktop platforms the ideal range is about 1500 to 4000. You may need to reduce the polygon count per mesh if the game can have lots of characters onscreen at any given time. As an example, Half Life 2 used 2500-5000 triangles per character. Current AAA games running on the PS3 or Xbox 360 usually have characters with 5000-7000 triangles. Keep Forward and Inverse Kinematics Separate When animations are imported, a model's inverse kinematic (IK) nodes are baked into forward kinematics (FK) and as a result, Unity doesn't need the IK nodes at all. However, if they are left in the model then they will have a CPU overhead even though they don't affect the animation. You can delete the redundant IK nodes in Unity or in the modeling tool, according to your preference. Ideally, you should keep separate IK and FK hierarchies during modeling to make it easier to remove the IK nodes when necessary. ------------------------------------------------------------------------------------------------ Draw Call Batching http://unity3d.com/support/documentation/Manual/iphone-DrawCall-Batching.html To draw an object on the screen, the engine has to issue a draw call to the graphics API (OpenGL ES in the case of iOS). Every single draw call requires a significant amount of work on the part of the graphics API, causing significant performance overhead on the CPU side. Unity combines a number of objects at runtime and draws them together with a single draw call. This operation is called "batching". The more objects Unity can batch together, the better rendering performance you will get. Built-in batching support in Unity has significant benefit over simply combining geometry in the modeling tool (or using the CombineChildren script from the Standard Assets package). Batching in Unity happens after visibility determination step. The engine does culling on each object individually, and the amount of rendered geometry is going to be the same as without batching. Combining geometry in the modeling tool, on the other hand, prevents effecient culling and results in much higher amount of geometry being rendered. Materials Only objects sharing the same material can be batched together. Therefore, if you want to achieve good batching, you need to share as many materials among different objects as possible. If you have two identical materials which differ only in textures, you can combine those textures into a single big texture - a process often called texture atlasing. Once textures are in the same atlas, you can use single material instead. If you need to access shared material properties from the scripts, then it is important to note that modifying Renderer.material will create a copy of the material. Instead, you should useRenderer.sharedMaterial to keep material shared. Dynamic Batching Unity can automatically batch moving objects into the same draw call if they share the same material. Dynamic batching is done automatically and does not require any additional effort on your side. Tips: Batching dynamic objects has certain overhead per vertex, so batching is applied only to meshes containing less than 900 vertex attributes in total. If your shader is using Vertex Position, Normal and single UV, then you can batch up to 300 verts and if your shader is using Vertex Position, Normal, UV0, UV1 and Tangent, then only 180 verts. Please note: attribute count limit might be changed in future Don't use scale. Objects with scale (1,1,1) and (2,2,2) won't batch. Uniformly scaled objects won't be batched with non-uniformly scaled ones. Objects with scale (1,1,1) and (1,2,1) won't be batched. On the other hand (1,2,1) and (1,3,1) will be. Using different material instances will cause batching to fail. Objects with lightmaps have additional (hidden) material parameter: offset/scale in lightmap, so lightmapped objects won't be batched (unless they point to same portions of lightmap) Multi-pass shaders will break batching. E.g. Almost all unity shaders supports several lights in forward rendering, effectively doing additional pass for them Using instances of a prefab automatically are using the same mesh and material. Static Batching Static batching, on the other hand, allows the engine to reduce draw calls for geometry of any size (provided it does not move and shares the same material). Static batching is significantly more efficient than dynamic batching. You should choose static batching as it will require less CPU power. In order to take advantage of static batching, you need explicitly specify that certain objects are static and will not move, rotate or scale in the game. To do so, you can mark objects as static using the Static checkbox in the Inspector: Using static batching will require additional memory for storing the combined geometry. If several objects shared the same geometry before static batching, then a copy of geometry will be created for each object, either in the Editor or at runtime. This might not always be a good idea - sometimes you will have to sacrifice rendering performance by avoiding static batching for some objects to keep a smaller memory footprint. For example, marking trees as static in a dense forest level can have serious memory impact. Static batching is only available in Unity iOS Advanced. Further Reading Measuring performance with the Built-in Profiler Rendering Statistics ------------------------------------------------------------------------------------------------ Rendering Statistics Window http://unity3d.com/support/documentation/Manual/RenderingStatistics.html#RenderingStatisticsIPhone The Game View has a Stats button in the top right corner. When the button is pressed, an overlay window is displayed which shows realtime rendering statistics, which are useful for optimizing performance. The exact statistics displayed vary according to the build target. Rendering Statistics Window. The Statistics window contains the following information:- Time per frame and FPS The amount of time taken to process and render one game frame (and its reciprocal, frames per second). Note that this number only includes the time taken to do the frame update and render the game view; it does not include the time taken in the editor to draw the scene view, inspector and other editor-only processing. Draw Calls The total number of meshes drawn after batching was applied. Note that where objects are rendered multiple times (for example, objects illuminated by pixel lights), each rendering results in a separate draw call. Batched (Draw Calls) The number of initially separate draw calls that were added to batches. "Batching" is where the engine attempts to combine the rendering of multiple objects into one draw call in order to reduce CPU overhead. To ensure good batching, you should share materials between different objects as often as possible. Tris andVerts The number of triangles and vertices drawn. This is mostly important when
optimizing for low-end hardware Used Textures The number of textures used to draw this frame and their memory usage. Render Textures The number of
Render Textures and their memory usage. The number of times the active Render Texture was switched each frame is also displayed. Screen The size of the screen, along with its anti-aliasing level and memory usage. VRAM usage Approximate bounds of current video memory (VRAM) usage. This also shows how much video memory your graphics card has. VBO total The number of unique meshes (Vertex Buffers Objects or VBOs) that are uploaded to the graphics card. Each different model will cause a new VBO to be created. In some cases scaled objects will cause additional VBOs to be created. In the case of a static batching, several different objects can potentially share the same VBO. Visible Skinned Meshes The number of skinned meshes rendered. Animations The number of animations playing. ------------------------------------------------------------------------------------------------ ----------------------------------------------------------------------------------------------- Measuring Performance with the Built-in Profiler http://unity3d.com/support/documentation/Manual/iphone-InternalProfiler.html Unity comes with a performance profiler. It is disabled by default so to enable it, you need to open the Unity-generated XCode project, select the iPhone_Profiler.h file and change the line #define ENABLE_INTERNAL_PROFILER 0 to #define ENABLE_INTERNAL_PROFILER 1 Select Run->Console in the XCode menu to display the output console (GDB) and then run your project. Unity will output statistics to the console window every thirty frames. For example: iPhone/iPad Unity internal profiler stats: cpu-player> min: 9.8 max: 24.0 avg: 16.3 cpu-ogles-drv> min: 1.8 max: 8.2 avg: 4.3 cpu-waits-gpu> min: 0.8 max: 1.2 avg: 0.9 cpu-present> min: 1.2 max: 3.9 avg: 1.6 frametime> min: 31.9 max: 37.8 avg: 34.1 draw-call #> min: 4 max: 9 avg: 6 | batched: 10 tris #> min: 3590 max: 4561 avg: 3871 | batched: 3572 verts #> min: 1940 max: 2487 avg: 2104 | batched: 1900 player-detail> physx: 1.2 animation: 1.2 culling: 0.5 skinning: 0.0 batching: 0.2 render: 12.0 fixed-update-count: 1 .. 2 mono-scripts> update: 0.5 fixedUpdate: 0.0 coroutines: 0.0 mono-memory> used heap: 233472 allocated heap: 548864 max number of collections: 1 collection total duration: 5.7 All times are measured in milliseconds per frame. You can see the minimum, maximum and average times over the last thirty frames. General CPU Activity cpu-player Displays the time your game spends executing code inside the Unity engine and executing scripts on the CPU. cpu-ogles-drv Displays the time spent executing OpenGL ES driver code on the CPU. Many factors like the number of draw calls, number of internal rendering state changes, the rendering pipeline setup and even the number of processed vertices can have an effect on the driver stats. cpu-waits-gpu Displays the time the CPU is idle while waiting for the GPU to finish rendering. If this number exceeds 2-3 milliseconds then your application is most probably fillrate/GPU processing bound. cpu-present The amount of time spent executing the presentRenderbuffer command in OpenGL ES. frametime Represents the overall time of a game frame. Note that iOS hardware is always locked at a 60Hz refresh rate, so you will always get multiples times of ~16.7ms (1000ms/60Hz = ~16.7ms). Rendering Statistics draw-call # The number of draw calls per frame. Keep it as low as possible. tris # Total number of triangles sent for rendering. verts # Total number of vertices sent for rendering. You should keep this number below 10000 if you use only static geometry but if you have lots of skinned geometry then you should keep it much lower. batched Number of draw-calls, triangles and vertices which were automatically batched by the engine. Comparing these numbers with draw-call and triangle totals will give you an idea how well is your scene prepared for batching. Share as many materials as possible among your objects to improve batching. Detailed Unity Player Statistics The player-detail section provides a detailed breakdown of what is happening inside the engine:- physx Time spent on physics. animation Time spent animating bones. culling Time spent culling objects outside the camera frustum. skinning Time spent applying animations to skinned meshes. batching Time spent batching geometry. Batching dynamic geometry is considerably more expensive than batching static geometry. render Time spent rendering visible objects. fixed-update-count Minimum and maximum number of FixedUpdates executed during this frame. Too many FixedUpdates will deteriorate performance considerably. There are some simple guidelines to set a good value for the fixed time delta
here. Detailed Scripts Statistics The mono-scripts section provides a detailed breakdown of the time spent executing code in the Mono runtime: update Total time spent executing all Update() functions in scripts. fixedUpdate Total time spent executing all FixedUpdate() functions in scripts. coroutines Time spent inside script coroutines. Detailed Statistics on Memory Allocated by Scripts The mono-memory section gives you an idea of how memory is being managed by the Mono garbage collector: allocated heap Total amount of memory available for allocations. A garbage collection will be triggered if there is not enough memory left in the heap for a given allocation. If there is still not enough free memory even after the collection then the allocated heap will grow in size. used heap The portion of the allocated heap which is currently used up by objects. Every time you create a new class instance (not a struct) this number will grow until the next garbage collection. max number of collections Number of garbage collection passes during the last 30 frames. collection total duration Total time (in milliseconds) of all garbage collection passes that have happened during the last 30 frames. Page last updated: 2012-01-18 ------------------------------------------------------------------------------------------ Optimizing the Size of the Built iOS Player http://unity3d.com/support/documentation/Manual/iphone-playerSizeOptimization.html The two main ways of reducing the size of the player are by changing the Active Build Configuration within Xcode and by changing the Stripping Level within Unity. Building in Release Mode You can choose between the Debug and Release options on the Active Build Configuration drop-down menu in Xcode. Building as Release instead of Debug can reduce the size of the built player by as much as 2-3MB, depending on the game. The Active Build Configuration drop-down In Release mode, the player will be built without any debug information, so if your game crashes or has other problems there will be no stack trace information available for output. This is fine for deploying a finished game but you will probably want to use Debug mode during development. iOS Stripping Level (Advanced License feature) The size optimizations activated by stripping work in the following way:- Strip assemblies level: the scripts' bytecode is analyzed so that classes and methods that are not referenced from the scripts can be removed from the DLLs and thereby excluded from the AOT compilation phase. This optimization reduces the size of the main binary and accompanying DLLs and is safe as long as no reflection is used. Strip ByteCode level: any .NET DLLs (stored in the Data folder) are stripped down to metadata only. This is possible because all the code is already precompiled during the AOT phase and linked into the main binary. Use micro mscorlib level: a special, smaller version of mscorlib is used. Some components are removed from this library, for example, Security, Reflection.Emit, Remoting, non Gregorian calendars, etc. Also, interdependencies between internal components are minimized. This optimization reduces the main binary and mscorlib.dll size but it is not compatible with some System and System.Xml assembly classes, so use it with care. These levels are cumulative, so level 3 optimization implicitly includes levels 2 and 1, while level 2 optimization includes level 1. Note: Micro mscorlib is a heavily stripped-down version of the core library. Only those items that are required by the Mono runtime in Unity remain. Best practice for using micro mscorlib is not to use any classes or other features of .NET that are not required by your application. GUIDs are a good example of something you could omit; they can easily be replaced with custom made pseudo GUIDs and doing this would result in better performance and app size. Tips How to Deal with Stripping when Using Reflection Stripping depends highly on static code analysis and sometimes this can't be done effectively, especially when dynamic features like reflection are used. In such cases, it is necessary to give some hints as to which classes shouldn't be touched. Unity supports a per-project custom strippingblacklist. Using the blacklist is a simple matter of creating a link.xml file and placing it into the Assets folder. An example of the contents of the link.xml file follows. Classes marked for preservation will not be affected by stripping:- <linker> <assembly fullname="System.Web.Services"> <type fullname="System.Web.Services.Protocols.SoapTypeStubInfo" preserve="all"/> <type fullname="System.Web.Services.Configuration.WebServicesConfigurationSectionHandler" preserve="all"/> </assembly> <assembly fullname="System"> <type fullname="System.Net.Configuration.WebRequestModuleHandler" preserve="all"/> <type fullname="System.Net.HttpRequestCreator" preserve="all"/> <type fullname="System.Net.FileWebRequestCreator" preserve="all"/> </assembly> </linker> Note: it can sometimes be difficult to determine which classes are getting stripped in error even though the applic
ation requires them. You can often get useful information about this by running the stripped application on the simulator and checking the Xcode console for error messages. Simple Checklist for Making Your Distribution as Small as Possible Minimize your assets: enable PVRTC compression for textures and reduce their resolution as far as possible. Also, minimize the number of uncompressed sounds. There are some additional tips for file size reduction here. Set the iOS Stripping Level to Use micro mscorlib. Set the script call optimization level to Fast but no exceptions. Don't use anything that lives in System.dll or System.Xml.dll in your code. These libraries are not compatible with micro mscorlib. Remove unnecessary code dependencies. Set the API Compatibility Level to .Net 2.0 subset. Note that .Net 2.0 subset has limited compatibility with other libraries. Set the Target Platform to armv6 (OpenGL ES1.1). Don't use JS Arrays. Avoid generic containers in combination with value types, including structs. Can I produce apps of less than 20 megabytes with Unity? Yes. An empty project would take about 13 MB in the AppStore if all the size optimizations were turned off. This gives you a budget of about 7MB for compressed assets in your game. If you own an Advanced License (and therefore have access to the stripping option), the empty scene with just the main camera can be reduced to about 6 MB in the AppStore (zipped and DRM attached) and you will have about 14 MB available for compressed assets. Why did my app increase in size after being released to the AppStore? When they publish your app, Apple first encrypt the binary file and then compresses it via zip. Most often Apple's DRM increases the binary size by about 4 MB or so. As a general rule, you should expect the final size to be approximately equal to the size of the zip-compressed archive of all files (except the executable) plus the size of the uncompressed executable file. --------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------- iOS Hardware Guide http://unity3d.com/support/documentation/Manual/iphone-Hardware.html Hardware models The following table summarizes iOS hardware available in devices of various generations: Common to all iOS devices Screen: 320x480 pixels, LCD at 163ppi (unless stated otherwise) Built-in accelerometer Wi-Fi Original iPhone ARM11, 412 Mhz CPU PowerVR MBX Lite 3D graphics processor 128MB of memory 2 megapixel camera Speaker and microphone Vibration support Silent switch iPhone 3G ARM11, 412 Mhz CPU PowerVR MBX Lite 3D graphics processor 128MB of memory 2 megapixel camera Speaker and microphone Vibration support Silent switch GPS support iPhone 3GS ARM Cortex A8, 600 MHz CPU PowerVR SGX graphics processor 256MB of memory 3 megapixel camera with video capture capability Speaker and microphone Vibration support Silent switch GPS support Compass support iPhone 4 ARM Cortex-A8 Apple A4 CPU ARM Cortex-A8 Apple A4 graphics processor 512MB of memory Cameras Rear 5.0 MP backside illuminated CMOS image sensor with 720p HD video at 30 fps and LED flash Front 0.3 MP (VGA) with geotagging, tap to focus, and 480p SD video at 30 fps Screen: 960x640 pixels, LCD at 326 ppi, 800:1 contrast ratio. Speaker and microphone Vibration Support Silent switch GPS support Compass Support iPod Touch 1st generation ARM11, 412 Mhz CPU PowerVR MBX Lite 3D graphics processor 128MB of memory iPod Touch 2nd generation ARM11, 533 Mhz CPU PowerVR MBX Lite 3D graphics processor 128MB of memory Speaker and microphone iPad 1 GHz Apple A4 CPU Wifi + Blueooth + (3G Cellular HSDPA, 2G cellular EDGE on the 3G version) Accelerometer, ambient light sensor, magnetometer (for digital compass) Mechanical keys: Home, sleep, screen rotation lock, volume. Screen: 1024x768 pixels, LCD at 132 ppi, LED-backlit. Graphics Processing Unit and Hidden Surface Removal（没看明白……） The iPhone/iPad graphics processing unit (GPU) is a Tile-Based Deferred Renderer. In contrast with most GPUs in desktop computers, the iPhone/iPad GPU focuses on minimizing the work required to render an image as early as possible in the processing of a scene. That way, only the visible pixels will consume processing resources. The GPU's frame buffer is divided up into tiles and rendering happens tile by tile. First, triangles for the whole frame are gathered and assigned to the tiles. Then, visible fragments of each triangle are chosen. Finally, the selected triangle fragments are passed to the rasterizer (triangle fragments occluded from the camera are rejected at this stage). In other words, the iPhone/iPad GPU implements a Hidden Surface Removal operation at reduced cost. Such an architecture consumes less memory bandwidth, has lower power consumption and utilizes the texture cache better. Tile-Based Deferred Rendering allows the device to reject occluded fragments before actual rasterization, which helps to keep overdraw low. For more information see also:- POWERVR MBX Technology Overview Apple Notes on iPhone/iPad GPU and OpenGL ES Apple Performance Advices for OpenGL ES in General Apple Performance Advices for OpenGL ES Shaders SGX series Starting with the iPhone 3GS, newer devices are equipped with the SGX series of GPUs. The SGX series features support for the OpenGL ES2.0 rendering API and vertex and pixel shaders. The Fixed-function pipeline is not supported natively on such GPUs, but instead is emulated by generating vertex and pixel shaders with analogous functionality on the fly. The SGX series fully supports MultiSample anti-aliasing. MBX series Older devices such as the original iPhone, iPhone 3G and iPod Touch 1st and 2nd Generation are equipped with the MBX series of GPUs. The MBX series supports only OpenGL ES1.1, the fixed function Transform/Lighting pipeline and two textures per fragment. Texture Compression The only texture compression format supported by iOS is PVRTC. PVRTC provides support for RGB and RGBA (color information plus an alpha channel) texture formats and can compress a single pixel to two or four bits. The PVRTC format is essential to reduce the memory footprint and to reduce consumption of memory bandwidth (ie, the rate at which data can be read from memory, which is usually very limited on mobile devices). Vertex Processing Unit The iPhone/iPad has a dedicated unit responsible for vertex processing which runs calculations in parallel with rasterization. In order to achieve better parallelization, the iPhone/iPad processes vertices one frame ahead of the rasterizer. Unified Memory Architecture CPU和GPU共用内存。图像使用了更多的内存，则游戏可用的内存就更少。 Multimedia CoProcessing Unit The iPhone/iPad main CPU is equipped with a powerful SIMD (Single Instruction, Multiple Data) coprocessor supporting either the VFP or the NEON architecture. The Unity iOS run-time takes advantage of these units for multiple tasks such as calculating skinned mesh transformations, geometry batching, audio processing and other calculation-intensive operations. -------------------------------------------------------------------------------------------- Texture 2D http://unity3d.com/support/documentation/Manual/Textures.html Textures bring your Meshes, Particles, and interfaces to life! They are image or movie files that you lay over or wrap around your objects. As they are so important, they have a lot of properties. If you are reading this for the first time, jump down to Details, and return to the actual settings when you need a reference. The shaders you use for your objects put specific requirements on which textures you need, but the basic principle is that you can put any image file inside your project. If it meets the size requirements (specified below), it will get imported and optimized for game use. This extends to multi-layer Photoshop or TIFF files - they are flattened on import, so there is no size penalty for your game. Properties The Texture Inspector looks a bit different from most others: The top section contains a few settings, and the bottom part contains the Texture Importer and the texture preview.
Aniso Level Increases texture quality when viewing the texture at a steep angle. Good for floor and ground textures, see
below. Filter Mode Selects how the Texture is filtered when it gets stretched by 3D transformations: Point The Texture becomes blocky up close Bilinear The Texture becomes blurry up close Trilinear Like Bilinear, but the Texture also blurs between the different mip levels Wrap Mode Selects how the Texture behaves when tiled: Repeat The Texture repeats (tiles) itself Clamp The Texture's edges get stretched Texture Importer Textures all come from image files in your Project Folder. How they are imported is specified by the Texture Importer. You change these by selecting the file texture in the Project View and modifying the Texture Importer in the Inspector. In Unity 3 we simplified for you all the settings, now you just need to select what are you going to use the texture for and Unity will set default parameters for the type of texture you have selected. Of course if you want to have total control of your texture and do specific tweaks, you can set the Texture Type to Advanced. This will present the full set of options that you have available.
Texture Type Select this to set basic parameters depending on the purpose of your texture. Texture This is the most common setting used for all the textures in general. Normal Map Select this to turn the color channels into a format suitable for real-time normal mapping. For more info, see
Normal Maps GUI Use this if your texture is going to be used on any HUD/GUI Controls. Reflection Also known as Cube Maps, used to create reflections on textures. check
keep the edges of your cookie texture solid black in order to get the proper effect. In the Texture Inspector, set the Edge Mode to Clamp. Generate Alpha from Greyscale If enabled, an alpha transparency channel will be generated by the image's existing values of light & dark. The Advanced Texture Importer Settings dialog Non Power of 2 If texture has non-power-of-two size, this will define a scaling behavior at import time (for more info see the
Texture Sizes section below): None Texture will be padded to the next larger power-of-two size for use with GUITexture component. To nearest Texture will be scaled to the nearest power-of-two size at import time. For instance 257x511 texture will become 256x512. Note that
PVRTC formats require textures to be square (width equal to height), therefore final size will be upscaled to 512x512. To larger Texture will be scaled to the next larger power-of-two size at import time. For instance 257x511 texture will become 512x512. To smaller Texture will be scaled to the next smaller power-of-two size at import time. For instance 257x511 texture will become 256x256. Generate Cube Map Generates a cubemap from the texture using different generation methods. Read/Write Enabled Select this to enable access to the texture data from scripts (GetPixels, SetPixels and other Texture2D functions). Note however that a copy of the texture data will be made, doubling the amount of memory required for texture asset. Use only if absolutely necessary. This is only valid for uncompressed and DTX compressed textures, other types of compressed textures cannot be read from. Disabled by default. Generate Mip Maps Select this to enable mip-map generation. Mip maps are smaller versions of the texture that get used when the texture is very small on screen. For more info, see
Mip Maps below. Correct Gamma Select this to enable per-mip-level gamma correction. Border Mip Maps Select this to avoid colors seeping out to the edge of the lower Mip levels. Used for light cookies (see below). Mip Map Filtering Two ways of mip map filtering are available to optimize image quality: Box The simplest way to fade out the mipmaps - the mip levels become smoother and smoother as they go down in size. Kaiser A sharpening Kaiser algorithm is run on the mip maps as they go down in size. If your textures are too blurry in the distance, try this option. Fade Out Mips Enable this to make the mipmaps fade to gray as the mip levels progress. This is used for detail maps. The left most scroll is the first mip level to begin fading out at. The rightmost scroll defines the mip level where the texture is completely grayed out Generate Normal Map Enable this to turn the color channels into a format suitable for real-time normal mapping. For more info, see
Normal Maps, below. Bumpiness Control the amount of bumpiness. Filtering Determine how the bumpiness is calculated: Smooth This generates normal maps that are quite smooth. Sharp Also known as a Sobel filter. this generates normal maps that are sharper than Standard. Normal Map Select this if you want to see how the normal map will be applied to your texture. Lightmap Select this if you want to use the texture as a lightmap. Per-Platform Overrides When you are building for different platforms, you have to think on the resolution of your textures for the target platform, the size and the quality. With Unity 3 you can override these options and assign specific values depending on the platform you are deploying to. Note that if you don't select any value to override, the Editor will pick the default values when building your project. Default settings for all platforms. Max Texture Size The maximum imported texture size.
Artists often prefer to work with huge textures - scale the texture down to a suitable size with this. Texture Format What internal representation is used for the texture. This is a tradeoff between size and quality. In the examples below we show the final size of a in-game texture of 256 by 256 pixels: Compressed Compressed RGB texture. This is the most common format for diffuse textures. 4 bits per pixel ( 32 KB for a 256x256 texture). 16 bit Low-quality truecolor. Has 16 levels of red, green, blue and alpha. Truecolor Truecolor, this is the highest quality. At 256 KB for a 256x256 texture. If you have set the Texture Type to Advanced then the Texture Format has different values. Desktop Texture Format What internal representation is used for the texture. This is a tradeoff between size and quality. In the examples below we show the final size of an in-game texture of 256 by 256 pixels: RGB Compressed DXT1 Compressed RGB texture. This is the most common format for diffuse textures. 4 bits per pixel (32 KB for a 256x256 texture). RGBA Compressed DXT5 Compressed RGBA texture. This is the main format used for diffuse & specular control textures. 1 byte/pixel (64 KB for a 256x256 texture). RGB 16 bit 65 thousand colors with no alpha. Compressed DXT formats use less memory and usually look better. 128 KB for a 256x256 texture. RGB 24 bit Truecolor but without alpha. 192 KB for a 256x256 texture. Alpha 8 bit High quality alpha channel but without any color. 64 KB for a 256x256 texture. RGBA 16 bit Low-quality truecolor. Has 16 levels of red, green, blue and alpha. Compressed DXT5 format uses less memory and usually looks better. 128 KB for a 256x256 texture. RGBA 32 bit Truecolor with alpha - this is the highest quality. At 256 KB for a 256x256 texture, this one is expensive. Most of the time,
DXT5 offers sufficient quality at a much smaller size. The main way this is used is for normal maps, as DXT compression there often carries a visible quality loss. iOS Texture Format What internal representation is used for the texture. This is a tradeoff between size and quality. In the examples below we show the final size of a in-game texture of 256 by 256 pixels: RGB Compressed PVRTC 4 bits Compressed RGB texture. This is the most common format for diffuse textures. 4 bits per pixel ( 32 KB for a 256x256 texture) RGBA Compressed PVRTC 4 bits Compressed RGBA texture. This is the main format used for diffuse & specular control textures or diffuse textures with transparency. 4 bits per pixel (32 KB for a 256x256 texture) RGB Compressed PVRTC 2 bits Compressed RGB texture. Lower quality format suitable for diffuse textures. 2 bits per pixel (16 KB for a 256x256 texture) RGBA Compressed PVRTC 2 bits Compressed RGBA texture. Lower quality format suitable for diffuse & specular control textures. 2 bits per pixel (16 KB for a 256x256 texture) RGB Compressed DXT1 Compressed RGB texture. This format is not supported on iOS, but kept for backwards compatibility with desktop projects. RGBA Compressed DXT5 Compressed RGBA texture. This format is not supported on iOS, but kept for backwards compatibility with desktop projects. RGB 16 bit 65 thousand colors with no alpha. Uses more memory than PVRTC formats, but could be more suitable for
UI or crisp textures without gradients.
128 KB for a 256x256 texture. RGB 24 bit Truecolor but without alpha. 192 KB for a 256x256 texture. Alpha 8 bit High quality alpha channel but without any color. 64 KB for a 256x256 texture. RGBA 16 bit Low-quality truecolor. Has 16 levels of red, green, blue and alpha. Uses more memory than PVRTC formats, but can be handy if you need exact alpha channel. 128 KB for a 256x256 texture. RGBA 32 bit Truecolor with alpha - this is the highest quality. At 256 KB for a 256x256 texture, this one is expensive. Most of the time,
PVRTC formats offers sufficient quality at a much smaller size.
Android Details Supported Formats Unity can read the following file formats: PSD, TIFF, JPG, TGA, PNG, GIF, BMP, IFF, PICT. It should be noted that Unity can import multi-layer PSD & TIFF files just fine. They are flattened automatically on import but the layers are maintained in the assets themselves, so you don't lose any of your work when using these file types natively. This is important as it allows you to just have one copy of your textures that you can use from Photoshop, through your 3D modelling app and into Unity. Texture Sizes Ideally texture sizes should be powers of two on the sides. These sizes are as follows: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 or 2048 pixels. The textures do not have to be square, i.e. width can be different from height. It is possible to use other (non power of two) texture sizes with Unity. Non power of two texture sizes work best when used on GUI Textures, however if used on anything else they will be converted to an uncompressed RGBA 32 bit format. That means they will take up more video memory (compared to PVRT(iOS)/DXT(Desktop) compressed textures), will be slower to load and slower to render (if you are on iOS mode). In general you'll use non power of two sizes only for GUI purposes. Non power of two texture assets can be scaled up at import time using the Non Power of 2 option in the advanced texture type in the import settings. Unity will scale texture contents as requested, and in the game they will behave just like any other texture, so they can still be compressed and very fast to load. UV Mapping When mapping a 2D texture onto a 3D model, some sort of wrapping is done. This is called UV mapping and is done in your 3D modelling app. Inside Unity, you can scale and move the texture using Materials. Scaling normal & detail maps is especially useful. Mip Maps Mip Maps are a list of progressively smaller versions of an image, used to optimise performance on real-time 3D engines. Objects that are far away from the camera use the smaller texture versions.Using mip maps uses 33% more memory, but not using them can be a huge performance loss. You should always use mipmaps for in-game textures; the only exceptions are textures that will never be minified (e.g. GUI textures).GUI图片不需要设置成Mip Maps Normal Maps Normal maps are used by normal map shaders to make low-polygon models look as if they contain more detail. Unity uses normal maps encoded as RGB images. You also have the option to generate a normal map from a grayscale height map image. Detail Maps If you want to make a terrain, you normally use your main texture to show where there are areas of grass, rocks sand, etc... If your terrain has a decent size, it will end up very blurry. Detail textures hide this fact by fading in small details as your main texture gets up close. When drawing detail textures, a neutral gray is invisible, white makes the main texture twice as bright and black makes the main texture completely black. Reflections (Cube Maps) If you want to use texture for reflection maps (e.g. use the Reflective builtin shaders), you need to use Cubemap Textures. Anisotropic filtering Anisotropic filtering increases texture quality when viewed from a grazing angle, at some expense of rendering cost (the cost is entirely on the graphics card). Increasing anisotropy level is usually a good idea for ground and floor textures. In Quality Settings anisotropic filtering can be forced for all textures or disabled completely. No anisotropy (left) / Maximum anisotropy (right) used on the ground texture ------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------- Unity图像性能优化（iOS） http://unity3d.com/support/documentation/Manual/Optimizing Graphics Performance.html Alpha-Testing 尽可能的用“alpha-blend”材质来替代“alpha-test”材质。 “alpha-blend”材质:叠加效果的材质,如加法叠加(Additive)、乘法叠加 （Multiply）(标准材质库>Mobile>Particles>...) “alpha-test”材质:Unity中带透明的材质球 多边形顶点 让每帧显示顶点数少于40,000个（iPhone 3GS） 灯光 会极大的增加渲染量，尽量不用灯光 多边形模型优化 -1- 不要使用任何多余的三角面 -2- 让UV贴图接缝数和硬边（如，doubled-up vertices）尽可能的少 注意，显卡需处理的顶点数通常和3D软件显示的顶点数不同。建模软件通常显示 的是构成一个模型的各表面转角顶点数之和。而显卡有时需要把一个多边形上的 顶点分割为二个、甚至更多逻辑点以便于渲染。如果一个顶点拥有多重法线、UV 坐标或顶点色，则这个顶点将被分割。Unity中的顶点数一般会比3D软件中的显示 数目大得多。 贴图压缩 使用iOS原生的PVRT压缩格式能够减小贴图的尺寸（能带来更快的加载速度和占用 更小的内存空间）。 如果PVRT压缩格式使用效果不佳（如颜色渐变会有明显色带阶梯），则使用16- bit格式贴图 复杂数学计算（略） 浮点数操作（略） 灯光烘焙 使用Unity自带的Lightmap工具，将静止灯光烘培到模型上 共享材质球 尽量把多张贴图整合到一张贴图上、让尽可能多的物体使用同一材质球。 总结以下几点，加速你的游戏性能
Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach.（什么意思？）
学习使用Occlusion culling 减少draw-calls
使用skybox模拟远景 ======================================================= 角色优化 http://unity3d.com/support/documentation/Manual/Modeling Optimized Characters.html 使用单个多边形蒙皮角色 使用尽量少的材质 使用尽量少的骨骼数（移动设备少于30） 多边形数目 对于移动设备，每一个多边形控制在300至1500三角面 对于桌上电脑，每一个多边形控制在1500至4000三角面（如，半条命2的每个角色 使用2500-5000三角面），PS3和XBOX上的AAA级游戏使用5000-7000三角面 动画 动画烘培后，删除IK系统 ======================================================== Draw Call Batching http://unity3d.com/support/documentation/Manual/iphone-DrawCall- Batching.html 为了在屏幕上绘制3D图像，引擎需向显卡API程序发送Draw-Call，每一个Draw- Call需要进行大量的计算 Unity会整合多个物体，在一次Draw-Call中把他们绘制出来，这个叫“batching ”（批处理）。一次批处理越多的物体，性能就越好。 Unity内置批处理比单纯的使用建模软件合并多边形有更多优点。引擎会单独裁剪 每一个物体，多边形渲染总数不变。 材质 只有共享相同材质的物体才会被一起批处理，所以，让尽可能多的物体共享相同 的材质 如果有2个材质只是贴图上有差别，你可以整合那些贴图成一张贴图，这样你就可 以只使用一个材质了 Dynamic Batching动态批处理 Unity会对应用相同材质的移动物体进行自动批处理。 动态批处理只应用到定点数少于900的多边形 如果你的材质使用顶点位置（可能指高度贴图）、法线和单一UV， 不要使用缩放，物体缩放值为（1,1,1）和（2,2,2）的物体无法一起批处理 Static Batching静态批处理 静态批处理让引擎减少draw-call。静态批处理比动态批处理更有效率。 对于不移动的物体，需要特别标注其为静态（static） 使用静态批处理需要额外的内存来存储合并的多边形。如果多个物体在静态批处 理之前已经合并在一起了，则每一个物体会创建一个额外的多边形副本，无论是 在引擎编辑器状态下或运行状态下。所以，有时候为了节省内存不得不牺牲渲染 效率。 ================================================================== 使用Profiler监视引擎运行效率（略） ================================================================== 贴图压缩 iOS只支持PVRTC图片压缩格式，PVRTC支持RGB和RGBA两种格式。PVRTC格式可以减 少内存占用。 共用内存 CPU和GPU共用内存，图像使用了更多内存，则游戏可用内存就更少。
（来自：http://www.xue5.com/Mobile/Mobile/614988.html） 最近一段时间一直在做Unity 在IOS设备上的资源优化，结合Unity的官方文档以及自己遇到的实际问题，我
- 2Unity3D之IOS Document
- 2Unity3D之IOS Document
- 4Unity3D 发布iOS介绍
- 6Unity3D:unity与Android相互传递消息 & unity与ios相互传递消息