找回密码
 立即注册→加入我们

QQ登录

只需一步,快速开始

搜索
热搜: 下载 VB C 实现 编写
查看: 6815|回复: 3

【搬运】Faster Alternatives to glReadPixels and glTexImage2D in OpenGL ES

[复制链接]
发表于 2014-9-22 10:46:26 | 显示全部楼层 |阅读模式

欢迎访问技术宅的结界,请注册或者登录吧。

您需要 登录 才可以下载或查看,没有账号?立即注册→加入我们

×
In the development of Shou, I’ve been using GLSL with NEON to manipulate image rotation, scaling and color conversion, before send them to video encoder.

So I need a very efficient way to transfer pixels between OpenGL and memory space. TheglTexImage2D and glReadPixels performance are very unacceptable, especially for some specific vendors, e.g. Samsung Galaxy devices with Exynos chip.

Compared to glTex(Sub)Image2D, the glReadPixels is the real bottleneck, which blocks all OpenGL pipeline and results in about 100ms delay for a standard 720P frame read back.

Here I will share two standard OpenGL approaches to achieve really faster pixels pack, which should be available on all OpenGL implementations. Only glReadPixels will be discussed, as the glTexImage2D should have the same usage.

Pixel Buffer Object

PBO is not introduced until OpenGL ES 3.0, which is available since Android 4.3. The pixels pack operation will be reduced to about 5ms using PBO.

PBO is created just like any other buffer objects:
  1. glGenBuffers(1, &pbo_id);
  2. glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo_id);
  3. glBufferData(GL_PIXEL_PACK_BUFFER, pbo_size, 0, GL_DYNAMIC_READ);
  4. glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);
复制代码
According to the reference of glReadPixels:
If a non-zero named buffer object is bound to the GL_PIXEL_PACK_BUFFER target (see glBindBuffer) while a block of pixels is requested, data is treated as a byte offset into the buffer object’s data store rather than a pointer to client memory.
When we need to read pixels from an FBO:
  1. glReadBuffer(GL_COLOR_ATTACHMENT0);
  2. glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo_id);
  3. glReadPixels(0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, 0);
  4. GLubyte *ptr = glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, pbo_size, GL_MAP_READ_BIT);
  5. memcpy(pixels, ptr, pbo_size);
  6. glUnmapBuffer(GL_PIXEL_PACK_BUFFER);
  7. glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);
复制代码
In a real project, we may consider using double or triple PBOs to improve the performance.

EGLImage

EGL_KHR_image_base is a completed EGL extension, which achieves the same performance as PBO, but only require OpenGL-ES 1.1 or 2.0.

The function to create an EGLImageKHR is
  1. EGLImageKHR eglCreateImageKHR(EGLDisplay dpy,
  2.                               EGLContext ctx,
  3.                               EGLenum target,
  4.                               EGLClientBuffer buffer,
  5.                               const EGLint *attrib_list)
复制代码
The Android EGL implementation frameworks/native/opengl/libagl/egl.cpp implies that theEGLDisplay should be a valid display, the EGLClientBuffer type should be ANativeWindowBuffer, the EGLContext can only be EGL_NO_CONTEXT, and the target can only beEGL_NATIVE_BUFFER_ANDROID.

All the other parameters are obvious, except for the ANativeWindowBuffer, which is defined insystem/core/include/system/window.h.

To allocate an ANativeWindowBuffer, Android has a simple wrapper called GraphicBuffer, defined in frameworks/native/include/ui/GraphicBuffer.h.
  1. GraphicBuffer *window = new GraphicBuffer(width, height, PIXEL_FORMAT_RGBA_8888, GraphicBuffer::USAGE_SW_READ_OFTEN | GraphicBuffer::USAGE_HW_TEXTURE);
  2. struct ANativeWindowBuffer *buffer = window->getNativeBuffer();
  3. EGLImageKHR *image = eglCreateImageKHR(eglGetCurrentDisplay(), EGL_NO_CONTEXT, EGL_NATIVE_BUFFER_ANDROID, *attribs);
复制代码
Then anytime we want to read pixels from an FBO, we should use one of the two methods below:
  1. void EGLImageTargetTexture2DOES(enum target, eglImageOES image);
  2. void EGLImageTargetRenderbufferStorageOES(enum target, eglImageOES image);
复制代码
These two methods will establishes all the properties of the target GL_TEXTURE_2D orGL_RENDERBUFFER.
  1. uint8_t *ptr;
  2. glBindTexture(GL_TEXTURE_2D, texture_id);
  3. glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
  4. window->lock(GraphicBuffer::USAGE_SW_READ_OFTEN, &ptr);
  5. memcpy(pixels, ptr, width * height * 4);
  6. window->unlock();
复制代码
References

GL_PIXEL_PACK_BUFFER:http://www.khronos.org/opengles/ ... lMapBufferRange.xml
EGL_KHR_image_base:http://www.khronos.org/registry/ ... _KHR_image_base.txt
GL_OES_EGL_image:http://www.khronos.org/registry/ ... S/OES_EGL_image.txt
Using direct textures on Android:http://snorp.net/2011/12/16/android-direct-texture.html
Using OpenGL ES to Accelerate Apps with Legacy 2D GUIs:http://software.intel.com/en-us/ ... with-legacy-2d-guis
iOS solution:http://stackoverflow.com/questio ... phone-opengl-es-2-0
回复

使用道具 举报

 楼主| 发表于 2016-5-15 05:39:07 | 显示全部楼层
原理其实就是建立多个PBO,这样就可以不用等待GPU跑完所有队列就可以将渲染好的前几帧的数据传回内存了。
回复 赞! 靠!

使用道具 举报

 楼主| 发表于 2018-3-20 13:39:58 | 显示全部楼层
经过实际的测试……效果比较玄学。有时候多个PBO性能不一定有提升,但有时候又有。
但,原则上还是建议用多个PBO。
回复 赞! 靠!

使用道具 举报

发表于 2019-12-15 11:35:25 | 显示全部楼层
不错。大佬也是搞图形编程的?
回复 赞! 靠!

使用道具 举报

本版积分规则

QQ|Archiver|小黑屋|技术宅的结界 ( 滇ICP备16008837号 )|网站地图

GMT+8, 2024-12-22 00:54 , Processed in 0.034792 second(s), 24 queries , Gzip On.

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表