android H264码流中的SPS获取
2015-11-30 20:56
951 查看
此文对于想要了解如何获取h264码流中SPS参数的过程,但是又不是很熟悉h264的朋友会很有帮助!!
This is a follow-up to my World’s Smallest h.264 Encoder post. I’ve received several emails asking about precise details of
things in two entities in the h.264 bitstream: the Sequence Parameter Set (SPS) and the Picture Parameter Set (PPS). Both entities contain information that an h.264 decoder needs to decode the video data, for example the resolution and frame rate of the video.
Recall that an h.264 bitstream contains a sequence of Network Abstraction Layer (NAL) units. The SPS and PPS
are both types of NAL units. The SPS NAL unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. The PPS NAL unit contains parameters that apply to the decoding of
one or more individual pictures inside a coded video sequence.
In the case of my simple encoder, we emitted a single SPS and PPS at the start of the video data stream, but in the case of a more complex encoder, it would not be uncommon to see them inserted periodically in the data for two reasons—first, often a decoder
will need to start decoding mid-stream, and second, because the encoder may wish to vary parameters for different parts of the stream in order to achieve better compression or quality goals.
In my trivial encoder, the h.264 SPS and PPS were hardcoded in hex as:
Let’s decode this into something readable from the spec. The first thing I did was to look at section 7 of the h.264
specification. I saw that at a minimum I had to choose how to fill in the SPS parameters in the table below. In the table, as in the standard, the type
coded value of a variable number of bits. The spec doesn’t seem to define the maximum number of bits anywhere, but the reference
encoder software uses 32. (People wishing to explore the security of decoder software may find it interesting to violate this assumption!)
Some key things here are the profile (
A question I got a couple of times in email was about the width and height parameters—specifically, what to do if the picture width or height is not an integer multiple of macroblock size. Recall that, for the 4:2:0 sampling scheme in my encoder, a macroblock
consists of 16×16 luma samples. In this case, you would set the
and
One interesting problem that we see fairly often with h.264 is when the container format (MP4, MOV, etc.) contains different values for some of these parameters than the SPS and PPS. In this case, we find different video players handle the streams differently.
A handy tool for decoding h.264 bitstreams, including the SPS, is the h264bitstream tool. It comes with a command line
program that decodes a bitstream to the parameter names defined in the h.264 specification. Let’s look at its output for a sample mp4 file I downloaded from youtube.
First, I extract the h.264 NAL units from the file using ffmpeg:
The NAL units now reside in the file
The only additional thing I’d like to point out here is that this particular SPS also contains information about the frame rate of the video (see
ensure they agree with the container format that the h.264 will eventually be muxed into. Even a small error, such as 29.97 fps in one place and 30 fps in another, can result in severe audio/video synchronization problems.
Next time I will write about the h.264 Picture Parameter Set (PPS).
Tags: h.264, Video
This is a follow-up to my World’s Smallest h.264 Encoder post. I’ve received several emails asking about precise details of
things in two entities in the h.264 bitstream: the Sequence Parameter Set (SPS) and the Picture Parameter Set (PPS). Both entities contain information that an h.264 decoder needs to decode the video data, for example the resolution and frame rate of the video.
Recall that an h.264 bitstream contains a sequence of Network Abstraction Layer (NAL) units. The SPS and PPS
are both types of NAL units. The SPS NAL unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. The PPS NAL unit contains parameters that apply to the decoding of
one or more individual pictures inside a coded video sequence.
In the case of my simple encoder, we emitted a single SPS and PPS at the start of the video data stream, but in the case of a more complex encoder, it would not be uncommon to see them inserted periodically in the data for two reasons—first, often a decoder
will need to start decoding mid-stream, and second, because the encoder may wish to vary parameters for different parts of the stream in order to achieve better compression or quality goals.
In my trivial encoder, the h.264 SPS and PPS were hardcoded in hex as:
/* h.264 bitstreams */
const uint8_t sps[] =
{0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0x00, 0x0a, 0xf8, 0x41, 0xa2};
const uint8_t pps[] =
{0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x38, 0x80};
Let’s decode this into something readable from the spec. The first thing I did was to look at section 7 of the h.264
specification. I saw that at a minimum I had to choose how to fill in the SPS parameters in the table below. In the table, as in the standard, the type
u(n)indicates an unsigned integer of n bits, and
ue(v)indicates an unsigned exponential-golomb
coded value of a variable number of bits. The spec doesn’t seem to define the maximum number of bits anywhere, but the reference
encoder software uses 32. (People wishing to explore the security of decoder software may find it interesting to violate this assumption!)
Parameter Name | Type | Value | Comments |
forbidden_zero_bit | u(1) | 0 | Despite being forbidden, it must be set to 0! |
nal_ref_idc | u(2) | 3 | 3 means it is “important” (this is an SPS) |
nal_unit_type | u(5) | 7 | Indicates this is a sequence parameter set |
profile_idc | u(8) | 66 | Baseline profile |
constraint_set0_flag | u(1) | 0 | We’re not going to honor constraints |
constraint_set1_flag | u(1) | 0 | We’re not going to honor constraints |
constraint_set2_flag | u(1) | 0 | We’re not going to honor constraints |
constraint_set3_flag | u(1) | 0 | We’re not going to honor constraints |
reserved_zero_4bits | u(4) | 0 | Better set them to zero |
level_idc | u(8) | 10 | Level 1, sec A.3.1 |
seq_parameter_set_id | ue(v) | 0 | We’ll just use id 0. |
log2_max_frame_num_minus4 | ue(v) | 0 | Let’s have as few frame numbers as possible |
pic_order_cnt_type | ue(v) | 0 | Keep things simple |
log2_max_pic_order_cnt_lsb_minus4 | ue(v) | 0 | Fewer is better. |
num_ref_frames | ue(v) | 0 | We will only send I slices |
gaps_in_frame_num_value_allowed_flag | u(1) | 0 | We will have no gaps |
pic_width_in_mbs_minus_1 | ue(v) | 7 | SQCIF is 8 macroblocks wide |
pic_height_in_map_units_minus_1 | ue(v) | 5 | SQCIF is 6 macroblocks high |
frame_mbs_only_flag | u(1) | 1 | We will not to field/frame encoding |
direct_8x8_inference_flag | u(1) | 0 | Used for B slices. We will not send B slices |
frame_cropping_flag | u(1) | 0 | We will not do frame cropping |
vui_prameters_present_flag | u(1) | 0 | We will not send VUI data |
rbsp_stop_one_bit | u(1) | 1 | Stop bit. I missed this at first and it caused me much trouble. |
Some key things here are the profile (
profile_idc) and level (
level_idc) that I chose, and the picture width and height. If you encode the above table in hex, you will get the values in the SPS array declared above.
A question I got a couple of times in email was about the width and height parameters—specifically, what to do if the picture width or height is not an integer multiple of macroblock size. Recall that, for the 4:2:0 sampling scheme in my encoder, a macroblock
consists of 16×16 luma samples. In this case, you would set the
frame_cropping_flagto 1, and reduce the number of pixels in the horizontal and vertical direction with the
frame_crop_left_offset,
frame_crop_right_offset,
frame_crop_top_offset,
and
frame_crop_bottom_offsetparameters, which are conditionally present in the bitstream only if the
frame_cropping_flagis set to one.
One interesting problem that we see fairly often with h.264 is when the container format (MP4, MOV, etc.) contains different values for some of these parameters than the SPS and PPS. In this case, we find different video players handle the streams differently.
A handy tool for decoding h.264 bitstreams, including the SPS, is the h264bitstream tool. It comes with a command line
program that decodes a bitstream to the parameter names defined in the h.264 specification. Let’s look at its output for a sample mp4 file I downloaded from youtube.
First, I extract the h.264 NAL units from the file using ffmpeg:
ffmpeg.exe -i Old Faithful.mp4 -vcodec copy -vbsf h264_mp4toannexb -an of.h264
The NAL units now reside in the file
of.h264. I then run the h264_analyze command from the h264bitstream package to produce the following output:
h264_analyze of.h264
!! Found NAL at offset 4 (0x0004), size 25 (0x0019)
==================== NAL ====================
forbidden_zero_bit : 0
nal_ref_idc : 3
nal_unit_type : 7 ( Sequence parameter set )
======= SPS =======
profile_idc : 100
constraint_set0_flag : 0
constraint_set1_flag : 0
constraint_set2_flag : 0
constraint_set3_flag : 0
reserved_zero_4bits : 0
level_idc : 31
seq_parameter_set_id : 0
chroma_format_idc : 1
residual_colour_transform_flag : 0
bit_depth_luma_minus8 : 0
bit_depth_chroma_minus8 : 0
qpprime_y_zero_transform_bypass_flag : 0
seq_scaling_matrix_present_flag : 0
log2_max_frame_num_minus4 : 3
pic_order_cnt_type : 0
log2_max_pic_order_cnt_lsb_minus4 : 3
delta_pic_order_always_zero_flag : 0
offset_for_non_ref_pic : 0
offset_for_top_to_bottom_field : 0
num_ref_frames_in_pic_order_cnt_cycle : 0
num_ref_frames : 1
gaps_in_frame_num_value_allowed_flag : 0
pic_width_in_mbs_minus1 : 79
pic_height_in_map_units_minus1 : 44
frame_mbs_only_flag : 1
mb_adaptive_frame_field_flag : 0
direct_8x8_inference_flag : 1
frame_cropping_flag : 0
frame_crop_left_offset : 0
frame_crop_right_offset : 0
frame_crop_top_offset : 0
frame_crop_bottom_offset : 0
vui_parameters_present_flag : 1
=== VUI ===
aspect_ratio_info_present_flag : 1
aspect_ratio_idc : 1
sar_width : 0
sar_height : 0
overscan_info_present_flag : 0
overscan_appropriate_flag : 0
video_signal_type_present_flag : 0
video_signal_type_present_flag : 0
video_format : 0
video_full_range_flag : 0
colour_description_present_flag : 0
colour_primaries : 0
transfer_characteristics : 0
matrix_coefficients : 0
chroma_loc_info_present_flag : 0
chroma_sample_loc_type_top_field : 0
chroma_sample_loc_type_bottom_field : 0
timing_info_present_flag : 1
num_units_in_tick : 100
time_scale : 5994
fixed_frame_rate_flag : 1
nal_hrd_parameters_present_flag : 0
vcl_hrd_parameters_present_flag : 0
low_delay_hrd_flag : 0
pic_struct_present_flag : 0
bitstream_restriction_flag : 1
motion_vectors_over_pic_boundaries_flag : 1
max_bytes_per_pic_denom : 0
max_bits_per_mb_denom : 0
log2_max_mv_length_horizontal : 11
log2_max_mv_length_vertical : 11
num_reorder_frames : 0
max_dec_frame_buffering : 1
=== HRD ===
cpb_cnt_minus1 : 0
bit_rate_scale : 0
cpb_size_scale : 0
initial_cpb_removal_delay_length_minus1 : 0
cpb_removal_delay_length_minus1 : 0
dpb_output_delay_length_minus1 : 0
time_offset_length : 0
The only additional thing I’d like to point out here is that this particular SPS also contains information about the frame rate of the video (see
timing_info_present_flag). These parameters must be closely checked when you generate bitstreams to
ensure they agree with the container format that the h.264 will eventually be muxed into. Even a small error, such as 29.97 fps in one place and 30 fps in another, can result in severe audio/video synchronization problems.
Next time I will write about the h.264 Picture Parameter Set (PPS).
Tags: h.264, Video
相关文章推荐
- android开发技巧总结(68个常用用法)
- 【Android】自定义View
- Android项目——ListView的使用
- Android安卓蓝牙开发
- Android IOS WebRTC 音视频开发总结(五三)-- 国内IM & RTC SDK列表
- AndroidStudio中使用plantUML
- LinearLayout中控件的置尾和置游问题
- Android ListView批量选择(全选、反选、全不选)
- Android之开源框架NineOldAndroids动画库
- Android ListView批量选择(全选、反选、全不选)
- Android禁止锁屏,保持常亮方法
- 利用animation-list逐帧动画创建Drawable序列并应于Android控件
- Android布局概览
- android 打开和关闭输入法
- Android编程-IntentService使用广播与Activity通信
- Android学习笔记-XML解析和JSON
- Android4.4 RIL的AT命令增加流程
- Android屏幕适配
- Android 获取手机GPS
- Android开发总结笔记 Menu(菜单) 1-1-18