您的位置:首页 > 其它

ffmpeg中的pcm格式

2016-04-20 18:20 369 查看
ffmpeg中的pcm格式枚举如下:

enum AVSampleFormat {

AV_SAMPLE_FMT_NONE = -1,

AV_SAMPLE_FMT_U8, ///< unsigned 8 bits

AV_SAMPLE_FMT_S16, ///< signed 16 bits

AV_SAMPLE_FMT_S32, ///< signed 32 bits

AV_SAMPLE_FMT_FLT, ///< float

AV_SAMPLE_FMT_DBL, ///< double

AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar

AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar

AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar

AV_SAMPLE_FMT_FLTP, ///< float, planar

AV_SAMPLE_FMT_DBLP, ///< double, planar

AV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically

};

openal中的格式:

/** Sound samples: format specifier. */

#define AL_FORMAT_MONO8 0x1100

#define AL_FORMAT_MONO16 0x1101

#define AL_FORMAT_STEREO8 0x1102

#define AL_FORMAT_STEREO16 0x1103

WAVEFORMATEX

The WAVEFORMATEX structure specifies the data format of a wave audio stream.WAVEFORMATEX 为波形音频流格式的数据结构typedef struct { WORD wFormatTag; WORD nChannels; DWORD nSamplesPerSec; DWORD nAvgBytesPerSec; WORD nBlockAlign; WORD wBitsPerSample; WORD cbSize; } WAVEFORMATEX; *PWAVEFORMATEX;

Members

wFormatTagSpecifies the waveform audio format type. For more information, see the following Comments section.设置波形声音的格式,更多的信息请参考说明部分。nChannelsSpecifies the number of channels of audio data. For monophonic audio, set this member to 1. For stereo, set this member to 2.设置音频文件的通道数量,对于单声道的声音,此此值为1。对于立体声,此值为2.nSamplesPerSecSpecifies the sample frequency at which each channel should be played or recorded. IfwFormatTag = WAVE_FORMAT_PCM, then common values for nSamplesPerSec are 8.0 kHz, 11.025 kHz, 22.05 kHz, and 44.1 kHz. For example, to specify a sample frequency of 11.025 kHz, setnSamplesPerSec to 11025. For non-PCM formats, this member should be computed according to the manufacturer's specification of the format tag.设置每个声道播放和记录时的样本频率。如果wFormatTag = WAVE_FORMAT_PCM,那么nSamplesPerSec通常为8.0 kHz, 11.025 kHz, 22.05 kHz和44.1 kHz。例如对于采样率为11.025 kHz的音频, nSamplesPerSec 将被设为11025。对于非PCM格式的,请根据厂商的设定计算。nAvgBytesPerSecSpecifies the required average data transfer rate in bytes per second. This value is useful for estimating buffer size.设置请求的平均数据传输率,单位byte/s。这个值对于创建缓冲大小是很有用的。nBlockAlignSpecifies the block alignment in bytes. The block alignment is the size of the minimum atomic unit of data for the wFormatTag format type. If wFormatTag = WAVE_FORMAT_PCM, set nBlockAlign to (nChannels*wBitsPerSample)/8, which is the size of a single audio frame. For non-PCM formats, this member should be computed according to the manufacturer's specification for the format tag.Playback and record software should process a multiple of nBlockAlign bytes of data at a time. Data written to and read from a device should always start at the beginning of a block.以字节为单位设置块对齐。块对齐是指最小数据的原子大小。如果wFormatTag = WAVE_FORMAT_PCM,nBlockAlign 为(nChannels*wBitsPerSample)/8。对于非PCM格式请根据厂商的说明计算。wBitsPerSampleSpecifies the number of bits per sample for the format type specified by wFormatTag. IfwFormatTag = WAVE_FORMAT_PCM, then wBitsPerSample should be set to either 8 or 16. For non-PCM formats, this member should be set according to the manufacturer's specification for the format tag. Some compression schemes cannot define a value for wBitsPerSample. In this case, setwBitsPerSample to zero.根据wFormatTag的类型设置每个样本的位深(即每次采样样本的大小,以bit为单位)。如果wFormatTag = WAVE_FORMAT_PCM,此值应该设为8或16,对于非PCM格式,根据厂商的说明设置。一些压缩的架构不能设置此值,此时wBitsPerSample应该为零。cbSizeSpecifies the size, in bytes, of extra format information appended to the end of the WAVEFORMATEX structure. This information can be used by non-PCM formats to store extra attributes for thewFormatTag. If no extra information is required by the wFormatTag, this member must be set to zero. For WAVE_FORMAT_PCM formats, this member is ignored.额外信息的大小,以字节为单位,额外信息添加在WAVEFORMATEX结构的结尾。这个信息可以作为非PCM格式的wFormatTag额外属性,如果wFormatTag不需要额外的信息,此值必需为0,对于PCM格式此值被忽略。 CommentAs a self-contained format descriptor, the WAVEFORMATEX structure is obsolete and has been replaced by the WAVEFORMATEXTENSIBLE structure. WAVEFORMATEXTENSIBLE contains an embedded WAVEFORMATEX structure as a member, but it also contains additional data for describing multichannel formats and sample sizes greater than 16 bits. WAVEFORMATEX cannot unambiguously specify data formats with more than two channels or with sample sizes greater than 16 bits. For the benefit of older drivers, the WDM audio subsystem provides limited support for the WAVEFORMATEX structure. New drivers should be written to use WAVEFORMATEXTENSIBLE instead. The WDM audio subsystem in all versions of Windows except for Windows 98 "Gold" supports WAVEFORMATEXTENSIBLE. For information about the limited support available for WAVEFORMATEX, see Audio Data Formats and Data Ranges. The wFormatTag member is set to one of the wave formats that are defined in mmreg.h. Some of the more common nonproprietary formats are listed in the following table.
wFormatTag ValueMeaning
WAVE_FORMAT_PCMPCM (pulse-code modulated) data in integer format.
WAVE_FORMAT_IEEE_FLOATPCM data in IEEE floating-point format.
WAVE_FORMAT_DRMDRM-encoded format (for digital-audio content protected by Microsoft Digital Rights Management).
WAVE_FORMAT_EXTENSIBLEExtensible WAVEFORMATEX structure (see WAVEFORMATEXTENSIBLE).
WAVE_FORMAT_ALAWA-law-encoded format.
WAVE_FORMAT_MULAWMu-law-encoded format.
WAVE_FORMAT_ADPCMADPCM (adaptive differential pulse-code modulated) data.
WAVE_FORMAT_MPEGMPEG-1 data format (stream conforms to ISO 11172-3 Audio specification).
WAVE_FORMAT_DOLBY_AC3_SPDIFAC-3 (aka Dolby Digital) over S/PDIF.
See mmreg.h for the complete list of WAVE_FORMAT_XXX formats. WAVEFORMATEX is nearly identical to the PCMWAVEFORMAT structure, which is an obsolete structure used to specify PCM formats. The only difference is that WAVEFORMATEX contains a cbSize member and PCMWAVEFORMAT does not. By convention, cbSize should be ignored when wFormatTag = WAVE_FORMAT_PCM. This convention allows driver software to treat the WAVEFORMATEX and PCMWAVEFORMAT structures identically in the case of a PCM format. For more information about PCMWAVEFORMAT, see the Microsoft Windows SDK documentation. When wFormatTag = WAVE_FORMAT_PCM, initialize cbSize to zero. For all other values of wFormatTag, cbSize specifies how many bytes of additional format data are appended to the WAVEFORMATEX structure. When wFormatTag = WAVE_FORMAT_EXTENSIBLE, set cbSize to sizeof(WAVEFORMATEXTENSIBLE)-sizeof(WAVEFORMATEX) plus the size of any format-specific data that is appended to the WAVEFORMATEXTENSIBLE structure.


在ffmepg中,可以从AVCodecContext取得声道数,如果单声道的话就对应OpenAL中的MONO,如果是2的话就对应STEREO,可以看出OpenAL中能支持的ffmepg输出的PCM格式为:

AV_SAMPLE_FMT_U8、AV_SAMPLE_FMT_S16、AV_SAMPLE_FMT_U8P、AV_SAMPLE_FMT_S16P这四种。值得庆幸的是大多数都是AV_SAMPLE_FMT_S16这种。

但是我在做播放器的时候发现了一个APE使用的是AV_SAMPLE_FMT_S16P

tips:P代表planar,就是平面格式,没用用P结尾的表示packed压缩格式。2者的存储方式不太一样。压缩格式是左右声道交替存储的,它只占用了AVFrame结构体中的data[0];平面格式不同声道数据分开存储,每种数据对应一个data[i]。

这里需要用到这个函数:swr_convert(包含"libswresample\swresample.h",它也是ffmpeg的一部分)来做PCM格式的转换
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: