gsteamer的疑问?

 

1. 为什么gstreamer是graph-based?

 

2. Pipelines can be visualised by dumping them to a .dot file and creating a PNG image from that

    如何倒?

 

3. container和codec的区别?

   container formats: asf, avi, 3gp/mp4/mov, flv, mpeg-ps/ts, mkv/webm, mxf, ogg

 

4. event->structure作用是什么?

 

5. GstStructure的作用?

   GstStructure实质是GArray.

   是Key/Value键值对的集合(Collection)

   一般GstCaps, GstEvent, GstQuery, GstMessage会用到GstStructure.

 

6. GstEvent仅仅提供了几个event,并没有生命周期,谁用它呢?

    GstPad? GstElement? GstPipeline?

    The event class provides factory methods to construct events for sending and functions to query (parse) received events.

 

GObject 设计思想

gobjec相关学习文章的list.

 

From gobject tutorial:

Ryan McDougall(2004)

 

http://wenku.baidu.com/link?url=QZYGO4DimnSAPGNRQ3tZSDfHaseTEG5tcLvWaoyjXgF4yEEE6YoTwBk7LFxxWxW-bmvLtcWB-By5xLLP-J7fhdcbfXUZLq8QQF29FOaenfC 

 

目的

本文档可用于两个目的:一是作为一篇学习Glib的GObject类型系统的教程,二是用作一篇按步骤使用GObject类型系统的入门文章。本文从如何用C语言来设计一个面向对象的类型系统着手,将GObject作为假设的解决方案。这种介绍的方式可以更好的解释这个开发库为何采用这种形式来设计,以及使用它为什么需要这些步骤。入门文章被安排在教程之后,使用了一种按步骤的、实际的、简洁的组织形式,这样对于某些更实际的程序员会更有用些。

读者

本文假想的读者是那些熟悉面向对象概念,但是刚开始接触GObject或者GTK+的开发人员。 我会认为您已经了解一门面向对象的语言,和一些C语言的基本命令。

动机

使用一种根本不支持面向对象的语言来编写一个面向的系统,这让人听上去有些疯狂。然而我们的确有一些很好的理由来做这样的事情。但在这里我不会试着去证明作者决定的正确性,并且我认为读者自己就有一些使用GLib的好理由。 这里我将指出这个系统的一些重要特性:

C是一门可移植性很强的语言
一个完全动态的系统,新的类型可以在运行时被添加上

这样系统的可扩展性要远强于一门标准的语言,所以新的特性也可以被很快的加入进来。

对面向对象语言来说,面向对象的特性和能力是用语法来定义的。然而,因为C并不支持面向对象,所以GObject系统必须手动的将面向对象的能力引入进来。一般来说,要实现这个目标需要做一些乏味的工作,甚至偶尔使用某些奇妙的手段。而我需要做的只是枚举出所有必要的步骤或“咒语”,使得程序执行起来,当然也希望能说明这些步骤对您的程序意味着什么。

1. 创建一个非继承的对象
设计

在面向对象领域,对象包含两种成员类型:数据和方法,它们处于同一个对象引用之下。有一种办法可以使用C来实现对象,那就是C的结构体(struct)。这样,普通公用成员可以是数据,方法则可以被实现为指向函数的指针。然而这样的实现却存在着一些严重的缺陷:别扭的语法,类型安全问题,缺少封装。而更实际的问题是-空间浪费严重。每个实例化后的对象需要一个4字节的指针来指向其每一个成员方法,而这些方法在同样的类封装范围里则是完全相同的,是冗余的。例如我们有一个类需要有4个成员方法,一个程序实例化了1000个这个类的对象,这样我们就浪费了接近16KB的空间。显然我们只需要保留一张包含这些指针的表,供这个类实例出的对象调用,这样就会节省下不少内存资源。

这张表就被称作虚方法表(vtable),GObject系统为每个类在内存中都保存了一份这张表。当你想调用一个虚方法时,必须先向系统请求查找这个对象所对应的虚方法表,而如上所述这张表包含了一个由函数指针组成的结构体。这样你就能复引用这个指针,通过它来调用方法了。

我们称这两种成员类型(数据和方法)为“实例结构体”和“类结构体”,并且将这两种结构体的实例分别称为“实例对象”和“类对象“。这两种结构体合并在一起形成了一个概念上的单元,我们称之为“类”,对这个“类”的实例则称作“对象”。

将这样的函数称作“虚函数”的原因是,调用它需要在运行时查找合适的函数指针,这样就能允许继承自它的类覆盖这个方法(只要更改虚函数表中的函数指针指向相应函数入口即可)。这样子类在向上转型(upcast)为父类时就会正常工作,就像我们所了解的C++里的虚方法一样。

尽管这样做可以节省内存和实现虚方法,但从语法上来看,将成员方法与对象用“点操作符”关联起来的能力就不具备了。(译者:因为点操作符关联的将是struct里的方法,而不是vtable里的)。因此我们将使用如下的命名约定来声明类的成员方法:NAMESPACE_TYPE_METHOD (OBJECT*, PARAMETERS)

非虚方法将被实现在一个普通的C函数里。虚方法其实也是实现在普通的C函数中,但不同的是这个函数实现时将调用虚函数表中某个合适的方法。私有成员将被实现为只存活在源文件中,而不被导出声明在头文件中。

注意:面向对象通常使用信息隐藏来作为封装的一部分,但在C语言中却没有简单的办法来隐藏私有成员。一种办法是将私有成员放到一个独立的结构体中,该结构体只定义在源文件中,再向你的公有对象结构体中添加一个指向这个私有类的指针。然而,在开放源代码的世界里,如果用户执意要做错误的事,这种保护也是毫无意义的。大部分开发者也只是简单的写上几句注释,标明这些成员他们应该被保护为私有的,希望用户能尊重这种封装上的区别。

现在为止我们有了两种不同的结构体,但我们没有好办法能通过一个实例化后的对象直接找到其虚方法表。但如我们在上面提到的,这应该是系统的职责,我们只要按要求向系统注册上新声明的类型,就应该能够处理这个问题。系统也要求我们去向它注册(对象的和类的)结构体初始化和销毁函数(以及其他的重要信息),这样我们的对象才能被正确的实例化出来。系统将通过枚举化所有的向它注册的类型来记录新的对象类型,要求所有实例化对象的第一个成员是一个指向它自己类的虚函数表的指针,每个虚函数表的第一个成员是它在系统中保存的枚举类型的数字表示。

注意:类型系统要求所有类型的对象结构体和类结构体的第一个成员是一个特殊结构体。在对象结构体中,该特殊结构体是一个指向其类型的对象。因为C语言保证在结构体中声明的第一个成员是在内存的最前面,因此这个类型对象可以通过将这个原对象的结构体转型而获得到。又因为类型系统要求我们将被继承的父结构体指针声明为子结构体的第一个成员,这样我们只需要在父类中声明一次这个类型对象,以后就能够通过一次转型而找到虚函数表了。

最后,我们还需要定义一些管理对象生命期的函数:创建类对象的函数,创建实例对象的函数,销毁类对象的函数,但不需要销毁实例对象的函数,因为实例对象的内存管理是一个比较复杂的问题,我们将把这个工作留给更高层的代码来做。

代码(头文件)

a. 用struct来创建实例对象和类对象,实现“C风格”的对象

注意:对结构体命名一般要在名字前添加下划线,然后使用前置类型定义typedef。这是因为C的语法不允许你在SomeObject中声明SomeObject指针(这对定义链表之类的数据结构很方便)(译者:如果非要这样用,则需要在类型前加上struct)。按上面的命名约定,我们还创建了一个命名域,叫做“Some”。

/* “实例结构体”定义所有的数据域,实例对象将是唯一的 */
 

  1. typedef struct _SomeObject SomeObject;
    struct _SomeObject
    {
    GTypeInstance gtype;

    gint m_a;
    gchar* m_b;
    gfloat m_c;
    };





/* “类结构体”定义所有的方法函数,类对象将是共享的 */
 

  1. typedef struct _SomeObjectClass SomeObjectClass;
    struct _SomeObjectClass
    {
    GTypeClass gtypeclass;

    void (*method1) (SomeObject *self, gint);
    void (*method2) (SomeObject *self, gchar*);
    };





b. 声明一个"get_type"函数,第一次调用该函数时,函数负责向系统注册上对象的类型,并返回系统返回的一个GType类型值,在此后的调用就会直接返回该GType值。该值实际上是一个系统用来区别已注册类型的整型数字。由于函数是SomeObject类型特有的,我们在它前面加上“some_object_"。

/* 该方法将返回我们新声明的对象类型所关联的GType类型 */
GType some_object_get_type (void);

c. 声明一些用来管理对象生命期的函数:初始化时创建对象的函数,结束时销毁对象的函数。

/* 类/实例的初始化/销毁函数。它们的标记在gtype.h中定义。 */
void some_object_class_init (gpointer g_class, gpointer class_data);
void some_object_class_final (gpointer g_class, gpointer class_data);
void some_object_instance_init (GTypeInstance *instance, gpointer g_class);

d. 用上面我们约定的方式来命名成员方法函数。

/* 所有这些函数都是SomeObject的方法. */
void some_object_method1 (SomeObject *self, gint); /* virtual */
void some_object_method2 (SomeObject *self, gchar*); /* virtual */
void some_object_method3 (SomeObject *self, gfloat); /* non-virtual */

e. 创建一些样板式代码(boiler-plate code),符合规则的同时也让事情更简单一些

/* 方便的宏定义 */
#define SOME_OBJECT_TYPE (some_object_get_type ())
#define SOME_OBJECT(obj) (G_TYPE_CHECK_INSTANCE_CAST ((obj), SOME_OBJECT_TYPE, SomeObject))
#define SOME_OBJECT_CLASS(c) (G_TYPE_CHECK_CLASS_CAST ((c), SOME_OBJECT_TYPE, SomeObjectClass))
#define SOME_IS_OBJECT(obj) (G_TYPE_CHECK_TYPE ((obj), SOME_OBJECT_TYPE))
#define SOME_IS_OBJECT_CLASS(c) (G_TYPE_CHECK_CLASS_TYPE ((c), SOME_OBJECT_TYPE))
#define SOME_OBJECT_GET_CLASS(obj)(G_TYPE_INSTANCE_GET_CLASS ((obj), SOME_OBJECT_TYPE, SomeObjectClass))

代码(源程序)

现在可以实现那些刚刚声明过的函数了。

注意:由于虚函数是一些函数指针,我们还要创建一些可被寻址的普通C函数(命名以"impl"结尾,并且不被导出到头文件中),虚函数将被实现为指向这些函数。

 

  1. a. 实现虚方法。

    /* 虚函数中指向的普通函数 */
    void some_object_method1_impl (SomeObject *self, gint a)
    {
    self->m_a = a;
    g_print ("Method1: %i\n", self->m_a);
    }

    void some_object_method2_impl (SomeObject *self, gchar* b)
    {
    self->m_b = b;
    g_print ("Method2: %s\n", self->m_b);
    }





b. 实现所有公有方法。实现虚方法时,我们必须使用“GET_CLASS”宏来从类型系统中获取到类对象,用以调用虚函数表中的虚方法。非虚方法时,直接写实现代码即可。

 

  1. /* 公有方法 */
    void some_object_method1 (SomeObject *self, gint a)
    {
    SOME_OBJECT_GET_CLASS (self)->method1 (self, a);
    }

    void some_object_method2 (SomeObject *self, gchar* b)
    {
    SOME_OBJECT_GET_CLASS (self)->method2 (self, b);
    }

    void some_object_method3 (SomeObject *self, gfloat c)
    {
    self->m_c = c;
    g_print ("Method3: %f\n", self->m_c);
    }





c. 实现初始化/销毁方法。在这两个方法中,系统传入的参数是指向该对象的泛型指针(我们相信这个指针的确指向一个合适的对象),所以我们在使用它之前必须将其转型为合适的类型。

/* 该函数将在类对象创建时被调用 */
 

  1. void some_object_class_init(gpointer g_class, gpointer class_data)
    {
    SomeObjectClass *this_class = SOME_OBJECT_CLASS (g_class);

    /* 填写类结构体的方法成员 (本例只存在一个虚函数表) */
    this_class->method1 = &some_object_method1_impl;
    this_class->method2 = &some_object_method2_impl;
    }

    /* 该函数在类对象不再被使用时调用 */
    void some_object_class_final (gpointer g_class, gpointer class_data)
    {
    /* 该对象被销毁时不需要做任何动作,因为它不存在任何指向动态分配的
    资源的指针或者引用。 */
    }

    /* 该函数在实例对象被创建时调用。系统通过g_class实例的类来传递该实例的类。 */
    void some_object_instance_init (GTypeInstance *instance, gpointer g_class)
    {
    SomeObject *this_object = SOME_OBJECT (instance);

    /* 填写实例结构体中的成员变量 */
    this_object->m_a = 42;
    this_object->m_b = 3.14;
    this_object->m_c = NULL;
    }





d. 实现能够返回给调用者SomeObject的GType的函数。该函数在第一次运行时,它通过向系统注册SomeObject来获取到GType。该 GType将被保存在一个静态变量中,以后该函数再被调用时就无须注册可以直接返回该数值了。虽然可以使用一个独立的函数来注册该类型,但这样的实现可以保证类在使用前是注册过的,该函数通常在实例化第一个对象时被调用。

 

  1. /* 因为该类没有父类,所以父类函数是空的 */
    GType some_object_get_type (void)
    {
    static GType type = 0;

    if (type == 0)
    {
    /* 这是系统用来完整描述要注册的类型是如何被创建、初始化和销毁的结构体。 */
    static const GTypeInfo type_info =
    {
    sizeof (SomeObjectClass),
    NULL, /* 父类初始化函数 */
    NULL, /* 父类销毁函数 */
    some_object_class_init, /* 类对象初始化函数 */
    some_object_class_final, /* 类对象销毁函数 */
    NULL, /* 类数据 */
    sizeof (SomeObject),
    0, /* 预分配的字节数 */
    some_object_instance_init /* 实例对象初始化函数 */
    };

    /* 因为我们的类没有父类,所以它将被认为是“基础类(fundamental)”,
    因此我们必须要告诉系统,该类既是一个复合结构的类(与浮点型,整型,
    或者指针不同),而且是可以被实例化的(系统可以创建实例对象,相反如接口
    或者抽象类则不能被实例化) */
    static const GTypeFundamentalInfo fundamental_info =
    {
    G_TYPE_FLAG_CLASSED | G_TYPE_FLAG_INSTANTIATABLE
    };

    type = g_type_register_fundamental

    g_type_fundamental_next (), /* 下一个可用的GType */
    "SomeObjectType", /* 类型的名称 */
    &type_info, /* 上面定义的type_info */
    &fundamental_info, /* 上面定义的fundamental_info */
    0 /* 类型不是抽象的 */
    );
    }

    return type;
    }





/* 让我们来编写一个测试用例吧! */

 

  1. int main()
    {
    SomeObject *testobj = NULL;

    /* 类型系统初始化 */
    g_type_init ();

    /* 让系统创建实例对象 */
    testobj = SOME_OBJECT (g_type_create_instance (some_object_get_type()));

    /* 调用我们定义了的方法 */
    if (testobj)
    {
    g_print ("%d\n", testobj->m_a);
    some_object_method1 (testobj, 32);
    g_print ("%s\n", testobj->m_b);
    some_object_method2 (testobj, "New string.");
    g_print ("%f\n", testobj->m_c);
    some_object_method3 (testobj, 6.9);
    }

    return 0;
    }


 

  1.  





还需要考虑的

我们已经用C实现了第一个对象,但是做了很多工作,而且这并不算是真正的面向对象,因为我们故意没有提及任何关于“继承”的方法。在下一节我们将看到如何利用别人的代码,使SomeObject继承于内建的类GObject。

尽管在下文中我们将重用上面讨论的思想和模型,但是创建一个基础类使得它能够像其它的GTK+代码一样,是一件非常困难和深入的事情。因此强烈建议您创建新的类时总是继承于GObject,它会帮您做大量背后的工作,使得您的类能符合GTK+的要求。

2.使用内建的宏定义来自动生成代码
设计

您可能已经注意到了,我们上面所做的大部分工作基本上都是机械的、模板化的工作。大多数的函数都不并是通用的,每创建一次类我们就需要重写一遍。很显然这就是为什么我们发明了计算机的原因 - 让工作自动化,让我们的生活更简单!

OK,其实我们很幸运,C的预处理器将允许我们编写宏定义,这些宏定义在编译时会展开成为合适的C代码,来生成我们需要的类型定义。其实使用宏定义还能帮助我们减少一些低级错误。

然而,自动化将使得我们失去对定义处理的灵活性。在上面描述的步骤中,我们能有许多可能的变化,但一个宏定义却只能实现一种展开。如果这个宏定义提供了轻量级的展开,但我们想要的是一个完整的类型,这样我们仍然需要手写一大堆代码。如果宏定义提供了完整的展开,但我们需要的却是一种轻量级的类型,我们将得到许多冗余的代码,花许多时间来填写这些用不上的桩代码,甚至是一些错误的代码。不幸的是C预处理器并没有设计成能够自动发现我们感兴趣的代码生成方式,它只包含了最有限的功能。

代码

创建一个新类型的代码非常简单:

G_DEFINE_TYPE_EXTENDED (TypeName, function_prefix, PARENT_TYPE, GTypeFlags, CODE)。

第一个参数是类的名称。第二个是函数名称的前缀,这使得我们的命名规则能保持一致。第三个是父类的GType。第四个是会被添加到!GTypeInfo结构体里的!GTypeFlag。第五个是在类型被注册后应该立刻被执行的代码。

看看下面的代码将被展开成为什么样将会给我们更多的启发。

G_DEFINE_TYPE_EXTENDED (SomeObject, some_object, 0, some_function())

注意:实际展开后的代码将随着系统版本不同而不同。你应该总是检查一下展开后的结果而不是凭主观臆断。

展开后的代码(清理了空格):

 

  1. static void some_object_init (SomeObject *self);
    static void some_object_class_init (SomeObjectClass *klass);
    static gpointer some_object_parent_class = ((void *)0);

    static void some_object_class_intern_init (gpointer klass)
    {
    some_object_parent_class = g_type_class_peek_parent (klass);
    some_object_class_init ((SomeObjectClass*) klass);
    }

    GType some_object_get_type (void)
    {
    static GType g_define_type_id = 0;
    if ((g_define_type_id == 0))
    {
    static const GTypeInfo g_define_type_info =
    {
    sizeof (SomeObjectClass),
    (GBaseInitFunc) ((void *)0),
    (GBaseFinalizeFunc) ((void *)0),
    (GClassInitFunc) some_object_class_intern_init,
    (GClassFinalizeFunc) ((void *)0),
    ((void *)0),
    sizeof (SomeObject),
    0,
    (GInstanceInitFunc) some_object_init,
    };

    g_define_type_id = g_type_register_static
    (
    G_TYPE_OBJECT,
    "SomeObject",
    &g_define_type_info,
    (GTypeFlags) 0
    );

    { some_function(); }

    }

    return g_define_type_id;
    }




注意:该宏定义声明了一个静态变量“_parent_class",它是一个指针,指向我们打算创建对象的父类。当我们要找到虚方法继承自哪里时它会派上用场,可以用于链式触发处理/销毁函数(译者:下面会介绍)。这些处理/销毁函数几乎总是虚函数。我们接下来的代码将不再使用这个结构,因为有其它的函数能够不使用静态变量而做到这一点。

你应该注意到了,这个宏定义没有定义父类的初始化、销毁函数以及类对象的销毁函数。那么如果你需要这些函数,就得自己动手了。

3.创建一个继承自GObject的对象


设计

 

尽管我们现在能够生成一个基本的对象,但事实上我们故意略过了本类型系统的上下文:作为一个复杂的开发库套件的基础 -那就是图形库GTK+。GTK+的设计要求所有的类应该继承自一个根类。这样就至少能允许一些公共的基础功能能够被共享:如支持信号(让消息可以很容易的从一个对象传递到另一个),使用引用计数来管理对象生命期,支持属性(针对对象的数据成员生成简单的setting和getting函数),支持构造和析构函数(用来设置信号、引用计数器、属性)。当我们让对象继承自GObject时,我们就获得了上述的一切,并且当与其它基于GObject的库交互时会很容易。然而,本章节我们不讨论信号、引用计数和属性,或者任何其它专门的特性,这里我们将详细描述类型系统中继承是如何工作的。
我们都知道,如果高档轿车继承自轿车,那么高档轿车就是轿车加上一些新的特性。那如何让系统去实现这样的功能呢?其实可以使用结构体的一个特性来实现:结构体里的第一个成员一定是在内存的最前面。只要我们要求所有的对象将它们的基类声明为它们自己结构体的第一个成员,那么我们就能迅速的将指向某个对象的指针转型为指向它基类的指针!尽管这个技巧很好用,并且语法上非常干净,但这种转型的方式只适用于指针 - 你不能这样来转型一个普通的结构体。
注意:这种转型技巧是类型不安全的。把一个对象转型为它的基类对象虽然合法但不明智。这将要求程序员自己来保障此次转型是安全的。



创建类型的实例

 

了解了这个技术后,那么究竟类型系统是如何实例化出对象的呢?第一次我们使用g_type_create_instance让系统创建一个*实例对象时,它必须要先创建一个*类对象供实例来使用。如果该类结构体继承自其它类,系统则需要先创建和初始化这些父类。系统依靠我们指定的结构体(*_get_type函数中的!GTypeInfo结构体)来完成这个工作,这个结构体描述了每个对象的实例对象大小,类对象大小,初始化函数和销毁函数。- 要用g_type_create_instance来实例化一个对象
如果它没有相关联的类对象
创建类对象并且将其加入到类的层次中
创建实例对象并且返回指向它的指针
当系统创建一个新的类对象时,它先会分配足够的内存来放置这个最终的类对象(译者:“最终的”意指这个新的类对象,相对于其继承的父类们)。然后在继承链上从最顶端的父类开始到最末端的子类对象,用父类的成员域覆写掉这个最终类对象的成员域。这就是子类如何继承自父类的。当把父类的数据复制完后,系统将会在当前状态的类对象中执行父类的“base_init“函数。这个覆写和执行“base_init”的工作将循环多次,直到这个继承链上的每个父类都被处理过后才结束。接下来系统将在这个最终的类对象上执行最终子类的“base_init”和“class_init”函数。函数“class_init”有一个参数,即上文所提到的“class_data”,该参数会是构造函数的参数。
细心的读者可能会问,为什么我们已经有了一个完整的父类对象的拷贝还需要它的base_init函数?因为当完整拷贝无法为每个类重新创建出某些数据时,我们就需要base_init函数。例如,某个类对象成员指向了另外一个对象,拷贝后我们希望每个类对象的成员都指向它自己的对象,而不是只拷贝对象的指针(内存的拷贝只是“浅拷贝”,这时我们需要一次“深拷贝”)。但事实上有经验的GObject程序员告诉我base_init函数会很少用到。
当系统创建一个新的实例对象时,它会先分配足够的内存来将这个实例对象放进去。从继承链的最顶端的父类开始调用它的“instance_init”函数,直到最终的子类。最后,系统在最终类对象上调用最终子类的“instance_init”函数。
我来总结一下上面所描述到的算法:- 实例化一个类对象
为最终对象分配内存
从父类到子类开始循环
复制对象内容以覆盖掉最终对象的内容
在最终对象上运行对象自己的base_init函数
在最终对象上运行最终对象的base_init函数
在最终对象上运行最终对象的class_init(附带上类数据)


- 实例化一个实例对象
为最终对象分配内存
从父类到子类开始循环
在最终对象上运行instance_init函数
在最终对象上运行最终对象的instance_init函数此时创建的类对象和实例对象都已经被初始化,系统将实例对象的类指针指向到类对象,这样实例对象就能找到类对象所包含的虚函数表。这就是系统实例化已注册类型的过程,其实GObject实现的构造函数和析构函数语义与上述的方法也是相同的。



创建GObject实例

 

前面我们使用g_type_create_instance来创建一个实例对象。然而事实上GObject给我们提供了一个新的API来创建gobjects,可以完成我们上述所有的工作。这个API调用三个新的方法来创建和销毁新的GObject对象:构造函数(constructor),部署函数(dispose)以及析构函数(finalize)。
因为C语言缺少真正面向对象的语言所具备的多态特性,特别是认出多个构造函数的能力,所以GObject的构造函数需要一些更复杂的实现:
我们怎样才能灵活的传递不同种类的初始化信息到对象中,使得构造对象更容易呢?也许我们会想到限制只使用拷贝构造函数,然后用初始化数据填充一个静态的”初始化对象“,再将这个”初始化对象“传递到这个拷贝构造函数中。方法虽然简单,但是不太灵活。
事实上GObject的作者们提供了一种更加通用的解决方案,同时还提供了方便的getting和setting方法来操作对象成员数据,这种机制被称作”属性“。在系统中我们的属性用字符串来命名,并对它进行边界和类型检查。属性还可以被声明为仅构造时可写,就像C++中的const变量一样。
属性使用了一种多态的类型(GValue),这种类型允许程序员在不了解它实际类型的前提下安全的复制一个值。GValue会记录下它的值所持有的GType,使用类型系统来保证它总是具有一个虚函数,该函数可以处理将其自身复制到另一个GValue或转换为另一种GType。我们将在下一章详细讨论GValues和属性。
要为一个GObject创建一个新的属性,我们要定义它的类型、名字,以及默认值,然后创建一个封装这些信息的“属性规格”对象。在GObject的类初始化函数中,我们可以通过g_object_class_install_property来将属性规格绑定到GObject的类对象上。
注意:任何子对象要添加一个新的属性必须覆盖它从GObject继承下来的set_property和get_property虚方法。将在下一节中介绍这两个方法。
使用属性我们可以向构造函数传递一组属性规格,附上我们希望的初始值,然后简单调用GObject的set_property,这样就能获得属性带给我们的神奇功效。但是事实上,构造函数是不会被我们直接调用的。
GObject构造函数另一个不是那么明显的特性是,每个构造函数需要接受一个GType作为其参数之一,并且当它向上转型为其父类时,需要将这个GType传递给它父类的构造函数。这是因为GObject的构造函数使用子类的GType来调用g_type_create_instance,这样GObject的构造函数必须要知道它的最终子类对象的GType。
注意:如果我们自己定义构造函数,我们则必须覆盖继承自父类的构造函数。自定义的构造函数必须得沿着“继承链”向上,在做任何其他的工作前,先调用完父类的构造函数。然而,因为我们使用了属性,所以事实上我们从来不用覆盖掉默认的构造函数。
我要为上面的离题而道歉,但是这是为了理解系统是如何工作的所必须要克服的困难。如上所述,我们现在能理解GObject的构造函数了-g_object_new。这个函数接受一个子类的GType类型,一系列属性名(字符串)和GValue对作为参数。
这一系列属性对被转换为键值对列表和相关的属性规格,这些属性规格将被在类初始化函数里被安装到系统中。调用类对象的构造函数时系统传入GType和构造属性。从最底端的子类构造函数到最顶端的基类构造函数,这条链会一直被触发直到GObject的构造函数被执行 - 这实际上才是第一个真正执行的初始化程序。GObject的构造函数现调用g_type_create_instance,并传下我们通过g_object_new一路带上的GType,这样我们上面所描述的细节将会发生,最终创建出实例。然后它将获得最终对象的类,并对传入所有构造属性调用set_property方法。这就是为什么我们加入一个新属性时必须要覆盖get_/set_property方法的原因。当这一串构造函数返回后,包含在其中的代码将从基类执行到子类。
当父类构造函数返回后,就轮到子类来执行它自己的初始化代码了。这样执行代码的顺序就成为:
  • 从GObject到ChildObject运行实例初始化函数
  • 从GObject到ChildObject运行构造函数
最后任何剩余的没有传递到构造函数的属性将使用set_property方法一次设置完毕。
读者也许会考虑在什么情况下需要覆盖默认构造函数,将自己的代码放到他们自己的构造函数里。因为我们所有的属性都可以使用虚方法set_property来设置,所以基本上没有覆盖GObject的默认构造函数的必要。
我仍尝试使用伪码的方式来总结一下GObject的构造函数过程:- 使用属性键值对列表创建合适的GObject对象:
在键值对列表中查找对应的属性规格
调用最终对象的构造函数并传入规格列表和类型
递归的向下调用直到GObject的构造函数
对传入的类型调用g_type_create_instance
对属性规格列表调用虚方法set_property
对剩下的属性,调用set_property注意:GObject将属性区分为两类,构造属性和“常规”属性。



销毁GObject实例

 

该做的工作完成后,我们可以看看要清理这个对象时会发生些什么。GObject实现面向对象中的析构时,将其分解成了两步:处理(dispose)和销毁(finalize)。
"处理"方法在对象知道自己将要被销毁时调用。实现该方法时,应该将指向资源的引用释放掉,这样可以避免循环引用或资源稀缺。“处理”方法应该允许被调用多次。要实现这一点,一般的做法是使用一个静态变量来保护”处理“方法。在“处理”方法调用后,对象本身应该依然能够使用,除非产生了不可恢复的错误(如段错误)。因此,“处理”方法不能释放或者改动某些对象成员。对于可恢复的错误,例如返回错误码或者空指针,则不应该受影响。
“销毁”方法会在从内存中清理掉对象之前被调用,用于释放剩余的资源引用。因此它只能被调用一次。析构过程被分成两个步骤降低了引用计数策略中循环引用发生的可能。
注意:如果我们自定义“处理”和“销毁”方法,就必须要覆盖掉继承自父类的相同方法。这两个方法从子类开始调用,沿着继承链向上直到最顶端的父类。
与构造函数不同的是,只要新的对象分配了资源,我们就需要覆盖掉继承自父类的相同方法,自己实现“处理”和“销毁”方法。
判断销毁代码放置到哪个函数不是件容易的事。一般来说,当与实现了引用计数的库(如GTK+)打交道时,我们应该在“处理”方法中解除对其它资源对象的引用,而在“销毁”方法中释放掉所有的内存或者关闭所有的文件描述字。
上面我们讨论过g_object_new,但是我们什么时候来销毁这些对象呢?其实上面也有提示过,GObject使用了引用计数的技术,它保存了一个整型的数据,该数据描述了有多少个对象或函数现在正在使用或者引用这个对象。当你在使用GObject时,如果你希望新创建的对象不在使用时被销毁掉,你就必须及早调用g_object_ref,将对象作为参数传递给它,这样就为引用计数器增加了1。如果你没有做这件事就意味着对象允许被自动销毁,这也许会导致你的程序崩溃。
同样的,当对象完成了它的任务后,你必须要调用g_object_unref。这样会使引用计数器减1,并且系统会检查它是否为0.当计数器为0时,对象将被先调用“处理”方法,最终被“销毁”掉。如果你没有解除到该对象的引用,则会导致内存泄漏,因为计数器永远不会回到0。
现在我们已经准备好了来写一些代码了!不要让上面冗长复杂的描述吓到您。如果你没有完全理解上面所提到的,别紧张 - GObject的确是很复杂的!继续读下去,你会看到许多细节,试试一些例子程序,或者去睡觉吧,明天再来接着读。
下面的程序与第一个例子很相似,事实上我去掉了更多的不合逻辑的、冗余的代码。



代码(头文件)

 

a. 我们仍然按照上面的方式继续,但是这次将把父类对象放到结构体的第一个成员位置上。事实上就是GObject。/* “实例结构体”定义所有的数据域,实例对象将是唯一的 */
typedef struct _SomeObject SomeObject;
struct _SomeObject
{
GObject parent_obj;

/* 下面是一些数据 */
};

/* “类结构体”定义所有的方法函数,类对象将是共享的 */
typedef struct _SomeObjectClass SomeObjectClass;
struct _SomeObjectClass
{
GTypeClass parent_class;

/* 下面是一些方法 */
};b. 头文件剩下的部分与第一个例子相同。



代码(源文件)

 

注意:我们需要增加一些对被覆盖的GObject方法的声明。/* 这些是GObject的构造和析构方法,它们的用法说明在gobject.h中 */
void some_object_constructor(GType this_type,
guint n_properties,
GObjectConstructParam *properties)
{
/* 如果有子类要继承我们的对象,那么this_type将不是SOME_OBJECT_TYPE,
g_type_peek_parent再是SOME_OBJECT_TYPE的话,将会造成无穷循环 */

GObjectClass *parent_class = g_type_class_peek (g_type_peek_parent (SOME_OBJECT_TYPE()));

parent_class-> constructor (self_type, n_properties, properties);

/* 很少需要再做其它工作 */
}

void some_object_dispose (GObject *self)
{
GObjectClass *parent_class = g_type_class_peek (g_type_peek_parent(SOME_OBJECT_TYPE()));
static gboolean first_run = TRUE;

if (first_run)
{
first_run = FALSE;

/* 对引用的所有GObject调用g_object_unref,但是不要破坏这个对象 */

parent_class-> dispose (self);
}
}

void some_object_finalize (GObject *self)
{
GObjectClass *parent_class = g_type_class_peek (g_type_peek_parent(SOME_OBJECT_TYPE()));

/* 释放内存和关闭文件 */

parent_class-> finalize (self);
}
注意:GObjectConstructParam是一个有两个成员的结构体,一个是一组!GParamSpec类型,用来描述参数定义,另一个是一组GValue类型,是对应参数的值。/* 这是GObject的Get和Set方法,它们的用法说明在gobject.h中 */
void some_object_get_property (GObject *object,
guint property_id,
GValue *value,
GParamSpec *pspec)
{
}

void some_object_set_property (GObject *object,
guint property_id,
const GValue *value,
GParamSpec *pspec)
{
}

/* 这里是我们覆盖函数的地方,因为我们没有定义属性或者任何域,下面都是不需要的 */
void some_object_class_init (gpointer g_class,
gpointer class_data)
{
GObjectClass*this_class = G_OBJECT_CLASS (g_class);

this_class-> constructor = &some_object_constructor;
this_class-> dispose = &some_object_dispose;
this_class-> finalize = &some_object_finalize;

this_class-> set_property = &some_object_set_property;
this_class-> get_property = &some_object_get_property;
}要想讨论关于创建和销毁GObject,我们就必须要了解属性和其它特性。我将把操作属性的示例放到下一节来叙述。以避免过于复杂而使得你灰心丧气。在你对这些概念有些实作经验后,它们将开始显现出来存在的意义。如上面所言,我们现在只是将自己限制在创建一个基础的GObject类,在下一节我们将真正的编写一些函数。 重要的是我们获得了让下面的学习更轻松的工具。



4.属性

 

上面已经提到属性是个很奇妙的东西,也简单介绍了如何使用它。在进一步深入介绍属性之前,我们又得先离一会儿题。



GValues

 

C是一门强类型语言,也就是说变量声明的类型必须和它被使用的方式保持一致,否则编译器就会报错。这是一件好事,它使得程序编写起来更迅速,帮助我们发现可能会导致系统崩溃或者不安全的因素。但这又是件坏事,因为实际上程序员活在一个很难什么事都保持严格的世界上,而且我们也希望声明的类型能够具备多态的能力 -也就是说类型能够根据上下文来改变它们自己的特性。通过C语言的转型我们可以获得一些多态的能力,如上面所讨论过的继承。然而,当使用无类型指针作为参数传递给函数时,可能问题会比较多。幸运的是,类型系统给了我们另外一个C语言没有的工具:GType。
让我们更清楚的描述一下问题吧。我需要一种数据类型,可以实现一个可以容纳多类型元素的链表,我想为这个链表编写一些接口,可以不依赖于任何特定的类型,并且不需要我为每种数据类型声明一个多余的函数。这种接口必然能涵盖多种类型,所以我们称它为GValue(Generic Value,泛型)。该如何实现这样一个类型呢?
我们创建了封装这种类型的结构体,它具有两个成员域:所有基础类型的联合(union),和表示保存在这个union中的值的GType。这样我们就可以将值的类型隐藏在GValue中,并且通过检查对GValue的操作来保证类型是安全的。这样还减少了多余的以类型为基础的操作接口(如get_int,set_float,...),统一换成了g_value_*的形式。
细心的读者会发现每个GValue都占据了最大的基础类型的内存大小(通常是8字节),再加上GType自己的大小。是的,GValues在空间上不是最优的,包含了不小的浪费,因此不应该被大量的使用它。它最常被用在定义一些泛型的API上。
属性是如何工作的这一点稍稍超出了我们要讨论的范围,但是这对于理解属性本身还是很有帮助的。/* 让我们使用GValue来复制整型数据! */
#define g_value_new(type) g_value_init (g_new (GValue, 1), type)

GValue *a = g_value_new (G_TYPE_UCHAR);
GValue *b = g_value_new (G_TYPE_INT);
int c = 0;

g_value_set_uchar (a, ''a'');
g_value_copy (a, b);

c = g_value_get (b);
g_print ("w00t: %d\n", c);

g_free (a);
g_free (b);



设计

 

我们已经在上面接触过属性了,对它们有了初步的认识,现在我们将继续来了解一下设计它们的最初动机。要编写一个泛型的属性设置机制,我们需要一个将其参数化的方法,以及与实例结构体中的成员变量名查重的机制。从外部上看,我们希望使用C字符串来区分属性和公有API,但是内部上来说,这样做会严重的影响效率。因此我们枚举化了属性,使用索引来标识它们。
上面提过属性规格,在Glib中被称作!GParamSpec,它保存了对象的gtype,对象的属性名称,属性枚举ID,属性默认值,边界值等,类型系统用!GParamSpec来将属性的字符串名转换为枚举的属性ID,GParamSpec也是一个能把所有东西都粘在一起的大胶水。
当我们需要设置或者获取一个属性的值时,传入属性的名字,并且带上GValue用来保存我们要设置的值,调用g_object_set/get_property。g_object_set_property函数将在GParamSpec中查找我们要设置的属性名称,查找我们对象的类,并且调用对象的set_property方法。这意味着如果我们要增加一个新的属性,就必须要覆盖默认的set/get_property方法。而且基类包含的属性将被它自己的set/get_property方法所正常处理,因为!GParamSpec就是从基类传递下来的。最后,应该记住,我们必须事先通过对象的class_init方法来传入GParamSpec参数,用于安装上属性!
假设我们已经有了如上一节所描述的那样一个可用的框架,那么现在让我们来为SomeObject加入处理属性的代码吧!



代码(头文件)

 

a. 除了我们增加了两个属性外,其余同上面的一样。/* “实例结构体”定义所有的数据域,实例对象将是唯一的 */
typedef struct _SomeObject SomeObject;
struct _SomeObject
{
GObject parent_obj;

/* 新增加的属性 */
int a;
float b;

/* 下面是一些数据 */
};



代码(源文件)

 

a. 创建一个枚举类型用来内部记录属性。enum
{
OBJECT_PROPERTY_A = 1 << 1;
OBJECT_PROPERTY_B = 1 << 2;
};b. 实现新增的处理属性的函数。void some_object_get_property (GObject *object, guint property_id, GValue *value, GParamSpec *pspec)
{
SomeObject *self = SOME_OBJECT (object);

switch (property_id)
{
case OBJECT_PROPERTY_A:
g_value_set_int (value, self-> a);
break;

case OBJECT_PROPERTY_B:
g_value_set_float (value, self-> b);
break;

default: /* 没有属性用到这个ID!! */
}
}

void some_object_set_property (GObject *object, guint property_id, const GValue *value, GParamSpec *pspec)
{
SomeObject *self = SOME_OBJECT (object);

switch (property_id)
{
case OBJECT_PROPERTY_A:
self-> a = g_value_get_int (value);
break;

case OBJECT_PROPERTY_B:
self-> b = g_value_get_float (value);
break;

default: /* 没有属性用到这个ID!! */
}
}
c. 覆盖继承自基类的set/get_property方法,并且传入GParamSpecs。/* 这里是我们覆盖函数的地方 */
void some_object_class_init (gpointer g_class, gpointer class_data)
{
GObjectClass *this_class = G_OBJECT_CLASS (g_class);
GParamSpec *spec;

this_class-> constructor = &some_object_constructor;
this_class-> dispose = &some_object_dispose;
this_class-> finalize = &some_object_finalize;

this_class-> set_property= &some_object_set_property;
this_class-> get_property = &some_object_get_property;

spec = g_param_spec_int
(
"property-a", /* 属性名称 */
"a", /* 属性昵称 */
"Mysterty value 1", /* 属性描述 */
5, /* 属性最大值 */
10, /* 属性最小值 */
5, /* 属性默认值 */
G_PARAM_READABLE |G_PARAM_WRITABLE /* GParamSpecFlags */
);
g_object_class_install_property(this_class,OBJECT_PROPERTY_A, spec);

spec = g_param_spec_float
(
"property-b", /* 属性名称 */
"b", /* 属性昵称 */
"Mysterty value 2" /* 属性描述 */
0.0, /* 属性最大值 */
1.0, /* 属性最小值 */
0.5, /* 属性默认值 */
G_PARAM_READABLE |G_PARAM_WRITABLE /* GParamSpecFlags */
);
g_object_class_install_property (this_class, OBJECT_PROPERTY_B, spec);
}

C++ VS Gobject

gobjec相关学习文章的list.

面向对象是一种游戏规则,它不是游戏。C++只是面向对象的一种开发语言。

 

很多人在学校里面

1. 学习了C语言,知道这是写“面向结构的”

2. 学习了C++,知道这是面向对象的;

 

如果这个时候再接触GObject, 就会觉得很怪异,为啥用C语言去实现一个面向对象的机制呢。

老是拿C++ VS GObject,老是觉得GObject浑身是刺,不好用,复杂的很!

 

如果你学过C语言之后,学校教面向对象,之后叫GObject, 你就不会觉得GObject是个异类了。

 

 

C++中用了一个struct, 里面包含了:成员变量,成员函数

GObject用了2个struct, 一个包含成员变量,另外一个包含成员函数;

 

 


http://tigersoldier.is-programmer.com/categories/4800/posts

C vs C++

C和面向对象,这不就是C++么?为什么要搞出另一套东西,而不直接使用C++呢?关于C与C++之争是一个大坑。Linux之父Linus就是力挺C而批判C++的。讨厌C++的人似乎认为C++过于复杂,内部机制陷阱过多等等。自己的经历不多,用C++也很少,达不到大牛们的境界,如果让我给个非要用C而不用C++的理由,我也给不出一个有说服力的。

为什么研究GObject

最原始的动力是,我在使用GTK+进行开发,而GObject是GTK+的基石。如果基础不牢,上层一定不会稳,因此很有必要把GObject给过一遍。知道了它的内部,才知道该如何使用它,明白它的机制与原理,做到心中有数。

但是研究GObject能带来更多。由于C里没有任何面向对象机制,因此GObject把这些机制全部实现了一遍。从中可以看到一些机制的实现原理,从而对面向对象有更多的理性了解。

第一步:封装

面向对象的最基本需求就是封装。所谓封装,按我的理解,就是将一系列相关数据,及对这些数据有关的操作,有序的组织在一个结构中。一个圆形有x坐标、y坐标、半径三个参数,我们可以用这三个变量表示一个圆:

?
1
double x, y, radius;

这没什么问题。现在多了一个圆,我们又要用三个变量:

?
1
double x1, y1, radius1;

当我们有很多个圆的时候,可能要用到数组:

?
1
double x[100], y[100], radius[100];

问题在哪?x、y和radius是相互独立的。我完全可以定义100个x,200个y,150个radius。如果不只有圆,还有矩形,那么矩形的坐标叫什么呢?xx、yy?等你写了一堆代码之后回来看,到底x和y是圆的坐标,还是xx和yy是圆的坐标?

所以有了struct。一个struct对数据进行了很自然的封装:

?
1
2
3
struct Circle {
double x, y, radius;
};

好了,现在我们有了Circle这个类型。这个类型将圆的三个参数封装到了一起,从现在开始它们就是一个整体了。我可以很自然的声明一个圆,而不是它的三个参数:

?
1
struct Circle c;

我们也不用担心x、y、z的数量不等了,更不用担心坐标和矩形坐标命名冲突——它们定义在Rectangle这个struct里呢:)。

事情还没有完。有了圆这个类型,那么对圆的操作呢?假设一个圆的操作之一为移动(move)。我们可以定义如下函数:

?
1
2
3
4
void circle_move (struct Circle *self, double x, double y) {
self->x = x;
self->y = y;
}

我们输入一个圆的指针,以及新的x、y坐标,移动操作帮助我们把指定的圆移动到新的坐标上。注意第一个参数self,是不是有点眼熟?它就是C++里的this。记得学C++时很多同学对this理解相当困难,如果看这个self就不难理解了:self就是我们要操作的那个变量,它是一个指针。C++在对象方法调用时省略了这个参数,它可以被编译器自动设置。在C里面,这个工作要我们自己做。因此移动一个圆要这么调用:

?
1
2
struct Circle cir;
circle_move (&cir, 10.0, 5.0);

注意self是个指针,因为C里没有引用,所以我们只能使用指针来达到传递一个对象,而不是传递它的复制品的效果。

这个方法……不就是普通的函数调用嘛,根本就没把操作给封装呀。好,现给一个看起来像C++中的方法:

?
1
2
3
4
5
6
7
8
struct Circle {
double x, y, radius;
void (*move) (struct Circle *self, double x, double y);
};
...
struct Circle cir;
cir.move = circle_move;
cir.move (&cir, 10.0, 5.0);

通过函数指针,可以让move调用看起来更像C++了。但是,有两个不爽的地方。其一,要显式地将circlemove函数赋值给move函数指针,如果有5个圆,那就要5行指定的代码(除非用数组+循环)。更为严重的是我们可以为不同的变量指定不同move操作。其二,调用时依然要显示地指定self,这带来的一个后果是,我们完全可以调用cir1的move,但是传入的是cir2的指针。

对于第一点,可以使用类结构+初始化函数来解决。对于第二点,C语言是没法避免显示的传入self指针(如果可以的话请告诉我)。因此这种写法只是“像”C++而已,没啥实际的好处。不过在之后我们会看到,GObject会在类结构中使用函数指针来表示对象的操作。

小结

  • 研究C下的面向对象实现,可以让我们更深入地了解面向对象的机理;
  • 要将数据封装,可以使用struct;
  • 要表示对某种对象的操作,定义一组函数,其第一个参数为要操作对象的指针。

 

struct  _GObject
{
  GTypeInstance  g_type_instance;
  
  /*< private >*/
  volatile guint ref_count;
  GData         *qdata;
};
struct  _GObjectClass
{
  GTypeClass   g_type_class;

  /*< private >*/
  GSList      *construct_properties;

  /*< public >*/
  /* seldomly overidden */
  GObject*   (*constructor)     (GType                  type,
                                 guint                  n_construct_properties,
                                 GObjectConstructParam *construct_properties);
  /* overridable methods */
  void       (*set_property)		(GObject        *object,
                                         guint           property_id,
                                         const GValue   *value,
                                         GParamSpec     *pspec);
  void       (*get_property)		(GObject        *object,
                                         guint           property_id,
                                         GValue         *value,
                                         GParamSpec     *pspec);
  void       (*dispose)			(GObject        *object);
  void       (*finalize)		(GObject        *object);
  /* seldomly overidden */
  void       (*dispatch_properties_changed) (GObject      *object,
					     guint	   n_pspecs,
					     GParamSpec  **pspecs);
  /* signals */
  void	     (*notify)			(GObject	*object,
					 GParamSpec	*pspec);

  /* called when done constructing */
  void	     (*constructed)		(GObject	*object);

  /*< private >*/
  gsize		flags;

  /* padding */
  gpointer	pdummy[6];
};

http://zh.wikipedia.org/wiki/GObject

類別實作[编辑]

每個 GObject 類別必須包含至少兩個結構:類別結構與實體結構。

類別結構
類別結構相當於 C++ 類別的 vtable。第一個元素必須是父類別的類別結構。裡面包含一組函式指標,也就是類別的虛擬方法。放在這裡的變數,就像是 C++ 類別裡的 const 或類別層級的成員。
實體結構
每個物件實體都將會是這個實體結構的副本,同樣地,第一個元素,必須是實體結構的父類別(這可以確保每個實體都有個指標可以指向類別結構,所有的基礎類別也同樣如此),在歸入父類別之後,可以在結構內放其他的變數,這就相當於 C++ 的成員變數。

C 結構沒有像 "public", "protected" 或 "private" 等的存取層級修飾,一般的方法是藉著在實體結構裡提供一個指向私有資料的指標,照慣例稱作 _priv。私有的結構可以宣告在公有的表頭檔案裡,然後把實體的定義寫在實作的檔案中,這樣作,對使用者來說,他並不知道私有資料是什麼,但對於實作者來說,卻可以很清楚的知道。如果私有結構也註冊在 GType 裡,那麼物件系統將會自動幫它配置空間。

GObject 框架最主要的不利點在於太過於冗長。像是手動定義的類型轉換巨集和難解的型別註冊咒語等的大量模板代碼都是建立新類別所必要的。GObject Builder 或 GOB2 這些工具試圖以提供樣板語法來解決這個問題。以 GOB2 寫的代碼必須事先處理過才能編譯。另外,Vala 可以將 c# 的語法轉換成 C,並編譯出獨立的二進制檔。

 

 

http://blog.csai.cn/user1/265/archives/2006/3301.html

注:此文只许被引用,非经作者同意不得擅自转载


一、概述

gtk是一套跨多种平台的图形工具包,虽然它是采用c语言来编写的,但是gtk显然具有很好的面向对象特性。为了今后自己能更好的使用gtk或扩展gtk系统,在这里俺对gtk对象系统与c++对象系统展开一个综合的比较。

二、支持面向对象的核心引擎

面向对象是一种思想!众所周知,诸如C++、java等语言,我们称它为面向对象的编程语言!那么究竟何谓“面向对象的编程语言”呢?简言之,能够从语言级别上较好的支持“面向对象”特性的,都可以称其为面向对象的编程语言。也即在编译器内部对这种语言予以了面向对象特性的直接支持,如抽象(反映一般和特殊的关系)和继承,聚合(反映局部和整体的关系)和封装等。

但是请注意,对面向对象编程方法的运用,不一定非要求我们必须采用支持“面向对象”的编程语言。因为面向对象编程是一种思想和方法,它本身与某种编程语言并无直接映射关系,它也更不会去依赖某种编程语言。或者换句话说,如果你选择了某种很好地支持了面向对象特性的编程语言,那么你也许能够更轻松地运用到一些面向对象的编程方法,去很好的解决你的业务问题。

因此,gtk虽然是采用了不支持面向对象特性的c语言所编写,但这并不影响gtk也具有了很好的面向对象特性。当然,与C++对象系统相比较,为了使gtk系统具有面向对象的特性,它就必须自己来实现那些由“c++编译器”所实现的对面向对象的支持。总结为一句话:C++对“面向对象”的支持是在“c++编译器”那里给实现的;而gtk系统中对“面向对象”的支持则必须由它自己来亲自实现,也即gtk面向对象的引擎是在gtk的运行库中,例如“类型”的创建、继承关系、“类型”的映射等!

总之,我们需要铭记一点:在gtk系统中,它在运行时刻,会维护一个全局的、大型的、能够反映所有继承关系的“类型关联表”。


三、共同的基类

gtk对象系统中所有的对象都有一个共同的基类,这一点很容易做到;而c++则没有;但java语言中有。

四、如何实现继承,扩展出子类

1、gtk系统的继承是利用struct关键字,它与c++中的class含义如出一辙!下面对重要的几点逐一说明一下:

2、c++中定义一个“类型”时,“成员变量”和“成员函数”都放在class数据结构之中,c++编译器会知道分别该怎么处理它。而gtk对象系统中,定义一个“类型”时,它的“成员变量”和“成员函数”应该显式将它们分开的,原因是“成员变量”的聚合才代表真正的“对象”,它可能是需要被实例化出很多的对象实例;而“对象类”则是“成员函数”的聚合,它表示对该类“对象”中成员变量的处理操作的统一封装,而且“对象类”只需要被实例化一次。示例如下:

//gtkcalendar.h文件中
struct _GtkCalendar
{
//第一个字段表示基类
GtkWidget widget;

//下面是其它数据字段("成员变量"的聚合)
...
}


struct _GtkCalendarClass
{
//第一个字段表示基类
GtkWidgetClass parent_class;

//下面是其它数据字段("成员函数"的聚合)
...
}


3、与c++中不同,由于gtk对象系统中必须自己维护所有继承关系的“类型关联表”,因此除了上面在.h文件中定义了类接口以外,它还必须在.c文件中来显式地实现这种关联。如下示例:

GType
gtk_calendar_get_type (void)
{
static GType calendar_type = 0;

if (!calendar_type)
{
static const GTypeInfo calendar_info =
{
sizeof (GtkCalendarClass),
NULL, /* base_init */
NULL, /* base_finalize */
//指定类型的初始化函数,每种“类型”只被实例化一次。
//在某种“类型”第一次被使用到时,它被实例化,这个函数同时也被调用
(GClassInitFunc) gtk_calendar_class_init,
NULL, /* class_finalize */
NULL, /* class_data */
sizeof (GtkCalendar),
0, /* n_preallocs */
(GInstanceInitFunc) gtk_calendar_init,//注意,这里指定对象的构造函数
};

// 注意!下面这条语句非常关键,尤其是第一个参数的指定,实际上它才代表“类型”的真正继承关系。
// GTK_TYPE_WIDGET代表一个"类型id",每种类型都有唯一对应的一个id,实际上,只有通过这个id才
// 可以访问到gtk系统中"类型”数据结构;而只有访问到了"类型”数据结构,gtk核心才知道该怎么去
// 创建实例化对象(如对象的尺寸大小等信息)
calendar_type = g_type_register_static (GTK_TYPE_WIDGET, "GtkCalendar",
&calendar_info, 0);
}

return calendar_type;
}

3、鉴于习惯的原因,在下文中,我们对gtk系统中相当于c++中“对象”的数据结构称为“构件”,如上面的_GtkCalendar;而相当于c++中“对象类”的gtk数据结构称为“构件类”,如上面的_GtkCalendarClass。


五、“成员变量”的可见性

1、在c++中,“成员变量”的可见性有public、protected、private三种。其实public属性的成员变量基本上不采用,因为我们坚决要杜绝这样做。

2、那么在gtk系统中,“成员变量”的可见性应该怎样控制呢?这里建议你按照如下规则:
(1)public的gtk不支持
(2)protected的,可以直接放在“构件”的数据字段中,它表示可以允许子类访问它
(3)对于需要private属性的,我们应该在“构件”的数据字段中定义一个特殊的字段,如_GtkCalendar构件中的gpointer private_data;再在这个构件的实现文件中定义这个只能被这个构件自己所访问并操作的“私有数据结构”体,如_GtkCalendar构件中的_GtkCalendarPrivateData。
(4)我们可以核对一下现在的gtk系统中每个gtk构件是不是都遵循以上的规则。

六、“成员函数”的可见性

1、同样,在c++中,“成员函数”的可见性也是有public、protected、private三种。public表示外部可见;protected表示子类可见;private表示只有自己可见。

2、在gtk系统中,“成员函数”的可见性又该如何来控制呢?有如下规则或建议:

(1)public属性的表示gtk的外部访问接口。注意!它虽然在.h文件中做声明,但是它并不是(也不能)在“构件类”中进行声明的字段,所以与c++中public属性的“成员函数”相对比,gtk中的这类的外部访问接口函数,它不能被子类所覆盖或重载。也正因为如此,才导致gtk系统的外部访问接口函数简直是忒多忒多了,个人认为这是gtk对象系统中所不可避免的最大缺陷,因为使用者要花很多的精力来学习它和熟悉它。另外,gtk中的这类的外部访问接口函数在声明时,第1个参数都会是“构件”自身,这相当与c++系统中隐式的this指针。示例如下:
//gtkcalendar.h文件中
//gtk_calendar_select_month相当于c++中public属性的“成员函数”,而GtkCalendar *calendar则相当于c++函数中隐式的this指针
gboolean gtk_calendar_select_month (GtkCalendar *calendar,
guint month,
guint year);

(2)请问,c++中protected属性的“成员函数”在gtk系统中如何来体现呢?其实不难理解,那就是那些直接在“构件类”中所声明的数据字段,它们才是来扮演protected属性“成员函数”这个重要角色的功能。当然值得注意的是,这类函数在声明时,都是一个个指向函数的指针(gtk系统中,把它们习惯称为“信号处理函数”,Signal handlers),而不是声明函数本身。其原因就是因为这类函数需要或允许我们能够在子类覆盖或重载它,有时甚至要求某些函数具有“多态性”,而在c++中,语言自身(如virtual所声明的虚函数)就具备了多态性,但在c语言所实现的gtk系统中,我们则可以通过“函数指针”来实现这种“多态性”,在后面的有关内容中,会进一步讨论它。示例如下:
//gtkcalendar.h文件中
struct _GtkCalendarClass
{
GtkWidgetClass parent_class;

/* Signal handlers */
void (* month_changed) (GtkCalendar *calendar);//相当于c++中protected属性的“成员函数”,同样第一个函数参数也是构件自身
void (* day_selected) (GtkCalendar *calendar);
...
};

(3)private属性的成员函数,好象在gtk系统中没有相对应的(似乎也感觉这类函数没多大必要)!其实不然,那些没在.h文件中所声明的,而仅在.c文件中所声明并使用的static函数,我们便可以把它们理解为c++中private属性的“成员函数”,个人认为它们扮演的角色差不多。

七、“成员函数”怎样实现覆盖和多态

1、这里所讨论的“成员函数”都是指在“构件类”中所声明的那些函数指针(从作用和角色上来考量,它相当于c++中protected属性的“成员函数”)

2、gtk系统中的这类函数,我们可以认为它都是多态的虚函数(相当于c++中用virtual关键字所声明的虚函数)

3、gtk系统中,我们如何实现这类函数的覆盖呢?(c++中对成员函数如何实现覆盖,在这里就不浪费笔墨了)

(1)首先在.c文件中声明并实现一个相应的static类型的信号回调函数,如gtkcalendar.c文件中所定义的gtk_calendar_realize函数

(2)接着在我们就可以在“构件类”的初始化函数(如GtkCalendarClass“构件类”中的gtk_calendar_class_init函数,这个在前面提到过,它在gtk_calendar_get_type函数中被指定)内做文章了。也即如果我们想覆盖某个函数,只需简单的给某个“构件类”中的相应“函数指针”进行赋值即可。在gtk系统中,我们把这样的操作通常称为对某类事件信号进行了响应处理,其结果就是对父类的事件响应函数进行了覆盖。

(3)从上面可以看出,对一个所谓的“成员函数”实现覆盖是很容易做到的,但问题是?子类在覆盖父类某个函数之后,子类有时又想调用父类的一些函数!对于这样的要求,在c++中,我们可以用域作用符(“::”)来实现函数的静态绑定;那么请问gtk系统中,这又如何实现呢?其实不难,有如下的步骤:

a. 首先在.c文件中声明一个static类型的父类指针,如GtkCalendar“构件”有如下声明:
//gtkcalendar.c文件中
static GtkWidgetClass *parent_class = NULL;

b. 接着在“构件类”的初始化函数中(gtk_calendar_class_init函数),对上述parent_class进行赋值,如下:
//gtkcalendar.c文件中
static void gtk_calendar_class_init (GtkCalendarClass *class)
{
...
parent_class = g_type_class_peek_parent (class); //获得了父类的“构件类”
...
}

c. 最后,就可以在你自己的“成员函数”中,任意使用父类作用域内的“成员函数”了,示例如下:
//gtkcalendar.c文件中
static void gtk_calendar_finalize (GObject *object)
{
GtkCalendarPrivateData *private_data;
private_data = GTK_CALENDAR_PRIVATE_DATA (object);

g_free (private_data);

//释放自己构件的资源之后,接着调用父类的finalize函数,以便父类构件能释放它曾经所申请并获得的资源
(* G_OBJECT_CLASS (parent_class)->finalize) (object);
}


八、构造函数和析构函数

1、c++面向对象编程一个特出的优点就是每个对象都有相应的构造函数和析构函数。这为我们有效管理“资源”带来了极大的便利,我们通常可以在构造函数中申请资源,而在析构函数中释放资源。

2、在gtk系统中,当然也有构造函数,如_GtkCalendar“构件”的构造函数便是gtk_calendar_init函数,它是在gtk_calendar_get_type函数内定义的GTypeInfo数据结构中被指定。

3、在gtk系统中,析构函数同样也是存在的,但是它不是在GTypeInfo数据结构中被指定的,想想这是为什么?个人分析,其原因有二:首先是没有必要必须在GTypeInfo数据结构中来指定它;其二,就是因为析构函数一般都应该是虚函数,这样系统会更健壮些,也很严谨,且简单。所以gtk系统中,析构函数被声明为“构件类”中的finalize事件信号,请参阅gobject.h文件中的_GObjectClass数据结构的定义


九、对象类型的转换

1、在c++中,对象类型的转换有2种。一种是子类向父类的转换,这是隐式的,也即可以自动完成;还有一类就是父类向子类的转换,这不是隐式的,而必须是显式的,也即它需要RTTI信息,当然你也可以不借助RTTI,而野蛮地进行类型间的强制转换,但这绝对是C++中不提倡的。

2、在gtk对象系统中,“构件”之间类型的转换当然也会有子类向父类的转换,以及父类向子类的转换等2种,但是与c++对象系统不同的是,gtk构件之间类型的转换都必须是显式的,无论是子类向父类的转换,还是父类向子类的转换。这是因为gtk系统中对面向对象的支持都是靠自身来实现的,而没有编译器语言级别上的面向对象的支持。所以它的这种转换都必须借助于gtk运行库中RTTI的支持。

3、每个gtk构件的.h文件中一开始的所声明的几个宏,它们都是用来支持RTTI类型转换的,示例如下:
#define GTK_TYPE_CALENDAR (gtk_calendar_get_type ())
//用的最多的就是下面的这个宏了,它实现构件的类型转换,返回一个指针
#define GTK_CALENDAR(obj) (G_TYPE_CHECK_INSTANCE_CAST ((obj), GTK_TYPE_CALENDAR, GtkCalendar))
#define GTK_CALENDAR_CLASS(klass) (G_TYPE_CHECK_CLASS_CAST ((klass), GTK_TYPE_CALENDAR, GtkCalendarClass))
#define GTK_IS_CALENDAR(obj) (G_TYPE_CHECK_INSTANCE_TYPE ((obj), GTK_TYPE_CALENDAR))
#define GTK_IS_CALENDAR_CLASS(klass) (G_TYPE_CHECK_CLASS_TYPE ((klass), GTK_TYPE_CALENDAR))
#define GTK_CALENDAR_GET_CLASS(obj) (G_TYPE_INSTANCE_GET_CLASS ((obj), GTK_TYPE_CALENDAR, GtkCalendarClass))

十、最后简单讨论一下对gtk构件的使用:构件的创建和构件的销毁

1、在c++中,实例化一个对象时,可以用new关键字来从堆上动态地创建出,也可以在栈上创建一个临时的对象实例(局部变量);而在gtk系统中,所有的构件都会是在堆上被创建出的,并且对于每一个构件的创建,它都会声明有一个专门的外部接口函数,如gtk_calendar_new()则用于创建一个_GtkCalendar“构件”

2、在c++中,对象实例如果在栈上,那么被自动销毁;如果在堆上,那么则必须通过delete关键字来显式地销毁某一个对象实例。在对象实例被销毁时,编译器会隐式地插入一个对相应析构函数的调用,来释放该对象所拥有的系统资源。

3、而在gtk系统中,用户一般无需考虑构件的销毁,以及构件资源的释放等,这些琐碎工作会由gtk系统内部来自动地、智能地完成它。它的原理(或者说处理流程)是这个样子的:外部用户调用构件的某个外部接口,如gtk_widget_destroy(...)函数,这会导致构件触发destroy信号事件,这个信号的缺省处理函数会引发gtk系统内部来销毁这个构件,并触发finalize信号事件,之后gtk系统会释放构件在堆上的内存资源;当然gtk_widget_destroy函数也会触发它所有的子构件的销毁,所以说gtk构件的销毁有一定的“自治”能力。最后需要强调的是,gtk1.2版本和gtk2.0版本对构件销毁的原理和流程可能会有比较大的差别,但它们的设计思想是基本保持一致的,上述的流程是针对gtk1.2版本而言的,而gtk2.0以上版本的对构件销毁的流程俺没有深入去研究过它,估计大体差不多,只不过gtk2.0引入了“引用计数”的管理机制,所以对构件销毁的触发过程可能略有不同。


十一、个人总结

C语言是面向过程的编程语言,但是gtk的设计者们却能够用C语言写出如此精妙的gtk对象系统,令俺甚是折服!敬仰之情更是油然而生!深刻体会到什么才是真正的软件系统设计师!同时也告诫自己:语言不是软件的灵魂,它仅仅只是一个工具罢了!俺坚信:一个优秀的程序员用C写出来的东西,将可能会比一个蹩脚的程序员用C++写出来的东西好n多倍。

当然,由于gtk所承载的是一个图形系统,所以gtk它终究会很复杂;但一个如此复杂的系统,却用一个如此简单的C语言来实现它,所以gtk系统必然会烙印上许多不可避免的局限性。例如它的接口实在是太多了,难以学习和使用它;没有异常事件处理系统,所以健壮性很值得担忧;gtk的扩展比较难,且需要自己来实现的东西也较多,代码量大,这无疑都增加了开发新构件的成本。所以俺个人觉得,gtk终究会被淘汰并退出历史的舞台,而能够永恒的、并值得我们永远牢记在心的是,它这其中的许多设计思想和设计理念!

gstreamer 中ABI接口定义的比较多。

//ZZ

应用程序二进制接口(ABI-Application Binary Interface)定义了一组在PowerPC系统软件上编译应用程序所需要遵循的一套规则。主要包括基本数据类型,通用寄存器的使用,参数的传递规则,以及堆栈的使用等等。

ABI涵盖了各种细节:如数据类型、大小和对齐;调用约定(控制着函数的参数如何传送以及如何接受返回值);系统调用的编码和一个应用如何向操作系统进行系统调用;以及在一个完整的操作系统ABI中,目标文件的二进制格式、程序库等等。一个完整的ABI,像Intel二进制兼容标准(iBCS)[1] ,允许支持它的操作系统上的程序不经修改在其他支持此ABI的操作体统上运行。

其他的 ABI 标准化细节包括 C++ 名称修饰[2] ,和同一个平台上的编译器之间的调用约定[3],但是不包括跨平台的兼容性。

ABI不同于应用程序接口(API),API定义了源代码和库之间的接口,因此同样的代码可以在支持这个API的任何系统中编译,然而ABI允许编译好的目标代码在使用兼容ABI的系统中无需改动就能运行。 在Unix风格的操作系统中,存在很多运行在同一硬件平台上互相相关但是不兼容的操作系统(尤其是Intel 80386兼容系统)。有一些努力尝试标准化ABI,以减少销售商将程序移植到其他系统时所需的工作。然而,直到现在还没有很成功的例子,虽然Linux标准化工作组正在为Linux做这方面的努力。

API,顾名思义,是编程的接口,换句话说也就是你编写“应用程序”时候调用的函数之类的东西。对于内核来说,它的“应用程序”有两种:一种是在它之上的,用户空间的真正的应用程序,内核给它们提供的是系统调用这种接口,比如 read(2),write(2);另一种就是内核模块了,它们和内核处于同一层,内核给它们提供的是导出的内核函数,比如 kmalloc(),printk()。这些接口都是你可以在编写程序的时候直接看到的,可以直接拿来用的。

而 ABI 是另一种形式的接口,二进制接口。除非你直接使用汇编语言,这种接口一般是不能直接拿来用的。比如,内核系统调用用哪些寄存器或者干脆用堆栈来传递参数,返回值又是通过哪个寄存器传递回去,内核里面定义的某个结构体的某个字段偏移是多少等等,这些都是二进制层面上的接口。这些接口是直接给编译好的二进制用的。换句话说,如果 ABI 保持稳定的话,你在之前版本上编译好的二进制应用程序、内核模块,完全可以无须重新编译直接在新版本上运行。另一种比较特殊的 ABI 是像 /proc,/sys 目录下面导出的文件,它们虽然不是直接的二进制形式,但也会影响编译出来的二进制,如果它里面使用到它们的话,因此这些“接口”也是一种 ABI。

你平时看到的什么 POSIX 标准啊,C99 标准啊,都是对 API 的规定。而规定 ABI 的标准就不多,而且也没那么强势,Linux 上面的 ABI 标准似乎只有 Linux Foundation 提供的一些标准

好了,从上面我们可以看出,其实保持一个稳定的 ABI 要比保持稳定的 API 要难得多。比如,在内核中 int register_netdevice(struct net_device *dev) 这个内核函数原型基本上是不会变的,所以保持这个 API 稳定是很简单的,但它的 ABI 就未必了,就算是这个函数定义本身没变,即 API 没变,而 struct net_device 的定义变了,里面多了或者少了某一个字段,它的 ABI 就变了,你之前编译好的二进制模块就很可能会出错了,必须重新编译才行。

你可能会感到意外,上游的 Linux 内核其实不光不保持稳定的 ABI,它就连稳定的 API 都不会保持!而且还牛逼哄哄地写了一个文档,叫 stable_api_nonsense.txt。这么做的道理是,内核一直在向前推进,而且速度很快,内核开发者们并不想因为 API 的限制而阻碍前进的脚步!毕竟我们不想成为下一个 Windows!:-)

所以,你的驱动在不同版本的内核上不经修改直接运行那几乎是不太可能的,就算是你允许重新编译也未必就能不经修改编译成功。即使在同一个大版本的不同发行版上也可能不行。

那你应该怎么办?最好的办法莫过于把你的驱动贡献到社区,汇入内核源代码树中,这样一旦内核的 API 有改动,改动这个 API 的人就有义务替你修改你的驱动的代码,你只需要 review 一下(或者这个也会有人帮你),也省去你不少时间,何乐而不为呢?另一种办法就是基于某个提供稳定 ABI 的内核,比如红帽的 RHEL (认为这是广告的人请使用 CentOS,谢谢!),红帽的企业版内核保证有稳定的 ABI,只要你没有跨大的版本,因为我们的源代码里会检测 ABI 的变化,为此我们实在付出了不少努力。

struct _GstElement
{
  GstObject             object;

  /*< public >*/ /* with LOCK */
  GStaticRecMutex      *state_lock;

  /* element state */
  GCond                *state_cond;
  guint32               state_cookie;
  GstState              current_state;
  GstState              next_state;
  GstState              pending_state;
  GstStateChangeReturn  last_return;

  GstBus               *bus;

  /* allocated clock */
  GstClock             *clock;
  GstClockTimeDiff      base_time; /* NULL/READY: 0 - PAUSED: current time - PLAYING: difference to clock */

  /* element pads, these lists can only be iterated while holding
   * the LOCK or checking the cookie after each LOCK. */
  guint16               numpads;
  GList                *pads;
  guint16               numsrcpads;
  GList                *srcpads;
  guint16               numsinkpads;
  GList                *sinkpads;
  guint32               pads_cookie;

  /*< private >*/
  union {
    struct {
      /* state set by application */
      GstState              target_state;
      /* running time of the last PAUSED state */
      GstClockTime          start_time;
    } ABI;
    /* adding + 0 to mark ABI change to be undone later */
    gpointer _gst_reserved[GST_PADDING + 0];
  } abidata;
};

 

struct _GstEvent {
  GstMiniObject mini_object;

  /*< public >*/ /* with COW */
  GstEventType  type;
  guint64       timestamp;
  GstObject     *src;

  GstStructure  *structure;

  /*< private >*/
  union {
    guint32 seqnum;
    gpointer _gst_reserved;
  } abidata;
};
struct _GstMessage
{
  GstMiniObject mini_object;

  /*< private >*//* with MESSAGE_LOCK */
  GMutex *lock;                 /* lock and cond for async delivery */
  GCond *cond;

  /*< public > *//* with COW */
  GstMessageType type;
  guint64 timestamp;
  GstObject *src;

  GstStructure *structure;

  /*< private >*/
  union {
    struct {
      guint32 seqnum;
    } ABI;
    /* + 0 to mark ABI change for future greppage */
    gpointer _gst_reserved[GST_PADDING + 0];
  } abidata;
};
struct _GstClock {
  GstObject	 object;

  GMutex	*slave_lock; /* order: SLAVE_LOCK, OBJECT_LOCK */

  /*< protected >*/ /* with LOCK */
  GstClockTime	 internal_calibration;
  GstClockTime	 external_calibration;
  GstClockTime	 rate_numerator;
  GstClockTime	 rate_denominator;
  GstClockTime	 last_time;
  GList		*entries;
  GCond		*entries_changed;

  /*< private >*/ /* with LOCK */
  GstClockTime	 resolution;
  gboolean	 stats;

  /* for master/slave clocks */
  GstClock      *master;

  /* with SLAVE_LOCK */
  gboolean       filling;
  gint           window_size;
  gint           window_threshold;
  gint           time_index;
  GstClockTime   timeout;
  GstClockTime  *times;
  GstClockID     clockid;

  /*< private >*/
  union {
    GstClockPrivate *priv;
    GstClockTime     _gst_reserved[GST_PADDING];
  } ABI;
};

 

struct _GstPad {
  GstObject			object;

  /*< public >*/
  gpointer			element_private;

  GstPadTemplate		*padtemplate;

  GstPadDirection		 direction;

  /*< public >*/ /* with STREAM_LOCK */
  /* streaming rec_lock */
  GStaticRecMutex		*stream_rec_lock;
  GstTask			*task;
  /*< public >*/ /* with PREROLL_LOCK */
  GMutex			*preroll_lock;
  GCond				*preroll_cond;

  /*< public >*/ /* with LOCK */
  /* block cond, mutex is from the object */
  GCond				*block_cond;
  GstPadBlockCallback		 block_callback;
  gpointer			 block_data;

  /* the pad capabilities */
  GstCaps			*caps;
  GstPadGetCapsFunction		getcapsfunc;
  GstPadSetCapsFunction		setcapsfunc;
  GstPadAcceptCapsFunction	 acceptcapsfunc;
  GstPadFixateCapsFunction	 fixatecapsfunc;

  GstPadActivateFunction	 activatefunc;
  GstPadActivateModeFunction	 activatepushfunc;
  GstPadActivateModeFunction	 activatepullfunc;

  /* pad link */
  GstPadLinkFunction		 linkfunc;
  GstPadUnlinkFunction		 unlinkfunc;
  GstPad			*peer;

  gpointer			 sched_private;

  /* data transport functions */
  GstPadChainFunction		 chainfunc;
  GstPadCheckGetRangeFunction	 checkgetrangefunc;
  GstPadGetRangeFunction	 getrangefunc;
  GstPadEventFunction		 eventfunc;

  GstActivateMode		 mode;

  /* generic query method */
  GstPadQueryTypeFunction	 querytypefunc;
  GstPadQueryFunction		 queryfunc;

  /* internal links */
#ifndef GST_DISABLE_DEPRECATED
  GstPadIntLinkFunction		 intlinkfunc;
#else
#ifndef __GTK_DOC_IGNORE__
  gpointer intlinkfunc;
#endif
#endif

  GstPadBufferAllocFunction      bufferallocfunc;

  /* whether to emit signals for have-data. counts number
   * of handlers attached. */
  gint				 do_buffer_signals;
  gint				 do_event_signals;

  /* ABI added */
  /* iterate internal links */
  GstPadIterIntLinkFunction     iterintlinkfunc;

  /* free block_data */
  GDestroyNotify block_destroy_data;

  /*< private >*/
  union {
    struct {
      gboolean                      block_callback_called;
      GstPadPrivate                *priv;
    } ABI;
    gpointer _gst_reserved[GST_PADDING - 2];
  } abidata;
};
struct _GstTask {
  GstObject      object;

  /*< public >*/ /* with LOCK */
  GstTaskState     state;
  GCond           *cond;

  GStaticRecMutex *lock;

  GstTaskFunction  func;
  gpointer         data;

  gboolean         running;

  /*< private >*/
  union {
    struct {
      /* thread this task is currently running in */
      GThread  *thread;
    } ABI;
    gpointer _gst_reserved[GST_PADDING - 1];
  } abidata;

  GstTaskPrivate *priv;
};

 

 

gst-inspect 的使用

Gstreamer学习list.

http://hi.baidu.com/dudangyimian/item/7afde5f9ca862c19e2e3bdb0

 

1. 默认打印信息

/YOURGST/bin/gst-inspect

此命令很常用,可以看出系统安装了多少个插件,共有多少个特性。

2. 最全打印信息

/YOURGST/bin/gst-inspect -a

此命令的打印量特别大,最好将结果存到文件中,供以后搜索

3. 查询可用于URI的特性

/YOURGST/bin/gst-inspect -u

输出都是一些 src, sink。是创建管道的源头和终点。

4. 查询某个插件信息

比如查询核心插件:

/YOURGST/bin/gst-inspect --plugin coreelements

5. 查询某个特性信息

查询文件源元件:

/YOURGST/bin/gst-inspect filesrc

值得注意的是,特性(feature)有三种:元件(element)、类型查找(typefinder)、索引(index)

6. 查询某个核心插件中包含的信息

/YOURGST/bin/gst-inspect /YOURGST//lib/gstreamer-0.10/libgstcoreelements.so

gst-inspect :

rawparse:  videoparse: Video Parse
rawparse:  audioparse: Audio Parse
mulaw:  mulawenc: Mu Law audio encoder
mulaw:  mulawdec: Mu Law audio decoder
avi:  avidemux: Avi demuxer
avi:  avimux: Avi muxer
avi:  avisubtitle: Avi subtitle parser
dccp:  dccpclientsrc: DCCP client source
dccp:  dccpserversink: DCCP server sink
dccp:  dccpclientsink: DCCP client sink
dccp:  dccpserversrc: DCCP server source
jpeg:  jpegenc: JPEG image encoder
jpeg:  jpegdec: JPEG image decoder
jpeg:  smokeenc: Smoke video encoder
jpeg:  smokedec: Smoke video decoder
1394:  dv1394src: Firewire (1394) DV video source
1394:  hdv1394src: Firewire (1394) HDV video source
deinterlace:  deinterlace: Deinterlacer
nsf:  nsfdec: Nsf decoder
debug:  breakmydata: Break my data
debug:  capssetter: CapsSetter
debug:  rndbuffersize: Random buffer size
debug:  navseek: Seek based on left-right arrows
debug:  pushfilesrc: Push File Source
debug:  progressreport: Progress report
debug:  taginject: TagInject
debug:  testsink: Test plugin
debug:  capsdebug: Caps debug
debug:  cpureport: CPU report
ogg:  oggdemux: Ogg demuxer
ogg:  oggmux: Ogg muxer
ogg:  ogmaudioparse: OGM audio stream parser
ogg:  ogmvideoparse: OGM video stream parser
ogg:  ogmtextparse: OGM text stream parser
ogg:  oggparse: Ogg parser
ogg:  oggaviparse: Ogg AVI parser
flac:  flacenc: FLAC audio encoder
flac:  flacdec: FLAC audio decoder
flac:  flactag: FLAC tagger
fsrtcpfilter:  fsrtcpfilter: RTCP Filter element
pango:  textoverlay: Text overlay
pango:  timeoverlay: Time overlay
pango:  clockoverlay: Clock overlay
pango:  textrender: Text renderer
gio:  giosink: GIO sink
gio:  giosrc: GIO source
gio:  giostreamsink: GIO stream sink
gio:  giostreamsrc: GIO stream source
videomaxrate:  videomaxrate: Video maximum rate adjuster
alaw:  alawenc: A Law audio encoder
alaw:  alawdec: A Law audio decoder
pulseaudio:  pulsesink: PulseAudio Audio Sink
pulseaudio:  pulsesrc: PulseAudio Audio Source
pulseaudio:  pulseaudiosink: Bin wrapping pulsesink
pulseaudio:  pulsemixer: PulseAudio Mixer
dc1394:  dc1394src: 1394 IIDC Video Source
smpte:  smpte: SMPTE transitions
smpte:  smptealpha: SMPTE transitions
smooth:  smooth: Smooth effect
ivfparse:  ivfparse: IVF parser
tta:  ttaparse: TTA file parser
tta:  ttadec: TTA audio decoder
mxf:  mxfdemux: MXF Demuxer
mxf:  mxfmux: MXF muxer
dv:  dvdemux: DV system stream demuxer
dv:  dvdec: DV video decoder
effectv:  edgetv: EdgeTV effect
effectv:  agingtv: AgingTV effect
effectv:  dicetv: DiceTV effect
effectv:  warptv: WarpTV effect
effectv:  shagadelictv: ShagadelicTV
effectv:  vertigotv: VertigoTV effect
effectv:  revtv: RevTV effect
effectv:  quarktv: QuarkTV effect
effectv:  optv: OpTV effect
effectv:  radioactv: RadioacTV effect
effectv:  streaktv: StreakTV effect
effectv:  rippletv: RippleTV effect
mpegtsdemux:  tsparse: MPEG transport stream parser
mpegtsdemux:  tsdemux: MPEG transport stream demuxer
coreindexers:  memindex: A index that stores entries in memory
coreindexers:  fileindex: A index that stores entries in file
adder:  adder: Adder
soup:  souphttpsrc: HTTP client source
soup:  souphttpclientsink: HTTP client sink
subparse: subparse_typefind: srt, sub, mpsub, mdvd, smi, txt, dks
subparse:  subparse: Subtitle parser
subparse:  ssaparse: SSA Subtitle Parser
equalizer:  equalizer-nbands: N Band Equalizer
equalizer:  equalizer-3bands: 3 Band Equalizer
equalizer:  equalizer-10bands: 10 Band Equalizer
faceoverlay:  faceoverlay: faceoverlay
vcdsrc:  vcdsrc: VCD Source
udp:  udpsink: UDP packet sender
udp:  multiudpsink: UDP packet sender
udp:  dynudpsink: UDP packet sender
udp:  udpsrc: UDP packet receiver
adpcmdec:  adpcmdec: ADPCM decoder
wavenc:  wavenc: WAV audio muxer
imagefreeze:  imagefreeze: Still frame stream generator
cairo:  cairotextoverlay: Text overlay
cairo:  cairotimeoverlay: Time overlay
cairo:  cairooverlay: Cairo overlay
cairo:  cairorender: Cairo encoder
cutter:  cutter: Audio cutter
openal:  openalsink: Audio sink (OpenAL)
openal:  openalsrc: OpenAL src
speed:  speed: Speed
auparse:  auparse: AU audio demuxer
tcp:  tcpclientsink: TCP client sink
tcp:  tcpclientsrc: TCP client source
tcp:  tcpserversink: TCP server sink
tcp:  tcpserversrc: TCP server source
tcp:  multifdsink: Multi filedescriptor sink
sdp:  sdpdemux: SDP session setup
video4linux2:  v4l2src: Video (video4linux2) Source
video4linux2:  v4l2sink: Video (video4linux2) Sink
video4linux2:  v4l2radio: Radio (video4linux2) Tuner
dataurisrc:  dataurisrc: data: URI source element
modplug:  modplug: ModPlug
pcapparse:  pcapparse: PCapParse
pcapparse:  irtspparse: IRTSPParse
navigationtest:  navigationtest: Video navigation test
nice:  nicesrc: ICE source
nice:  nicesink: ICE sink
sndfile:  sfsink: Sndfile sink
sndfile:  sfsrc: Sndfile source
videosignal:  videoanalyse: Video analyser
videosignal:  videodetect: Video detecter
videosignal:  videomark: Video marker
autodetect:  autovideosink: Auto video sink
autodetect:  autovideosrc: Auto video source
autodetect:  autoaudiosink: Auto audio sink
autodetect:  autoaudiosrc: Auto audio source
dirac:  diracenc: Dirac Encoder
gstsiren:  sirendec: Siren Decoder element
gstsiren:  sirenenc: Siren Encoder element
decodebin:  decodebin: Decoder Bin
inter:  interaudiosrc: FIXME Long name
inter:  interaudiosink: FIXME Long name
inter:  intervideosrc: FIXME Long name
inter:  intervideosink: FIXME Long name
debugutilsbad:  checksumsink: Checksum sink
debugutilsbad:  fpsdisplaysink: Measure and show framerate on videosink
debugutilsbad:  chopmydata: FIXME
debugutilsbad:  compare: Compare buffers
debugutilsbad:  debugspy: DebugSpy
fsvideoanyrate:  fsvideoanyrate: Videoanyrate element
gconfelements:  gconfvideosink: GConf video sink
gconfelements:  gconfvideosrc: GConf video source
gconfelements:  gconfaudiosink: GConf audio sink
gconfelements:  gconfaudiosrc: GConf audio source
gstrtpmanager:  gstrtpbin: RTP Bin
gstrtpmanager:  gstrtpjitterbuffer: RTP packet jitter-buffer
gstrtpmanager:  gstrtpptdemux: RTP Demux
gstrtpmanager:  gstrtpsession: RTP Session
gstrtpmanager:  gstrtpssrcdemux: RTP SSRC Demux
videofilter:  gamma: Video gamma correction
videofilter:  videobalance: Video balance
videofilter:  videoflip: Video flipper
liveadder:  liveadder: Live Adder element
opus:  opusenc: Opus audio encoder
opus:  opusdec: Opus audio decoder
opus:  opusparse: Opus audio parser
opus:  rtpopusdepay: RTP Opus packet depayloader
opus:  rtpopuspay: RTP Opus payloader
app:  appsrc: AppSrc
app:  appsink: AppSink
apetag:  apedemux: APE tag demuxer
efence:  efence: Electric Fence
fsmsnconference:  fsmsncamsendconference: Farstream MSN Sending Conference
fsmsnconference:  fsmsncamrecvconference: Farstream MSN Reception Conference
ossaudio:  ossmixer: OSS Mixer
ossaudio:  osssrc: Audio Source (OSS)
ossaudio:  osssink: Audio Sink (OSS)
videoscale:  videoscale: Video scaler
nuvdemux:  nuvdemux: Nuv demuxer
freeverb:  freeverb: Stereo positioning
mpegvideoparse:  legacympegvideoparse: MPEG video elementary stream parser
multipart:  multipartdemux: Multipart demuxer
multipart:  multipartmux: Multipart muxer
decklink:  decklinksrc: Decklink source
decklink:  decklinksink: Decklink Sink
resindvd:  rsndvdbin: rsndvdbin
cdparanoia:  cdparanoiasrc: CD Audio (cdda) Source, Paranoia IV
dfbvideosink:  dfbvideosink: DirectFB video sink
videomixer:  videomixer: Video mixer
videomixer:  videomixer2: Video mixer 2
coloreffects:  coloreffects: Color Look-up Table filter
coloreffects:  chromahold: Chroma hold filter
spandsp:  spanplc: SpanDSP PLC
mpegdemux2:  mpegpsdemux: The Fluendo MPEG Program Stream Demuxer
mpegdemux2:  mpegtsdemux: The Fluendo MPEG Transport stream demuxer
mpegdemux2:  mpegtsparse: MPEG transport stream parser
colorspace:  colorspace:  Colorspace converter
gsettings:  gsettingsaudiosink: GSettings audio sink
gsettings:  gsettingsaudiosrc: GSettings audio src
gsettings:  gsettingsvideosink: GSettings video sink
gsettings:  gsettingsvideosrc: GSettings video src
videomeasure:  ssim: SSim
videomeasure:  measurecollector: Video measure collector
vmnc:  vmncdec: VMnc video decoder
camerabin:  camerabin: Camera Bin
dtsdec:  dtsdec: DTS audio decoder
postproc:  postproc_hdeblock: LibPostProc hdeblock filter
postproc:  postproc_vdeblock: LibPostProc vdeblock filter
postproc:  postproc_x1hdeblock: LibPostProc x1hdeblock filter
postproc:  postproc_x1vdeblock: LibPostProc x1vdeblock filter
postproc:  postproc_ahdeblock: LibPostProc ahdeblock filter
postproc:  postproc_avdeblock: LibPostProc avdeblock filter
postproc:  postproc_dering: LibPostProc dering filter
postproc:  postproc_autolevels: LibPostProc autolevels filter
postproc:  postproc_linblenddeint: LibPostProc linblenddeint filter
postproc:  postproc_linipoldeint: LibPostProc linipoldeint filter
postproc:  postproc_cubicipoldeint: LibPostProc cubicipoldeint filter
postproc:  postproc_mediandeint: LibPostProc mediandeint filter
postproc:  postproc_ffmpegdeint: LibPostProc ffmpegdeint filter
postproc:  postproc_lowpass5: LibPostProc lowpass5 filter
postproc:  postproc_tmpnoise: LibPostProc tmpnoise filter
postproc:  postproc_forcequant: LibPostProc forcequant filter
postproc:  postproc_default: LibPostProc default filter
bz2:  bz2enc: BZ2 encoder
bz2:  bz2dec: BZ2 decoder
assrender:  assrender: ASS/SSA Render
jpegformat:  jpegparse: JPEG stream parser
jpegformat:  jifmux: JPEG stream muxer
alpha:  alpha: Alpha filter
libvisual:  libvisual_bumpscope: libvisual Bumpscope plugin plugin v.0.0.1
libvisual:  libvisual_corona: libvisual libvisual corona plugin plugin v.0.1
libvisual:  libvisual_infinite: libvisual infinite plugin plugin v.0.1
libvisual:  libvisual_jakdaw: libvisual Jakdaw plugin plugin v.0.0.1
libvisual:  libvisual_jess: libvisual jess plugin plugin v.0.1
libvisual:  libvisual_lv_analyzer: libvisual libvisual analyzer plugin v.1.0
libvisual:  libvisual_lv_scope: libvisual libvisual scope plugin v.0.1
libvisual:  libvisual_oinksie: libvisual oinksie plugin plugin v.0.1
theora:  theoradec: Theora video decoder
theora:  theoraenc: Theora video encoder
theora:  theoraparse: Theora video parser
geometrictransform:  circle: circle
geometrictransform:  diffuse: diffuse
geometrictransform:  kaleidoscope: kaleidoscope
geometrictransform:  marble: marble
geometrictransform:  pinch: pinch
geometrictransform:  rotate: rotate
geometrictransform:  sphere: sphere
geometrictransform:  twirl: twirl
geometrictransform:  waterripple: waterripple
geometrictransform:  stretch: stretch
geometrictransform:  bulge: bulge
geometrictransform:  tunnel: tunnel
geometrictransform:  square: square
geometrictransform:  mirror: mirror
geometrictransform:  fisheye: fisheye
spectrum:  spectrum: Spectrum analyzer
dvdspu:  dvdspu: Sub-picture Overlay
zbar:  zbar: Barcode detector
xvimagesink:  xvimagesink: Video sink
hdvparse:  hdvparse: HDVParser
speex:  speexenc: Speex audio encoder
speex:  speexdec: Speex audio decoder
interleave:  interleave: Audio interleaver
interleave:  deinterleave: Audio deinterleaver
audiofx:  audiopanorama: Stereo positioning
audiofx:  audioinvert: Audio inversion
audiofx:  audiokaraoke: AudioKaraoke
audiofx:  audioamplify: Audio amplifier
audiofx:  audiodynamic: Dynamic range controller
audiofx:  audiocheblimit: Low pass & high pass filter
audiofx:  audiochebband: Band pass & band reject filter
audiofx:  audioiirfilter: Audio IIR filter
audiofx:  audiowsinclimit: Low pass & high pass filter
audiofx:  audiowsincband: Band pass & band reject filter
audiofx:  audiofirfilter: Audio FIR filter
audiofx:  audioecho: Audio echo
ffmpegcolorspace:  ffmpegcolorspace: FFMPEG Colorspace converter
fsrtpconference:  fsrtpconference: Farstream RTP Conference
ffvideoscale:  ffvideoscale: FFMPEG Scale element
ximagesrc:  ximagesrc: Ximage video source
subenc:  srtenc: Srt encoder
subenc:  webvttenc: WebVTT encoder
schro:  schrodec: Dirac Decoder
schro:  schroenc: Dirac Encoder
alsa:  alsamixer: Alsa mixer
alsa:  alsasrc: Audio source (ALSA)
alsa:  alsasink: Audio sink (ALSA)
fragmented:  hlsdemux: HLS Demuxer
aiff:  aiffparse: AIFF audio demuxer
aiff:  aiffmux: AIFF audio muxer
mimic:  mimenc: Mimic Encoder
mimic:  mimdec: Mimic Decoder
videofiltersbad:  scenechange: Scene change detector
videofiltersbad:  zebrastripe: Zebra stripe overlay
dvbsuboverlay:  dvbsuboverlay: DVB Subtitles Overlay
png:  pngdec: PNG image decoder
png:  pngenc: PNG image encoder
apexsink:  apexsink: Apple AirPort Express Audio Sink
typefindfunctions: video/x-ms-asf: asf, wm, wma, wmv
typefindfunctions: audio/x-musepack: mpc, mpp, mp+
typefindfunctions: audio/x-au: au, snd
typefindfunctions: video/x-msvideo: avi
typefindfunctions: audio/qcelp: qcp
typefindfunctions: video/x-cdxa: dat
typefindfunctions: video/x-vcd: dat
typefindfunctions: audio/x-imelody: imy, ime, imelody
typefindfunctions: audio/midi: mid, midi
typefindfunctions: audio/riff-midi: mid, midi
typefindfunctions: audio/mobile-xmf: mxmf
typefindfunctions: video/x-fli: flc, fli
typefindfunctions: application/x-id3v2: mp3, mp2, mp1, mpga, ogg, flac, tta
typefindfunctions: application/x-id3v1: mp3, mp2, mp1, mpga, ogg, flac, tta
typefindfunctions: application/x-apetag: mp3, ape, mpc, wv
typefindfunctions: audio/x-ttafile: tta
typefindfunctions: audio/x-mod: 669, amf, dsm, gdm, far, imf, it, med, mod, mtm, okt, sam, s3m, stm, stx, ult, xm
typefindfunctions: audio/mpeg: mp3, mp2, mp1, mpga
typefindfunctions: audio/x-ac3: ac3, eac3
typefindfunctions: audio/x-dts: dts
typefindfunctions: audio/x-gsm: gsm
typefindfunctions: video/mpeg-sys: mpe, mpeg, mpg
typefindfunctions: video/mpegts: ts, mts
typefindfunctions: application/ogg: anx, ogg, ogm
typefindfunctions: video/mpeg-elementary: mpv, mpeg, mpg
typefindfunctions: video/mpeg4: m4v
typefindfunctions: video/x-h263: h263, 263
typefindfunctions: video/x-h264: h264, x264, 264
typefindfunctions: video/x-nuv: nuv
typefindfunctions: audio/x-m4a: m4a
typefindfunctions: application/x-3gp: 3gp
typefindfunctions: video/quicktime: mov
typefindfunctions: image/x-quicktime: qif, qtif, qti
typefindfunctions: image/jp2: jp2
typefindfunctions: video/mj2: mj2
typefindfunctions: text/html: htm, html
typefindfunctions: application/vnd.rn-realmedia: ra, ram, rm, rmvb
typefindfunctions: application/x-pn-realaudio: ra, ram, rm, rmvb
typefindfunctions: application/x-shockwave-flash: swf, swfl
typefindfunctions: video/x-flv: flv
typefindfunctions: text/plain: txt
typefindfunctions: text/utf-16: txt
typefindfunctions: text/utf-32: txt
typefindfunctions: text/uri-list: ram
typefindfunctions: application/x-hls: m3u8
typefindfunctions: application/sdp: sdp
typefindfunctions: application/smil: smil
typefindfunctions: application/xml: xml
typefindfunctions: audio/x-wav: wav
typefindfunctions: audio/x-aiff: aiff, aif, aifc
typefindfunctions: audio/x-svx: iff, svx
typefindfunctions: audio/x-paris: paf
typefindfunctions: audio/x-nist: nist
typefindfunctions: audio/x-voc: voc
typefindfunctions: audio/x-sds: sds
typefindfunctions: audio/x-ircam: sf
typefindfunctions: audio/x-w64: w64
typefindfunctions: audio/x-shorten: shn
typefindfunctions: application/x-ape: ape
typefindfunctions: image/jpeg: jpg, jpe, jpeg
typefindfunctions: image/gif: gif
typefindfunctions: image/png: png
typefindfunctions: image/bmp: bmp
typefindfunctions: image/tiff: tif, tiff
typefindfunctions: image/x-portable-pixmap: pnm, ppm, pgm, pbm
typefindfunctions: video/x-matroska: mkv, mka
typefindfunctions: video/webm: webm
typefindfunctions: application/mxf: mxf
typefindfunctions: video/x-mve: mve
typefindfunctions: video/x-dv: dv, dif
typefindfunctions: audio/x-amr-nb-sh: amr
typefindfunctions: audio/x-amr-wb-sh: amr
typefindfunctions: audio/iLBC-sh: ilbc
typefindfunctions: audio/x-sid: sid
typefindfunctions: image/x-xcf: xcf
typefindfunctions: video/x-mng: mng
typefindfunctions: image/x-jng: jng
typefindfunctions: image/x-xpixmap: xpm
typefindfunctions: image/x-sun-raster: ras
typefindfunctions: application/x-bzip: bz2
typefindfunctions: application/x-gzip: gz
typefindfunctions: application/zip: zip
typefindfunctions: application/x-compress: Z
typefindfunctions: subtitle/x-kate: no extensions
typefindfunctions: audio/x-flac: flac
typefindfunctions: audio/x-vorbis: no extensions
typefindfunctions: video/x-theora: no extensions
typefindfunctions: application/x-ogm-video: no extensions
typefindfunctions: application/x-ogm-audio: no extensions
typefindfunctions: application/x-ogm-text: no extensions
typefindfunctions: audio/x-speex: no extensions
typefindfunctions: audio/x-celt: no extensions
typefindfunctions: application/x-ogg-skeleton: no extensions
typefindfunctions: text/x-cmml: no extensions
typefindfunctions: application/x-executable: no extensions
typefindfunctions: audio/aac: aac, adts, adif, loas
typefindfunctions: audio/x-spc: spc
typefindfunctions: audio/x-wavpack: wv, wvp
typefindfunctions: audio/x-wavpack-correction: wvc
typefindfunctions: application/postscript: ps
typefindfunctions: image/svg+xml: svg
typefindfunctions: application/x-rar: rar
typefindfunctions: application/x-tar: tar
typefindfunctions: application/x-ar: a
typefindfunctions: application/x-ms-dos-executable: dll, exe, ocx, sys, scr, msstyles, cpl
typefindfunctions: video/x-dirac: no extensions
typefindfunctions: multipart/x-mixed-replace: no extensions
typefindfunctions: application/x-mmsh: no extensions
typefindfunctions: video/vivo: viv
typefindfunctions: audio/x-nsf: nsf
typefindfunctions: audio/x-gym: gym
typefindfunctions: audio/x-ay: ay
typefindfunctions: audio/x-gbs: gbs
typefindfunctions: audio/x-vgm: vgm
typefindfunctions: audio/x-sap: sap
typefindfunctions: video/x-ivf: ivf
typefindfunctions: audio/x-kss: kss
typefindfunctions: application/pdf: pdf
typefindfunctions: application/msword: doc
typefindfunctions: application/octet-stream: DS_Store
typefindfunctions: image/vnd.adobe.photoshop: psd
typefindfunctions: image/vnd.wap.wbmp: no extensions
typefindfunctions: application/x-yuv4mpeg: y4m
typefindfunctions: image/x-icon: no extensions
typefindfunctions: xdgmime-base: no extensions
typefindfunctions: image/x-degas: no extensions
jp2kdecimator:  jp2kdecimator: JPEG2000 decimator
ximagesink:  ximagesink: Video sink
segmentclip:  audiosegmentclip: Audio buffer segment clipper
segmentclip:  videosegmentclip: Video buffer segment clipper
wavpack:  wavpackparse: Wavpack parser
wavpack:  wavpackdec: Wavpack audio decoder
wavpack:  wavpackenc: Wavpack audio encoder
cdxaparse:  cdxaparse: (S)VCD parser
cdxaparse:  vcdparse: (S)VCD stream parser
oss4:  oss4sink: OSS v4 Audio Sink
oss4:  oss4src: OSS v4 Audio Source
oss4:  oss4mixer: OSS v4 Audio Mixer
asfmux:  asfmux: ASF muxer
asfmux:  rtpasfpay: RTP ASF payloader
asfmux:  asfparse: ASF parser
bluetooth: sbc: sbc
bluetooth:  sbcenc: Bluetooth SBC encoder
bluetooth:  sbcdec: Bluetooth SBC decoder
bluetooth:  sbcparse: Bluetooth SBC parser
bluetooth:  avdtpsink: Bluetooth AVDTP sink
bluetooth:  a2dpsink: Bluetooth A2DP sink
bluetooth:  rtpsbcpay: RTP packet payloader
coreelements:  capsfilter: CapsFilter
coreelements:  fakesrc: Fake Source
coreelements:  fakesink: Fake Sink
coreelements:  fdsrc: Filedescriptor Source
coreelements:  fdsink: Filedescriptor Sink
coreelements:  filesrc: File Source
coreelements:  funnel: Funnel pipe fitting
coreelements:  identity: Identity
coreelements:  input-selector: Input selector
coreelements:  output-selector: Output selector
coreelements:  queue: Queue
coreelements:  queue2: Queue 2
coreelements:  filesink: File Sink
coreelements:  tee: Tee pipe fitting
coreelements:  typefind: TypeFind
coreelements:  multiqueue: MultiQueue
coreelements:  valve: Valve element
mpegpsmux:  mpegpsmux: MPEG Program Stream Muxer
annodex:  cmmlenc: CMML streams encoder
annodex:  cmmldec: CMML stream decoder
voaacenc:  voaacenc: AAC audio encoder
xvid:  xvidenc: XviD video encoder
xvid:  xviddec: XviD video decoder
festival:  festival: Festival Text-to-Speech synthesizer
camerabin2:  viewfinderbin: Viewfinder Bin
camerabin2:  wrappercamerabinsrc: V4l2 camera src element for camerabin
camerabin2:  camerabin2: CameraBin2
goom:  goom: GOOM: what a GOOM!
autoconvert:  autoconvert: Select convertor based on caps
autoconvert:  autovideoconvert: Select color space convertor based on caps
id3demux:  id3demux: ID3 tag demuxer
audioresample:  audioresample: Audio resampler
flv:  flvdemux: FLV Demuxer
flv:  flvmux: FLV muxer
level:  level: Level
interlace:  interlace: Interlace filter
adpcmenc:  adpcmenc: ADPCM encoder
pnm:  pnmdec: PNM image decoder
pnm:  pnmenc: PNM image encoder
videocrop:  videocrop: Crop
videocrop:  aspectratiocrop: aspectratiocrop
patchdetect:  patchdetect: Color Patch Detector
audioconvert:  audioconvert: Audio converter
id3tag:  id3mux: ID3 v1 and v2 Muxer
freeze:  freeze: Stream freezer
multifile:  multifilesrc: Multi-File Source
multifile:  multifilesink: Multi-File Sink
multifile:  splitfilesrc: Split-File Source
wavparse:  wavparse: WAV audio demuxer
wildmidi:  wildmidi: WildMidi
taglib:  id3v2mux: TagLib-based ID3v2 Muxer
taglib:  apev2mux: TagLib-based APEv2 Muxer
mpegtsmux:  mpegtsmux: MPEG Transport Stream Muxer
uridecodebin:  decodebin2: Decoder Bin
uridecodebin:  uridecodebin: URI Decoder
fieldanalysis:  fieldanalysis: Video field analysis
volume:  volume: Volume
playback:  playbin: Player Bin
playback:  playbin2: Player Bin 2
playback:  playsink: Player Sink
playback:  subtitleoverlay: Subtitle Overlay
shout2send:  shout2send: Icecast network sink
rtsp:  rtspsrc: RTSP packet receiver
rtsp:  rtpdec: RTP Decoder
y4menc:  y4menc: YUV4MPEG video encoder
sdi:  sdidemux: SDI Demuxer
sdi:  sdimux: SDI Muxer
rtpmux:  rtpmux: RTP muxer
rtpmux:  rtpdtmfmux: RTP muxer
icydemux:  icydemux: ICY tag demuxer
legacyresample:  legacyresample: Audio scaler
soundtouch:  pitch: Pitch controller
soundtouch:  bpmdetect: BPM Detector
gdp:  gdpdepay: GDP Depayloader
gdp:  gdppay: GDP Payloader
kate:  katedec: Kate stream text decoder
kate:  kateenc: Kate stream encoder
kate:  kateparse: Kate stream parser
kate:  katetag: Kate stream tagger
mms:  mmssrc: MMS streaming source
cacasink:  cacasink: A colored ASCII art video sink
shapewipe:  shapewipe: Shape Wipe transition filter
isomp4:  qtdemux: QuickTime demuxer
isomp4:  rtpxqtdepay: RTP packet depayloader
isomp4:  qtmux: QuickTime Muxer
isomp4:  mp4mux: MP4 Muxer
isomp4:  ismlmux: ISML Muxer
isomp4:  3gppmux: 3GPP Muxer
isomp4:  gppmux: 3GPP Muxer
isomp4:  mj2mux: MJ2 Muxer
isomp4:  qtmoovrecover: QT Moov Recover
shm:  shmsrc: Shared Memory Source
shm:  shmsink: Shared Memory Sink
vorbis:  vorbisenc: Vorbis audio encoder
vorbis:  vorbisdec: Vorbis audio decoder
vorbis:  vorbisparse: VorbisParse
vorbis:  vorbistag: VorbisTag
cdaudio:  cdaudio: CD player
ffmpeg:  ffenc_a64multi: FFmpeg Multicolor charset for Commodore 64 encoder
ffmpeg:  ffenc_a64multi5: FFmpeg Multicolor charset for Commodore 64, extended with 5th color (colram) encoder
ffmpeg:  ffenc_asv1: FFmpeg ASUS V1 encoder
ffmpeg:  ffenc_asv2: FFmpeg ASUS V2 encoder
ffmpeg:  ffenc_bmp: FFmpeg BMP image encoder
ffmpeg:  ffenc_cljr: FFmpeg Cirrus Logic AccuPak encoder
ffmpeg:  ffenc_dnxhd: FFmpeg VC3/DNxHD encoder
ffmpeg:  ffenc_dpx: FFmpeg DPX image encoder
ffmpeg:  ffenc_dvvideo: FFmpeg DV (Digital Video) encoder
ffmpeg:  ffenc_ffv1: FFmpeg FFmpeg video codec #1 encoder
ffmpeg:  ffenc_ffvhuff: FFmpeg Huffyuv FFmpeg variant encoder
ffmpeg:  ffenc_flashsv: FFmpeg Flash Screen Video encoder
ffmpeg:  ffenc_flv: FFmpeg Flash Video (FLV) / Sorenson Spark / Sorenson H.263 encoder
ffmpeg:  ffenc_h261: FFmpeg H.261 encoder
ffmpeg:  ffenc_h263: FFmpeg H.263 / H.263-1996 encoder
ffmpeg:  ffenc_h263p: FFmpeg H.263+ / H.263-1998 / H.263 version 2 encoder
ffmpeg:  ffenc_huffyuv: FFmpeg Huffyuv / HuffYUV encoder
ffmpeg:  ffenc_jpegls: FFmpeg JPEG-LS encoder
ffmpeg:  ffenc_ljpeg: FFmpeg Lossless JPEG encoder
ffmpeg:  ffenc_mjpeg: FFmpeg MJPEG (Motion JPEG) encoder
ffmpeg:  ffenc_mpeg1video: FFmpeg MPEG-1 video encoder
ffmpeg:  ffenc_mpeg2video: FFmpeg MPEG-2 video encoder
ffmpeg:  ffenc_mpeg4: FFmpeg MPEG-4 part 2 encoder
ffmpeg:  ffenc_msmpeg4v2: FFmpeg MPEG-4 part 2 Microsoft variant version 2 encoder
ffmpeg:  ffenc_msmpeg4: FFmpeg MPEG-4 part 2 Microsoft variant version 3 encoder
ffmpeg:  ffenc_pam: FFmpeg PAM (Portable AnyMap) image encoder
ffmpeg:  ffenc_pbm: FFmpeg PBM (Portable BitMap) image encoder
ffmpeg:  ffenc_pcx: FFmpeg PC Paintbrush PCX image encoder
ffmpeg:  ffenc_pgm: FFmpeg PGM (Portable GrayMap) image encoder
ffmpeg:  ffenc_pgmyuv: FFmpeg PGMYUV (Portable GrayMap YUV) image encoder
ffmpeg:  ffenc_png: FFmpeg PNG image encoder
ffmpeg:  ffenc_ppm: FFmpeg PPM (Portable PixelMap) image encoder
ffmpeg:  ffenc_qtrle: FFmpeg QuickTime Animation (RLE) video encoder
ffmpeg:  ffenc_roqvideo: FFmpeg id RoQ video encoder
ffmpeg:  ffenc_rv10: FFmpeg RealVideo 1.0 encoder
ffmpeg:  ffenc_rv20: FFmpeg RealVideo 2.0 encoder
ffmpeg:  ffenc_sgi: FFmpeg SGI image encoder
ffmpeg:  ffenc_snow: FFmpeg Snow encoder
ffmpeg:  ffenc_svq1: FFmpeg Sorenson Vector Quantizer 1 / Sorenson Video 1 / SVQ1 encoder
ffmpeg:  ffenc_targa: FFmpeg Truevision Targa image encoder
ffmpeg:  ffenc_tiff: FFmpeg TIFF image encoder
ffmpeg:  ffenc_v410: FFmpeg Uncompressed 4:4:4 10-bit encoder
ffmpeg:  ffenc_wmv1: FFmpeg Windows Media Video 7 encoder
ffmpeg:  ffenc_wmv2: FFmpeg Windows Media Video 8 encoder
ffmpeg:  ffenc_zmbv: FFmpeg Zip Motion Blocks Video encoder
ffmpeg:  ffenc_aac: FFmpeg Advanced Audio Coding encoder
ffmpeg:  ffenc_ac3: FFmpeg ATSC A/52A (AC-3) encoder
ffmpeg:  ffenc_ac3_fixed: FFmpeg ATSC A/52A (AC-3) encoder
ffmpeg:  ffenc_alac: FFmpeg ALAC (Apple Lossless Audio Codec) encoder
ffmpeg:  ffenc_eac3: FFmpeg ATSC A/52 E-AC-3 encoder
ffmpeg:  ffenc_mp2: FFmpeg MP2 (MPEG audio layer 2) encoder
ffmpeg:  ffenc_nellymoser: FFmpeg Nellymoser Asao encoder
ffmpeg:  ffenc_real_144: FFmpeg RealAudio 1.0 (14.4K) encoder encoder
ffmpeg:  ffenc_wmav1: FFmpeg Windows Media Audio 1 encoder
ffmpeg:  ffenc_wmav2: FFmpeg Windows Media Audio 2 encoder
ffmpeg:  ffenc_roq_dpcm: FFmpeg id RoQ DPCM encoder
ffmpeg:  ffenc_adpcm_adx: FFmpeg SEGA CRI ADX ADPCM encoder
ffmpeg:  ffenc_g722: FFmpeg G.722 ADPCM encoder
ffmpeg:  ffenc_g726: FFmpeg G.726 ADPCM encoder
ffmpeg:  ffenc_adpcm_ima_qt: FFmpeg ADPCM IMA QuickTime encoder
ffmpeg:  ffenc_adpcm_ima_wav: FFmpeg ADPCM IMA WAV encoder
ffmpeg:  ffenc_adpcm_ms: FFmpeg ADPCM Microsoft encoder
ffmpeg:  ffenc_adpcm_swf: FFmpeg ADPCM Shockwave Flash encoder
ffmpeg:  ffenc_adpcm_yamaha: FFmpeg ADPCM Yamaha encoder
ffmpeg:  ffdec_aasc: FFmpeg Autodesk RLE decoder
ffmpeg:  ffdec_amv: FFmpeg AMV Video decoder
ffmpeg:  ffdec_anm: FFmpeg Deluxe Paint Animation decoder
ffmpeg:  ffdec_ansi: FFmpeg ASCII/ANSI art decoder
ffmpeg:  ffdec_asv1: FFmpeg ASUS V1 decoder
ffmpeg:  ffdec_asv2: FFmpeg ASUS V2 decoder
ffmpeg:  ffdec_aura: FFmpeg Auravision AURA decoder
ffmpeg:  ffdec_aura2: FFmpeg Auravision Aura 2 decoder
ffmpeg:  ffdec_avs: FFmpeg AVS (Audio Video Standard) video decoder
ffmpeg:  ffdec_bethsoftvid: FFmpeg Bethesda VID video decoder
ffmpeg:  ffdec_bfi: FFmpeg Brute Force & Ignorance decoder
ffmpeg:  ffdec_binkvideo: FFmpeg Bink video decoder
ffmpeg:  ffdec_bmp: FFmpeg BMP image decoder
ffmpeg:  ffdec_bmv_video: FFmpeg Discworld II BMV video decoder
ffmpeg:  ffdec_c93: FFmpeg Interplay C93 decoder
ffmpeg:  ffdec_cavs: FFmpeg Chinese AVS video (AVS1-P2, JiZhun profile) decoder
ffmpeg:  ffdec_cdgraphics: FFmpeg CD Graphics video decoder
ffmpeg:  ffdec_cinepak: FFmpeg Cinepak decoder
ffmpeg:  ffdec_cljr: FFmpeg Cirrus Logic AccuPak decoder
ffmpeg:  ffdec_camstudio: FFmpeg CamStudio decoder
ffmpeg:  ffdec_cyuv: FFmpeg Creative YUV (CYUV) decoder
ffmpeg:  ffdec_dfa: FFmpeg Chronomaster DFA decoder
ffmpeg:  ffdec_dnxhd: FFmpeg VC3/DNxHD decoder
ffmpeg:  ffdec_dpx: FFmpeg DPX image decoder
ffmpeg:  ffdec_dsicinvideo: FFmpeg Delphine Software International CIN video decoder
ffmpeg:  ffdec_dvvideo: FFmpeg DV (Digital Video) decoder
ffmpeg:  ffdec_dxa: FFmpeg Feeble Files/ScummVM DXA decoder
ffmpeg:  ffdec_dxtory: FFmpeg Dxtory decoder
ffmpeg:  ffdec_eacmv: FFmpeg Electronic Arts CMV video decoder
ffmpeg:  ffdec_eamad: FFmpeg Electronic Arts Madcow Video decoder
ffmpeg:  ffdec_eatgq: FFmpeg Electronic Arts TGQ video decoder
ffmpeg:  ffdec_eatgv: FFmpeg Electronic Arts TGV video decoder
ffmpeg:  ffdec_eatqi: FFmpeg Electronic Arts TQI Video decoder
ffmpeg:  ffdec_8bps: FFmpeg QuickTime 8BPS video decoder
ffmpeg:  ffdec_8svx_exp: FFmpeg 8SVX exponential decoder
ffmpeg:  ffdec_8svx_fib: FFmpeg 8SVX fibonacci decoder
ffmpeg:  ffdec_escape124: FFmpeg Escape 124 decoder
ffmpeg:  ffdec_ffv1: FFmpeg FFmpeg video codec #1 decoder
ffmpeg:  ffdec_ffvhuff: FFmpeg Huffyuv FFmpeg variant decoder
ffmpeg:  ffdec_flashsv: FFmpeg Flash Screen Video v1 decoder
ffmpeg:  ffdec_flashsv2: FFmpeg Flash Screen Video v2 decoder
ffmpeg:  ffdec_flic: FFmpeg Autodesk Animator Flic video decoder
ffmpeg:  ffdec_flv: FFmpeg Flash Video (FLV) / Sorenson Spark / Sorenson H.263 decoder
ffmpeg:  ffdec_4xm: FFmpeg 4X Movie decoder
ffmpeg:  ffdec_fraps: FFmpeg Fraps decoder
ffmpeg:  ffdec_FRWU: FFmpeg Forward Uncompressed decoder
ffmpeg:  ffdec_h261: FFmpeg H.261 decoder
ffmpeg:  ffdec_h263: FFmpeg H.263 / H.263-1996, H.263+ / H.263-1998 / H.263 version 2 decoder
ffmpeg:  ffdec_h263i: FFmpeg Intel H.263 decoder
ffmpeg:  ffdec_h264: FFmpeg H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 decoder
ffmpeg:  ffdec_huffyuv: FFmpeg Huffyuv / HuffYUV decoder
ffmpeg:  ffdec_idcinvideo: FFmpeg id Quake II CIN video decoder
ffmpeg:  ffdec_iff_byterun1: FFmpeg IFF ByteRun1 decoder
ffmpeg:  ffdec_iff_ilbm: FFmpeg IFF ILBM decoder
ffmpeg:  ffdec_indeo2: FFmpeg Intel Indeo 2 decoder
ffmpeg:  ffdec_indeo3: FFmpeg Intel Indeo 3 decoder
ffmpeg:  ffdec_indeo4: FFmpeg Intel Indeo Video Interactive 4 decoder
ffmpeg:  ffdec_indeo5: FFmpeg Intel Indeo Video Interactive 5 decoder
ffmpeg:  ffdec_interplayvideo: FFmpeg Interplay MVE video decoder
ffmpeg:  ffdec_jpegls: FFmpeg JPEG-LS decoder
ffmpeg:  ffdec_jv: FFmpeg Bitmap Brothers JV video decoder
ffmpeg:  ffdec_kgv1: FFmpeg Kega Game Video decoder
ffmpeg:  ffdec_kmvc: FFmpeg Karl Morton's video codec decoder
ffmpeg:  ffdec_lagarith: FFmpeg Lagarith lossless decoder
ffmpeg:  ffdec_loco: FFmpeg LOCO decoder
ffmpeg:  ffdec_mdec: FFmpeg Sony PlayStation MDEC (Motion DECoder) decoder
ffmpeg:  ffdec_mimic: FFmpeg Mimic decoder
ffmpeg:  ffdec_mjpeg: FFmpeg MJPEG (Motion JPEG) decoder
ffmpeg:  ffdec_mjpegb: FFmpeg Apple MJPEG-B decoder
ffmpeg:  ffdec_mmvideo: FFmpeg American Laser Games MM Video decoder
ffmpeg:  ffdec_motionpixels: FFmpeg Motion Pixels video decoder
ffmpeg:  ffdec_mpeg2video: FFmpeg MPEG-2 video decoder
ffmpeg:  ffdec_mpeg4: FFmpeg MPEG-4 part 2 decoder
ffmpeg:  ffdec_msmpeg4v1: FFmpeg MPEG-4 part 2 Microsoft variant version 1 decoder
ffmpeg:  ffdec_msmpeg4v2: FFmpeg MPEG-4 part 2 Microsoft variant version 2 decoder
ffmpeg:  ffdec_msmpeg4: FFmpeg MPEG-4 part 2 Microsoft variant version 3 decoder
ffmpeg:  ffdec_msrle: FFmpeg Microsoft RLE decoder
ffmpeg:  ffdec_msvideo1: FFmpeg Microsoft Video 1 decoder
ffmpeg:  ffdec_mszh: FFmpeg LCL (LossLess Codec Library) MSZH decoder
ffmpeg:  ffdec_mxpeg: FFmpeg Mobotix MxPEG video decoder
ffmpeg:  ffdec_nuv: FFmpeg NuppelVideo/RTJPEG decoder
ffmpeg:  ffdec_pam: FFmpeg PAM (Portable AnyMap) image decoder
ffmpeg:  ffdec_pbm: FFmpeg PBM (Portable BitMap) image decoder
ffmpeg:  ffdec_pcx: FFmpeg PC Paintbrush PCX image decoder
ffmpeg:  ffdec_pgm: FFmpeg PGM (Portable GrayMap) image decoder
ffmpeg:  ffdec_pgmyuv: FFmpeg PGMYUV (Portable GrayMap YUV) image decoder
ffmpeg:  ffdec_pictor: FFmpeg Pictor/PC Paint decoder
ffmpeg:  ffdec_png: FFmpeg PNG image decoder
ffmpeg:  ffdec_ppm: FFmpeg PPM (Portable PixelMap) image decoder
ffmpeg:  ffdec_prores: FFmpeg Apple ProRes (iCodec Pro) decoder
ffmpeg:  ffdec_ptx: FFmpeg V.Flash PTX image decoder
ffmpeg:  ffdec_qdraw: FFmpeg Apple QuickDraw decoder
ffmpeg:  ffdec_qpeg: FFmpeg Q-team QPEG decoder
ffmpeg:  ffdec_qtrle: FFmpeg QuickTime Animation (RLE) video decoder
ffmpeg:  ffdec_r10k: FFmpeg AJA Kona 10-bit RGB Codec decoder
ffmpeg:  ffdec_rl2: FFmpeg RL2 video decoder
ffmpeg:  ffdec_roqvideo: FFmpeg id RoQ video decoder
ffmpeg:  ffdec_rpza: FFmpeg QuickTime video (RPZA) decoder
ffmpeg:  ffdec_rv10: FFmpeg RealVideo 1.0 decoder
ffmpeg:  ffdec_rv20: FFmpeg RealVideo 2.0 decoder
ffmpeg:  ffdec_rv30: FFmpeg RealVideo 3.0 decoder
ffmpeg:  ffdec_rv40: FFmpeg RealVideo 4.0 decoder
ffmpeg:  ffdec_s302m: FFmpeg SMPTE 302M decoder
ffmpeg:  ffdec_sgi: FFmpeg SGI image decoder
ffmpeg:  ffdec_smackvid: FFmpeg Smacker video decoder
ffmpeg:  ffdec_smc: FFmpeg QuickTime Graphics (SMC) decoder
ffmpeg:  ffdec_snow: FFmpeg Snow decoder
ffmpeg:  ffdec_sp5x: FFmpeg Sunplus JPEG (SP5X) decoder
ffmpeg:  ffdec_sunrast: FFmpeg Sun Rasterfile image decoder
ffmpeg:  ffdec_svq1: FFmpeg Sorenson Vector Quantizer 1 / Sorenson Video 1 / SVQ1 decoder
ffmpeg:  ffdec_svq3: FFmpeg Sorenson Vector Quantizer 3 / Sorenson Video 3 / SVQ3 decoder
ffmpeg:  ffdec_targa: FFmpeg Truevision Targa image decoder
ffmpeg:  ffdec_thp: FFmpeg Nintendo Gamecube THP video decoder
ffmpeg:  ffdec_tiertexseqvideo: FFmpeg Tiertex Limited SEQ video decoder
ffmpeg:  ffdec_tiff: FFmpeg TIFF image decoder
ffmpeg:  ffdec_tmv: FFmpeg 8088flex TMV decoder
ffmpeg:  ffdec_truemotion1: FFmpeg Duck TrueMotion 1.0 decoder
ffmpeg:  ffdec_truemotion2: FFmpeg Duck TrueMotion 2.0 decoder
ffmpeg:  ffdec_camtasia: FFmpeg TechSmith Screen Capture Codec decoder
ffmpeg:  ffdec_txd: FFmpeg Renderware TXD (TeXture Dictionary) image decoder
ffmpeg:  ffdec_ultimotion: FFmpeg IBM UltiMotion decoder
ffmpeg:  ffdec_utvideo: FFmpeg Ut Video decoder
ffmpeg:  ffdec_v410: FFmpeg Uncompressed 4:4:4 10-bit decoder
ffmpeg:  ffdec_vb: FFmpeg Beam Software VB decoder
ffmpeg:  ffdec_vble: FFmpeg VBLE Lossless Codec decoder
ffmpeg:  ffdec_vc1: FFmpeg SMPTE VC-1 decoder
ffmpeg:  ffdec_vc1image: FFmpeg Windows Media Video 9 Image v2 decoder
ffmpeg:  ffdec_vcr1: FFmpeg ATI VCR1 decoder
ffmpeg:  ffdec_vmdvideo: FFmpeg Sierra VMD video decoder
ffmpeg:  ffdec_vmnc: FFmpeg VMware Screen Codec / VMware Video decoder
ffmpeg:  ffdec_vp3: FFmpeg On2 VP3 decoder
ffmpeg:  ffdec_vp5: FFmpeg On2 VP5 decoder
ffmpeg:  ffdec_vp6: FFmpeg On2 VP6 decoder
ffmpeg:  ffdec_vp6a: FFmpeg On2 VP6 (Flash version, with alpha channel) decoder
ffmpeg:  ffdec_vp6f: FFmpeg On2 VP6 (Flash version) decoder
ffmpeg:  ffdec_vp8: FFmpeg On2 VP8 decoder
ffmpeg:  ffdec_vqavideo: FFmpeg Westwood Studios VQA (Vector Quantized Animation) video decoder
ffmpeg:  ffdec_wmv1: FFmpeg Windows Media Video 7 decoder
ffmpeg:  ffdec_wmv2: FFmpeg Windows Media Video 8 decoder
ffmpeg:  ffdec_wmv3: FFmpeg Windows Media Video 9 decoder
ffmpeg:  ffdec_wmv3image: FFmpeg Windows Media Video 9 Image decoder
ffmpeg:  ffdec_wnv1: FFmpeg Winnov WNV1 decoder
ffmpeg:  ffdec_xan_wc3: FFmpeg Wing Commander III / Xan decoder
ffmpeg:  ffdec_xan_wc4: FFmpeg Wing Commander IV / Xxan decoder
ffmpeg:  ffdec_xl: FFmpeg Miro VideoXL decoder
ffmpeg:  ffdec_yop: FFmpeg Psygnosis YOP Video decoder
ffmpeg:  ffdec_zlib: FFmpeg LCL (LossLess Codec Library) ZLIB decoder
ffmpeg:  ffdec_zmbv: FFmpeg Zip Motion Blocks Video decoder
ffmpeg:  ffdec_aac: FFmpeg Advanced Audio Coding decoder
ffmpeg:  ffdec_aac_latm: FFmpeg AAC LATM (Advanced Audio Codec LATM syntax) decoder
ffmpeg:  ffdec_ac3: FFmpeg ATSC A/52A (AC-3) decoder
ffmpeg:  ffdec_alac: FFmpeg ALAC (Apple Lossless Audio Codec) decoder
ffmpeg:  ffdec_als: FFmpeg MPEG-4 Audio Lossless Coding (ALS) decoder
ffmpeg:  ffdec_amrnb: FFmpeg Adaptive Multi-Rate NarrowBand decoder
ffmpeg:  ffdec_amrwb: FFmpeg Adaptive Multi-Rate WideBand decoder
ffmpeg:  ffdec_ape: FFmpeg Monkey's Audio decoder
ffmpeg:  ffdec_atrac1: FFmpeg Atrac 1 (Adaptive TRansform Acoustic Coding) decoder
ffmpeg:  ffdec_atrac3: FFmpeg Atrac 3 (Adaptive TRansform Acoustic Coding 3) decoder
ffmpeg:  ffdec_binkaudio_dct: FFmpeg Bink Audio (DCT) decoder
ffmpeg:  ffdec_binkaudio_rdft: FFmpeg Bink Audio (RDFT) decoder
ffmpeg:  ffdec_bmv_audio: FFmpeg Discworld II BMV audio decoder
ffmpeg:  ffdec_cook: FFmpeg COOK decoder
ffmpeg:  ffdec_dca: FFmpeg DCA (DTS Coherent Acoustics) decoder
ffmpeg:  ffdec_dsicinaudio: FFmpeg Delphine Software International CIN audio decoder
ffmpeg:  ffdec_eac3: FFmpeg ATSC A/52B (AC-3, E-AC-3) decoder
ffmpeg:  ffdec_flac: FFmpeg FLAC (Free Lossless Audio Codec) decoder
ffmpeg:  ffdec_gsm: FFmpeg GSM decoder
ffmpeg:  ffdec_gsm_ms: FFmpeg GSM Microsoft variant decoder
ffmpeg:  ffdec_imc: FFmpeg IMC (Intel Music Coder) decoder
ffmpeg:  ffdec_mace3: FFmpeg MACE (Macintosh Audio Compression/Expansion) 3:1 decoder
ffmpeg:  ffdec_mace6: FFmpeg MACE (Macintosh Audio Compression/Expansion) 6:1 decoder
ffmpeg:  ffdec_mlp: FFmpeg MLP (Meridian Lossless Packing) decoder
ffmpeg:  ffdec_mp1float: FFmpeg MP1 (MPEG audio layer 1) decoder
ffmpeg:  ffdec_mp2float: FFmpeg MP2 (MPEG audio layer 2) decoder
ffmpeg:  ffdec_mp3: FFmpeg MP3 (MPEG audio layer 3) decoder
ffmpeg:  ffdec_mp3float: FFmpeg MP3 (MPEG audio layer 3) decoder
ffmpeg:  ffdec_mp3adu: FFmpeg ADU (Application Data Unit) MP3 (MPEG audio layer 3) decoder
ffmpeg:  ffdec_mp3adufloat: FFmpeg ADU (Application Data Unit) MP3 (MPEG audio layer 3) decoder
ffmpeg:  ffdec_mp3on4: FFmpeg MP3onMP4 decoder
ffmpeg:  ffdec_mp3on4float: FFmpeg MP3onMP4 decoder
ffmpeg:  ffdec_mpc7: FFmpeg Musepack SV7 decoder
ffmpeg:  ffdec_mpc8: FFmpeg Musepack SV8 decoder
ffmpeg:  ffdec_nellymoser: FFmpeg Nellymoser Asao decoder
ffmpeg:  ffdec_qcelp: FFmpeg QCELP / PureVoice decoder
ffmpeg:  ffdec_qdm2: FFmpeg QDesign Music Codec 2 decoder
ffmpeg:  ffdec_real_144: FFmpeg RealAudio 1.0 (14.4K) decoder
ffmpeg:  ffdec_real_288: FFmpeg RealAudio 2.0 (28.8K) decoder
ffmpeg:  ffdec_shorten: FFmpeg Shorten decoder
ffmpeg:  ffdec_sipr: FFmpeg RealAudio SIPR / ACELP.NET decoder
ffmpeg:  ffdec_smackaud: FFmpeg Smacker audio decoder
ffmpeg:  ffdec_truehd: FFmpeg TrueHD decoder
ffmpeg:  ffdec_truespeech: FFmpeg DSP Group TrueSpeech decoder
ffmpeg:  ffdec_tta: FFmpeg True Audio (TTA) decoder
ffmpeg:  ffdec_twinvq: FFmpeg VQF TwinVQ decoder
ffmpeg:  ffdec_vmdaudio: FFmpeg Sierra VMD audio decoder
ffmpeg:  ffdec_wmapro: FFmpeg Windows Media Audio 9 Professional decoder
ffmpeg:  ffdec_wmav1: FFmpeg Windows Media Audio 1 decoder
ffmpeg:  ffdec_wmav2: FFmpeg Windows Media Audio 2 decoder
ffmpeg:  ffdec_wmavoice: FFmpeg Windows Media Audio Voice decoder
ffmpeg:  ffdec_ws_snd1: FFmpeg Westwood Audio (SND1) decoder
ffmpeg:  ffdec_pcm_lxf: FFmpeg PCM signed 20-bit little-endian planar decoder
ffmpeg:  ffdec_pcm_s8_planar: FFmpeg PCM signed 8-bit planar decoder
ffmpeg:  ffdec_interplay_dpcm: FFmpeg DPCM Interplay decoder
ffmpeg:  ffdec_roq_dpcm: FFmpeg DPCM id RoQ decoder
ffmpeg:  ffdec_sol_dpcm: FFmpeg DPCM Sol decoder
ffmpeg:  ffdec_xan_dpcm: FFmpeg DPCM Xan decoder
ffmpeg:  ffdec_adpcm_4xm: FFmpeg ADPCM 4X Movie decoder
ffmpeg:  ffdec_adpcm_adx: FFmpeg SEGA CRI ADX ADPCM decoder
ffmpeg:  ffdec_adpcm_ct: FFmpeg ADPCM Creative Technology decoder
ffmpeg:  ffdec_adpcm_ea: FFmpeg ADPCM Electronic Arts decoder
ffmpeg:  ffdec_adpcm_ea_maxis_xa: FFmpeg ADPCM Electronic Arts Maxis CDROM XA decoder
ffmpeg:  ffdec_adpcm_ea_r1: FFmpeg ADPCM Electronic Arts R1 decoder
ffmpeg:  ffdec_adpcm_ea_r2: FFmpeg ADPCM Electronic Arts R2 decoder
ffmpeg:  ffdec_adpcm_ea_r3: FFmpeg ADPCM Electronic Arts R3 decoder
ffmpeg:  ffdec_adpcm_ea_xas: FFmpeg ADPCM Electronic Arts XAS decoder
ffmpeg:  ffdec_g722: FFmpeg G.722 ADPCM decoder
ffmpeg:  ffdec_g726: FFmpeg G.726 ADPCM decoder
ffmpeg:  ffdec_adpcm_ima_amv: FFmpeg ADPCM IMA AMV decoder
ffmpeg:  ffdec_adpcm_ima_dk3: FFmpeg ADPCM IMA Duck DK3 decoder
ffmpeg:  ffdec_adpcm_ima_dk4: FFmpeg ADPCM IMA Duck DK4 decoder
ffmpeg:  ffdec_adpcm_ima_ea_eacs: FFmpeg ADPCM IMA Electronic Arts EACS decoder
ffmpeg:  ffdec_adpcm_ima_ea_sead: FFmpeg ADPCM IMA Electronic Arts SEAD decoder
ffmpeg:  ffdec_adpcm_ima_iss: FFmpeg ADPCM IMA Funcom ISS decoder
ffmpeg:  ffdec_adpcm_ima_qt: FFmpeg ADPCM IMA QuickTime decoder
ffmpeg:  ffdec_adpcm_ima_smjpeg: FFmpeg ADPCM IMA Loki SDL MJPEG decoder
ffmpeg:  ffdec_adpcm_ima_wav: FFmpeg ADPCM IMA WAV decoder
ffmpeg:  ffdec_adpcm_ima_ws: FFmpeg ADPCM IMA Westwood decoder
ffmpeg:  ffdec_adpcm_ms: FFmpeg ADPCM Microsoft decoder
ffmpeg:  ffdec_adpcm_sbpro_2: FFmpeg ADPCM Sound Blaster Pro 2-bit decoder
ffmpeg:  ffdec_adpcm_sbpro_3: FFmpeg ADPCM Sound Blaster Pro 2.6-bit decoder
ffmpeg:  ffdec_adpcm_sbpro_4: FFmpeg ADPCM Sound Blaster Pro 4-bit decoder
ffmpeg:  ffdec_adpcm_swf: FFmpeg ADPCM Shockwave Flash decoder
ffmpeg:  ffdec_adpcm_thp: FFmpeg ADPCM Nintendo Gamecube THP decoder
ffmpeg:  ffdec_adpcm_xa: FFmpeg ADPCM CDROM XA decoder
ffmpeg:  ffdec_adpcm_yamaha: FFmpeg ADPCM Yamaha decoder
ffmpeg:  ffdec_xsub: FFmpeg XSUB decoder
ffmpeg:  ffdemux_aiff: FFmpeg Audio IFF demuxer
ffmpeg:  ffdemux_ape: FFmpeg Monkey's Audio demuxer
ffmpeg:  ffdemux_avs: FFmpeg AVS format demuxer
ffmpeg: fftype_avs: no extensions
ffmpeg:  ffdemux_daud: FFmpeg D-Cinema audio format demuxer
ffmpeg: fftype_daud: 302
ffmpeg:  ffdemux_ea: FFmpeg Electronic Arts Multimedia Format demuxer
ffmpeg: fftype_ea: no extensions
ffmpeg:  ffdemux_ffm: FFmpeg FFM (AVserver live feed) format demuxer
ffmpeg: fftype_ffm: no extensions
ffmpeg:  ffdemux_4xm: FFmpeg 4X Technologies format demuxer
ffmpeg: fftype_4xm: no extensions
ffmpeg:  ffdemux_gxf: FFmpeg GXF format demuxer
ffmpeg: fftype_gxf: no extensions
ffmpeg:  ffdemux_idcin: FFmpeg id Cinematic format demuxer
ffmpeg: fftype_idcin: no extensions
ffmpeg:  ffdemux_ipmovie: FFmpeg Interplay MVE format demuxer
ffmpeg: fftype_ipmovie: no extensions
ffmpeg:  ffdemux_mm: FFmpeg American Laser Games MM format demuxer
ffmpeg: fftype_mm: no extensions
ffmpeg:  ffdemux_mmf: FFmpeg Yamaha SMAF demuxer
ffmpeg: fftype_mmf: no extensions
ffmpeg:  ffdemux_mpc: FFmpeg Musepack demuxer
ffmpeg:  ffdemux_mxf: FFmpeg Material eXchange Format demuxer
ffmpeg:  ffdemux_nsv: FFmpeg Nullsoft Streaming Video demuxer
ffmpeg: fftype_nsv: no extensions
ffmpeg:  ffdemux_nut: FFmpeg NUT format demuxer
ffmpeg: fftype_nut: nut
ffmpeg:  ffdemux_nuv: FFmpeg NuppelVideo format demuxer
ffmpeg:  ffdemux_RoQ: FFmpeg id RoQ format demuxer
ffmpeg: fftype_RoQ: no extensions
ffmpeg:  ffdemux_film_cpk: FFmpeg Sega FILM/CPK format demuxer
ffmpeg: fftype_film_cpk: no extensions
ffmpeg:  ffdemux_smk: FFmpeg Smacker video demuxer
ffmpeg: fftype_smk: no extensions
ffmpeg:  ffdemux_sol: FFmpeg Sierra SOL format demuxer
ffmpeg: fftype_sol: no extensions
ffmpeg:  ffdemux_psxstr: FFmpeg Sony Playstation STR format demuxer
ffmpeg: fftype_psxstr: no extensions
ffmpeg:  ffdemux_swf: FFmpeg Flash format demuxer
ffmpeg:  ffdemux_tta: FFmpeg True Audio demuxer
ffmpeg:  ffdemux_vmd: FFmpeg Sierra VMD format demuxer
ffmpeg: fftype_vmd: no extensions
ffmpeg:  ffdemux_voc: FFmpeg Creative Voice file format demuxer
ffmpeg:  ffdemux_wc3movie: FFmpeg Wing Commander III movie format demuxer
ffmpeg: fftype_wc3movie: no extensions
ffmpeg:  ffdemux_wsaud: FFmpeg Westwood Studios audio format demuxer
ffmpeg: fftype_wsaud: no extensions
ffmpeg:  ffdemux_wsvqa: FFmpeg Westwood Studios VQA format demuxer
ffmpeg: fftype_wsvqa: no extensions
ffmpeg:  ffdemux_yuv4mpegpipe: FFmpeg YUV4MPEG pipe format demuxer
ffmpeg: fftype_yuv4mpegpipe: y4m
ffmpeg:  ffmux_a64: FFmpeg a64 - video for Commodore 64 muxer
ffmpeg:  ffmux_adts: FFmpeg ADTS AAC muxer (not recommended, use aacparse instead)
ffmpeg:  ffmux_adx: FFmpeg CRI ADX muxer
ffmpeg:  ffmux_aiff: FFmpeg Audio IFF muxer (not recommended, use aiffmux instead)
ffmpeg:  ffmux_amr: FFmpeg 3GPP AMR file format muxer
ffmpeg:  ffmux_asf: FFmpeg ASF format muxer (not recommended, use asfmux instead)
ffmpeg:  ffmux_asf_stream: FFmpeg ASF format muxer (not recommended, use asfmux instead)
ffmpeg:  ffmux_au: FFmpeg SUN AU format muxer
ffmpeg:  ffmux_avi: FFmpeg AVI format muxer (not recommended, use avimux instead)
ffmpeg:  ffmux_avm2: FFmpeg Flash 9 (AVM2) format muxer
ffmpeg:  ffmux_daud: FFmpeg D-Cinema audio format muxer
ffmpeg:  ffmux_dv: FFmpeg DV video format muxer
ffmpeg:  ffmux_ffm: FFmpeg FFM (AVserver live feed) format muxer
ffmpeg:  ffmux_filmstrip: FFmpeg Adobe Filmstrip muxer
ffmpeg:  ffmux_flv: FFmpeg FLV format muxer (not recommended, use flvmux instead)
ffmpeg:  ffmux_gxf: FFmpeg GXF format muxer
ffmpeg:  ffmux_ipod: FFmpeg iPod H.264 MP4 format muxer
ffmpeg:  ffmux_ivf: FFmpeg On2 IVF muxer
ffmpeg:  ffmux_latm: FFmpeg LOAS/LATM muxer
ffmpeg:  ffmux_md5: FFmpeg MD5 testing format muxer
ffmpeg:  ffmux_matroska: FFmpeg Matroska file format muxer (not recommended, use matroskamux instead)
ffmpeg:  ffmux_mmf: FFmpeg Yamaha SMAF muxer
ffmpeg:  ffmux_mov: FFmpeg MOV format muxer (not recommended, use qtmux instead)
ffmpeg:  ffmux_mp2: FFmpeg MPEG audio layer 2 formatter (not recommended, use id3v2mux instead)
ffmpeg:  ffmux_mp3: FFmpeg MPEG audio layer 3 formatter (not recommended, use id3v2mux instead)
ffmpeg:  ffmux_mp4: FFmpeg MP4 format muxer (not recommended, use mp4mux instead)
ffmpeg:  ffmux_mpeg: FFmpeg MPEG-1 System format muxer
ffmpeg:  ffmux_vcd: FFmpeg MPEG-1 System format (VCD) muxer
ffmpeg:  ffmux_dvd: FFmpeg MPEG-2 PS format (DVD VOB) muxer
ffmpeg:  ffmux_svcd: FFmpeg MPEG-2 PS format (VOB) muxer
ffmpeg:  ffmux_vob: FFmpeg MPEG-2 PS format (VOB) muxer
ffmpeg:  ffmux_mpegts: FFmpeg MPEG-2 transport stream format muxer (not recommended, use mpegtsmux instead)
ffmpeg:  ffmux_mpjpeg: FFmpeg MIME multipart JPEG format muxer (not recommended, use multipartmux instead)
ffmpeg:  ffmux_mxf: FFmpeg Material eXchange Format muxer (not recommended, use mxfmux instead)
ffmpeg:  ffmux_mxf_d10: FFmpeg Material eXchange Format, D-10 Mapping muxer
ffmpeg:  ffmux_nut: FFmpeg NUT format muxer
ffmpeg:  ffmux_ogg: FFmpeg Ogg muxer (not recommended, use oggmux instead)
ffmpeg:  ffmux_oma: FFmpeg Sony OpenMG audio muxer
ffmpeg:  ffmux_psp: FFmpeg PSP MP4 format muxer
ffmpeg:  ffmux_rm: FFmpeg RealMedia format muxer
ffmpeg:  ffmux_rso: FFmpeg Lego Mindstorms RSO format muxer
ffmpeg:  ffmux_rtsp: FFmpeg RTSP output format muxer
ffmpeg:  ffmux_sap: FFmpeg SAP output format muxer
ffmpeg:  ffmux_segment: FFmpeg segment muxer muxer
ffmpeg:  ffmux_smjpeg: FFmpeg Loki SDL MJPEG muxer
ffmpeg:  ffmux_sox: FFmpeg SoX native format muxer
ffmpeg:  ffmux_spdif: FFmpeg IEC 61937 (used on S/PDIF - IEC958) muxer
ffmpeg:  ffmux_swf: FFmpeg Flash format muxer
ffmpeg:  ffmux_3g2: FFmpeg 3GP2 format muxer
ffmpeg:  ffmux_3gp: FFmpeg 3GP format muxer (not recommended, use gppmux instead)
ffmpeg:  ffmux_rcv: FFmpeg VC-1 test bitstream muxer
ffmpeg:  ffmux_voc: FFmpeg Creative Voice file format muxer
ffmpeg:  ffmux_wav: FFmpeg WAV format muxer (not recommended, use wavenc instead)
ffmpeg:  ffmux_webm: FFmpeg WebM file format muxer (not recommended, use webmmux instead)
ffmpeg:  ffmux_yuv4mpegpipe: FFmpeg YUV4MPEG pipe format muxer (not recommended, use y4menc instead)
ffmpeg:  ffdeinterlace: FFMPEG Deinterlace element
ffmpeg:  ffaudioresample: FFMPEG Audio resampling element
videoparsersbad:  h263parse: H.263 parser
videoparsersbad:  h264parse: H.264 parser
videoparsersbad:  diracparse: Dirac parser
videoparsersbad:  mpegvideoparse: MPEG video elementary stream parser
videoparsersbad:  mpeg4videoparse: MPEG 4 video elementary stream parser
alphacolor:  alphacolor: Alpha color filter
replaygain:  rganalysis: ReplayGain analysis
replaygain:  rglimiter: ReplayGain limiter
replaygain:  rgvolume: ReplayGain volume
rtpvp8:  rtpvp8depay: RTP VP8 depayloader
rtpvp8:  rtpvp8pay: RTP VP8 payloader
flxdec:  flxdec: FLX video decoder
matroska:  matroskademux: Matroska demuxer
matroska:  matroskaparse: Matroska parser
matroska:  matroskamux: Matroska muxer
matroska:  webmmux: WebM muxer
fsrawconference:  fsrawconference: Generic bin
rtp:  rtpdepay: Dummy RTP session manager
rtp:  rtpac3depay: RTP AC3 depayloader
rtp:  rtpac3pay: RTP AC3 audio payloader
rtp:  rtpbvdepay: RTP BroadcomVoice depayloader
rtp:  rtpbvpay: RTP BV Payloader
rtp:  rtpceltdepay: RTP CELT depayloader
rtp:  rtpceltpay: RTP CELT payloader
rtp:  rtpdvdepay: RTP DV Depayloader
rtp:  rtpdvpay: RTP DV Payloader
rtp:  rtpgstdepay: GStreamer depayloader
rtp:  rtpgstpay: RTP GStreamer payloader
rtp:  rtpilbcpay: RTP iLBC Payloader
rtp:  rtpilbcdepay: RTP iLBC depayloader
rtp:  rtpg722depay: RTP audio depayloader
rtp:  rtpg722pay: RTP audio payloader
rtp:  rtpg723depay: RTP G.723 depayloader
rtp:  rtpg723pay: RTP G.723 payloader
rtp:  rtpg726depay: RTP G.726 depayloader
rtp:  rtpg726pay: RTP G.726 payloader
rtp:  rtpg729depay: RTP G.729 depayloader
rtp:  rtpg729pay: RTP G.729 payloader
rtp:  rtpgsmdepay: RTP GSM depayloader
rtp:  rtpgsmpay: RTP GSM payloader
rtp:  rtpamrdepay: RTP AMR depayloader
rtp:  rtpamrpay: RTP AMR payloader
rtp:  rtppcmadepay: RTP PCMA depayloader
rtp:  rtppcmudepay: RTP PCMU depayloader
rtp:  rtppcmupay: RTP PCMU payloader
rtp:  rtppcmapay: RTP PCMA payloader
rtp:  rtpmpadepay: RTP MPEG audio depayloader
rtp:  rtpmpapay: RTP MPEG audio payloader
rtp:  rtpmparobustdepay: RTP MPEG audio depayloader
rtp:  rtpmpvdepay: RTP MPEG video depayloader
rtp:  rtpmpvpay: RTP MPEG2 ES video payloader
rtp:  rtph263ppay: RTP H263 payloader
rtp:  rtph263pdepay: RTP H263 depayloader
rtp:  rtph263depay: RTP H263 depayloader
rtp:  rtph263pay: RTP H263 packet payloader
rtp:  rtph264depay: RTP H264 depayloader
rtp:  rtph264pay: RTP H264 payloader
rtp:  rtpj2kdepay: RTP JPEG 2000 depayloader
rtp:  rtpj2kpay: RTP JPEG 2000 payloader
rtp:  rtpjpegdepay: RTP JPEG depayloader
rtp:  rtpjpegpay: RTP JPEG payloader
rtp:  rtpL16pay: RTP audio payloader
rtp:  rtpL16depay: RTP audio depayloader
rtp:  asteriskh263: RTP Asterisk H263 depayloader
rtp:  rtpmp1sdepay: RTP MPEG1 System Stream depayloader
rtp:  rtpmp2tdepay: RTP MPEG Transport Stream depayloader
rtp:  rtpmp2tpay: RTP MPEG2 Transport Stream payloader
rtp:  rtpmp4vpay: RTP MPEG4 Video payloader
rtp:  rtpmp4vdepay: RTP MPEG4 video depayloader
rtp:  rtpmp4apay: RTP MPEG4 audio payloader
rtp:  rtpmp4adepay: RTP MPEG4 audio depayloader
rtp:  rtpmp4gdepay: RTP MPEG4 ES depayloader
rtp:  rtpmp4gpay: RTP MPEG4 ES payloader
rtp:  rtpqcelpdepay: RTP QCELP depayloader
rtp:  rtpqdm2depay: RTP QDM2 depayloader
rtp:  rtpsirenpay: RTP Payloader for Siren Audio
rtp:  rtpsirendepay: RTP Siren packet depayloader
rtp:  rtpspeexpay: RTP Speex payloader
rtp:  rtpspeexdepay: RTP Speex depayloader
rtp:  rtpsv3vdepay: RTP SVQ3 depayloader
rtp:  rtptheoradepay: RTP Theora depayloader
rtp:  rtptheorapay: RTP Theora payloader
rtp:  rtpvorbisdepay: RTP Vorbis depayloader
rtp:  rtpvorbispay: RTP Vorbis depayloader
rtp:  rtpvrawdepay: RTP Raw Video depayloader
rtp:  rtpvrawpay: RTP Raw Video payloader
goom2k1:  goom2k1: GOOM: what a GOOM! 2k1 edition
removesilence:  removesilence: RemoveSilence
y4mdec:  y4mdec: YUV4MPEG demuxer/decoder
stereo:  stereo: Stereo effect
monoscope:  monoscope: Monoscope
dtmf:  dtmfdetect: DTMF detector element
dtmf:  dtmfsrc: DTMF tone generator
dtmf:  rtpdtmfsrc: RTP DTMF packet generator
dtmf:  rtpdtmfdepay: RTP DTMF packet depayloader
flite:  flitetestsrc: Flite speech test source
rtmp:  rtmpsrc: RTMP Source
rtmp:  rtmpsink: RTMP output sink
jp2k:  jp2kdec: Jasper JPEG2000 image decoder
jp2k:  jp2kenc: Jasper JPEG2000 image encoder
mve:  mvedemux: MVE Demuxer
mve:  mvemux: MVE Multiplexer
jack:  jackaudiosrc: Audio Source (Jack)
jack:  jackaudiosink: Audio Sink (Jack)
vp8:  vp8dec: On2 VP8 Decoder
vp8:  vp8enc: On2 VP8 Encoder
gsm:  gsmenc: GSM audio encoder
gsm:  gsmdec: GSM audio decoder
audiotestsrc:  audiotestsrc: Audio test source
dvb:  dvbsrc: DVB Source
dvb:  dvbbasebin: DVB bin
real:  realvideodec: RealVideo decoder
real:  realaudiodec: RealAudio decoder
cog:  cogdownsample: Scale down video by factor of 2
cog:  cogcolorspace: YCbCr/RGB format conversion
cog:  cogscale: Video scaler
cog:  cogcolorconvert: Convert colorspace
cog:  coglogoinsert: Overlay image onto video
cog:  cogmse: Calculate MSE
encoding:  encodebin: Encoder Bin
audioparsers:  aacparse: AAC audio stream parser
audioparsers:  amrparse: AMR audio stream parser
audioparsers:  ac3parse: AC3 audio stream parser
audioparsers:  dcaparse: DTS Coherent Acoustics audio stream parser
audioparsers:  flacparse: FLAC audio parser
audioparsers:  mpegaudioparse: MPEG1 Audio Parser
voamrwbenc:  voamrwbenc: AMR-WB audio encoder
gdkpixbuf:  gdkpixbufdec: GdkPixbuf image decoder
gdkpixbuf:  gdkpixbufsink: GdkPixbuf sink
gdkpixbuf:  gdkpixbufscale: GdkPixbuf image scaler
bayer:  bayer2rgb: Bayer to RGB decoder for cameras
bayer:  rgb2bayer: RGB to Bayer converter
linsys:  linsyssdisrc: SDI video source
linsys:  linsyssdisink: SDI video sink
aasink:  aasink: ASCII art video sink
musepack:  musepackdec: Musepack decoder
ofa:  ofa: OFA
faad:  faad: AAC audio decoder
scaletempo:  scaletempo: Scaletempo
curl:  curlsink: Curl sink
h264parse:  legacyh264parse: H264Parse
audiovisualizers:  spacescope: Stereo visualizer
audiovisualizers:  spectrascope: Frequency spectrum scope
audiovisualizers:  synaescope: Synaescope
audiovisualizers:  wavescope: Waveform oscilloscope
teletext:  teletextdec: Teletext decoder
audiorate:  audiorate: Audio rate adjuster
gmedec:  gmedec: Gaming console music file decoder
videorate:  videorate: Video rate adjuster
gaudieffects:  burn: Burn
gaudieffects:  chromium: Chromium
gaudieffects:  dilate: Dilate
gaudieffects:  dodge: Dodge
gaudieffects:  exclusion: Exclusion
gaudieffects:  solarize: Solarize
gaudieffects:  gaussianblur: GaussBlur
rsvg:  rsvgoverlay: RSVG overlay
rsvg:  rsvgdec: SVG image decoder
fbdevsink:  fbdevsink: fbdev video sink
fsfunnel:  fsfunnel: Farstream Funnel pipe fitting
videobox:  videobox: Video box filter
rfbsrc:  rfbsrc: Rfb source
videotestsrc:  videotestsrc: Video test source
staticelements:  bin: Generic bin
staticelements:  pipeline: Pipeline object

Total count: 231 plugins (1 blacklist entry not shown), 1116 features

 

 

Ogg Vorbis 介绍以及Spec.

http://xiph.org/vorbis/doc/Vorbis_I_spec.html

 

Ogg Vorbis

目 录
 
 

1OGG Vobis

Ogg全称应该是OGG Vorbis是一种新的音频压缩格式,类似于MP3等现有的音乐格式。但有一点不同的是,它是完全免费、开放和没有专利限制的。OGG Vorbis有一个很出众的特点,就是支持多声道,随着它的流行,以后用随身听来听DTS编码的多声道作品将不会是梦想。
Vorbis 是这种音频压缩机制的名字,而Ogg则是一个计划的名字,该计划意图设计一个完全开放性的多媒体系统。目前该计划只实现了OggVorbis这一部分。
Ogg Vorbis文件的扩展名是.OGG。这种文件的设计格式是非常先进的。现在创建的OGG文件可以在未来的任何播放器上播放,因此,这种文件格式可以不断地进行大小和音质的改良,而不影响旧有的编码器或播放器。

2问与答

■为何要使用Ogg Vorbis格式呢?
众所周知,MP3是有损压缩格式,因此压缩后的数据与标准的CD音乐相比是有损失的。VORBIS也是有损压缩,但通过使用更加先进的声学模型去减少损失,因此,同样位速率(Bit Rate)编码的OGG与MP3相比听起来更好一些。另外,还有一个原因,MP3格式是受专利保护的。如果你想使用MP3格式发布自己的作品,则需要付给 Fraunhofer(发明MP3的公司)专利使用费。而VORBIS就完全没有这个问题。对于乐迷来说,使用OGG文件的显著好处是可以用更小的文件获得优越的声音质量。而且,由于OGG是完全开放和免费的,制作OGG文件将不受任何专利限制,可望可以获得大量的编码器和播放器。这也是为何现在MP3编码器如此少而且大多是商业软件的原因,因为Fraunhofer要收取专利使用费。
讲了这么多,其实声音质量才是大家最关心的问题。
■Ogg Vorbis的音质真能比得上MP3吗?
由于Vorbis 使用了与MP3相比完全不同的数学原理,因此在压缩音乐时受到的挑战也不同。在当前的聆听测试中,同样位速率编码的Vorbis 和MP3 文件具有同等的声音质量。
■Ogg Vorbis的声音质量会继续改进吗?
是的。由于Vorbis使用了一种灵活的格式,能够在文件格式已经固定下来后还能对音质进行明显的调节和新算法训练。因此,它的声音质量将会越来越好。
由于Ogg目前仍处于BETA测试阶段,因此,现阶段的编码器仍存在着一些未解决的问题。Ogg的开发人员已经承诺将于下一个BETA版本改进现存的问题。关键的一点是,这些问题只是由于软件的编写而导致的,与Ogg格式本身所使用的算法无关。
■Ogg的文件大小如何与MP3进行比较?
如果两个文件都是以同样的位速率和CBR(常量位速率,指文件从头到尾都是一种位速率)方式来编码的话,那他们的大小肯定相同。当前 Vorbis 是以VBR(可变位速率)方式编码的,这使得Ogg的文件可以更小,因为VBR方式可以处理能大幅度进行压缩的音频数据(比如无声的时段)而节省空间。
■Vorbis能以什么位速率进行编码?
理论上,没有固定的位速率。Vorbis的设计是可以以16kbps~128kbps/通道的位速率进行编码。但规格说明中并没有限制你将文件以512kbps或8kbps方式编码。
■Ogg Vorbis支持类似于MP3的ID3信息吗?
支持。Vorbis格式中包括有一个灵活而又完整的注释栏,可用于填写各种相关信息。
■编码/解码器能有多快?
当前的编码/解码器在速度上已经接近一些商业级别的编码解码器了。由于现在还没有对程序进行优化,相信等一切完成后,至少可以跟MP3持平。
■Ogg Vorbis格式支持流式播放吗?
音频流是Vorbis的一个重要组成部分.vorbis格式从设计的一开始就是立足于可以容易地进行流式处理。并且,Vorbis的设计者正与 Icecast 流媒体软件的创造者一起使Icecast兼容 Vorbis。待正式版推出时,将会有各种各样支持流式OGG播放的软件或软件插件出现。
■目前有什么软硬件支持Ogg Vorbis?
迄今为止,Sonique、FreeAmp、Winamp、XMMS和kmpg都支持通过插件播放Ogg Vorbis文件。鉴于这些软件的影响力,将会有更多的软件支持OGG格式。另外,两个最流行的MP3编码器:LAME和BladeEnc,也宣布他们将支持编码Ogg Vorbis文件。
■Ogg Vorbis有什么独一无二的特性?
Vorbis具有一个设计良好、灵活的注释,避免了象MP3文件的ID3标记那样烦琐的操作; Vorbis还具有位速率缩放:可以不用重新编码便可调节文件的位速率。Vorbis文件可以被分成小块并以样本粒度进行编辑;Vorbis支持多通道; Vorbis文件可以以逻辑方式相连接等。
■那里可以找到相关的软件以及开发信息等资料下载?
ogg-vorbis[1],这是Ogg Vorbis的官方站点
■如何制作OGG音乐格式文件
OGG格式的音乐文件现在没有大规模普及,一般我们需要用CD唱片或网上下载的其它格式转换。先介绍一下Ogg编码的一些知识,Ogg的编码中的比特率选项主要有ABR、VBR和Quality三种,其实Ogg的比特率都是可变的,推荐使用设置简单Quality模式,能满足大多数人要求。
下面的列表是Quality的等级划分和对应的比特率
面将介绍常用的转换OGG格式的方法:
转换OGG格式的方法有很多,但最简单的方法是就是用Foobar2000+OGGENC外部编码器。Foobar2000现在很流行,它是一个Windows 平台下的高级音频播放器。不仅音质很出色,功能也很强大。Foobar2000支持WAV、AIFF、VOC、 AU、SND、Ogg Vorbis、MPC、MP2、MP3等音乐格式,通过插件还能支持MPEG-4、AAC、FLAC、Ogg、FLAC、Monkey“sAudio、WavPack、Speex、CDDA、SPC及各种MOD类型,相信支持的类型是已经足够了。利用Foobar2000的转换功能,可以方便的其它的音频格式转换成OGG格式。
Foobar2000转换OGG需要OGG外部编码器的支持(OGGENC),可供选择的OGGENC外部编码器版本很多,音质也有些细微的差距.
基于目前音质最好的aoTuV beta 4.51制作的编码器 使用SSE大幅优化了编码速度 非常快
oggenc[2]
这是一个命令行编码器 可以在foobar2000 / EAC / 千千静听等软件中方便的调用
比如在foobar2000中 大家用鼠标选中要转换的歌曲,右键点击,选中“转换”,然后选择“转换到同目录”,这样你就不用找转换出来的歌了。FOOBAR会弹出个“转换器设置”,在“编码预置”里选择“Ogg Vorbis"。FOOBAR默认的是Q5。我们可以点击“。。。”,然后在“命令行编码器设置”将品质拉到最右边即是最高品质Q10了!设置好后,点击“确定”开始转换咯!如果没有设置好OGG编码器的为止,FOOBAR会弹出提示叫你选择OGG编码器的位置。
千千静听中,把想转的歌曲加入到千千静听,在歌曲上右键点转换格式 - 输出格式中选命令行编码器1.0,点击“配置”- 新建1个编码器方案,名称任意 - 把编码程序改为刚才下载的那个ogg编码器 - 扩展名为ogg,命令参数为:-q10 - -o "%d" ,其余默认 - 确认后点击“立即转换”,OK了,等吧。不是Q10的话把"10"改成对应的数字即可。用Easy CD-DA Extractor 9转起来更方便,但是选择第三方编码器时比较麻烦。
3.选择合适的OGG编码品质
目前新一代的MP3播放器都增加了对OGG格式支持,比如说魅族的MINIPLAYER(支持OGG/Q-1~Q10)。如果你选购的是小容量版本,建议使用Q2和Q4品质的OGG。 Q2品质的歌曲文件的音质水平全面超过了128KBPS码率的MP3,而文件体积却还要下1/4以上,无疑是音质和体积的很好结合,适合绝大多数普通用户。事实上即使Q0品质的音质就超过了同样采用64Kbps码率的WMA格式,非常接近128KBPS码率的MP3,而64k的wma的音质根本无法达到128kmp3的水平。如果不是很挑剔音质的用户用Q0品质也完全没有任何问题。
对于音质要求较高的朋友可以选择使用Q4品质的OGG,Q4品质的OGG文件体积与128K MP3相当,而音质超过了Lame压缩的192Kbps/VBR MP3,这一点对支持OGG的数码播放器用户很具有吸引力——同样的容量下你存储更多高音质歌曲。对于容量较高切对希望用数码播放器达到CD随身听音质的发烧友来说,Q10品质的OGG无疑是一个福音,Q10的OGG从频谱上看基本和WAV文件完全一致,而体积只有后者的1/3,一首4分钟的歌曲大约有10MB左右。
在支持OGG/Q10格式的数码音乐播放器诞生之前,很多注重音质的朋友对MP3格式并不感冒——即使普通人也能听出来无损的CD音轨和最高品质的320Kbps码率MP3的音质区别。想魅族的MINIPLAYER之类的新一代高音质MP3播放器配合OGG/Q10格式有完全取代CD随身听的实力。Q5、Q6音质也非常优秀,可以达到256Kbps MP3以上的音质水平,也是对音质和体积很好的折衷,可以根据个人需要自己选择。

3总结

新一代OGG格式的流行,能以更低的码率和文件体积欣赏到音质更高的音乐歌曲,同时也使得数码随身听设备有了真正取代CD随身听的资本,非常值得我们期待。心动的朋友不妨赶紧体验一下,将自己喜好的CD转换成小巧方便的OGG格式,为自己打造HI-FI级的随身音乐世界。

 

 

Contents

1 Introduction and Description
1.1 Overview
1.1.1 Application
1.1.2 Classification
1.1.3 Assumptions
1.1.4 Codec Setup and Probability Model
1.1.5 Format Specification
1.1.6 Hardware Profile
1.2 Decoder Configuration
1.2.1 Global Config
1.2.2 Mode
1.2.3 Mapping
1.2.4 Floor
1.2.5 Residue
1.2.6 Codebooks
1.3 High-level Decode Process
1.3.1 Decode Setup
1.3.2 Decode Procedure
2 Bitpacking Convention
2.1 Overview
2.1.1 octets, bytes and words
2.1.2 bit order
2.1.3 byte order
2.1.4 coding bits into byte sequences
2.1.5 signedness
2.1.6 coding example
2.1.7 decoding example
2.1.8 end-of-packet alignment
2.1.9 reading zero bits
3 Probability Model and Codebooks
3.1 Overview
3.1.1 Bitwise operation
3.2 Packed codebook format
3.2.1 codebook decode
3.3 Use of the codebook abstraction
4 Codec Setup and Packet Decode
4.1 Overview
4.2 Header decode and decode setup
4.2.1 Common header decode
4.2.2 Identification header
4.2.3 Comment header
4.2.4 Setup header
4.3 Audio packet decode and synthesis
4.3.1 packet type, mode and window decode
4.3.2 floor curve decode
4.3.3 nonzero vector propagate
4.3.4 residue decode
4.3.5 inverse coupling
4.3.6 dot product
4.3.7 inverse MDCT
4.3.8 overlap_add
4.3.9 output channel order
5 comment field and header specification
5.1 Overview
5.2 Comment encoding
5.2.1 Structure
5.2.2 Content vector format
5.2.3 Encoding
6 Floor type 0 setup and decode
6.1 Overview
6.2 Floor 0 format
6.2.1 header decode
6.2.2 packet decode
6.2.3 curve computation
7 Floor type 1 setup and decode
7.1 Overview
7.2 Floor 1 format
7.2.1 model
7.2.2 header decode
7.2.3 packet decode
7.2.4 curve computation
8 Residue setup and decode
8.1 Overview
8.2 Residue format
8.3 residue 0
8.4 residue 1
8.5 residue 2
8.6 Residue decode
8.6.1 header decode
8.6.2 packet decode
8.6.3 format 0 specifics
8.6.4 format 1 specifics
8.6.5 format 2 specifics
9 Helper equations
9.1 Overview
9.2 Functions
9.2.1 ilog
9.2.2 float32_unpack
9.2.3 lookup1_values
9.2.4 low_neighbor
9.2.5 high_neighbor
9.2.6 render_point
9.2.7 render_line
10 Tables
10.1 floor1_inverse_dB_table
A Embedding Vorbis into an Ogg stream
A.1 Overview
A.1.1 Restrictions
A.1.2 MIME type
A.2 Encapsulation
B Vorbis encapsulation in RTP

1. Introduction and Description

 

1.1. Overview

This document provides a high level description of the Vorbis codec’s construction. A bit-by-bit specification appears beginning in Section 4, “Codec Setup and Packet Decode”. The later sections assume a high-level understanding of the Vorbis decode process, which is provided here.

 

1.1.1. Application

Vorbis is a general purpose perceptual audio CODEC intended to allow maximum encoder flexibility, thus allowing it to scale competitively over an exceptionally wide range of bitrates. At the high quality/bitrate end of the scale (CD or DAT rate stereo, 16/24 bits) it is in the same league as MPEG-2 and MPC. Similarly, the 1.0 encoder can encode high-quality CD and DAT rate stereo at below 48kbps without resampling to a lower rate. Vorbis is also intended for lower and higher sample rates (from 8kHz telephony to 192kHz digital masters) and a range of channel representations (monaural, polyphonic, stereo, quadraphonic, 5.1, ambisonic, or up to 255 discrete channels).

 

1.1.2. Classification

Vorbis I is a forward-adaptive monolithic transform CODEC based on the Modified Discrete Cosine Transform. The codec is structured to allow addition of a hybrid wavelet filterbank in Vorbis II to offer better transient response and reproduction using a transform better suited to localized time events.

 

1.1.3. Assumptions

The Vorbis CODEC design assumes a complex, psychoacoustically-aware encoder and simple, low-complexity decoder. Vorbis decode is computationally simpler than mp3, although it does require more working memory as Vorbis has no static probability model; the vector codebooks used in the first stage of decoding from the bitstream are packed in their entirety into the Vorbis bitstream headers. In packed form, these codebooks occupy only a few kilobytes; the extent to which they are pre-decoded into a cache is the dominant factor in decoder memory usage.

Vorbis provides none of its own framing, synchronization or protection against errors; it is solely a method of accepting input audio, dividing it into individual frames and compressing these frames into raw, unformatted ’packets’. The decoder then accepts these raw packets in sequence, decodes them, synthesizes audio frames from them, and reassembles the frames into a facsimile of the original audio stream. Vorbis is a free-form variable bit rate (VBR) codec and packets have no minimum size, maximum size, or fixed/expected size. Packets are designed that they may be truncated (or padded) and remain decodable; this is not to be considered an error condition and is used extensively in bitrate management in peeling. Both the transport mechanism and decoder must allow that a packet may be any size, or end before or after packet decode expects.

Vorbis packets are thus intended to be used with a transport mechanism that provides free-form framing, sync, positioning and error correction in accordance with these design assumptions, such as Ogg (for file transport) or RTP (for network multicast). For purposes of a few examples in this document, we will assume that Vorbis is to be embedded in an Ogg stream specifically, although this is by no means a requirement or fundamental assumption in the Vorbis design.

The specification for embedding Vorbis into an Ogg transport stream is in Section A,“Embedding Vorbis into an Ogg stream”.

 

1.1.4. Codec Setup and Probability Model

Vorbis’ heritage is as a research CODEC and its current design reflects a desire to allow multiple decades of continuous encoder improvement before running out of room within the codec specification. For these reasons, configurable aspects of codec setup intentionally lean toward the extreme of forward adaptive.

The single most controversial design decision in Vorbis (and the most unusual for a Vorbis developer to keep in mind) is that the entire probability model of the codec, the Huffman and VQ codebooks, is packed into the bitstream header along with extensive CODEC setup parameters (often several hundred fields). This makes it impossible, as it would be with MPEG audio layers, to embed a simple frame type flag in each audio packet, or begin decode at any frame in the stream without having previously fetched the codec setup header.

Note: Vorbis can initiate decode at any arbitrary packet within a bitstream so long as the codec has been initialized/setup with the setup headers.

Thus, Vorbis headers are both required for decode to begin and relatively large as bitstream headers go. The header size is unbounded, although for streaming a rule-of-thumb of 4kB or less is recommended (and Xiph.Org’s Vorbis encoder follows this suggestion).

Our own design work indicates the primary liability of the required header is in mindshare; it is an unusual design and thus causes some amount of complaint among engineers as this runs against current design trends (and also points out limitations in some existing software/interface designs, such as Windows’ ACM codec framework). However, we find that it does not fundamentally limit Vorbis’ suitable application space.

 

1.1.5. Format Specification

The Vorbis format is well-defined by its decode specification; any encoder that produces packets that are correctly decoded by the reference Vorbis decoder described below may be considered a proper Vorbis encoder. A decoder must faithfully and completely implement the specification defined below (except where noted) to be considered a proper Vorbis decoder.

 

1.1.6. Hardware Profile

Although Vorbis decode is computationally simple, it may still run into specific limitations of an embedded design. For this reason, embedded designs are allowed to deviate in limited ways from the ‘full’ decode specification yet still be certified compliant. These optional omissions are labelled in the spec where relevant.

 

1.2. Decoder Configuration

Decoder setup consists of configuration of multiple, self-contained component abstractions that perform specific functions in the decode pipeline. Each different component instance of a specific type is semantically interchangeable; decoder configuration consists both of internal component configuration, as well as arrangement of specific instances into a decode pipeline. Componentry arrangement is roughly as follows:

PIC


Figure 1: decoder pipeline configuration

 

1.2.1. Global Config

Global codec configuration consists of a few audio related fields (sample rate, channels), Vorbis version (always ’0’ in Vorbis I), bitrate hints, and the lists of component instances. All other configuration is in the context of specific components.

 

1.2.2. Mode

Each Vorbis frame is coded according to a master ’mode’. A bitstream may use one or many modes.

The mode mechanism is used to encode a frame according to one of multiple possible methods with the intention of choosing a method best suited to that frame. Different modes are, e.g. how frame size is changed from frame to frame. The mode number of a frame serves as a top level configuration switch for all other specific aspects of frame decode.

A ’mode’ configuration consists of a frame size setting, window type (always 0, the Vorbis window, in Vorbis I), transform type (always type 0, the MDCT, in Vorbis I) and a mapping number. The mapping number specifies which mapping configuration instance to use for low-level packet decode and synthesis.

 

1.2.3. Mapping

A mapping contains a channel coupling description and a list of ’submaps’ that bundle sets of channel vectors together for grouped encoding and decoding. These submaps are not references to external components; the submap list is internal and specific to a mapping.

A ’submap’ is a configuration/grouping that applies to a subset of floor and residue vectors within a mapping. The submap functions as a last layer of indirection such that specific special floor or residue settings can be applied not only to all the vectors in a given mode, but also specific vectors in a specific mode. Each submap specifies the proper floor and residue instance number to use for decoding that submap’s spectral floor and spectral residue vectors.

As an example:

Assume a Vorbis stream that contains six channels in the standard 5.1 format. The sixth channel, as is normal in 5.1, is bass only. Therefore it would be wasteful to encode a full-spectrum version of it as with the other channels. The submapping mechanism can be used to apply a full range floor and residue encoding to channels 0 through 4, and a bass-only representation to the bass channel, thus saving space. In this example, channels 0-4 belong to submap 0 (which indicates use of a full-range floor) and channel 5 belongs to submap 1, which uses a bass-only representation.

 

1.2.4. Floor

Vorbis encodes a spectral ’floor’ vector for each PCM channel. This vector is a low-resolution representation of the audio spectrum for the given channel in the current frame, generally used akin to a whitening filter. It is named a ’floor’ because the Xiph.Org reference encoder has historically used it as a unit-baseline for spectral resolution.

A floor encoding may be of two types. Floor 0 uses a packed LSP representation on a dB amplitude scale and Bark frequency scale. Floor 1 represents the curve as a piecewise linear interpolated representation on a dB amplitude scale and linear frequency scale. The two floors are semantically interchangeable in encoding/decoding. However, floor type 1 provides more stable inter-frame behavior, and so is the preferred choice in all coupled-stereo and high bitrate modes. Floor 1 is also considerably less expensive to decode than floor 0.

Floor 0 is not to be considered deprecated, but it is of limited modern use. No known Vorbis encoder past Xiph.Org’s own beta 4 makes use of floor 0.

The values coded/decoded by a floor are both compactly formatted and make use of entropy coding to save space. For this reason, a floor configuration generally refers to multiple codebooks in the codebook component list. Entropy coding is thus provided as an abstraction, and each floor instance may choose from any and all available codebooks when coding/decoding.

 

1.2.5. Residue

The spectral residue is the fine structure of the audio spectrum once the floor curve has been subtracted out. In simplest terms, it is coded in the bitstream using cascaded (multi-pass) vector quantization according to one of three specific packing/coding algorithms numbered 0 through 2. The packing algorithm details are configured by residue instance. As with the floor components, the final VQ/entropy encoding is provided by external codebook instances and each residue instance may choose from any and all available codebooks.

 

1.2.6. Codebooks

Codebooks are a self-contained abstraction that perform entropy decoding and, optionally, use the entropy-decoded integer value as an offset into an index of output value vectors, returning the indicated vector of values.

The entropy coding in a Vorbis I codebook is provided by a standard Huffman binary tree representation. This tree is tightly packed using one of several methods, depending on whether codeword lengths are ordered or unordered, or the tree is sparse.

The codebook vector index is similarly packed according to index characteristic. Most commonly, the vector index is encoded as a single list of values of possible values that are then permuted into a list of n-dimensional rows (lattice VQ).

 

1.3. High-level Decode Process

 

1.3.1. Decode Setup

Before decoding can begin, a decoder must initialize using the bitstream headers matching the stream to be decoded. Vorbis uses three header packets; all are required, in-order, by this specification. Once set up, decode may begin at any audio packet belonging to the Vorbis stream. In Vorbis I, all packets after the three initial headers are audio packets.

The header packets are, in order, the identification header, the comments header, and the setup header.

Identification HeaderThe identification header identifies the bitstream as Vorbis, Vorbis version, and the simple audio characteristics of the stream such as sample rate and number of channels.

Comment HeaderThe comment header includes user text comments (“tags”) and a vendor string for the application/library that produced the bitstream. The encoding and proper use of the comment header is described in Section 5, “comment field and header specification”.

Setup HeaderThe setup header includes extensive CODEC setup information as well as the complete VQ and Huffman codebooks needed for decode.

 

1.3.2. Decode Procedure

The decoding and synthesis procedure for all audio packets is fundamentally the same.

1.
decode packet type flag
2.
decode mode number
3.
decode window shape (long windows only)
4.
decode floor
5.
decode residue into residue vectors
6.
inverse channel coupling of residue vectors
7.
generate floor curve from decoded floor data
8.
compute dot product of floor and residue, producing audio spectrum vector
9.
inverse monolithic transform of audio spectrum vector, always an MDCT in Vorbis I
10.
overlap/add left-hand output of transform with right-hand output of previous frame
11.
store right hand-data from transform of current frame for future lapping
12.
if not first frame, return results of overlap/add as audio result of current frame

Note that clever rearrangement of the synthesis arithmetic is possible; as an example, one can take advantage of symmetries in the MDCT to store the right-hand transform data of a partial MDCT for a 50% inter-frame buffer space savings, and then complete the transform later before overlap/add with the next frame. This optimization produces entirely equivalent output and is naturally perfectly legal. The decoder must be entirely mathematically equivalent to the specification, it need not be a literal semantic implementation.

Packet type decodeVorbis I uses four packet types. The first three packet types mark each of the three Vorbis headers described above. The fourth packet type marks an audio packet. All other packet types are reserved; packets marked with a reserved type should be ignored.

Following the three header packets, all packets in a Vorbis I stream are audio. The first step of audio packet decode is to read and verify the packet type; a non-audio packet when audio isexpected indicates stream corruption or a non-compliant stream. The decoder must ignore thepacket and not attempt decoding it to audio.

Mode decodeVorbis allows an encoder to set up multiple, numbered packet ’modes’, as described earlier, all of which may be used in a given Vorbis stream. The mode is encoded as an integer used as a direct offset into the mode instance index.

Window shape decode (long windows only)Vorbis frames may be one of two PCM sample sizes specified during codec setup. In Vorbis I, legal frame sizes are powers of two from 64 to 8192 samples. Aside from coupling, Vorbis handles channels as independent vectors and these frame sizes are in samples per channel.

Vorbis uses an overlapping transform, namely the MDCT, to blend one frame into the next, avoiding most inter-frame block boundary artifacts. The MDCT output of one frame is windowed according to MDCT requirements, overlapped 50% with the output of the previous frame and added. The window shape assures seamless reconstruction.

This is easy to visualize in the case of equal sized-windows:

PIC


Figure 2: overlap of two equal-sized windows

And slightly more complex in the case of overlapping unequal sized windows:

PIC


Figure 3: overlap of a long and a short window

In the unequal-sized window case, the window shape of the long window must be modified for seamless lapping as above. It is possible to correctly infer window shape to be applied to the current window from knowing the sizes of the current, previous and next window. It is legal for a decoder to use this method. However, in the case of a long window (short windows require no modification), Vorbis also codes two flag bits to specify pre- and post- window shape. Although not strictly necessary for function, this minor redundancy allows a packet to be fully decoded to the point of lapping entirely independently of any other packet, allowing easier abstraction of decode layers as well as allowing a greater level of easy parallelism in encode and decode.

A description of valid window functions for use with an inverse MDCT can be found in [1]. Vorbis windows all use the slope function

y = sin (.5 ∗ π sin2((x + .5)∕n ∗ π)).

floor decodeEach floor is encoded/decoded in channel order, however each floor belongs to a ’submap’ that specifies which floor configuration to use. All floors are decoded before residue decode begins.

residue decodeAlthough the number of residue vectors equals the number of channels, channel coupling may mean that the raw residue vectors extracted during decode do not map directly to specific channels. When channel coupling is in use, some vectors will correspond to coupled magnitude or angle. The coupling relationships are described in the codec setup and may differ from frame to frame, due to different mode numbers.

Vorbis codes residue vectors in groups by submap; the coding is done in submap order from submap 0 through n-1. This differs from floors which are coded using a configuration provided by submap number, but are coded individually in channel order.

inverse channel couplingA detailed discussion of stereo in the Vorbis codec can be found in the documentStereo Channel Coupling in the Vorbis CODEC. Vorbis is not limited to only stereocoupling, but the stereo document also gives a good overview of the generic coupling mechanism.

Vorbis coupling applies to pairs of residue vectors at a time; decoupling is done in-place a pair at a time in the order and using the vectors specified in the current mapping configuration. The decoupling operation is the same for all pairs, converting square polar representation (where one vector is magnitude and the second angle) back to Cartesian representation.

After decoupling, in order, each pair of vectors on the coupling list, the resulting residue vectors represent the fine spectral detail of each output channel.

generate floor curveThe decoder may choose to generate the floor curve at any appropriate time. It is reasonable to generate the output curve when the floor data is decoded from the raw packet, or it can be generated after inverse coupling and applied to the spectral residue directly, combining generation and the dot product into one step and eliminating some working space.

Both floor 0 and floor 1 generate a linear-range, linear-domain output vector to be multiplied (dot product) by the linear-range, linear-domain spectral residue.

compute floor/residue dot productThis step is straightforward; for each output channel, the decoder multiplies the floor curve and residue vectors element by element, producing the finished audio spectrum of each channel.

One point is worth mentioning about this dot product; a common mistake in a fixed point implementation might be to assume that a 32 bit fixed-point representation for floor and residue and direct multiplication of the vectors is sufficient for acceptable spectral depth in all cases because it happens to mostly work with the current Xiph.Org reference encoder.

However, floor vector values can span 140dB (24 bits unsigned), and the audio spectrum vector should represent a minimum of 120dB (21 bits with sign), even when output is to a 16 bit PCM device. For the residue vector to represent full scale if the floor is nailed to 140dB, it must be able to span 0 to +140dB. For the residue vector to reach full scale if the floor is nailed at 0dB, it must be able to represent 140dB to +0dB. Thus, in order to handle full range dynamics, a residue vector may span 140dB to +140dB entirely within spec. A 280dB range is approximately 48 bits with sign; thus the residue vector must be able to represent a 48 bit range and the dot product must be able to handle an effective 48 bit times 24 bit multiplication. This range may be achieved using large (64 bit or larger) integers, or implementing a movable binary point representation.

inverse monolithic transform (MDCT)The audio spectrum is converted back into time domain PCM audio via an inverse Modified Discrete Cosine Transform (MDCT). A detailed description of the MDCT is available in[1].

Note that the PCM produced directly from the MDCT is not yet finished audio; it must be lapped with surrounding frames using an appropriate window (such as the Vorbis window) before the MDCT can be considered orthogonal.

overlap/add dataWindowed MDCT output is overlapped and added with the right hand data of the previous window such that the 3/4 point of the previous window is aligned with the 1/4 point of the current window (as illustrated in the window overlap diagram). At this point, the audio data between the center of the previous frame and the center of the current frame is now finished and ready to be returned.

cache right hand dataThe decoder must cache the right hand portion of the current frame to be lapped with the left hand portion of the next frame.

return finished audio dataThe overlapped portion produced from overlapping the previous and current frame data is finished data to be returned by the decoder. This data spans from the center of the previous window to the center of the current window. In the case of same-sized windows, the amount of data to return is one-half block consisting of and only of the overlapped portions. When overlapping a short and long window, much of the returned range is not actually overlap. This does not damage transform orthogonality. Pay attention however to returning the correct data range; the amount of data to be returned is:

 

1 window_blocksize(previous_window)/4+window_blocksize(current_window)/4

from the center of the previous window to the center of the current window.

Data is not returned from the first frame; it must be used to ’prime’ the decode engine. The encoder accounts for this priming when calculating PCM offsets; after the first frame, the proper PCM output offset is ’0’ (as no data has been returned yet).

2. Bitpacking Convention

 

2.1. Overview

The Vorbis codec uses relatively unstructured raw packets containing arbitrary-width binary integer fields. Logically, these packets are a bitstream in which bits are coded one-by-one by the encoder and then read one-by-one in the same monotonically increasing order by the decoder. Most current binary storage arrangements group bits into a native word size of eight bits (octets), sixteen bits, thirty-two bits or, less commonly other fixed word sizes. The Vorbis bitpacking convention specifies the correct mapping of the logical packet bitstream into an actual representation in fixed-width words.

 

2.1.1. octets, bytes and words

In most contemporary architectures, a ’byte’ is synonymous with an ’octet’, that is, eight bits. This has not always been the case; seven, ten, eleven and sixteen bit ’bytes’ have been used. For purposes of the bitpacking convention, a byte implies the native, smallest integer storage representation offered by a platform. On modern platforms, this is generally assumed to be eight bits (not necessarily because of the processor but because of the filesystem/memory architecture. Modern filesystems invariably offer bytes as the fundamental atom of storage). A ’word’ is an integer size that is a grouped multiple of this smallest size.

The most ubiquitous architectures today consider a ’byte’ to be an octet (eight bits) and a word to be a group of two, four or eight bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking convention is still well defined for any native byte size; Vorbis uses the native bit-width of a given storage system. This document assumes that a byte is one octet for purposes of example.

 

2.1.2. bit order

A byte has a well-defined ’least significant’ bit (LSb), which is the only bit set when the byte is storing the two’s complement integer value +1. A byte’s ’most significant’ bit (MSb) is at the opposite end of the byte. Bits in a byte are numbered from zero at the LSb to n (n = 7 in an octet) for the MSb.

 

2.1.3. byte order

Words are native groupings of multiple bytes. Several byte orderings are possible in a word; the common ones are 3-2-1-0 (’big endian’ or ’most significant byte first’ in which the highest-valued byte comes first), 0-1-2-3 (’little endian’ or ’least significant byte first’ in which the lowest value byte comes first) and less commonly 3-1-2-0 and 0-2-1-3 (’mixed endian’).

The Vorbis bitpacking convention specifies storage and bitstream manipulation at the byte, not word, level, thus host word ordering is of a concern only during optimization when writing high performance code that operates on a word of storage at a time rather than by byte. Logically, bytes are always coded and decoded in order from byte zero through byten.

 

2.1.4. coding bits into byte sequences

The Vorbis codec has need to code arbitrary bit-width integers, from zero to 32 bits wide, into packets. These integer fields are not aligned to the boundaries of the byte representation; the next field is written at the bit position at which the previous field ends.

The encoder logically packs integers by writing the LSb of a binary integer to the logical bitstream first, followed by next least significant bit, etc, until the requested number of bits have been coded. When packing the bits into bytes, the encoder begins by placing the LSb of the integer to be written into the least significant unused bit position of the destination byte, followed by the next-least significant bit of the source integer and so on up to the requested number of bits. When all bits of the destination byte have been filled, encoding continues by zeroing all bits of the next byte and writing the next bit into the bit position 0 of that byte. Decoding follows the same process as encoding, but by reading bits from the byte stream and reassembling them into integers.

 

2.1.5. signedness

The signedness of a specific number resulting from decode is to be interpreted by the decoder given decode context. That is, the three bit binary pattern ’b111’ can be taken to represent either ’seven’ as an unsigned integer, or ’-1’ as a signed, two’s complement integer. The encoder and decoder are responsible for knowing if fields are to be treated as signed or unsigned.

 

2.1.6. coding example

Code the 4 bit integer value ’12’ [b1100] into an empty bytestream. Bytestream result:

 

1 |
2 V
3
4 7 6 5 4 3 2 1 0
5 byte 0 [0 0 0 0 1 1 0 0] <-
6 byte 1 [ ]
7 byte 2 [ ]
8 byte 3 [ ]
9 ...
10 byte n [ ] bytestream length == 1 byte
11

Continue by coding the 3 bit integer value ’-1’ [b111]:

 

1 |
2 V
3
4 7 6 5 4 3 2 1 0
5 byte 0 [0 1 1 1 1 1 0 0] <-
6 byte 1 [ ]
7 byte 2 [ ]
8 byte 3 [ ]
9 ...
10 byte n [ ] bytestream length == 1 byte

Continue by coding the 7 bit integer value ’17’ [b0010001]:

 

1 |
2 V
3
4 7 6 5 4 3 2 1 0
5 byte 0 [1 1 1 1 1 1 0 0]
6 byte 1 [0 0 0 0 1 0 0 0] <-
7 byte 2 [ ]
8 byte 3 [ ]
9 ...
10 byte n [ ] bytestream length == 2 bytes
11 bit cursor == 6

Continue by coding the 13 bit integer value ’6969’ [b110 11001110 01]:

 

1 |
2 V
3
4 7 6 5 4 3 2 1 0
5 byte 0 [1 1 1 1 1 1 0 0]
6 byte 1 [0 1 0 0 1 0 0 0]
7 byte 2 [1 1 0 0 1 1 1 0]
8 byte 3 [0 0 0 0 0 1 1 0] <-
9 ...
10 byte n [ ] bytestream length == 4 bytes
11

 

2.1.7. decoding example

Reading from the beginning of the bytestream encoded in the above example:

 

1 |
2 V
3
4 7 6 5 4 3 2 1 0
5 byte 0 [1 1 1 1 1 1 0 0] <-
6 byte 1 [0 1 0 0 1 0 0 0]
7 byte 2 [1 1 0 0 1 1 1 0]
8 byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes
9

We read two, two-bit integer fields, resulting in the returned numbers ’b00’ and ’b11’. Two things are worth noting here:

  • Although these four bits were originally written as a single four-bit integer, reading some other combination of bit-widths from the bitstream is well defined. There are no artificial alignment boundaries maintained in the bitstream.
  • The second value is the two-bit-wide integer ’b11’. This value may be interpreted either as the unsigned value ’3’, or the signed value ’-1’. Signedness is dependent on decode context.

 

2.1.8. end-of-packet alignment

The typical use of bitpacking is to produce many independent byte-aligned packets which are embedded into a larger byte-aligned container structure, such as an Ogg transport bitstream. Externally, each bytestream (encoded bitstream) must begin and end on a byte boundary. Often, the encoded bitstream is not an integer number of bytes, and so there is unused (uncoded) space in the last byte of a packet.

Unused space in the last byte of a bytestream is always zeroed during the coding process. Thus, should this unused space be read, it will return binary zeroes.

Attempting to read past the end of an encoded packet results in an ’end-of-packet’ condition. End-of-packet is not to be considered an error; it is merely a state indicating that there is insufficient remaining data to fulfill the desired read size. Vorbis uses truncated packets as a normal mode of operation, and as such, decoders must handle reading past the end of a packet as a typical mode of operation. Any further read operations after an ’end-of-packet’ condition shall also return ’end-of-packet’.

 

2.1.9. reading zero bits

Reading a zero-bit-wide integer returns the value ’0’ and does not increment the stream cursor. Reading to the end of the packet (but not past, such that an ’end-of-packet’ condition has not triggered) and then reading a zero bit integer shall succeed, returning 0, and not trigger an end-of-packet condition. Reading a zero-bit-wide integer after a previous read sets ’end-of-packet’shall also fail with ’end-of-packet’.

3. Probability Model and Codebooks

 

3.1. Overview

Unlike practically every other mainstream audio codec, Vorbis has no statically configured probability model, instead packing all entropy decoding configuration, VQ and Huffman, into the bitstream itself in the third header, the codec setup header. This packed configuration consists of multiple ’codebooks’, each containing a specific Huffman-equivalent representation for decoding compressed codewords as well as an optional lookup table of output vector values to which a decoded Huffman value is applied as an offset, generating the final decoded output corresponding to a given compressed codeword.

 

3.1.1. Bitwise operation

The codebook mechanism is built on top of the vorbis bitpacker. Both the codebooks themselves and the codewords they decode are unrolled from a packet as a series of arbitrary-width values read from the stream according to Section 2, “Bitpacking Convention”.

 

3.2. Packed codebook format

For purposes of the examples below, we assume that the storage system’s native byte width is eight bits. This is not universally true; see Section 2, “Bitpacking Convention” for discussion relating to non-eight-bit bytes.

 

3.2.1. codebook decode

A codebook begins with a 24 bit sync pattern, 0x564342:

 

1 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
2 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
3 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)

16 bit [codebook_dimensions] and 24 bit [codebook_entries] fields:

 

1
2 byte 3: [ X X X X X X X X ]
3 byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
4
5 byte 5: [ X X X X X X X X ]
6 byte 6: [ X X X X X X X X ]
7 byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
8

Next is the [ordered] bit flag:

 

1
2 byte 8: [ X ] [ordered] (1 bit)
3

Each entry, numbering a total of [codebook_entries], is assigned a codeword length. We now read the list of codeword lengths and store these lengths in the array[codebook_codeword_lengths]. Decode of lengths is according to whether the [ordered] flag is set or unset.

  • If the [ordered] flag is unset, the codeword list is not length ordered and the decoder needs to read each codeword length one-by-one.

    The decoder first reads one additional bit flag, the [sparse] flag. This flag determines whether or not the codebook contains unused entries that are not to be included in the codeword decode tree:

     

    1 byte 8: [ X 1 ] [sparse] flag (1 bit)

    The decoder now performs for each of the [codebook_entries] codebook entries:

     

    1
    2 1) if([sparse] is set) {
    3
    4 2) [flag] = read one bit;
    5 3) if([flag] is set) {
    6
    7 4) [length] = read a five bit unsigned integer;
    8 5) codeword length for this entry is [length]+1;
    9
    10 } else {
    11
    12 6) this entry is unused. mark it as such.
    13
    14 }
    15
    16 } else the sparse flag is not set {
    17
    18 7) [length] = read a five bit unsigned integer;
    19 8) the codeword length for this entry is [length]+1;
    20
    21 }
    22
  • If the [ordered] flag is set, the codeword list for this codebook is encoded in ascending length order. Rather than reading a length for every codeword, the encoder reads the number of codewords per length. That is, beginning at entry zero:

     

    1 1) [current_entry] = 0;
    2 2) [current_length] = read a five bit unsigned integer and add 1;
    3 3) [number] = read ilog([codebook_entries] - [current_entry]) bits as an unsigned integer
    4 4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive,
    5 of the [codebook_codeword_lengths] array to [current_length]
    6 5) set [current_entry] to [number] + [current_entry]
    7 6) increment [current_length] by 1
    8 7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION;
    9 the decoder will not be able to read this stream.
    10 8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
    11 9) done.

After all codeword lengths have been decoded, the decoder reads the vector lookup table. Vorbis I supports three lookup types:

1.
No lookup
2.
Implicitly populated value mapping (lattice VQ)
3.
Explicitly populated value mapping (tessellated or ’foam’ VQ)

The lookup table type is read as a four bit unsigned integer:

1 1) [codebook_lookup_type] = read four bits as an unsigned integer

Codebook decode precedes according to [codebook_lookup_type]:

  • Lookup type zero indicates no lookup to be read. Proceed past lookup decode.
  • Lookup types one and two are similar, differing only in the number of lookup values to be read. Lookup type one reads a list of values that are permuted in a set pattern to build a list of vectors, each vector of order [codebook_dimensions] scalars. Lookup type two builds the same vector list, but reads each scalar for each vector explicitly, rather than building vectors from a smaller list of possible scalar values. Lookup decode proceeds as follows:

     

    1 1) [codebook_minimum_value] = float32_unpack( read 32 bits as an unsigned integer)
    2 2) [codebook_delta_value] = float32_unpack( read 32 bits as an unsigned integer)
    3 3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
    4 4) [codebook_sequence_p] = read 1 bit as a boolean flag
    5
    6 if ( [codebook_lookup_type] is 1 ) {
    7
    8 5) [codebook_lookup_values] = lookup1_values([codebook_entries], [codebook_dimensions] )
    9
    10 } else {
    11
    12 6) [codebook_lookup_values] = [codebook_entries] * [codebook_dimensions]
    13
    14 }
    15
    16 7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each;
    17 store these in order in the array [codebook_multiplicands]
  • A [codebook_lookup_type] of greater than two is reserved and indicates a stream that is not decodable by the specification in this document.

An ’end of packet’ during any read operation in the above steps is considered an error condition rendering the stream undecodable.

Huffman decision tree representationThe [codebook_codeword_lengths] array and [codebook_entries] value uniquely define the Huffman decision tree used for entropy decoding.

Briefly, each used codebook entry (recall that length-unordered codebooks support unused codeword entries) is assigned, in order, the lowest valued unused binary Huffman codeword possible. Assume the following codeword length list:

 

1 entry 0: length 2
2 entry 1: length 4
3 entry 2: length 4
4 entry 3: length 4
5 entry 4: length 4
6 entry 5: length 2
7 entry 6: length 3
8 entry 7: length 3

Assigning codewords in order (lowest possible value of the appropriate length to highest) results in the following codeword list:

 

1 entry 0: length 2 codeword 00
2 entry 1: length 4 codeword 0100
3 entry 2: length 4 codeword 0101
4 entry 3: length 4 codeword 0110
5 entry 4: length 4 codeword 0111
6 entry 5: length 2 codeword 10
7 entry 6: length 3 codeword 110
8 entry 7: length 3 codeword 111

Note: Unlike most binary numerical values in this document, we intend the above codewords to be read and used bit by bit from left to right, thus the codeword ’001’ is the bit string ’zero, zero, one’. When determining ’lowest possible value’ in the assignment definition above, the leftmost bit is the MSb.

It is clear that the codeword length list represents a Huffman decision tree with the entry numbers equivalent to the leaves numbered left-to-right:

PIC


Figure 4: huffman tree illustration

As we assign codewords in order, we see that each choice constructs a new leaf in the leftmost possible position.

Note that it’s possible to underspecify or overspecify a Huffman tree via the length list. In the above example, if codeword seven were eliminated, it’s clear that the tree is unfinished:

PIC


Figure 5: underspecified huffman tree illustration

Similarly, in the original codebook, it’s clear that the tree is fully populated and a ninth codeword is impossible. Both underspecified and overspecified trees are an error condition rendering the stream undecodable. Take special care that a codebook with a single used entry is handled properly; it consists of a single codework of zero bits and ’reading’a value out of such a codebook always returns the single used value and sinks zero bits.

Codebook entries marked ’unused’ are simply skipped in the assigning process. They have no codeword and do not appear in the decision tree, thus it’s impossible for any bit pattern read from the stream to decode to that entry number.

VQ lookup table vector representationUnpacking the VQ lookup table vectors relies on the following values:

1 the [codebook\_multiplicands] array
2 [codebook\_minimum\_value]
3 [codebook\_delta\_value]
4 [codebook\_sequence\_p]
5 [codebook\_lookup\_type]
6 [codebook\_entries]
7 [codebook\_dimensions]
8 [codebook\_lookup\_values]

Decoding (unpacking) a specific vector in the vector lookup table proceeds according to[codebook_lookup_type]. The unpacked vector values are what a codebook would return during audio packet decode in a VQ context.

Vector value decode: Lookup type 1Lookup type one specifies a lattice VQ lookup table built algorithmically from a list of scalar values. Calculate (unpack) the final values of a codebook entry vector from the entries in [codebook_multiplicands] as follows ([value_vector] is the output vector representing the vector of values for entry number [lookup_offset] in this codebook):

 

1 1) [last] = 0;
2 2) [index_divisor] = 1;
3 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
4
5 4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer
6 division ) integer modulo [codebook_lookup_values]
7
8 5) vector [value_vector] element [i] =
9 ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
10 [codebook_delta_value] + [codebook_minimum_value] + [last];
11
12 6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
13
14 7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
15
16 }
17
18 8) vector calculation completed.

Vector value decode: Lookup type 2Lookup type two specifies a VQ lookup table in which each scalar in each vector is explicitly set by the [codebook_multiplicands] array in a one-to-one mapping. Calculate [unpack] the final values of a codebook entry vector from the entries in [codebook_multiplicands] as follows ([value_vector] is the output vector representing the vector of values for entry number[lookup_offset] in this codebook):

 

1 1) [last] = 0;
2 2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
3 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
4
5 4) vector [value_vector] element [i] =
6 ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
7 [codebook_delta_value] + [codebook_minimum_value] + [last];
8
9 5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
10
11 6) increment [multiplicand_offset]
12
13 }
14
15 7) vector calculation completed.

 

3.3. Use of the codebook abstraction

The decoder uses the codebook abstraction much as it does the bit-unpacking convention; a specific codebook reads a codeword from the bitstream, decoding it into an entry number, and then returns that entry number to the decoder (when used in a scalar entropy coding context), or uses that entry number as an offset into the VQ lookup table, returning a vector of values (when used in a context desiring a VQ value). Scalar or VQ context is always explicit; any call to the codebook mechanism requests either a scalar entry number or a lookup vector.

Note that VQ lookup type zero indicates that there is no lookup table; requesting decode using a codebook of lookup type 0 in any context expecting a vector return value (even in a case where a vector of dimension one) is forbidden. If decoder setup or decode requests such an action, that is an error condition rendering the packet undecodable.

Using a codebook to read from the packet bitstream consists first of reading and decoding the next codeword in the bitstream. The decoder reads bits until the accumulated bits match a codeword in the codebook. This process can be though of as logically walking the Huffman decode tree by reading one bit at a time from the bitstream, and using the bit as a decision boolean to take the 0 branch (left in the above examples) or the 1 branch (right in the above examples). Walking the tree finishes when the decode process hits a leaf in the decision tree; the result is the entry number corresponding to that leaf. Reading past the end of a packet propagates the ’end-of-stream’ condition to the decoder.

When used in a scalar context, the resulting codeword entry is the desired return value.

When used in a VQ context, the codeword entry number is used as an offset into the VQ lookup table. The value returned to the decoder is the vector of scalars corresponding to this offset.

4. Codec Setup and Packet Decode

 

4.1. Overview

This document serves as the top-level reference document for the bit-by-bit decode specification of Vorbis I. This document assumes a high-level understanding of the Vorbis decode process, which is provided in Section 1, “Introduction and Description”. Section 2,“Bitpacking Convention” covers reading and writing bit fields from and to bitstream packets.

 

4.2. Header decode and decode setup

A Vorbis bitstream begins with three header packets. The header packets are, in order, the identification header, the comments header, and the setup header. All are required for decode compliance. An end-of-packet condition during decoding the first or third header packet renders the stream undecodable. End-of-packet decoding the comment header is a non-fatal error condition.

 

4.2.1. Common header decode

Each header packet begins with the same header fields.

 

1 1) [packet_type] : 8 bit value
2 2) 0x76, 0x6f, 0x72, 0x62, 0x69, 0x73: the characters ’v’,’o’,’r’,’b’,’i’,’s’ as six octets

Decode continues according to packet type; the identification header is type 1, the comment header type 3 and the setup header type 5 (these types are all odd as a packet with a leading single bit of ’0’ is an audio packet). The packets must occur in the order of identification, comment, setup.

 

4.2.2. Identification header

The identification header is a short header of only a few fields used to declare the stream definitively as Vorbis, and provide a few externally relevant pieces of information about the audio stream. The identification header is coded as follows:

 

1 1) [vorbis_version] = read 32 bits as unsigned integer
2 2) [audio_channels] = read 8 bit integer as unsigned
3 3) [audio_sample_rate] = read 32 bits as unsigned integer
4 4) [bitrate_maximum] = read 32 bits as signed integer
5 5) [bitrate_nominal] = read 32 bits as signed integer
6 6) [bitrate_minimum] = read 32 bits as signed integer
7 7) [blocksize_0] = 2 exponent (read 4 bits as unsigned integer)
8 8) [blocksize_1] = 2 exponent (read 4 bits as unsigned integer)
9 9) [framing_flag] = read one bit

[vorbis_version] is to read ’0’ in order to be compatible with this document. Both[audio_channels] and [audio_sample_rate] must read greater than zero. Allowed final blocksize values are 64, 128, 256, 512, 1024, 2048, 4096 and 8192 in Vorbis I. [blocksize_0]must be less than or equal to [blocksize_1]. The framing bit must be nonzero. Failure to meet any of these conditions renders a stream undecodable.

The bitrate fields above are used only as hints. The nominal bitrate field especially may be considerably off in purely VBR streams. The fields are meaningful only when greater than zero.

  • All three fields set to the same value implies a fixed rate, or tightly bounded, nearly fixed-rate bitstream
  • Only nominal set implies a VBR or ABR stream that averages the nominal bitrate
  • Maximum and or minimum set implies a VBR bitstream that obeys the bitrate limits
  • None set indicates the encoder does not care to speculate.

 

4.2.3. Comment header

Comment header decode and data specification is covered in Section 5, “comment field and header specification”.

 

4.2.4. Setup header

Vorbis codec setup is configurable to an extreme degree:

PIC


Figure 6: decoder pipeline configuration

The setup header contains the bulk of the codec setup information needed for decode. The setup header contains, in order, the lists of codebook configurations, time-domain transform configurations (placeholders in Vorbis I), floor configurations, residue configurations, channel mapping configurations and mode configurations. It finishes with a framing bit of ’1’. Header decode proceeds in the following order:

Codebooks

1.
[vorbis_codebook_count] = read eight bits as unsigned integer and add one
2.
Decode [vorbis_codebook_count] codebooks in order as defined in Section 3, “Probability Model and Codebooks”. Save each configuration, in order, in an array of codebook configurations [vorbis_codebook_configurations].

Time domain transformsThese hooks are placeholders in Vorbis I. Nevertheless, the configuration placeholder values must be read to maintain bitstream sync.

 

1.
[vorbis_time_count] = read 6 bits as unsigned integer and add one
2.
read [vorbis_time_count] 16 bit values; each value should be zero. If any value is nonzero, this is an error condition and the stream is undecodable.

FloorsVorbis uses two floor types; header decode is handed to the decode abstraction of the appropriate type.

 

1.
[vorbis_floor_count] = read 6 bits as unsigned integer and add one
2.
For each [i] of [vorbis_floor_count] floor numbers:
a)
read the floor type: vector [vorbis_floor_types] element [i] = read 16 bits as unsigned integer
b)
If the floor type is zero, decode the floor configuration as defined in Section 6, “Floor type 0 setup and decode”; save this configuration in slot [i] of the floor configuration array [vorbis_floor_configurations].
c)
If the floor type is one, decode the floor configuration as defined in Section 7, “Floor type 1 setup and decode”; save this configuration in slot [i] of the floor configuration array [vorbis_floor_configurations].
d)
If the the floor type is greater than one, this stream is undecodable; ERROR CONDITION

ResiduesVorbis uses three residue types; header decode of each type is identical.

 

1.
[vorbis_residue_count] = read 6 bits as unsigned integer and add one
2.
For each of [vorbis_residue_count] residue numbers:
a)
read the residue type; vector [vorbis_residue_types] element [i] = read 16 bits as unsigned integer
b)
If the residue type is zero, one or two, decode the residue configuration as defined in Section 8, “Residue setup and decode”; save this configuration in slot [i] of the residue configuration array [vorbis_residue_configurations].
c)
If the the residue type is greater than two, this stream is undecodable; ERROR CONDITION

MappingsMappings are used to set up specific pipelines for encoding multichannel audio with varying channel mapping applications. Vorbis I uses a single mapping type (0), with implicit PCM channel mappings.

 

1.
[vorbis_mapping_count] = read 6 bits as unsigned integer and add one
2.
For each [i] of [vorbis_mapping_count] mapping numbers:
a)
read the mapping type: 16 bits as unsigned integer. There’s no reason to save the mapping type in Vorbis I.
b)
If the mapping type is nonzero, the stream is undecodable
c)
If the mapping type is zero:
i.
read 1 bit as a boolean flag
A.
if set, [vorbis_mapping_submaps] = read 4 bits as unsigned integer and add one
B.
if unset, [vorbis_mapping_submaps] = 1
ii.
read 1 bit as a boolean flag
A.
if set, square polar channel mapping is in use:
  • [vorbis_mapping_coupling_steps] = read 8 bits as unsigned integer and add one
  • for [j] each of [vorbis_mapping_coupling_steps] steps:
    • vector [vorbis_mapping_magnitude] element [j]= read ilog([audio_channels] - 1) bits as unsigned integer
    • vector [vorbis_mapping_angle] element [j]= read ilog([audio_channels] - 1) bits as unsigned integer
    • the numbers read in the above two steps are channel numbers representing the channel to treat as magnitude and the channel to treat as angle, respectively. If for any coupling step the angle channel number equals the magnitude channel number, the magnitude channel number is greater than [audio_channels]-1, or the angle channel is greater than [audio_channels]-1, the stream is undecodable.
B.
if unset, [vorbis_mapping_coupling_steps] = 0
iii.
read 2 bits (reserved field); if the value is nonzero, the stream is undecodable
iv.
if [vorbis_mapping_submaps] is greater than one, we read channel multiplex settings. For each [j] of [audio_channels] channels:
A.
vector [vorbis_mapping_mux] element [j] = read 4 bits as unsigned integer
B.
if the value is greater than the highest numbered submap ([vorbis_mapping_submaps] - 1), this in an error condition rendering the stream undecodable
v.
for each submap [j] of [vorbis_mapping_submaps] submaps, read the floor and residue numbers for use in decoding that submap:
A.
read and discard 8 bits (the unused time configuration placeholder)
B.
read 8 bits as unsigned integer for the floor number; save in vector [vorbis_mapping_submap_floor] element [j]
C.
verify the floor number is not greater than the highest number floor configured for the bitstream. If it is, the bitstream is undecodable
D.
read 8 bits as unsigned integer for the residue number; save in vector [vorbis_mapping_submap_residue] element [j]
E.
verify the residue number is not greater than the highest number residue configured for the bitstream. If it is, the bitstream is undecodable
vi.
save this mapping configuration in slot [i] of the mapping configuration array [vorbis_mapping_configurations].

Modes

1.
[vorbis_mode_count] = read 6 bits as unsigned integer and add one
2.
For each of [vorbis_mode_count] mode numbers:
a)
[vorbis_mode_blockflag] = read 1 bit
b)
[vorbis_mode_windowtype] = read 16 bits as unsigned integer
c)
[vorbis_mode_transformtype] = read 16 bits as unsigned integer
d)
[vorbis_mode_mapping] = read 8 bits as unsigned integer
e)
verify ranges; zero is the only legal value in Vorbis I for [vorbis_mode_windowtype] and [vorbis_mode_transformtype]. [vorbis_mode_mapping] must not be greater than the highest number mapping in use. Any illegal values render the stream undecodable.
f)
save this mode configuration in slot [i] of the mode configuration array [vorbis_mode_configurations].
3.
read 1 bit as a framing flag. If unset, a framing error occurred and the stream is not decodable.

After reading mode descriptions, setup header decode is complete.

 

4.3. Audio packet decode and synthesis

Following the three header packets, all packets in a Vorbis I stream are audio. The first step of audio packet decode is to read and verify the packet type. A non-audio packet when audio isexpected indicates stream corruption or a non-compliant stream. The decoder must ignore thepacket and not attempt decoding it to audio.

 

4.3.1. packet type, mode and window decode

 

1.
read 1 bit [packet_type]; check that packet type is 0 (audio)
2.
read ilog([vorbis_mode_count]-1) bits [mode_number]
3.
decode blocksize [n] is equal to [blocksize_0] if [vorbis_mode_blockflag] is 0, else [n] is equal to [blocksize_1].
4.
perform window selection and setup; this window is used later by the inverse MDCT:
a)
if this is a long window (the [vorbis_mode_blockflag] flag of this mode is set):
i.
read 1 bit for [previous_window_flag]
ii.
read 1 bit for [next_window_flag]
iii.
if [previous_window_flag] is not set, the left half of the window will be a hybrid window for lapping with a short block. See paragraph 1.3.2, “Window shape decode (long windows only)” for an illustration of overlapping dissimilar windows. Else, the left half window will have normal long shape.
iv.
if [next_window_flag] is not set, the right half of the window will be a hybrid window for lapping with a short block. See paragraph 1.3.2, “Window shape decode (long windows only)” for an illustration of overlapping dissimilar windows. Else, the left right window will have normal long shape.
b)
if this is a short window, the window is always the same short-window shape.

Vorbis windows all use the slope function y = sin(π2 sin 2((x + 0.5)∕n π)), where n is window size and x ranges 0n1, but dissimilar lapping requirements can affect overall shape. Window generation proceeds as follows:

 

1.
[window_center] = [n] / 2
2.
if ([vorbis_mode_blockflag] is set and [previous_window_flag] is not set) then
a)
[left_window_start] = [n]/4 - [blocksize_0]/4
b)
[left_window_end] = [n]/4 + [blocksize_0]/4
c)
[left_n] = [blocksize_0]/2

else

a)
[left_window_start] = 0
b)
[left_window_end] = [window_center]
c)
[left_n] = [n]/2
3.
if ([vorbis_mode_blockflag] is set and [next_window_flag] is not set) then
a)
[right_window_start] = [n]*3/4 - [blocksize_0]/4
b)
[right_window_end] = [n]*3/4 + [blocksize_0]/4
c)
[right_n] = [blocksize_0]/2

else

a)
[right_window_start] = [window_center]
b)
[right_window_end] = [n]
c)
[right_n] = [n]/2
4.
window from range 0 ... [left_window_start]-1 inclusive is zero
5.
for [i] in range [left_window_start] ... [left_window_end]-1, window([i]) = sin(π
2 sin 2( ([i]-[left_window_start]+0.5) / [left_n] π
2) )
6.
window from range [left_window_end] ... [right_window_start]-1 inclusive is one
7.
for [i] in range [right_window_start] ... [right_window_end]-1, window([i]) = sin(π
2 sin 2( ([i]-[right_window_start]+0.5) / [right_n] π
2 + π
2) )
8.
window from range [right_window_start] ... [n]-1 is zero

An end-of-packet condition up to this point should be considered an error that discards this packet from the stream. An end of packet condition past this point is to be considered a possible nominal occurrence.

 

4.3.2. floor curve decode

From this point on, we assume out decode context is using mode number [mode_number]from configuration array [vorbis_mode_configurations] and the map number[vorbis_mode_mapping] (specified by the current mode) taken from the mapping configuration array [vorbis_mapping_configurations].

Floor curves are decoded one-by-one in channel order.

For each floor [i] of [audio_channels]

1.
[submap_number] = element [i] of vector [vorbis_mapping_mux]
2.
[floor_number] = element [submap_number] of vector [vorbis_submap_floor]
3.
if the floor type of this floor (vector [vorbis_floor_types] element [floor_number]) is zero then decode the floor for channel [i] according to the subsubsection 6.2.2, “packet decode
4.
if the type of this floor is one then decode the floor for channel [i] according to the subsubsection 7.2.3, “packet decode
5.
save the needed decoded floor information for channel for later synthesis
6.
if the decoded floor returned ’unused’, set vector [no_residue] element [i] to true, else set vector [no_residue] element [i] to false

An end-of-packet condition during floor decode shall result in packet decode zeroing all channel output vectors and skipping to the add/overlap output stage.

 

4.3.3. nonzero vector propagate

A possible result of floor decode is that a specific vector is marked ’unused’ which indicates that that final output vector is all-zero values (and the floor is zero). The residue for that vector is not coded in the stream, save for one complication. If some vectors are used and some are not, channel coupling could result in mixing a zeroed and nonzeroed vector to produce two nonzeroed vectors.

for each [i] from 0 ... [vorbis_mapping_coupling_steps]-1

 

1.
if either [no_residue] entry for channel ([vorbis_mapping_magnitude] element [i]) or channel ([vorbis_mapping_angle] element [i]) are set to false, then both must be set to false. Note that an ’unused’ floor has no decoded floor information; it is important that this is remembered at floor curve synthesis time.

 

4.3.4. residue decode

Unlike floors, which are decoded in channel order, the residue vectors are decoded in submap order.

for each submap [i] in order from 0 ... [vorbis_mapping_submaps]-1

 

1.
[ch] = 0
2.
for each channel [j] in order from 0 ... [audio_channels] - 1
a)
if channel [j] in submap [i] (vector [vorbis_mapping_mux] element [j] is equal to [i])
i.
if vector [no_residue] element [j] is true
A.
vector [do_not_decode_flag] element [ch] is set

else

A.
vector [do_not_decode_flag] element [ch] is unset
ii.
increment [ch]
3.
[residue_number] = vector [vorbis_mapping_submap_residue] element [i]
4.
[residue_type] = vector [vorbis_residue_types] element [residue_number]
5.
decode [ch] vectors using residue [residue_number], according to type [residue_type], also passing vector [do_not_decode_flag] to indicate which vectors in the bundle should not be decoded. Correct per-vector decode length is [n]/2.
6.
[ch] = 0
7.
for each channel [j] in order from 0 ... [audio_channels]
a)
if channel [j] is in submap [i] (vector [vorbis_mapping_mux] element [j] is equal to [i])
i.
residue vector for channel [j] is set to decoded residue vector [ch]
ii.
increment [ch]

 

4.3.5. inverse coupling

for each [i] from [vorbis_mapping_coupling_steps]-1 descending to 0

 

1.
[magnitude_vector] = the residue vector for channel (vector [vorbis_mapping_magnitude] element [i])
2.
[angle_vector] = the residue vector for channel (vector [vorbis_mapping_angle] element [i])
3.
for each scalar value [M] in vector [magnitude_vector] and the corresponding scalar value [A] in vector [angle_vector]:
a)
if ([M] is greater than zero)
i.
if ([A] is greater than zero)
A.
[new_M] = [M]
B.
[new_A] = [M]-[A]

else

A.
[new_A] = [M]
B.
[new_M] = [M]+[A]

else

i.
if ([A] is greater than zero)
A.
[new_M] = [M]
B.
[new_A] = [M]+[A]

else

A.
[new_A] = [M]
B.
[new_M] = [M]-[A]
b)
set scalar value [M] in vector [magnitude_vector] to [new_M]
c)
set scalar value [A] in vector [angle_vector] to [new_A]

 

4.3.6. dot product

For each channel, synthesize the floor curve from the decoded floor information, according to packet type. Note that the vector synthesis length for floor computation is [n]/2.

For each channel, multiply each element of the floor curve by each element of that channel’s residue vector. The result is the dot product of the floor and residue vectors for each channel; the produced vectors are the length [n]/2 audio spectrum for each channel.

One point is worth mentioning about this dot product; a common mistake in a fixed point implementation might be to assume that a 32 bit fixed-point representation for floor and residue and direct multiplication of the vectors is sufficient for acceptable spectral depth in all cases because it happens to mostly work with the current Xiph.Org reference encoder.

However, floor vector values can span 140dB (24 bits unsigned), and the audio spectrum vector should represent a minimum of 120dB (21 bits with sign), even when output is to a 16 bit PCM device. For the residue vector to represent full scale if the floor is nailed to 140dB, it must be able to span 0 to +140dB. For the residue vector to reach full scale if the floor is nailed at 0dB, it must be able to represent 140dB to +0dB. Thus, in order to handle full range dynamics, a residue vector may span 140dB to +140dB entirely within spec. A 280dB range is approximately 48 bits with sign; thus the residue vector must be able to represent a 48 bit range and the dot product must be able to handle an effective 48 bit times 24 bit multiplication. This range may be achieved using large (64 bit or larger) integers, or implementing a movable binary point representation.

 

4.3.7. inverse MDCT

Convert the audio spectrum vector of each channel back into time domain PCM audio via an inverse Modified Discrete Cosine Transform (MDCT). A detailed description of the MDCT is available in [1]. The window function used for the MDCT is the function described earlier.

 

4.3.8. overlap_add

Windowed MDCT output is overlapped and added with the right hand data of the previous window such that the 3/4 point of the previous window is aligned with the 1/4 point of the current window (as illustrated in paragraph 1.3.2, “Window shape decode (long windows only)”). The overlapped portion produced from overlapping the previous and current frame data is finished data to be returned by the decoder. This data spans from the center of the previous window to the center of the current window. In the case of same-sized windows, the amount of data to return is one-half block consisting of and only of the overlapped portions. When overlapping a short and long window, much of the returned range does not actually overlap. This does not damage transform orthogonality. Pay attention however to returning the correct data range; the amount of data to be returned is:

 

1 window\_blocksize(previous\_window)/4+window\_blocksize(current\_window)/4

from the center (element windowsize/2) of the previous window to the center (element windowsize/2-1, inclusive) of the current window.

Data is not returned from the first frame; it must be used to ’prime’ the decode engine. The encoder accounts for this priming when calculating PCM offsets; after the first frame, the proper PCM output offset is ’0’ (as no data has been returned yet).

 

4.3.9. output channel order

Vorbis I specifies only a channel mapping type 0. In mapping type 0, channel mapping is implicitly defined as follows for standard audio applications. As of revision 16781 (20100113), the specification adds defined channel locations for 6.1 and 7.1 surround. Ordering/location for greater-than-eight channels remains ’left to the implementation’.

These channel orderings refer to order within the encoded stream. It is naturally possible for a decoder to produce output with channels in any order. Any such decoder should explicitly document channel reordering behavior.

 

one channel
the stream is monophonic
two channels
the stream is stereo. channel order: left, right
three channels
the stream is a 1d-surround encoding. channel order: left, center, right
four channels
the stream is quadraphonic surround. channel order: front left, front right, rear left, rear right
five channels
the stream is five-channel surround. channel order: front left, center, front right, rear left, rear right
six channels
the stream is 5.1 surround. channel order: front left, center, front right, rear left, rear right, LFE
seven channels
the stream is 6.1 surround. channel order: front left, center, front right, side left, side right, rear center, LFE
eight channels
the stream is 7.1 surround. channel order: front left, center, front right, side left, side right, rear left, rear right, LFE
greater than eight channels
channel use and order is defined by the application

Applications using Vorbis for dedicated purposes may define channel mapping as seen fit. Future channel mappings (such as three and four channel Ambisonics) will make use of channel mappings other than mapping 0.

5. comment field and header specification

 

5.1. Overview

The Vorbis text comment header is the second (of three) header packets that begin a Vorbis bitstream. It is meant for short text comments, not arbitrary metadata; arbitrary metadata belongs in a separate logical bitstream (usually an XML stream type) that provides greater structure and machine parseability.

The comment field is meant to be used much like someone jotting a quick note on the bottom of a CDR. It should be a little information to remember the disc by and explain it to others; a short, to-the-point text note that need not only be a couple words, but isn’t going to be more than a short paragraph. The essentials, in other words, whatever they turn out to be, eg:

 

Honest Bob and the Factory-to-Dealer-Incentives, “I’m Still Around”, opening for Moxy Fr�vous, 1997.

 

5.2. Comment encoding

 

5.2.1. Structure

The comment header is logically a list of eight-bit-clean vectors; the number of vectors is bounded to 232 1 and the length of each vector is limited to 232 1 bytes. The vector length is encoded; the vector contents themselves are not null terminated. In addition to the vector list, there is a single vector for vendor name (also 8 bit clean, length encoded in 32 bits). For example, the 1.0 release of libvorbis set the vendor string to “Xiph.Org libVorbis I 20020717”.

The vector lengths and number of vectors are stored lsb first, according to the bit packing conventions of the vorbis codec. However, since data in the comment header is octet-aligned, they can simply be read as unaligned 32 bit little endian unsigned integers.

The comment header is decoded as follows:

 

1 1) [vendor\_length] = read an unsigned integer of 32 bits
2 2) [vendor\_string] = read a UTF-8 vector as [vendor\_length] octets
3 3) [user\_comment\_list\_length] = read an unsigned integer of 32 bits
4 4) iterate [user\_comment\_list\_length] times {
5 5) [length] = read an unsigned integer of 32 bits
6 6) this iteration’s user comment = read a UTF-8 vector as [length] octets
7 }
8 7) [framing\_bit] = read a single bit as boolean
9 8) if ( [framing\_bit] unset or end-of-packet ) then ERROR
10 9) done.

 

5.2.2. Content vector format

The comment vectors are structured similarly to a UNIX environment variable. That is, comment fields consist of a field name and a corresponding value and look like:

 

 

1 comment[0]="ARTIST=me";
2 comment[1]="TITLE=the sound of Vorbis";

The field name is case-insensitive and may consist of ASCII 0x20 through 0x7D, 0x3D (’=’)excluded. ASCII 0x41 through 0x5A inclusive (characters A-Z) is to be considered equivalent to ASCII 0x61 through 0x7A inclusive (characters a-z).

The field name is immediately followed by ASCII 0x3D (’=’); this equals sign is used to terminate the field name.

0x3D is followed by 8 bit clean UTF-8 encoded value of the field contents to the end of the field.

Field namesBelow is a proposed, minimal list of standard field names with a description of intended use. No single or group of field names is mandatory; a comment header may contain one, all or none of the names in this list.

 

TITLE
Track/Work name
VERSION
The version field may be used to differentiate multiple versions of the same track title in a single collection. (e.g. remix info)
ALBUM
The collection name to which this track belongs
TRACKNUMBER
The track number of this piece if part of a specific larger collection or album
ARTIST
The artist generally considered responsible for the work. In popular music this is usually the performing band or singer. For classical music it would be the composer. For an audio book it would be the author of the original text.
PERFORMER
The artist(s) who performed the work. In classical music this would be the conductor, orchestra, soloists. In an audio book it would be the actor who did the reading. In popular music this is typically the same as the ARTIST and is omitted.
COPYRIGHT
Copyright attribution, e.g., ’2001 Nobody’s Band’ or ’1999 Jack Moffitt’
LICENSE
License information, eg, ’All Rights Reserved’, ’Any Use Permitted’, a URL to a license such as a Creative Commons license (”www.creativecommons.org/blahblah/license.html”) or the EFF Open Audio License (’distributed under the terms of the Open Audio License. see http://www.eff.org/IP/Open_licenses/eff_oal.html for details’), etc.
ORGANIZATION
Name of the organization producing the track (i.e. the ’record label’)
DESCRIPTION
A short text description of the contents
GENRE
A short text indication of music genre
DATE
Date the track was recorded
LOCATION
Location where track was recorded
CONTACT
Contact information for the creators or distributors of the track. This could be a URL, an email address, the physical address of the producing label.
ISRC
International Standard Recording Code for the track; see the ISRC intro page for more information on ISRC numbers.

ImplicationsField names should not be ’internationalized’; this is a concession to simplicity not an attempt to exclude the majority of the world that doesn’t speak English. Fieldcontents, however, use the UTF-8 character encoding to allow easy representation of any language.

We have the length of the entirety of the field and restrictions on the field name so that the field name is bounded in a known way. Thus we also have the length of the field contents.

Individual ’vendors’ may use non-standard field names within reason. The proper use of comment fields should be clear through context at this point. Abuse will be discouraged.

There is no vendor-specific prefix to ’nonstandard’ field names. Vendors should make some effort to avoid arbitrarily polluting the common namespace. We will generally collect the more useful tags here to help with standardization.

Field names are not required to be unique (occur once) within a comment header. As an example, assume a track was recorded by three well know artists; the following is permissible, and encouraged:

 

 

1 ARTIST=Dizzy Gillespie
2 ARTIST=Sonny Rollins
3 ARTIST=Sonny Stitt

 

5.2.3. Encoding

The comment header comprises the entirety of the second bitstream header packet. Unlike the first bitstream header packet, it is not generally the only packet on the second page and may not be restricted to within the second bitstream page. The length of the comment header packet is (practically) unbounded. The comment header packet is not optional; it must be present in the bitstream even if it is effectively empty.

The comment header is encoded as follows (as per Ogg’s standard bitstream mapping which renders least-significant-bit of the word to be coded into the least significant available bit of the current bitstream octet first):

 

1.
Vendor string length (32 bit unsigned quantity specifying number of octets)
2.
Vendor string ([vendor string length] octets coded from beginning of string to end of string, not null terminated)
3.
Number of comment fields (32 bit unsigned quantity specifying number of fields)
4.
Comment field 0 length (if [Number of comment fields] > 0; 32 bit unsigned quantity specifying number of octets)
5.
Comment field 0 ([Comment field 0 length] octets coded from beginning of string to end of string, not null terminated)
6.
Comment field 1 length (if [Number of comment fields] > 1...)...

This is actually somewhat easier to describe in code; implementation of the above can be found in vorbis/lib/info.c, _vorbis_pack_comment() and _vorbis_unpack_comment().

6. Floor type 0 setup and decode

 

6.1. Overview

Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately known as Line Spectral Frequency or LSF) representation to encode a smooth spectral envelope curve as the frequency response of the LSP filter. This representation is equivalent to a traditional all-pole infinite impulse response filter as would be used in linear predictive coding; LSP representation may be converted to LPC representation and vice-versa.

 

6.2. Floor 0 format

Floor zero configuration consists of six integer fields and a list of VQ codebooks for use in coding/decoding the LSP filter coefficient values used by each frame.

 

6.2.1. header decode

Configuration information for instances of floor zero decodes from the codec setup header (third packet). configuration decode proceeds as follows:

 

1 1) [floor0_order] = read an unsigned integer of 8 bits
2 2) [floor0_rate] = read an unsigned integer of 16 bits
3 3) [floor0_bark_map_size] = read an unsigned integer of 16 bits
4 4) [floor0_amplitude_bits] = read an unsigned integer of six bits
5 5) [floor0_amplitude_offset] = read an unsigned integer of eight bits
6 6) [floor0_number_of_books] = read an unsigned integer of four bits and add 1
7 7) array [floor0_book_list] = read a list of [floor0_number_of_books] unsigned integers of eight bits each;

An end-of-packet condition during any of these bitstream reads renders this stream undecodable. In addition, any element of the array [floor0_book_list] that is greater than the maximum codebook number for this bitstream is an error condition that also renders the stream undecodable.

 

6.2.2. packet decode

Extracting a floor0 curve from an audio packet consists of first decoding the curve amplitude and [floor0_order] LSP coefficient values from the bitstream, and then computing the floor curve, which is defined as the frequency response of the decoded LSP filter.

Packet decode proceeds as follows:

1 1) [amplitude] = read an unsigned integer of [floor0_amplitude_bits] bits
2 2) if ( [amplitude] is greater than zero ) {
3 3) [coefficients] is an empty, zero length vector
4 4) [booknumber] = read an unsigned integer of ilog( [floor0_number_of_books] ) bits
5 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable
6 6) [last] = zero;
7 7) vector [temp_vector] = read vector from bitstream using codebook number [floor0_book_list] element [booknumber] in VQ context.
8 8) add the scalar value [last] to each scalar in vector [temp_vector]
9 9) [last] = the value of the last scalar in vector [temp_vector]
10 10) concatenate [temp_vector] onto the end of the [coefficients] vector
11 11) if (length of vector [coefficients] is less than [floor0_order], continue at step 6
12
13 }
14
15 12) done.
16

Take note of the following properties of decode:

  • An [amplitude] value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don’t occur for an unused channel.
  • An end-of-packet condition during decode should be considered a nominal occruence; if end-of-packet is reached during any read operation above, floor decode is to return ’unused’ status as if the [amplitude] value had read zero at the beginning of decode.
  • The book number used for decode can, in fact, be stored in the bitstream in ilog( [floor0_number_of_books] - 1 ) bits. Nevertheless, the above specification is correct and values greater than the maximum possible book value are reserved.
  • The number of scalars read into the vector [coefficients] may be greater than [floor0_order], the number actually required for curve computation. For example, if the VQ codebook used for the floor currently being decoded has a [codebook_dimensions] value of three and [floor0_order] is ten, the only way to fill all the needed scalars in [coefficients] is to to read a total of twelve scalars as four vectors of three scalars each. This is not an error condition, and care must be taken not to allow a buffer overflow in decode. The extra values are not used and may be ignored or discarded.

 

6.2.3. curve computation

Given an [amplitude] integer and [coefficients] vector from packet decode as well as the [floor0_order], [floor0_rate], [floor0_bark_map_size], [floor0_amplitude_bits] and [floor0_amplitude_offset] values from floor setup, and an output vector size [n] specified by the decode process, we compute a floor output vector.

If the value [amplitude] is zero, the return value is a length [n] vector with all-zero scalars. Otherwise, begin by assuming the following definitions for the given vector to be synthesized:

        {
          min (floor0_bark_map_size    − 1,foobar )  for i ∈ [0,n − 1 ]
mapi =    − 1                                        for i = n

where

          ⌊                                                 ⌋
                (floor0_rate   ⋅ i) floor0_bark_map_size
foobar =   bark  -------2n-------  ⋅-bark(.5 ⋅ floor0_rate-)

and

                                                         2
bark(x) = 13.1arctan (.00074x ) + 2.24 arctan(.0000000185x  +  .0001x )

The above is used to synthesize the LSP curve on a Bark-scale frequency axis, then map the result to a linear-scale frequency axis. Similarly, the below calculation synthesizes the output LSP curve [output] on a log (dB) amplitude scale, mapping it to linear amplitude in the last step:

 

1.
[i] = 0
2.
[ω] = π * map element [i] / [floor0_bark_map_size]
3.
if ( [floor0_order] is odd )
a)
calculate [p] and [q] according to:
                   floor0_order−3
               2      ∏2                                       2
p  =   (1 − cos ω)           4(cos([coefficients  ]2j+1) − cosω )
         floor0_order−1   j=0
       1 ----∏2----
q  =   --          4(cos([coefficients  ]2j) − cosω )2
       4    j=0

else [floor0_order] is even

b)
calculate [p] and [q] according to:
                   floor0_order−2
       (1-−-cosω-)    ∏2                                       2
p  =        2                4(cos([coefficients   ]2j+1) − cosω)
                      j=0
                   floor0_∏o2rder−-2
q  =   (1-+-cosω-)           4(cos([coefficients  ]2j) − cos ω)2
            2         j=0
4.
calculate [linear_floor_value] according to:
    (           (                                                                      ))
exp   .11512925   amplitude---⋅ floor0_amplitute_√offset---−  floor0_amplitude_offset
                     (2floor0_amplitude_bits − 1)  p + q

 

5.
[iteration_condition] = map element [i]
6.
[output] element [i] = [linear_floor_value]
7.
increment [i]
8.
if ( map element [i] is equal to [iteration_condition] ) continue at step 5
9.
if ( [i] is less than [n] ) continue at step 2
10.
done

7. Floor type 1 setup and decode

 

7.1. Overview

Vorbis floor type one uses a piecewise straight-line representation to encode a spectral envelope curve. The representation plots this curve mechanically on a linear frequency axis and a logarithmic (dB) amplitude axis. The integer plotting algorithm used is similar to Bresenham’s algorithm.

 

7.2. Floor 1 format

 

7.2.1. model

Floor type one represents a spectral curve as a series of line segments. Synthesis constructs a floor curve using iterative prediction in a process roughly equivalent to the following simplified description:

  • the first line segment (base case) is a logical line spanning from x˙0,y˙0 to x˙1,y˙1 where in the base case x˙0=0 and x˙1=[n], the full range of the spectral floor to be computed.
  • the induction step chooses a point x˙new within an existing logical line segment and produces a y˙new value at that point computed from the existing line’s y value at x˙new (as plotted by the line) and a difference value decoded from the bitstream packet.
  • floor computation produces two new line segments, one running from x˙0,y˙0 to x˙new,y˙new and from x˙new,y˙new to x˙1,y˙1. This step is performed logically even if y˙new represents no change to the amplitude value at x˙new so that later refinement is additionally bounded at x˙new.
  • the induction step repeats, using a list of x values specified in the codec setup header at floor 1 initialization time. Computation is completed at the end of the x value list.

Consider the following example, with values chosen for ease of understanding rather than representing typical configuration:

For the below example, we assume a floor setup with an [n] of 128. The list of selected X values in increasing order is 0,16,32,48,64,80,96,112 and 128. In list order, the values interleave as 0, 128, 64, 32, 96, 16, 48, 80 and 112. The corresponding list-order Y values as decoded from an example packet are 110, 20, -5, -45, 0, -25, -10, 30 and -10. We compute the floor in the following way, beginning with the first line:

PIC


Figure 7: graph of example floor

We now draw new logical lines to reflect the correction to new˙Y, and iterate for X positions 32 and 96:

PIC


Figure 8: graph of example floor

Although the new Y value at X position 96 is unchanged, it is still used later as an endpoint for further refinement. From here on, the pattern should be clear; we complete the floor computation as follows:

PIC


Figure 9: graph of example floor

PIC


Figure 10: graph of example floor

A more efficient algorithm with carefully defined integer rounding behavior is used for actual decode, as described later. The actual algorithm splits Y value computation and line plotting into two steps with modifications to the above algorithm to eliminate noise accumulation through integer roundoff/truncation.

 

7.2.2. header decode

A list of floor X values is stored in the packet header in interleaved format (used in list order during packet decode and synthesis). This list is split into partitions, and each partition is assigned to a partition class. X positions 0 and [n] are implicit and do not belong to an explicit partition or partition class.

A partition class consists of a representation vector width (the number of Y values which the partition class encodes at once), a ’subclass’ value representing the number of alternate entropy books the partition class may use in representing Y values, the list of [subclass] books and a master book used to encode which alternate books were chosen for representation in a given packet. The master/subclass mechanism is meant to be used as a flexible representation cascade while still using codebooks only in a scalar context.

 

1
2 1) [floor1_partitions] = read 5 bits as unsigned integer
3 2) [maximum_class] = -1
4 3) iterate [i] over the range 0 ... [floor1_partitions]-1 {
5
6 4) vector [floor1_partition_class_list] element [i] = read 4 bits as unsigned integer
7
8 }
9
10 5) [maximum_class] = largest integer scalar value in vector [floor1_partition_class_list]
11 6) iterate [i] over the range 0 ... [maximum_class] {
12
13 7) vector [floor1_class_dimensions] element [i] = read 3 bits as unsigned integer and add 1
14 8) vector [floor1_class_subclasses] element [i] = read 2 bits as unsigned integer
15 9) if ( vector [floor1_class_subclasses] element [i] is nonzero ) {
16
17 10) vector [floor1_class_masterbooks] element [i] = read 8 bits as unsigned integer
18
19 }
20
21 11) iterate [j] over the range 0 ... (2 exponent [floor1_class_subclasses] element [i]) - 1 {
22
23 12) array [floor1_subclass_books] element [i],[j] =
24 read 8 bits as unsigned integer and subtract one
25 }
26 }
27
28 13) [floor1_multiplier] = read 2 bits as unsigned integer and add one
29 14) [rangebits] = read 4 bits as unsigned integer
30 15) vector [floor1_X_list] element [0] = 0
31 16) vector [floor1_X_list] element [1] = 2 exponent [rangebits];
32 17) [floor1_values] = 2
33 18) iterate [i] over the range 0 ... [floor1_partitions]-1 {
34
35 19) [current_class_number] = vector [floor1_partition_class_list] element [i]
36 20) iterate [j] over the range 0 ... ([floor1_class_dimensions] element [current_class_number])-1 {
37 21) vector [floor1_X_list] element ([floor1_values]) =
38 read [rangebits] bits as unsigned integer
39 22) increment [floor1_values] by one
40 }
41 }
42
43 23) done

An end-of-packet condition while reading any aspect of a floor 1 configuration during setup renders a stream undecodable. In addition, a [floor1_class_masterbooks] or[floor1_subclass_books] scalar element greater than the highest numbered codebook configured in this stream is an error condition that renders the stream undecodable. Vector [floor1_x_list] is limited to a maximum length of 65 elements; a setup indicating more than 65 total elements (including elements 0 and 1 set prior to the read loop) renders the stream undecodable. All vector [floor1_x_list] element values must be unique within the vector; a non-unique value renders the stream undecodable.

 

7.2.3. packet decode

Packet decode begins by checking the [nonzero] flag:

 

1 1) [nonzero] = read 1 bit as boolean

If [nonzero] is unset, that indicates this channel contained no audio energy in this frame. Decode immediately returns a status indicating this floor curve (and thus this channel) is unused this frame. (A return status of ’unused’ is different from decoding a floor that has all points set to minimum representation amplitude, which happens to be approximately-140dB).

Assuming [nonzero] is set, decode proceeds as follows:

 

1 1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
2 2) vector [floor1_Y] element [0] = read ilog([range]-1) bits as unsigned integer
3 3) vector [floor1_Y] element [1] = read ilog([range]-1) bits as unsigned integer
4 4) [offset] = 2;
5 5) iterate [i] over the range 0 ... [floor1_partitions]-1 {
6
7 6) [class] = vector [floor1_partition_class] element [i]
8 7) [cdim] = vector [floor1_class_dimensions] element [class]
9 8) [cbits] = vector [floor1_class_subclasses] element [class]
10 9) [csub] = (2 exponent [cbits])-1
11 10) [cval] = 0
12 11) if ( [cbits] is greater than zero ) {
13
14 12) [cval] = read from packet using codebook number
15 (vector [floor1_class_masterbooks] element [class]) in scalar context
16 }
17
18 13) iterate [j] over the range 0 ... [cdim]-1 {
19
20 14) [book] = array [floor1_subclass_books] element [class],([cval] bitwise AND [csub])
21 15) [cval] = [cval] right shifted [cbits] bits
22 16) if ( [book] is not less than zero ) {
23
24 17) vector [floor1_Y] element ([j]+[offset]) = read from packet using codebook
25 [book] in scalar context
26
27 } else [book] is less than zero {
28
29 18) vector [floor1_Y] element ([j]+[offset]) = 0
30
31 }
32 }
33
34 19) [offset] = [offset] + [cdim]
35
36 }
37
38 20) done

An end-of-packet condition during curve decode should be considered a nominal occurrence; if end-of-packet is reached during any read operation above, floor decode is to return ’unused’status as if the [nonzero] flag had been unset at the beginning of decode.

Vector [floor1_Y] contains the values from packet decode needed for floor 1 synthesis.

 

7.2.4. curve computation

Curve computation is split into two logical steps; the first step derives final Y amplitude values from the encoded, wrapped difference values taken from the bitstream. The second step plots the curve lines. Also, although zero-difference values are used in the iterative prediction to find final Y values, these points are conditionally skipped during final line computation in step two. Skipping zero-difference values allows a smoother line fit.

Although some aspects of the below algorithm look like inconsequential optimizations, implementors are warned to follow the details closely. Deviation from implementing a strictly equivalent algorithm can result in serious decoding errors.

Additional note: Although [floor1_final_Y] values in the prediction loop and at the end of step 1 are inherently limited by the prediction algorithm to [0, [range]), it is possible to abuse the setup and codebook machinery to produce negative or over-range results. We suggest that decoder implementations guard the values in vector [floor1_final_Y] by clamping each element to [0, [range]) after step 1. Variants of this suggestion are acceptable as valid floor1 setups cannot produce out of range values.

 

step 1: amplitude value synthesis

Unwrap the always-positive-or-zero values read from the packet into +/- difference values, then apply to line prediction.

 

1 1) [range] = vector { 256, 128, 86, 64 } element ([floor1_multiplier]-1)
2 2) vector [floor1_step2_flag] element [0] = set
3 3) vector [floor1_step2_flag] element [1] = set
4 4) vector [floor1_final_Y] element [0] = vector [floor1_Y] element [0]
5 5) vector [floor1_final_Y] element [1] = vector [floor1_Y] element [1]
6 6) iterate [i] over the range 2 ... [floor1_values]-1 {
7
8 7) [low_neighbor_offset] = low_neighbor([floor1_X_list],[i])
9 8) [high_neighbor_offset] = high_neighbor([floor1_X_list],[i])
10
11 9) [predicted] = render_point( vector [floor1_X_list] element [low_neighbor_offset],
12 vector [floor1_final_Y] element [low_neighbor_offset],
13 vector [floor1_X_list] element [high_neighbor_offset],
14 vector [floor1_final_Y] element [high_neighbor_offset],
15 vector [floor1_X_list] element [i] )
16
17 10) [val] = vector [floor1_Y] element [i]
18 11) [highroom] = [range] - [predicted]
19 12) [lowroom] = [predicted]
20 13) if ( [highroom] is less than [lowroom] ) {
21
22 14) [room] = [highroom] * 2
23
24 } else [highroom] is not less than [lowroom] {
25
26 15) [room] = [lowroom] * 2
27
28 }
29
30 16) if ( [val] is nonzero ) {
31
32 17) vector [floor1_step2_flag] element [low_neighbor_offset] = set
33 18) vector [floor1_step2_flag] element [high_neighbor_offset] = set
34 19) vector [floor1_step2_flag] element [i] = set
35 20) if ( [val] is greater than or equal to [room] ) {
36
37 21) if ( [highroom] is greater than [lowroom] ) {
38
39 22) vector [floor1_final_Y] element [i] = [val] - [lowroom] + [predicted]
40
41 } else [highroom] is not greater than [lowroom] {
42
43 23) vector [floor1_final_Y] element [i] = [predicted] - [val] + [highroom] - 1
44
45 }
46
47 } else [val] is less than [room] {
48
49 24) if ([val] is odd) {
50
51 25) vector [floor1_final_Y] element [i] =
52 [predicted] - (([val] + 1) divided by 2 using integer division)
53
54 } else [val] is even {
55
56 26) vector [floor1_final_Y] element [i] =
57 [predicted] + ([val] / 2 using integer division)
58
59 }
60
61 }
62
63 } else [val] is zero {
64
65 27) vector [floor1_step2_flag] element [i] = unset
66 28) vector [floor1_final_Y] element [i] = [predicted]
67
68 }
69
70 }
71
72 29) done
73
step 2: curve synthesis

Curve synthesis generates a return vector [floor] of length [n] (where [n] is provided by the decode process calling to floor decode). Floor 1 curve synthesis makes use of the [floor1_X_list], [floor1_final_Y] and [floor1_step2_flag] vectors, as well as [floor1_multiplier] and [floor1_values] values.

Decode begins by sorting the scalars from vectors [floor1_X_list], [floor1_final_Y] and [floor1_step2_flag] together into new vectors [floor1_X_list]’, [floor1_final_Y]’and [floor1_step2_flag]’ according to ascending sort order of the values in [floor1_X_list]. That is, sort the values of [floor1_X_list] and then apply the same permutation to elements of the other two vectors so that the X, Y and step2_flag values still match.

Then compute the final curve in one pass:

 

1 1) [hx] = 0
2 2) [lx] = 0
3 3) [ly] = vector [floor1_final_Y]’ element [0] * [floor1_multiplier]
4 4) iterate [i] over the range 1 ... [floor1_values]-1 {
5
6 5) if ( [floor1_step2_flag]’ element [i] is set ) {
7
8 6) [hy] = [floor1_final_Y]’ element [i] * [floor1_multiplier]
9 7) [hx] = [floor1_X_list]’ element [i]
10 8) render_line( [lx], [ly], [hx], [hy], [floor] )
11 9) [lx] = [hx]
12 10) [ly] = [hy]
13 }
14 }
15
16 11) if ( [hx] is less than [n] ) {
17
18 12) render_line( [hx], [hy], [n], [hy], [floor] )
19
20 }
21
22 13) if ( [hx] is greater than [n] ) {
23
24 14) truncate vector [floor] to [n] elements
25
26 }
27
28 15) for each scalar in vector [floor], perform a lookup substitution using
29 the scalar value from [floor] as an offset into the vector [floor1_inverse_dB_static_table]
30
31 16) done
32

8. Residue setup and decode

 

8.1. Overview

A residue vector represents the fine detail of the audio spectrum of one channel in an audio frame after the encoder subtracts the floor curve and performs any channel coupling. A residue vector may represent spectral lines, spectral magnitude, spectral phase or hybrids as mixed by channel coupling. The exact semantic content of the vector does not matter to the residue abstraction.

Whatever the exact qualities, the Vorbis residue abstraction codes the residue vectors into the bitstream packet, and then reconstructs the vectors during decode. Vorbis makes use of three different encoding variants (numbered 0, 1 and 2) of the same basic vector encoding abstraction.

 

8.2. Residue format

Residue format partitions each vector in the vector bundle into chunks, classifies each chunk, encodes the chunk classifications and finally encodes the chunks themselves using the the specific VQ arrangement defined for each selected classification. The exact interleaving and partitioning vary by residue encoding number, however the high-level process used to classify and encode the residue vector is the same in all three variants.

A set of coded residue vectors are all of the same length. High level coding structure, ignoring for the moment exactly how a partition is encoded and simply trusting that it is, is as follows:

  • Each vector is partitioned into multiple equal sized chunks according to configuration specified. If we have a vector size of n, a partition size residue_partition_size, and a total of ch residue vectors, the total number of partitioned chunks coded is n/residue_partition_size*ch. It is important to note that the integer division truncates. In the below example, we assume an example residue_partition_size of 8.
  • Each partition in each vector has a classification number that specifies which of multiple configured VQ codebook setups are used to decode that partition. The classification numbers of each partition can be thought of as forming a vector in their own right, as in the illustration below. Just as the residue vectors are coded in grouped partitions to increase encoding efficiency, the classification vector is also partitioned into chunks. The integer elements of each scalar in a classification chunk are built into a single scalar that represents the classification numbers in that chunk. In the below example, the classification codeword encodes two classification numbers.
  • The values in a residue vector may be encoded monolithically in a single pass through the residue vector, but more often efficient codebook design dictates that each vector is encoded as the additive sum of several passes through the residue vector using more than one VQ codebook. Thus, each residue value potentially accumulates values from multiple decode passes. The classification value associated with a partition is the same in each pass, thus the classification codeword is coded only in the first pass.

PIC


Figure 11: illustration of residue vector format

 

8.3. residue 0

Residue 0 and 1 differ only in the way the values within a residue partition are interleaved during partition encoding (visually treated as a black box–or cyan box or brown box–in the above figure).

Residue encoding 0 interleaves VQ encoding according to the dimension of the codebook used to encode a partition in a specific pass. The dimension of the codebook need not be the same in multiple passes, however the partition size must be an even multiple of the codebook dimension.

As an example, assume a partition vector of size eight, to be encoded by residue 0 using codebook sizes of 8, 4, 2 and 1:

 

1
2 original residue vector: [ 0 1 2 3 4 5 6 7 ]
3
4 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
5
6 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
7
8 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
9
10 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
11

It is worth mentioning at this point that no configurable value in the residue coding setup is restricted to a power of two.

 

8.4. residue 1

Residue 1 does not interleave VQ encoding. It represents partition vector scalars in order. As with residue 0, however, partition length must be an integer multiple of the codebook dimension, although dimension may vary from pass to pass.

As an example, assume a partition vector of size eight, to be encoded by residue 0 using codebook sizes of 8, 4, 2 and 1:

 

1
2 original residue vector: [ 0 1 2 3 4 5 6 7 ]
3
4 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
5
6 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
7
8 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
9
10 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
11

 

8.5. residue 2

Residue type two can be thought of as a variant of residue type 1. Rather than encoding multiple passed-in vectors as in residue type 1, the ch passed in vectors of length n are first interleaved and flattened into a single vector of length ch*n. Encoding then proceeds as in type 1. Decoding is as in type 1 with decode interleave reversed. If operating on a single vector to begin with, residue type 1 and type 2 are equivalent.

PIC


Figure 12: illustration of residue type 2

 

8.6. Residue decode

 

8.6.1. header decode

Header decode for all three residue types is identical.

1 1) [residue\_begin] = read 24 bits as unsigned integer
2 2) [residue\_end] = read 24 bits as unsigned integer
3 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one
4 4) [residue\_classifications] = read 6 bits as unsigned integer and add one
5 5) [residue\_classbook] = read 8 bits as unsigned integer

[residue_begin] and [residue_end] select the specific sub-portion of each vector that is actually coded; it implements akin to a bandpass where, for coding purposes, the vector effectively begins at element [residue_begin] and ends at [residue_end]. Preceding and following values in the unpacked vectors are zeroed. Note that for residue type 2, these values as well as [residue_partition_size]apply to the interleaved vector, not the individual vectors before interleave. [residue_partition_size] is as explained above,[residue_classifications] is the number of possible classification to which a partition can belong and [residue_classbook] is the codebook number used to code classification codewords. The number of dimensions in book [residue_classbook] determines how many classification values are grouped into a single classification codeword. Note that the number of entries and dimensions in book [residue_classbook], along with[residue_classifications], overdetermines to possible number of classification codewords. If [residue_classifications]ˆ[residue_classbook].dimensions exceeds[residue_classbook].entries, the bitstream should be regarded to be undecodable.

Next we read a bitmap pattern that specifies which partition classes code values in which passes.

 

1 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
2
3 2) [high\_bits] = 0
4 3) [low\_bits] = read 3 bits as unsigned integer
5 4) [bitflag] = read one bit as boolean
6 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer
7 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits]
8 }
9 7) done

Finally, we read in a list of book numbers, each corresponding to specific bit set in the cascade bitmap. We loop over the possible codebook classifications and the maximum possible number of encoding stages (8 in Vorbis I, as constrained by the elements of the cascade bitmap being eight bits):

 

1 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
2
3 2) iterate [j] over the range 0 ... 7 {
4
5 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) {
6
7 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer
8
9 } else {
10
11 5) array [residue\_books] element [i][j] = unused
12
13 }
14 }
15 }
16
17 6) done

An end-of-packet condition at any point in header decode renders the stream undecodable. In addition, any codebook number greater than the maximum numbered codebook set up in this stream also renders the stream undecodable. All codebooks in array [residue_books] are required to have a value mapping. The presence of codebook in array [residue_books] without a value mapping (maptype equals zero) renders the stream undecodable.

 

8.6.2. packet decode

Format 0 and 1 packet decode is identical except for specific partition interleave. Format 2 packet decode can be built out of the format 1 decode process. Thus we describe first the decode infrastructure identical to all three formats.

In addition to configuration information, the residue decode process is passed the number of vectors in the submap bundle and a vector of flags indicating if any of the vectors are not to be decoded. If the passed in number of vectors is 3 and vector number 1 is marked ’do not decode’,decode skips vector 1 during the decode loop. However, even ’do not decode’ vectors are allocated and zeroed.

Depending on the values of [residue_begin] and [residue_end], it is obvious that the encoded portion of a residue vector may be the entire possible residue vector or some other strict subset of the actual residue vector size with zero padding at either uncoded end. However, it is also possible to set [residue_begin] and [residue_end] to specify a range partially or wholly beyond the maximum vector size. Before beginning residue decode, limit [residue_begin]and [residue_end] to the maximum possible vector size as follows. We assume that the number of vectors being encoded, [ch] is provided by the higher level decoding process.

 

1 1) [actual\_size] = current blocksize/2;
2 2) if residue encoding is format 2
3 3) [actual\_size] = [actual\_size] * [ch];
4 4) [limit\_residue\_begin] = maximum of ([residue\_begin],[actual\_size]);
5 5) [limit\_residue\_end] = maximum of ([residue\_end],[actual\_size]);

The following convenience values are conceptually useful to clarifying the decode process:

 

1 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook]
2 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin]
3 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size]

Packet decode proceeds as follows, matching the description offered earlier in the document.

1 1) allocate and zero all vectors that will be returned.
2 2) if ([n\_to\_read] is zero), stop; there is no residue to decode.
3 3) iterate [pass] over the range 0 ... 7 {
4
5 4) [partition\_count] = 0
6
7 5) while [partition\_count] is less than [partitions\_to\_read]
8
9 6) if ([pass] is zero) {
10
11 7) iterate [j] over the range 0 .. [ch]-1 {
12
13 8) if vector [j] is not marked ’do not decode’ {
14
15 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context
16 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 {
17
18 11) array [classifications] element [j],([i]+[partition\_count]) =
19 [temp] integer modulo [residue\_classifications]
20 12) [temp] = [temp] / [residue\_classifications] using integer division
21
22 }
23
24 }
25
26 }
27
28 }
29
30 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count]
31 is also less than [partitions\_to\_read] {
32
33 14) iterate [j] over the range 0 .. [ch]-1 {
34
35 15) if vector [j] is not marked ’do not decode’ {
36
37 16) [vqclass] = array [classifications] element [j],[partition\_count]
38 17) [vqbook] = array [residue\_books] element [vqclass],[pass]
39 18) if ([vqbook] is not ’unused’) {
40
41 19) decode partition into output vector number [j], starting at scalar
42 offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using
43 codebook number [vqbook] in VQ context
44 }
45 }
46
47 20) increment [partition\_count] by one
48
49 }
50 }
51 }
52
53 21) done
54

An end-of-packet condition during packet decode is to be considered a nominal occurrence. Decode returns the result of vector decode up to that point.

 

8.6.3. format 0 specifics

Format zero decodes partitions exactly as described earlier in the ’Residue Format: residue 0’section. The following pseudocode presents the same algorithm. Assume:

  • [n] is the value in [residue_partition_size]
  • [v] is the residue vector
  • [offset] is the beginning read offset in [v]

 

1 1) [step] = [n] / [codebook\_dimensions]
2 2) iterate [i] over the range 0 ... [step]-1 {
3
4 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context
5 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
6
7 5) vector [v] element ([offset]+[i]+[j]*[step]) =
8 vector [v] element ([offset]+[i]+[j]*[step]) +
9 vector [entry\_temp] element [j]
10
11 }
12
13 }
14
15 6) done
16

 

8.6.4. format 1 specifics

Format 1 decodes partitions exactly as described earlier in the ’Residue Format: residue 1’section. The following pseudocode presents the same algorithm. Assume:

  • [n] is the value in [residue_partition_size]
  • [v] is the residue vector
  • [offset] is the beginning read offset in [v]

 

1 1) [i] = 0
2 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context
3 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
4
5 4) vector [v] element ([offset]+[i]) =
6 vector [v] element ([offset]+[i]) +
7 vector [entry\_temp] element [j]
8 5) increment [i]
9
10 }
11
12 6) if ( [i] is less than [n] ) continue at step 2
13 7) done

 

8.6.5. format 2 specifics

Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.

Format 2 handles ’do not decode’ vectors differently than residue 0 or 1; if all vectors are marked’do not decode’, no decode occurrs. However, if at least one vector is to be decoded, all the vectors are decoded. We then request normal format 1 to decode a single vector representing all output channels, rather than a vector for each channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:

 

1.
If all vectors 0 through ch-1 are marked ’do not decode’, allocate and clear a single vector [v]of length ch*n and skip step 2 below; proceed directly to the post-decode step.
2.
Rather than performing format 1 decode to produce ch vectors of length n each, call format 1 decode to produce a single vector [v] of length ch*n.
3.
Post decode: Deinterleave the single vector [v] returned by format 1 decode as described above into ch independent vectors, one for each outputchannel, according to:
1 1) iterate [i] over the range 0 ... [n]-1 {
2
3 2) iterate [j] over the range 0 ... [ch]-1 {
4
5 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
6
7 }
8 }
9
10 4) done

9. Helper equations

 

9.1. Overview

The equations below are used in multiple places by the Vorbis codec specification. Rather than cluttering up the main specification documents, they are defined here and referenced where appropriate.

 

9.2. Functions

 

9.2.1. ilog

The ”ilog(x)” function returns the position number (1 through n) of the highest set bit in the two’s complement integer value [x]. Values of [x] less than zero are defined to return zero.

 

1 1) [return\_value] = 0;
2 2) if ( [x] is greater than zero ) {
3
4 3) increment [return\_value];
5 4) logical shift [x] one bit to the right, padding the MSb with zero
6 5) repeat at step 2)
7
8 }
9
10 6) done

Examples:

  • ilog(0) = 0;
  • ilog(1) = 1;
  • ilog(2) = 2;
  • ilog(3) = 2;
  • ilog(4) = 3;
  • ilog(7) = 3;
  • ilog(negative number) = 0;

 

9.2.2. float32_unpack

”float32_unpack(x)” is intended to translate the packed binary representation of a Vorbis codebook float value into the representation used by the decoder for floating point numbers. For purposes of this example, we will unpack a Vorbis float32 into a host-native floating point number.

 

1 1) [mantissa] = [x] bitwise AND 0x1fffff (unsigned result)
2 2) [sign] = [x] bitwise AND 0x80000000 (unsigned result)
3 3) [exponent] = ( [x] bitwise AND 0x7fe00000) shifted right 21 bits (unsigned result)
4 4) if ( [sign] is nonzero ) then negate [mantissa]
5 5) return [mantissa] * ( 2 ^ ( [exponent] - 788 ) )

 

9.2.3. lookup1_values

”lookup1_values(codebook_entries,codebook_dimensions)” is used to compute the correct length of the value index for a codebook VQ lookup table of lookup type 1. The values on this list are permuted to construct the VQ vector lookup table of size[codebook_entries].

The return value for this function is defined to be ’the greatest integer value for which[return_value] to the power of [codebook_dimensions] is less than or equal to[codebook_entries]’.

 

9.2.4. low_neighbor

”low_neighbor(v,x)” finds the position n in vector [v] of the greatest value scalar element for which n is less than [x] and vector [v] element n is less than vector [v] element[x].

 

9.2.5. high_neighbor

”high_neighbor(v,x)” finds the position n in vector [v] of the lowest value scalar element for which n is less than [x] and vector [v] element n is greater than vector [v] element[x].

 

9.2.6. render_point

”render_point(x0,y0,x1,y1,X)” is used to find the Y value at point X along the line specified by x0, x1, y0 and y1. This function uses an integer algorithm to solve for the point directly without calculating intervening values along the line.

 

1 1) [dy] = [y1] - [y0]
2 2) [adx] = [x1] - [x0]
3 3) [ady] = absolute value of [dy]
4 4) [err] = [ady] * ([X] - [x0])
5 5) [off] = [err] / [adx] using integer division
6 6) if ( [dy] is less than zero ) {
7
8 7) [Y] = [y0] - [off]
9
10 } else {
11
12 8) [Y] = [y0] + [off]
13
14 }
15
16 9) done

 

9.2.7. render_line

Floor decode type one uses the integer line drawing algorithm of ”render_line(x0, y0, x1, y1, v)”to construct an integer floor curve for contiguous piecewise line segments. Note that it has not been relevant elsewhere, but here we must define integer division as rounding division of both positive and negative numbers toward zero.

 

1 1) [dy] = [y1] - [y0]
2 2) [adx] = [x1] - [x0]
3 3) [ady] = absolute value of [dy]
4 4) [base] = [dy] / [adx] using integer division
5 5) [x] = [x0]
6 6) [y] = [y0]
7 7) [err] = 0
8
9 8) if ( [dy] is less than 0 ) {
10
11 9) [sy] = [base] - 1
12
13 } else {
14
15 10) [sy] = [base] + 1
16
17 }
18
19 11) [ady] = [ady] - (absolute value of [base]) * [adx]
20 12) vector [v] element [x] = [y]
21
22 13) iterate [x] over the range [x0]+1 ... [x1]-1 {
23
24 14) [err] = [err] + [ady];
25 15) if ( [err] >= [adx] ) {
26
27 16) [err] = [err] - [adx]
28 17) [y] = [y] + [sy]
29
30 } else {
31
32 18) [y] = [y] + [base]
33
34 }
35
36 19) vector [v] element [x] = [y]
37
38 }

10. Tables

 

10.1. floor1_inverse_dB_table

The vector [floor1_inverse_dB_table] is a 256 element static lookup table consiting of the following values (read left to right then top to bottom):

 

1 1.0649863e-07, 1.1341951e-07, 1.2079015e-07, 1.2863978e-07,
2 1.3699951e-07, 1.4590251e-07, 1.5538408e-07, 1.6548181e-07,
3 1.7623575e-07, 1.8768855e-07, 1.9988561e-07, 2.1287530e-07,
4 2.2670913e-07, 2.4144197e-07, 2.5713223e-07, 2.7384213e-07,
5 2.9163793e-07, 3.1059021e-07, 3.3077411e-07, 3.5226968e-07,
6 3.7516214e-07, 3.9954229e-07, 4.2550680e-07, 4.5315863e-07,
7 4.8260743e-07, 5.1396998e-07, 5.4737065e-07, 5.8294187e-07,
8 6.2082472e-07, 6.6116941e-07, 7.0413592e-07, 7.4989464e-07,
9 7.9862701e-07, 8.5052630e-07, 9.0579828e-07, 9.6466216e-07,
10 1.0273513e-06, 1.0941144e-06, 1.1652161e-06, 1.2409384e-06,
11 1.3215816e-06, 1.4074654e-06, 1.4989305e-06, 1.5963394e-06,
12 1.7000785e-06, 1.8105592e-06, 1.9282195e-06, 2.0535261e-06,
13 2.1869758e-06, 2.3290978e-06, 2.4804557e-06, 2.6416497e-06,
14 2.8133190e-06, 2.9961443e-06, 3.1908506e-06, 3.3982101e-06,
15 3.6190449e-06, 3.8542308e-06, 4.1047004e-06, 4.3714470e-06,
16 4.6555282e-06, 4.9580707e-06, 5.2802740e-06, 5.6234160e-06,
17 5.9888572e-06, 6.3780469e-06, 6.7925283e-06, 7.2339451e-06,
18 7.7040476e-06, 8.2047000e-06, 8.7378876e-06, 9.3057248e-06,
19 9.9104632e-06, 1.0554501e-05, 1.1240392e-05, 1.1970856e-05,
20 1.2748789e-05, 1.3577278e-05, 1.4459606e-05, 1.5399272e-05,
21 1.6400004e-05, 1.7465768e-05, 1.8600792e-05, 1.9809576e-05,
22 2.1096914e-05, 2.2467911e-05, 2.3928002e-05, 2.5482978e-05,
23 2.7139006e-05, 2.8902651e-05, 3.0780908e-05, 3.2781225e-05,
24 3.4911534e-05, 3.7180282e-05, 3.9596466e-05, 4.2169667e-05,
25 4.4910090e-05, 4.7828601e-05, 5.0936773e-05, 5.4246931e-05,
26 5.7772202e-05, 6.1526565e-05, 6.5524908e-05, 6.9783085e-05,
27 7.4317983e-05, 7.9147585e-05, 8.4291040e-05, 8.9768747e-05,
28 9.5602426e-05, 0.00010181521, 0.00010843174, 0.00011547824,
29 0.00012298267, 0.00013097477, 0.00013948625, 0.00014855085,
30 0.00015820453, 0.00016848555, 0.00017943469, 0.00019109536,
31 0.00020351382, 0.00021673929, 0.00023082423, 0.00024582449,
32 0.00026179955, 0.00027881276, 0.00029693158, 0.00031622787,
33 0.00033677814, 0.00035866388, 0.00038197188, 0.00040679456,
34 0.00043323036, 0.00046138411, 0.00049136745, 0.00052329927,
35 0.00055730621, 0.00059352311, 0.00063209358, 0.00067317058,
36 0.00071691700, 0.00076350630, 0.00081312324, 0.00086596457,
37 0.00092223983, 0.00098217216, 0.0010459992, 0.0011139742,
38 0.0011863665, 0.0012634633, 0.0013455702, 0.0014330129,
39 0.0015261382, 0.0016253153, 0.0017309374, 0.0018434235,
40 0.0019632195, 0.0020908006, 0.0022266726, 0.0023713743,
41 0.0025254795, 0.0026895994, 0.0028643847, 0.0030505286,
42 0.0032487691, 0.0034598925, 0.0036847358, 0.0039241906,
43 0.0041792066, 0.0044507950, 0.0047400328, 0.0050480668,
44 0.0053761186, 0.0057254891, 0.0060975636, 0.0064938176,
45 0.0069158225, 0.0073652516, 0.0078438871, 0.0083536271,
46 0.0088964928, 0.009474637, 0.010090352, 0.010746080,
47 0.011444421, 0.012188144, 0.012980198, 0.013823725,
48 0.014722068, 0.015678791, 0.016697687, 0.017782797,
49 0.018938423, 0.020169149, 0.021479854, 0.022875735,
50 0.024362330, 0.025945531, 0.027631618, 0.029427276,
51 0.031339626, 0.033376252, 0.035545228, 0.037855157,
52 0.040315199, 0.042935108, 0.045725273, 0.048696758,
53 0.051861348, 0.055231591, 0.058820850, 0.062643361,
54 0.066714279, 0.071049749, 0.075666962, 0.080584227,
55 0.085821044, 0.091398179, 0.097337747, 0.10366330,
56 0.11039993, 0.11757434, 0.12521498, 0.13335215,
57 0.14201813, 0.15124727, 0.16107617, 0.17154380,
58 0.18269168, 0.19456402, 0.20720788, 0.22067342,
59 0.23501402, 0.25028656, 0.26655159, 0.28387361,
60 0.30232132, 0.32196786, 0.34289114, 0.36517414,
61 0.38890521, 0.41417847, 0.44109412, 0.46975890,
62 0.50028648, 0.53279791, 0.56742212, 0.60429640,
63 0.64356699, 0.68538959, 0.72993007, 0.77736504,
64 0.82788260, 0.88168307, 0.9389798, 1.

A. Embedding Vorbis into an Ogg stream

 

A.1. Overview

This document describes using Ogg logical and physical transport streams to encapsulate Vorbis compressed audio packet data into file form.

The Section 1, “Introduction and Description” provides an overview of the construction of Vorbis audio packets.

The Ogg bitstream overview and Ogg logical bitstream and framing spec provide detailed descriptions of Ogg transport streams. This specification document assumes a working knowledge of the concepts covered in these named backround documents. Please read them first.

 

A.1.1. Restrictions

The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis streams use Ogg transport streams in degenerate, unmultiplexed form only. That is:

  • A meta-headerless Ogg file encapsulates the Vorbis I packets
  • The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
  • The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)

This is not to say that it is not currently possible to multiplex Vorbis with other media types into a multi-stream Ogg file. At the time this document was written, Ogg was becoming a popular container for low-bitrate movies consisting of DivX video and Vorbis audio. However, a ’Vorbis I audio file’ is taken to imply Vorbis audio existing alone within a degenerate Ogg stream. A compliant ’Vorbis audio player’ is not required to implement Ogg support beyond the specific support of Vorbis within a degenrate Ogg stream (naturally, application authors are encouraged to support full multiplexed Ogg handling).

 

A.1.2. MIME type

The MIME type of Ogg files depend on the context. Specifically, complex multimedia and applications should use application/ogg, while visual media should use video/ogg, and audioaudio/ogg. Vorbis data encapsulated in Ogg may appear in any of those types. RTP encapsulated Vorbis should use audio/vorbis + audio/vorbis-config.

 

A.2. Encapsulation

Ogg encapsulation of a Vorbis packet stream is straightforward.

  • The first Vorbis packet (the identification header), which uniquely identifies a stream as Vorbis audio, is placed alone in the first page of the logical Ogg stream. This results in a first Ogg page of exactly 58 bytes at the very beginning of the logical stream.
  • This first page is marked ’beginning of stream’ in the page flags.
  • The second and third vorbis packets (comment and setup headers) may span one or more pages beginning on the second page of the logical stream. However many pages they span, the third header packet finishes the page on which it ends. The next (first audio) packet must begin on a fresh page.
  • The granule position of these first pages containing only headers is zero.
  • The first audio packet of the logical stream begins a fresh Ogg page.
  • Packets are placed into ogg pages in order until the end of stream.
  • The last page is marked ’end of stream’ in the page flags.
  • Vorbis packets may span page boundaries.
  • The granule position of pages containing Vorbis audio is in units of PCM audio samples (per channel; a stereo stream’s granule position does not increment at twice the speed of a mono stream).
  • The granule position of a page represents the end PCM sample position of the last packet completed on that page. The ’last PCM sample’ is the last complete sample returned by decode, not an internal sample awaiting lapping with a subsequent block. A page that is entirely spanned by a single packet (that completes on a subsequent page) has no granule position, and the granule position is set to ’-1’.

    Note that the last decoded (fully lapped) PCM sample from a packet is not necessarily the middle sample from that block. If, eg, the current Vorbis packet encodes a ”long block” and the next Vorbis packet encodes a ”short block”, the last decodable sample from the current packet be at position (3*long_block_length/4) - (short_block_length/4).

  • The granule (PCM) position of the first page need not indicate that the stream started at position zero. Although the granule position belongs to the last completed packet on the page and a valid granule position must be positive, by inference it may indicate that the PCM position of the beginning of audio is positive or negative.
    • A positive starting value simply indicates that this stream begins at some positive time offset, potentially within a larger program. This is a common case when connecting to the middle of broadcast stream.
    • A negative value indicates that output samples preceeding time zero should be discarded during decoding; this technique is used to allow sample-granularity editing of the stream start time of already-encoded Vorbis streams. The number of samples to be discarded must not exceed the overlap-add span of the first two audio packets.

    In both of these cases in which the initial audio PCM starting offset is nonzero, the second finished audio packet must flush the page on which it appears and the third packet begin a fresh page. This allows the decoder to always be able to perform PCM position adjustments before needing to return any PCM data from synthesis, resulting in correct positioning information without any aditional seeking logic.

    Note: Failure to do so should, at worst, cause a decoder implementation to return incorrect positioning information for seeking operations at the very beginning of the stream.

  • A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries. The difference between the actual available data returned and the declared amount indicates how many trailing samples to discard from the decoding process.

B. Vorbis encapsulation in RTP

Please consult RFC 5215 “RTP Payload Format for Vorbis Encoded Audio” for description of how to embed Vorbis audio in an RTP stream.

Colophon

PIC

Ogg is a Xiph.Org Foundation effort to protect essential tenets of Internet multimedia from corporate hostage-taking; Open Source is the net’s greatest tool to keep everyone honest. SeeAbout the Xiph.Org Foundation for details.

Ogg Vorbis is the first Ogg audio CODEC. Anyone may freely use and distribute the Ogg and Vorbis specification, whether in a private, public or corporate capacity. However, the Xiph.Org Foundation and the Ogg project (xiph.org) reserve the right to set the Ogg Vorbis specification and certify specification compliance.

Xiph.Org’s Vorbis software CODEC implementation is distributed under a BSD-like license. This does not restrict third parties from distributing independent implementations of Vorbis software under other licenses.

Ogg, Vorbis, Xiph.Org Foundation and their logos are trademarks (tm) of the Xiph.Org Foundation. These pages are copyright (C) 1994-2007 Xiph.Org Foundation. All rights reserved.

This document is set using LATEX.

References

 

[1] T. Sporer, K. Brandenburg and B. Edler, The use of multirate filter banks for coding of high quality digital audio, http://www.iocon.com/resource/docs/ps/eusipco_corrected.ps.

 

Gstreamer design documents.

Gstreamer学习list.

 

这是设计者写的原始设计文档,

多读几遍,结合代码来看。

http://cgit.freedesktop.org/gstreamer/gstreamer/tree/docs/design/

 

Size  
-rw-r--r-- Makefile.am 1511 logplain
-rw-r--r-- draft-klass.txt 7466 logplain
-rw-r--r-- draft-metadata.txt 7377 logplain
-rw-r--r-- draft-push-pull.txt 3849 logplain
-rw-r--r-- draft-tagreading.txt 3209 logplain
-rw-r--r-- draft-tracing.txt 3359 logplain
-rw-r--r-- part-MT-refcounting.txt 16570 logplain
-rw-r--r-- part-TODO.txt 3821 logplain
-rw-r--r-- part-activation.txt 4633 logplain
-rw-r--r-- part-buffer.txt 5761 logplain
-rw-r--r-- part-buffering.txt 12660 logplain
-rw-r--r-- part-bufferpool.txt 14984 logplain
-rw-r--r-- part-caps.txt 6201 logplain
-rw-r--r-- part-clocks.txt 3606 logplain
-rw-r--r-- part-context.txt 2507 logplain
-rw-r--r-- part-controller.txt 3098 logplain
-rw-r--r-- part-conventions.txt 2280 logplain
-rw-r--r-- part-dynamic.txt 389 logplain
-rw-r--r-- part-element-sink.txt 7895 logplain
-rw-r--r-- part-element-source.txt 5251 logplain
-rw-r--r-- part-element-transform.txt 13682 logplain
-rw-r--r-- part-events.txt 11775 logplain
-rw-r--r-- part-framestep.txt 10653 logplain
-rw-r--r-- part-gstbin.txt 3918 logplain
-rw-r--r-- part-gstbus.txt 1983 logplain
-rw-r--r-- part-gstelement.txt 2749 logplain
-rw-r--r-- part-gstghostpad.txt 15160 logplain
-rw-r--r-- part-gstobject.txt 3278 logplain
-rw-r--r-- part-gstpipeline.txt 2987 logplain
-rw-r--r-- part-latency.txt 13224 logplain
-rw-r--r-- part-live-source.txt 2129 logplain
-rw-r--r-- part-memory.txt 5848 logplain
-rw-r--r-- part-messages.txt 4706 logplain
-rw-r--r-- part-meta.txt 15672 logplain
-rw-r--r-- part-miniobject.txt 7194 logplain
-rw-r--r-- part-missing-plugins.txt 10964 logplain
-rw-r--r-- part-negotiation.txt 13305 logplain
-rw-r--r-- part-overview.txt 23151 logplain
-rw-r--r-- part-preroll.txt 2071 logplain
-rw-r--r-- part-probes.txt 15208 logplain
-rw-r--r-- part-progress.txt 9716 logplain
-rw-r--r-- part-push-pull.txt 2103 logplain
-rw-r--r-- part-qos.txt 16143 logplain
-rw-r--r-- part-query.txt 2450 logplain
-rw-r--r-- part-relations.txt 14928 logplain
-rw-r--r-- part-scheduling.txt 8610 logplain
-rw-r--r-- part-seeking.txt 10200 logplain
-rw-r--r-- part-segments.txt 4341 logplain
-rw-r--r-- part-sparsestreams.txt 4863 logplain
-rw-r--r-- part-standards.txt 1684 logplain
-rw-r--r-- part-states.txt 15978 logplain
-rw-r--r-- part-stream-status.txt 4248 logplain
-rw-r--r-- part-streams.txt 2688 logplain
-rw-r--r-- part-synchronisation.txt 8304 logplain
-rw-r--r-- part-toc.txt 6104 logplain
-rw-r--r-- part-trickmodes.txt 9172 logplain

 

 

 

 

 

 

 

 

 

 

可见,pipeline的主要布局是elements, Bins 和pads.把上述几个对象继承关系牢记脑海中。PADs是连接器。pipeline的各对象之间的通信通过缓存buffers、事件event、查询query、消息message机制实现, 调度通过时钟clock和队列queue实现。

 

 

 

 

 

 

 

 

 

 

个人阅读顺序:

part-convertions.txt

 

part-overview.txt

初步对gstreamer的设计有个大致的了解。

 

part-gstelement.txt

GstElement 一般是从GstElementFactory中create, 为什么不直接用g_object_new()?

(工厂名,就是插件名, 所以需要加载插件,然后创建对应的Element)

 

GstPad也是从GstObject派生,和GstElement是并行的派生关系,

但是GstPad一般隶属于GstElement.

GstPad是GstElement的property, 在GstElement内部:用几个GList来保存GstPad.

 

part-element-source.txt

 

part-element-sink.txt

sink的设计要被source element复杂一些,都是实际使用的抽象。

现实中,sink就是比source复杂些。

 

 

part-element-transform.txt

根据user case分析内部的运行情况。

 

part-pads.txt 缺失

GstPad和GstPadTemplate是什么关系?

GstPadTemplate就是一个工厂吗?

push/poll模式

chain函数的作用?

part-caps.txt

caps其实是描述媒体类型(media type)的,用GstStructure保存;

caps似附属于gstPads/GstPadTemplate上的。

GstCaps是数据类型。

part-negotiation.txt

caps的negotiation: 规格的协商。

 

 

pipeline的各对象之间的通信通过缓存buffers、事件event、查询query、消息message机制实现, 调度通过时钟clock和队列queue实现。

 

part-buffering.txt

buffering的目的:缓冲数据,使得播放更平顺。

buffer包含由(内存指针,内存大小,时间戳,引用计数等信息)。简单的使用方式是:创建buffer,分配内存,放入数据,传输到后续element,后续element读数据,处理后减去引用计数。更复杂的情况是element就地修改buffer中的内容,而不是分配一个新的buffer。element还写到硬件内存(video-capture src等),bufer还可以是只读的。

 

part-bufferlist.txt

 

 

part-events.txt

Events are objects passed around in parallel to the buffer dataflow to notify elements of various events.
Events are received on pads using the event function. Some events should be interleaved with the data stream so they require taking the STREAM_LOCK, others don't.
Different types of events exist to implement various functionalities.
  GST_EVENT_FLUSH_START:   data is to be discarded
  GST_EVENT_FLUSH_STOP:    data is allowed again
  GST_EVENT_EOS:     no more data is to be expected on a pad. GST_EVENT_NEWSEGMENT:    A new group of buffers with common start time GST_EVENT_TAG:           Stream metadata.
  GST_EVENT_BUFFERSIZE:    Buffer size requirements
  GST_EVENT_QOS:           A notification of the quality of service of the stream GST_EVENT_SEEK:          A seek should be performed to a new position in the stream GST_EVENT_NAVIGATION:    A navigation event.
  GST_EVENT_LATENCY:       Configure the latency in a pipeline
  * GST_EVENT_DRAIN:       Play all data downstream before returning.
* not yet implemented, under investigation, might be needed to do still frames in DVD.

 

Event控制包,可以发送给上/下游element。发送给下游的event统治后续element流状态,是断绝,涮新,流结尾等信息。发送给上游event用在应用交互和event-event交互,用来请求改变流状态,例如seek等。对于应用来说,只有上流event比较重要。下流event仅仅维持一个完整的数据流概念。

 

typedef enum {
  GST_EVENT_UNKNOWN               = GST_EVENT_MAKE_TYPE (0, 0),
  /* bidirectional events */
  GST_EVENT_FLUSH_START           = GST_EVENT_MAKE_TYPE (1, FLAG(BOTH)),
  GST_EVENT_FLUSH_STOP            = GST_EVENT_MAKE_TYPE (2, FLAG(BOTH) | FLAG(SERIALIZED)),
  /* downstream serialized events */
  GST_EVENT_EOS                   = GST_EVENT_MAKE_TYPE (5, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  GST_EVENT_NEWSEGMENT            = GST_EVENT_MAKE_TYPE (6, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  GST_EVENT_TAG                   = GST_EVENT_MAKE_TYPE (7, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  GST_EVENT_BUFFERSIZE            = GST_EVENT_MAKE_TYPE (8, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  GST_EVENT_SINK_MESSAGE          = GST_EVENT_MAKE_TYPE (9, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  /* upstream events */
  GST_EVENT_QOS                   = GST_EVENT_MAKE_TYPE (15, FLAG(UPSTREAM)),
  GST_EVENT_SEEK                  = GST_EVENT_MAKE_TYPE (16, FLAG(UPSTREAM)),
  GST_EVENT_NAVIGATION            = GST_EVENT_MAKE_TYPE (17, FLAG(UPSTREAM)),
  GST_EVENT_LATENCY               = GST_EVENT_MAKE_TYPE (18, FLAG(UPSTREAM)),
  GST_EVENT_STEP                  = GST_EVENT_MAKE_TYPE (19, FLAG(UPSTREAM)),

  /* custom events start here */
  GST_EVENT_CUSTOM_UPSTREAM       = GST_EVENT_MAKE_TYPE (32, FLAG(UPSTREAM)),
  GST_EVENT_CUSTOM_DOWNSTREAM     = GST_EVENT_MAKE_TYPE (32, FLAG(DOWNSTREAM) | FLAG(SERIALIZED)),
  GST_EVENT_CUSTOM_DOWNSTREAM_OOB = GST_EVENT_MAKE_TYPE (32, FLAG(DOWNSTREAM)),
  GST_EVENT_CUSTOM_BOTH           = GST_EVENT_MAKE_TYPE (32, FLAG(BOTH) | FLAG(SERIALIZED)),
  GST_EVENT_CUSTOM_BOTH_OOB       = GST_EVENT_MAKE_TYPE (32, FLAG(BOTH))
} GstEventType;
static GstEventQuarks event_quarks[] = {
  {GST_EVENT_UNKNOWN, "unknown", 0},
  {GST_EVENT_FLUSH_START, "flush-start", 0},
  {GST_EVENT_FLUSH_STOP, "flush-stop", 0},
  {GST_EVENT_EOS, "eos", 0},
  {GST_EVENT_NEWSEGMENT, "newsegment", 0},
  {GST_EVENT_TAG, "tag", 0},
  {GST_EVENT_BUFFERSIZE, "buffersize", 0},
  {GST_EVENT_SINK_MESSAGE, "sink-message", 0},
  {GST_EVENT_QOS, "qos", 0},
  {GST_EVENT_SEEK, "seek", 0},
  {GST_EVENT_NAVIGATION, "navigation", 0},
  {GST_EVENT_LATENCY, "latency", 0},
  {GST_EVENT_STEP, "step", 0},
  {GST_EVENT_CUSTOM_UPSTREAM, "custom-upstream", 0},
  {GST_EVENT_CUSTOM_DOWNSTREAM, "custom-downstream", 0},
  {GST_EVENT_CUSTOM_DOWNSTREAM_OOB, "custom-downstream-oob", 0},
  {GST_EVENT_CUSTOM_BOTH, "custom-both", 0},
  {GST_EVENT_CUSTOM_BOTH_OOB, "custom-both-oob", 0},

  {0, NULL, 0}
};

 

 

part-bus.txt

 

从pipeline线程向应用程序发送发送消息,目的是对应用屏蔽线程的概念。缺省情况下,每个pipeline包含一个bus,所以应用不需要创建bus,仅仅需要在bus上设定一个对象信号处理类似的消息处理。当主循环开始运行后,bus周期性的检查新消息,如果有消息,那么调用注册的回调函数。如果使用glib,那么glib有对应的处理机制,否则使用信号机制。

流过pipeline的数据由buffers和events组成。buffers中是确切的pipeline数据,events中是控制信息,比如seek信息,end-of-stream指示等。src-element通常创建一个新的buffer,通过pad传递到链中的下一个element。

 

Gst各个组件的定义。

Gstreamer学习list.

GObject

  |_____ GstObject

                   |____GstPad

                   |____GstElement

                                 |____GstBin

                                              |___GstPipeline

 |_____GstSignalObject

 

G_DEFINE_ABSTRACT_TYPE (GstObject, gst_object, G_TYPE_OBJECT);

 

G_DEFINE_TYPE (GstSignalObject, gst_signal_object, G_TYPE_OBJECT);

 

G_DEFINE_TYPE_WITH_CODE (GstPad, gst_pad, GST_TYPE_OBJECT, _do_init);

 

G_DEFINE_TYPE (GstPadTemplate, gst_pad_template, GST_TYPE_OBJECT);

 

G_DEFINE_TYPE_WITH_CODE (GstElementFactory, gst_element_factory,
    GST_TYPE_PLUGIN_FEATURE, _do_init);

 

GstElement是显式定义:

GType
gst_element_get_type (void)
{
  static volatile gsize gst_element_type = 0;

  if (g_once_init_enter (&gst_element_type)) {
    GType _type;
    static const GTypeInfo element_info = {
      sizeof (GstElementClass),
      gst_element_base_class_init,
      gst_element_base_class_finalize,
      (GClassInitFunc) gst_element_class_init,
      NULL,
      NULL,
      sizeof (GstElement),
      0,
      (GInstanceInitFunc) gst_element_init,
      NULL
    };

    _type = g_type_register_static (GST_TYPE_OBJECT, "GstElement",
        &element_info, G_TYPE_FLAG_ABSTRACT);

    _gst_elementclass_factory =
        g_quark_from_static_string ("GST_ELEMENTCLASS_FACTORY");
    g_once_init_leave (&gst_element_type, _type);
  }
  return gst_element_type;
}

 

GST_BOILERPLATE_FULL (GstBin, gst_bin, GstElement, GST_TYPE_ELEMENT, _do_init);

 

GST_BOILERPLATE_FULL (GstPipeline, gst_pipeline, GstBin, GST_TYPE_BIN,
    _do_init);

 

GstObject

 |____GstClock

G_DEFINE_TYPE (GstClock, gst_clock, GST_TYPE_OBJECT);

 

GstObject

|____GstTask

G_DEFINE_TYPE_WITH_CODE (GstTask, gst_task, GST_TYPE_OBJECT, _do_init);

GstObject

|____GstTaskPool

G_DEFINE_TYPE_WITH_CODE (GstTaskPool, gst_task_pool, GST_TYPE_OBJECT, _do_init);

 

 

GBoxed

  |____GstCaps

显式定义:

GType
gst_caps_get_type (void)
{
  static GType gst_caps_type = 0;

  if (G_UNLIKELY (gst_caps_type == 0)) {
    gst_caps_type = g_boxed_type_register_static ("GstCaps",
        (GBoxedCopyFunc) gst_caps_copy_conditional,
        (GBoxedFreeFunc) gst_caps_unref);

    g_value_register_transform_func (gst_caps_type,
        G_TYPE_STRING, gst_caps_transform_to_string);
  }

  return gst_caps_type;
}

 

GstMiniObject

 |_____GstEvent

G_DEFINE_TYPE_WITH_CODE (GstEvent, gst_event, GST_TYPE_MINI_OBJECT, _do_init);

 

GstMiniObject

 |_____GstBuffer

G_DEFINE_TYPE_WITH_CODE (GstBuffer, gst_buffer, GST_TYPE_MINI_OBJECT, _do_init);

 

GstMiniObject

|_____GstBufferList

G_DEFINE_TYPE (GstBufferList, gst_buffer_list, GST_TYPE_MINI_OBJECT);

 

 

GType
gst_type_find_get_type (void)
{
  static GType typefind_type = 0;

  if (G_UNLIKELY (typefind_type == 0)) {
    typefind_type = g_pointer_type_register_static ("GstTypeFind");
  }
  return typefind_type;
}

 

GstObject
|____GstPluginFeature

           |_______GstTypeFindFactory

G_DEFINE_TYPE_WITH_CODE (GstTypeFindFactory, gst_type_find_factory,
    GST_TYPE_PLUGIN_FEATURE, _do_init);

 

 

    

Gstreamer学习list.

 

ID Title 简单描述
7 gst-inspect 的使用  
6 Gstreamer design documents. 原始的设计文档, 需要仔细读懂
5 Gst各个组件的定义。  
4 GstBus SubSystem  
3 Gstreamer Objects hierachy.  
2 gst init.  
1 gstreamer的参考资料。  

 

GMutex / GCond

下面两个函数:
void g_cond_signal (GCond *cond);
void g_cond_wait (GCond *cond, GMutex *mutex);
 

用于进行线程同步。
mutex 是“mutual exclusion”(互斥)的英文缩写,用来保证线程对于共享数据的独占访问。

 

GMutex *mutex; // 互斥变量

GCond *cond; //等待条件

 

CRITICAL_SECTION 属于轻量级的线程同步对象,相对于mutex来说,它的效率会高很多。

mutex可以用于进程之间的同步,CRITICAL_SECTION只在同一个进程有效。

 

同一进程可以包括多个线程,这些线程共享相同的内存空间,而进程都有各自独立的内存空间,进程之间通信需要专门的机制,这无疑增加了内核的开销,降低了系 统性能。线程带来的开销很小,内核无需单独复制进程的内存空间或文件描述符等,这就大量地节省了CPU时间,使得创建线程比进程的速度快数十倍。另外,多 线程程序作为一种多任务、并发的工作方式,还有以下的优点:1)提高应用程序响应时间;2)使多CPU系统更加有效;3)改善程序结构。
首先我们理清一下Pthread和Gthread的区别。Pthread即POSIX thread,Posix线程是一个POSIX标准线程,该标准定义内部API创建和操纵线程。Gthread调用的是Glib库中的线程部分;GLib是GTK+和GNOME工程的基础底层核心程序库,是一个综合用途的实用的轻量级的C程序库。 本软件是带有界面的,并且是GTK+界面,因此,Gthread是最好的选择。
引入Gthread线程的文件在编译时要加入参数`pkg-config --cflags --libs gthread-2.0`。
(1)gboolean g_thread_supported();/*测试是否支持多线程*/
(2)void g_thread_init (GThreadFunctions *vtable);/*初始化多线程支持*/
(3)void gdk_threads_init (void);/*初始化GDK多线程支持*/
(4)void gdk_threads_enter (void);/*进入多线程互斥区域*/
(5)void gdk_threads_leave (void);/*退出多线程互斥区域*/
(6)GThread * g_thread_create (GThreadFunc func, gpointer data, gboolean joinable, GError **error);
这是创建线程函数,func是线程执行的外部函数,data是传给该外部函数的参数,joinable标志线程是否可分离,error是出错代码返回地址。
(7)void g_thread_exit (gpointer retval);/*线程退出,retval为返回状态值*/
(8)GMutex *g_mutex_new ();/*返回一个新的互斥锁*/
(9) void g_mutex_lock(GMutex *mutex);/*上锁*/
(10)void g_mutex_unlock (GMutex *mutex);/*解锁*/
(11)GCond* g_cond_new ();/*返回一个新的信号量*/
(12)void g_cond_signal (GCond *cond);/*释放信号量cond*/
(13)void g_cond_wait(GCond *cond, GMutex *mutex);/*等待信号量cond*/

 

#define g_mutex_new()            G_THREAD_UF (mutex_new,      ())
#define G_THREAD_UF(op, arglist)					\
    (*g_thread_functions_for_glib_use . op) arglist

#define g_mutex_new()            (*g_thread_functions_for_glib_use.mutex_new())

 

#define g_cond_new()             G_THREAD_UF (cond_new,       ())

#define g_cond_new()             (*g_thread_functions_for_glib_use.cond_new())

 

typedef gpointer (*GThreadFunc) (gpointer data);

typedef enum
{
  G_THREAD_PRIORITY_LOW,
  G_THREAD_PRIORITY_NORMAL,
  G_THREAD_PRIORITY_HIGH,
  G_THREAD_PRIORITY_URGENT
} GThreadPriority;

typedef struct _GThread         GThread;
struct  _GThread
{
  /*< private >*/
  GThreadFunc func;
  gpointer data;
  gboolean joinable;
  GThreadPriority priority;
};

typedef struct _GMutex          GMutex;
typedef struct _GCond           GCond;
typedef struct _GPrivate        GPrivate;
typedef struct _GStaticPrivate  GStaticPrivate;

typedef struct _GThreadFunctions GThreadFunctions;
struct _GThreadFunctions
{
  GMutex*  (*mutex_new)           (void);
  void     (*mutex_lock)          (GMutex               *mutex);
  gboolean (*mutex_trylock)       (GMutex               *mutex);
  void     (*mutex_unlock)        (GMutex               *mutex);
  void     (*mutex_free)          (GMutex               *mutex);
  GCond*   (*cond_new)            (void);
  void     (*cond_signal)         (GCond                *cond);
  void     (*cond_broadcast)      (GCond                *cond);
  void     (*cond_wait)           (GCond                *cond,
                                   GMutex               *mutex);
  gboolean (*cond_timed_wait)     (GCond                *cond,
                                   GMutex               *mutex,
                                   GTimeVal             *end_time);
  void      (*cond_free)          (GCond                *cond);
  GPrivate* (*private_new)        (GDestroyNotify        destructor);
  gpointer  (*private_get)        (GPrivate             *private_key);
  void      (*private_set)        (GPrivate             *private_key,
                                   gpointer              data);
  void      (*thread_create)      (GThreadFunc           func,
                                   gpointer              data,
                                   gulong                stack_size,
                                   gboolean              joinable,
                                   gboolean              bound,
                                   GThreadPriority       priority,
                                   gpointer              thread,
                                   GError              **error);
  void      (*thread_yield)       (void);
  void      (*thread_join)        (gpointer              thread);
  void      (*thread_exit)        (void);
  void      (*thread_set_priority)(gpointer              thread,
                                   GThreadPriority       priority);
  void      (*thread_self)        (gpointer              thread);
  gboolean  (*thread_equal)       (gpointer              thread1,
				   gpointer              thread2);
};

 

定义一些宏来访问这些成员变量:

/* shorthands for conditional and unconditional function calls */

#define G_THREAD_UF(op, arglist)					\
    (*g_thread_functions_for_glib_use . op) arglist
#define G_THREAD_CF(op, fail, arg)					\
    (g_thread_supported () ? G_THREAD_UF (op, arg) : (fail))
#define G_THREAD_ECF(op, fail, mutex, type)				\
    (g_thread_supported () ? 						\
      ((type(*)(GMutex*, const gulong, gchar const*))			\
      (*g_thread_functions_for_glib_use . op))				\
     (mutex, G_MUTEX_DEBUG_MAGIC, G_STRLOC) : (fail))

#ifndef G_ERRORCHECK_MUTEXES
# define g_mutex_lock(mutex)						\
    G_THREAD_CF (mutex_lock,     (void)0, (mutex))
# define g_mutex_trylock(mutex)						\
    G_THREAD_CF (mutex_trylock,  TRUE,    (mutex))
# define g_mutex_unlock(mutex)						\
    G_THREAD_CF (mutex_unlock,   (void)0, (mutex))
# define g_mutex_free(mutex)						\
    G_THREAD_CF (mutex_free,     (void)0, (mutex))
# define g_cond_wait(cond, mutex)					\
    G_THREAD_CF (cond_wait,      (void)0, (cond, mutex))
# define g_cond_timed_wait(cond, mutex, abs_time)			\
    G_THREAD_CF (cond_timed_wait, TRUE,   (cond, mutex, abs_time))
#else /* G_ERRORCHECK_MUTEXES */
# define g_mutex_lock(mutex)						\
    G_THREAD_ECF (mutex_lock,    (void)0, (mutex), void)
# define g_mutex_trylock(mutex)						\
    G_THREAD_ECF (mutex_trylock, TRUE,    (mutex), gboolean)
# define g_mutex_unlock(mutex)						\
    G_THREAD_ECF (mutex_unlock,  (void)0, (mutex), void)
# define g_mutex_free(mutex)						\
    G_THREAD_ECF (mutex_free,    (void)0, (mutex), void)
# define g_cond_wait(cond, mutex)					\
    (g_thread_supported () ? ((void(*)(GCond*, GMutex*, gulong, gchar*))\
      g_thread_functions_for_glib_use.cond_wait)			\
        (cond, mutex, G_MUTEX_DEBUG_MAGIC, G_STRLOC) : (void) 0)
# define g_cond_timed_wait(cond, mutex, abs_time)			\
    (g_thread_supported () ?						\
      ((gboolean(*)(GCond*, GMutex*, GTimeVal*, gulong, gchar*))	\
        g_thread_functions_for_glib_use.cond_timed_wait)		\
          (cond, mutex, abs_time, G_MUTEX_DEBUG_MAGIC, G_STRLOC) : TRUE)
#endif /* G_ERRORCHECK_MUTEXES */

#if defined(G_THREADS_ENABLED) && defined(G_THREADS_MANDATORY)
#define g_thread_supported()     1
#else
#define g_thread_supported()    (g_threads_got_initialized)
#endif
#define g_mutex_new()            G_THREAD_UF (mutex_new,      ())
#define g_cond_new()             G_THREAD_UF (cond_new,       ())
#define g_cond_signal(cond)      G_THREAD_CF (cond_signal,    (void)0, (cond))
#define g_cond_broadcast(cond)   G_THREAD_CF (cond_broadcast, (void)0, (cond))
#define g_cond_free(cond)        G_THREAD_CF (cond_free,      (void)0, (cond))
#define g_private_new(destructor) G_THREAD_UF (private_new, (destructor))
#define g_private_get(private_key) G_THREAD_CF (private_get, \
                                                ((gpointer)private_key), \
                                                (private_key))
#define g_private_set(private_key, value) G_THREAD_CF (private_set, \
                                                       (void) (private_key = \
                                                        (GPrivate*) (value)), \
                                                       (private_key, value))
#define g_thread_yield()              G_THREAD_CF (thread_yield, (void)0, ())

#define g_thread_create(func, data, joinable, error)			\
  (g_thread_create_full (func, data, 0, joinable, FALSE, 		\
                         G_THREAD_PRIORITY_NORMAL, error))