Objective-C Messaging
2016-05-16 08:38
537 查看
Friday Q&A 2009-03-20: Objective-C Messaging
by Mike Ash
This article is also available in Chinese (translation by neoman).
Welcome back to another Friday Q&A. This week I'd like to take Joshua Pennington's idea and elaborate on a particular facet last week's topic of the Objective-C runtime, namely messaging. How
does messaging work, and what exactly does it do? Read on!
Definitions
Before we get started on the mechanisms, we need to define our terms. A lot of people are kind of unclear on exactly what a "method" is versus a "message", for example, but this is critically important for understanding how the messaging system works at the
low level.
Method: an actual piece of code associated with a class, and which is given a particular name. Example:
Message: a name and a set of parameters sent to an object. Example: sending "meaning" and no parameters to object
Selector: a particular way of representing the name of a message or method, represented as the type
just opaque strings that are managed so that simple pointer equality can be used to compare them, to allow for extra speed. (The implementation may be different, but that's essentially how they look on the outside.) Example:
Message send: the process of taking a message and finding and executing the appropriate method.
Methods
The next thing that we need to discuss is what exactly a method is at the machine level. From the definition, it's a piece of code given a name and associated with a particular class, but what does it actually end up creating in your application binary?
Methods end up being generated as straight C functions, with a couple of extra parameters. You probably know that
as an implicit parameter, which ends up being an explicit parameter. The lesser-known implicit parameter
Writing a method like this:
Gets translated to a function like this:
(The name mangling is just illustrative, and the gcc doesn't actually generate a linker-visible symbol for methods at all.)
What, then, happens when we write some code like this?
The compiler ends up generating code that does the equivalent of this:
I'm sorry, did you run off screaming there? I'll just wait a moment to give everybody some time to come to their senses and return....
What that ridiculous piece of code after the equals sign does is take the
runtime, and cast it to a different type. Specifically, it casts it from a function that returns
and variable arguments after that to a function that matches the prototype of the method being invoked.
To put it another way, the compiler generates code that calls
to the method in question.
Those readers who are really awake and didn't have their wits scared out of them have now noticed that the compiler needs to know the method's prototype even though all it has to work with is the message being sent. How does
the compiler deal with this discrepancy? Quite simply, it cheats. It makes a guess at the method prototype based on the methods it can see from the declarations that it has parsed so far. If it can't find one, or there's a mismatch between the declarations
it sees and the method that will actually be executed at runtime, Bad Things Happen. This is why Objective-C deals so poorly with multiple methods which have the same name but different argument/return types.
Messaging
A message send in code turns into a call to
method implementation and then call it. Calling is easy: it just needs to jump to the appropriate address. But how does it look it up?
The Objective-C header
members:
That struct is in turn defined:
Which is just declaring a variable-length struct holding
turn defined as:
So even though we're not supposed to touch these structs (don't worry, all the functionality for manipulating them is provided through functions in elsewhere in the header), we can still see
what the runtime considers a method to be. It's a name (in the form of a selector), a string containing argument/return types (look up the
an
Now we know enough to see how this stuff works. All
of the object you give it (available by just dereferencing it and obtaining the
list, and search through the method list until a method with the right selector is found. If nothing is there, search the superclass's list, and so on up the hierarchy. Once the right method is found, jump to the IMP of method in question.
One more detail needs to be considered here. The above procedure would work but it would be extremely slow.
about a dozen CPU cycles to execute on the x86 architecture, which makes it clear that it's not going through this lengthy procedure every single time you call it. The clue to this is another
And that's defined farther down:
This defines a hash table that stores
is by first hashing the selector and looking it up in the class's method cache. If it's found, which it nearly always will be, it can jump straight to the method implementation with no further fuss. Only if it's not found does it have to do the more laborious
lookup, at the end of which it inserts an entry into the cache so that future lookups can be fast.
(There is actually one more detail beyond this which ends up being extremely important: what happens when no method can be found for a given selector. But that one is so important
that it deserves its own post, so look for it next week.)
Conclusion
That wraps up this week's edition. Come back next week for more. Have a question? Think Objective-C's messaging system should be done differently? Post below.
by Mike Ash
This article is also available in Chinese (translation by neoman).
Welcome back to another Friday Q&A. This week I'd like to take Joshua Pennington's idea and elaborate on a particular facet last week's topic of the Objective-C runtime, namely messaging. How
does messaging work, and what exactly does it do? Read on!
Definitions
Before we get started on the mechanisms, we need to define our terms. A lot of people are kind of unclear on exactly what a "method" is versus a "message", for example, but this is critically important for understanding how the messaging system works at the
low level.
Method: an actual piece of code associated with a class, and which is given a particular name. Example:
- (int)meaning { return 42; }
Message: a name and a set of parameters sent to an object. Example: sending "meaning" and no parameters to object
0x12345678.
Selector: a particular way of representing the name of a message or method, represented as the type
SEL. Selectors are essentially
just opaque strings that are managed so that simple pointer equality can be used to compare them, to allow for extra speed. (The implementation may be different, but that's essentially how they look on the outside.) Example:
@selector(meaning).
Message send: the process of taking a message and finding and executing the appropriate method.
Methods
The next thing that we need to discuss is what exactly a method is at the machine level. From the definition, it's a piece of code given a name and associated with a particular class, but what does it actually end up creating in your application binary?
Methods end up being generated as straight C functions, with a couple of extra parameters. You probably know that
selfis passed
as an implicit parameter, which ends up being an explicit parameter. The lesser-known implicit parameter
_cmd(which holds the selector of the message being sent) is a second such implicit parameter.
Writing a method like this:
- (int)foo:(NSString *)str { ...
Gets translated to a function like this:
int SomeClass_method_foo_(SomeClass *self, SEL _cmd, NSString *str) { ...
(The name mangling is just illustrative, and the gcc doesn't actually generate a linker-visible symbol for methods at all.)
What, then, happens when we write some code like this?
int result = [obj foo:@"hello"];
The compiler ends up generating code that does the equivalent of this:
int result = ((int (*)(id, SEL, NSString *))objc_msgSend)(obj, @selector(foo:), @"hello");
I'm sorry, did you run off screaming there? I'll just wait a moment to give everybody some time to come to their senses and return....
What that ridiculous piece of code after the equals sign does is take the
objc_msgSendfunction, defined as part of the Objective-C
runtime, and cast it to a different type. Specifically, it casts it from a function that returns
idand takes
id,
SEL,
and variable arguments after that to a function that matches the prototype of the method being invoked.
To put it another way, the compiler generates code that calls
objc_msgSendbut with parameter and return value conventions matched
to the method in question.
Those readers who are really awake and didn't have their wits scared out of them have now noticed that the compiler needs to know the method's prototype even though all it has to work with is the message being sent. How does
the compiler deal with this discrepancy? Quite simply, it cheats. It makes a guess at the method prototype based on the methods it can see from the declarations that it has parsed so far. If it can't find one, or there's a mismatch between the declarations
it sees and the method that will actually be executed at runtime, Bad Things Happen. This is why Objective-C deals so poorly with multiple methods which have the same name but different argument/return types.
Messaging
A message send in code turns into a call to
objc_msgSend, so what does that do? The high-level answer should be fairly apparent. Since that's the only function call present, it must look up the appropriate
method implementation and then call it. Calling is easy: it just needs to jump to the appropriate address. But how does it look it up?
The Objective-C header
runtime.hincludes this as part of the (now opaque, legacy)
objc_classstructure
members:
struct objc_method_list **methodLists OBJC2_UNAVAILABLE;
That struct is in turn defined:
struct objc_method_list { struct objc_method_list *obsolete OBJC2_UNAVAILABLE; int method_count OBJC2_UNAVAILABLE; #ifdef __LP64__ int space OBJC2_UNAVAILABLE; #endif /* variable length structure */ struct objc_method method_list[1] OBJC2_UNAVAILABLE; } OBJC2_UNAVAILABLE;
Which is just declaring a variable-length struct holding
objc_methodstructs. That one is in
turn defined as:
struct objc_method { SEL method_name OBJC2_UNAVAILABLE; char *method_types OBJC2_UNAVAILABLE; IMP method_imp OBJC2_UNAVAILABLE; } OBJC2_UNAVAILABLE;
So even though we're not supposed to touch these structs (don't worry, all the functionality for manipulating them is provided through functions in elsewhere in the header), we can still see
what the runtime considers a method to be. It's a name (in the form of a selector), a string containing argument/return types (look up the
@encodedirective for more information about this one), and
an
IMP, which is just a function pointer:
typedef id (*IMP)(id, SEL, ...);
Now we know enough to see how this stuff works. All
objc_msgSendhas to do is look up the class
of the object you give it (available by just dereferencing it and obtaining the
isamember that all objects contain), get the class's method
list, and search through the method list until a method with the right selector is found. If nothing is there, search the superclass's list, and so on up the hierarchy. Once the right method is found, jump to the IMP of method in question.
One more detail needs to be considered here. The above procedure would work but it would be extremely slow.
objc_msgSendonly takes
about a dozen CPU cycles to execute on the x86 architecture, which makes it clear that it's not going through this lengthy procedure every single time you call it. The clue to this is another
objc_classmember:
struct objc_cache *cache OBJC2_UNAVAILABLE;
And that's defined farther down:
struct objc_cache { unsigned int mask /* total = mask + 1 */ OBJC2_UNAVAILABLE; unsigned int occupied OBJC2_UNAVAILABLE; Method buckets[1] OBJC2_UNAVAILABLE; }
This defines a hash table that stores
Methodstructs, using the selector as the key. The way
objc_msgSendreally works
is by first hashing the selector and looking it up in the class's method cache. If it's found, which it nearly always will be, it can jump straight to the method implementation with no further fuss. Only if it's not found does it have to do the more laborious
lookup, at the end of which it inserts an entry into the cache so that future lookups can be fast.
(There is actually one more detail beyond this which ends up being extremely important: what happens when no method can be found for a given selector. But that one is so important
that it deserves its own post, so look for it next week.)
Conclusion
That wraps up this week's edition. Come back next week for more. Have a question? Think Objective-C's messaging system should be done differently? Post below.
相关文章推荐
- Objective-C Message Forwarding
- effective objective-c 2.0 笔记 第四章 :协议与分类
- 内核设备模型之为kobject添加属性文件
- Objective-C的NSCopying协议
- 论文笔记《Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification》
- Python Types and Objects
- Qt多线程 信号和槽以及C++11的绑定 及QMetaObject::invokeMethod
- iOS开发系列--Objective-C之KVC、KVO
- iOS开发系列—Objective-C之Foundation框架
- Object类和String类equals方法的区别
- 关于unable to find class referenced in signature (Lorg/ksoap2/serialization/SoapObject;)的错误
- ObjectInputStream和ObjectOutputStream的用法(Map数据读取和写入)
- KVO、KVC、单例的使用(16.5.15)
- effective objective-c 2.0 笔记 第三章 :接口与API设计
- objective-c - 基础篇 - NSNumber与NSdate与NSExcetion
- effective objective-c 2.0 笔记 第二章 :对象,消息,运行期
- Assigning retained object to weak property; object will be released after assignment
- Js判断参数(String,Array,Object)是否为undefined或者值为空
- Axis2 服务器端抛出ServiceClass object does not implement问题解决方法
- 【Android动画九章】-属性动画ObjectAnimator