Chapter 2. COM Fundamentals

Now that we've covered the basics of COM components, it is time to look at COM in greater detail. Unfortunately, Visual Basic hides a lot of the details, and Visual J++ (VJ++), although exposing more details than VB, still abstracts COM to a large degree. This leaves Visual C++. I have always maintained that COM has its truest representation in C++, and to truly understand COM, you need to understand how it is implemented in this language.

That said, I realize that there are many developers in the world who lead a perfectly satisfying professional life without programming in C++. But, much like learning Latin, your understanding of your present language can be greatly enhanced by a study of its sometimes more verbose and complex predecessor.

The Role of IUnknown

Let's focus again on interfaces. A number of times in the last chapter, I said that all interfaces must inherit from IUnknown. I said this because IUnknown contains three methods that every interface is required to support. The three methods are as follows:

  • QueryInterface()

  • AddRef()

  • Release()

In some languages, particularly VB, you can never call these methods directly; therefore, you can easily forget IUnknown exists. However, every interface you ever use must inherit from IUnknown; because of this, it is guaranteed to support these three methods. This is important, because even if you don't explicitly call these methods, they are called behind the scenes for the following two reasons:

  • To control the lifetime of the objects (AddRef(), Release())

  • To enable clients to get one or many interfaces from objects (QueryInterface())

This becomes apparent when you take a look at how objects are created. As I said, IUnknown is largely abstracted away in languages like VB, so you need to look at the C++ Win32 way of instantiating an object and getting an interface. In VB, the lines

Dim MyCalc as VBCalcLib.Calc
Set MyCalc=new VBCalcLib.Calc

are equivalent to the VJ++ lines

ICalc MyCalc; 
MyCalc = new Calc();

which are equivalent to the C++

ICalc * pICalc;
const GUID CLSID_Calc =
    { 0x638094e0,0x758f,0x11d1,{ 0x83,0x66,0x00,0x00,0xe8,0x3b,0x6e,0xf3} } ;
const GUID IID_ICalc = 
    { 0x638094e5,0x758f,0x11d1,{ 0x83,0x66,0x00,0x00,0xe8,0x3b,0x6e,0xf3} } ;

CoCreateInstance(CLSID_Calc, 0,CLSCTX_INPROC_SERVER , IID_ICalc,
 (void**)&pICalc);

Some of the preceding syntax might be unfamiliar to non-C++ developers, but it is not complicated. If you cut it down to its most basic level, you end up with the following:

CoCreateInstance(CLSID_COMCalc, NULL, CLSCTX_ALL , IID_ICalc, (void**)&pICalc);

Remember, our type library definition of ICalc was:

    [
        uuid(638094E5-758F-11d1-8366-0000E83B6EF3)
    ]
    interface ICalc : IUnknown
    {

Our type library definition for the Calc coclass was:

     [
        uuid(638094E0-758F-11d1-8366-0000E83B6EF3)
     ]
coclass Calc

Here are the C++ lines:

    const GUID CLSID_Calc =
    { 0x638094e0,0x758f,0x11d1,{ 0x83,0x66,0x00,0x00,0xe8,0x3b,0x6e,0xf3} } ;
const GUID IID_ICalc =
    { 0x638094e5,0x758f,0x11d1,{ 0x83,0x66,0x00,0x00,0xe8,0x3b,0x6e,0xf3} } ;

This represents how the GUID for the coclass Calc and interface ICalc are written in C++.

The following code simply says, ***COM, please create an instance of the object whose GUID is CLSID_Calc and get me the interface of that object whose GUID is IID_ICalc and put that interface into the variable pICalc:

   CoCreateInstance(CLSID_COMCalc, _ , _ , IID_ICalc, (void**)&pICalc);

The following code then becomes possible:

int result=pICalc->Add(1,2);

Release()

When you are finished using the object, how do you tell the object that you're done with it? After all, your application might need Calc's services for only a few milliseconds, but it might run for several hours. You don't want Calc hanging around, unused, wasting resources all that time; so there must be some way to let Calc know it's time to go. In VB or VJ++, this is easy to do; you can do one of the following two things:

  • Do nothing and the language eventually performs some form of garbage collection and cleans up the object for you.

  • Get rid of the object by saying the following:

In VB:

   Set MyCalc= Nothing

In J++:

   MyCalc = null;

or

   com.ms.com.ComLib.release(MyCalc);

Implementing Release()

Let's turn our attention to the implementation of Release. When it comes to releasing an object, different languages impose different degrees of responsibility on the object developer. In C++, for example, the language has no inherent garbage collection functionality, so the developer must deallocate the object himself. Unlike many 4GL languages, in C++, an equality operator does not natively have an overloaded meaning. So, the following line means that a 0 is placed in the pointer variable pICalc:

   pICalc=NULL; //BAD!

This does not inform the Calc object of anything; in fact, you have made an orphan out of the object because you no longer hold a reference to it. How then, can you tell Calc that its services are no longer required? Because the ICalc interface you hold is inherited from IUnknown, it must support a function called Release().This is exactly what Release() does—it informs the object that someone has just lost interest in it. Therefore, to ask your object to destroy itself when you are done with it, you simply use the following code:

   pICalc->Release();

The object now destroys itself. If you wonder how IUnknown got involved with ICalc, just remember that ICalc is inherited from IUnknown. Look again at the type library in Listing 2.1.

Example 2.1.  Type Library

    [
        object,
        uuid(638094E5-758F-11d1-8366-0000E83B6EF3),
        oleautomation,
        helpstring("ICalc Interface"),
        pointer_default(unique)
    ]
    interface ICalc : IUnknown
    {
    [id(1), helpstring("method Add")] HRESULT Add([in] int x, [in] int y,
                                                  [out,retval] int * r );
    [id(2), helpstring("method Subtract")] HRESULT Subtract([in] int x, [in] int y,
                                                            [out,retval] int * r );

Remember that MIDL.EXE was run on this IDL file, and a header file containing an ICalc proxy class was generated. This header file is to be included in the client application so that it is possible to declare ICalc pointers. Thus, the following line is possible:

ICalc * pICalc;

This line is possible because just above it in the source file is the following line, which includes the MIDL-generated header file:

#include "calc.h" //MIDL generated this from the type library

If you look in this header file, you will find the code shown in Listing 2.2.

Example 2.2.  ICalc Prototype as Seen in MIDL-Generated File, CALC.H

MIDL_INTERFACE("638094E5-758F-11d1-8366-0000E83B6EF3")
    ICalc : public IUnknown
    {
    public:
        virtual /* [helpstring][id] */ HRESULT STDMETHODCALLTYPE Add(
            /* [in] */ int x,
            /* [in] */ int y,
            /* [retval][out] */ int __RPC_FAR *r) = 0;

        virtual /* [helpstring][id] */ HRESULT STDMETHODCALLTYPE Divide(
            /* [in] */ int x,
            /* [in] */ int y,
            /* [retval][out] */ int __RPC_FAR *r) = 0;


    } ;

There is no escape from the unknown (or IUnknown). Every interface inherits from IUnknown, so every interface must support AddRef(), Release(), and QueryInterface(). Normally, however, every interface just delegates all of its method calls to the derived object as shown in Figure 2.1.

The point is that every object supports and implements the methods of IUnknown, and the methods of IUnknown can be called from any interface the object supports.

This means that every COM object ever written must know what to do when Release() is called on any of its interfaces. In the C++ case, this responsibility often falls on the developer (unless he is using the Active Template Library [ATL]), and it is typically addressed as shown in Listing 2.3.

Example 2.3.  A Typical Implementation of Release()

SomeObject::Release()
{
    m_ObjectCount--,
    if(m_ObjectCount==0)
        delete this;
}

In VB and VJ++, this code is generated automatically, and the developer doesn't have to write it.

AddRef()

Another method of IUnknown is AddRef(). It is just as critical, but not much more complex. AddRef() is the complementary function to Release(). If Release() is called when you want to notify an object that there is one less outstanding reference to it, AddRef() is called to let an object know there is one more. To be more specific, Release() and AddRef() are called to keep track of the number ofreferences held by a client (or clients) to a given object. Basically, whenever you (or COM) give out an interface, AddRef() must be called to let the object know that there is a client somewhere that holds a reference to it.

A call to a method in an abstract base class trickles down to the implementation in the derived class.

Figure 2.1. A call to a method in an abstract base class trickles down to the implementation in the derived class.

If this is true, you might wonder why we didn't call AddRef() in your client code in Figure 2.1. AddRef() is actually called; it just happens automatically in CoCreateInstance(). For example, if you share your interface in any way, as in the following:

ICalc * pICalc1, pICalc2;
CoCreateInstance(CLSID_COMCalc, _ , _ , IID_ICalc, (void**)&pICalc);
pICalc2=pICalc;

then you are supposed to call AddRef() like this:

pICalc->AddRef();

You do this to let the underlying Calc object know that there is another entity (pICalc2) that holds a reference to it, not just one. If this is the case, the following line does not cause the object to be released, because every AddRef() must be balanced by one complementary Release():

pICalc->Release();

Release() must be called again to balance the number of AddRef()s called. In this way, an object can keep a reference count so as to track the number of clients the object has. Usually, the reference count is kept in an internal variable of the object and is incremented whenever AddRef() is called and decremented whenever Release() is called. An object knows it can destroy itself when this internal variable, this reference count, becomes 0.

Guidelines for Reference Counts and Object Lifetime

A client is solely responsible for the lifetime of an object. Always remember that AddRef() and Release() are member functions implemented by the object itself; they are not COM system functions. They are available to your client only after the object is instantiated and one of its interfaces is delivered to the client. When a client calls AddRef() on any interface, the object should take note of it and increment its internal reference count by one. If Release() is called, the object should decrement its reference count by one. AddRef() never initiates any other kind of action, creation or otherwise. It simply means add one to the reference. Likewise, Release() does nothing more than decrement the reference count until the reference count is 0. When this happens, the object should destroy itself because a reference count of 0 indicates that no clients hold any interface pointers to it.

The following are some guidelines for when these functions should be called:

  • The client must call AddRef() when the client does the following:

    Places the interface pointer in another variable.

    Passes an interface pointer by reference as an [in,out] argument to a method.

  • AddRef() is called implicitly on an object when the following occurs:

    QueryInterface() is called and returns the requested interface. Note that when an object is instantiated QueryInterface() is implicitly called by the object's class factory during its creation.

  • A client should call Release when the following occurs:

    It is finished using an interface it QueryInterfaced for.

    It is finished using an interface given to it when the client instantiated an object using a COM creation API, such as CoCreateInstance().

    It is finished using an interface passed to it as an output argument from a method.

This is traditional COM. COM+, however, throws a big wrench in all this. Lifetimes are no longer so clear; the reference counting rules should still be followed, but COM+ might elect to keep an object alive even when there are no longer any clients for it and its reference count is zero. I'll explain exactly how this works in Appendix C, "Object Pooling."

QueryInterface()

AddRef() and Release() are used by the client application (and COM) to manage the lifetime of an object. It is really only in C++ that you deal with these methods directly, and even then you can avoid calling these functions by using the ATL and what are calledsmart pointers. At this point, don't be concerned about understanding all the possible rules regarding when to call AddRef() and Release(). They are relatively straightforward, and in many development environments, you don't even see these functions. What is important is that you understand the rules generally. Let's move on to talk about the final method of IUnknown, QueryInterface().

QueryInterface() is arguably the most important of the three functions of IUnknown, and you will understand its use if I introduce two main tenets of COM:

  • The only way a client can ever interact with an object is through one of its interfaces.

  • The only thing a client can hold is a reference to one of an object's interfaces.

Client applications—be they VB, VJ++, or VC++—can never, ever have a reference or pointer to the object itself; rather, they can only have a reference to one of the interfaces of the object. This makes sense if you begin to look at interfaces for what they really are—channels of communication. An interface is like a radio frequency, and your object is the AM/FM dial that has different programs on different frequencies. When you listen to a talk show during your drive to work, you don't imagine that the host is sitting in the car with you. You realize that this station (or interface) is not the actual show, but a channel into the show, which can be in a far-off city or only two miles away. Either way, you're not concerned where the show is, nor do you care how the show is put together (producer, director, sound engineer, audio equipment, and so on) When you tire of the show, you can always change the station.

This is where QueryInterface() comes in. QueryInterface() is your tuning dial, allowing you to change the communications channel. In the same way you move to another station up or down from whatever frequency you're tuned in to, you can move from any one interface your object supports to any other interface.

QueryInterface() is a function available on any interface that allows you to request any other interface that the object supports.

At first glance, our Calc object appears to support only one interface, ICalc. In actuality, COMCalc actually supports both ICalc and IUnknown because ICalc is inherited from IUnknown. This brings us to another tenet of COM:

  • Every object, without exception, supports the IUnknown interface.

This is because every interface of an object must be inherited from IUnknown; and because the base object is ultimately responsible for implementing all the methods of all its interfaces, it is also responsible for implementing AddRef(), Release(), and QueryInterface(). By doing this, the object ends up supporting IUnknown. Let's prove this. First, recall the following code:

ICalc * pICalc;
IUnknown * pIUnk;

CoCreateInstance(CLSID_COMCalc, 0,CLSCTX_INPROC_SERVER , IID_ICalc,
(void**)&pICalc);

Now, let's use QueryInterface on the interface you have (ICalc) and ask the object if it supports IUnknown:

pICalc->QueryInterface(IID_IUnknown, (void**)&pIUnk);

You can be sure the object supports IUnknown (all objects must); so upon returning from this function, pIUnk holds an IUnknown reference to object COMCalc. You can't really do much with an IUnknown reference except control the object's lifetime and query for another interface. As you'll see, there are times when an IUnknown reference to an object is a useful thing, but for the most part you don't often ask for it.

Interfaces and Their GUID Identifiers

IID_IUnknown is a GUID that is declared in the Windows header files. The GUID for IUnknown—and in fact, the GUID for any interface—never changes after it's published. Think of the GUID of an interface as its Social Security Number.

Inheriting from IUnknown

ICalc has the QueryInterface method because it is inherited from IUnknown like all interfaces. Don't get confused—there is not a contradiction between ICalc being inherited from IUnknown and COMCalc supporting both IUnknown and ICalc. It is because use ICalc inherits from IUnknown that COMCalc supports IUnknown. And it is becabecauseuse ICalc inherits from IUnknown that COMCalc supports IUnknown. And it is because ICalc inherits from IUnknown that it has the QueryInterface() function that can be used by the client to query the object for another interface it supports.

Using QueryInterface()

Let's use QueryInterface() in a more common context. If you recall our earlier Calc example in Chapter 1, "COM+: An Evolution," we proposed that the object have two interfaces, ICalc and IFinancial. In this way, the object can partition its functionality into two distinct groups or interfaces, each of which contains methods grouped by functionality. If the client is interested in basic arithmetic functions, it can ask for ICalc. If it is interested in financial functions, it can ask for IFinancial. If your COMCalc supports both interfaces, the IDL file would look like Listing 2.4.

Example 2.4.  IDL File for COMCalc

import "oaidl.idl";
import "ocidl.idl";

    [
        object,
        uuid(638094E5-758F-11d1-8366-0000E83B6EF3),
        dual,
        oleautomation,
        helpstring("ICalc Interface"),
        pointer_default(unique)
    ]
    interface ICalc : IUnknown
    {
        [id(1), helpstring("method Add")] HRESULT Add([in] int x, [in] int y,
                                         [out,retval] int * r );
        [id(2), helpstring("method Divide")] HRESULT Divide([in] int x,
                                         [in] int y, [out,retval] int * r);
    } ;

    [
        object,
        uuid(638094E4-758F-11d1-8366-0000E83B6EF3),
        dual,
        helpstring("IFinancial Interface"),
        pointer_default(unique),

    ]

    interface IFinancial : IUnknown
    {
        [id(1), helpstring("method MortgagePayment")]
        HRESULT MortgagePayment([in] double amount, [in] double percent,
                                [in] int period,
                                [out,retval]float * payment);
        [id(2), helpstring("method GetPrimeRate")]
        HRESULT GetPrimeRate( [out,retval]double * rate);

    } ;


 [
    uuid(638094E1-758F-11d1-8366-0000E83B6EF3),
    version(1.0),
    helpstring("COMCalc 1.0 Type Library")
]
library COMCALCLib
{
    importlib("stdole32.tlb");
    importlib("stdole2.tlb");

    [
        uuid(638094E0-758F-11d1-8366-0000E83B6EF3),
        helpstring("Calc Class")
    ]
    CoClass COMCalc
    {
        [default] interface ICalc;
        interface IFinancial;

    };

} ;

The C++ client can choose which interface it wants (see Listing 2.5).

Example 2.5.  Client Code Creating an Instance of COMCalc and Using Its ICalc and IFinancial Interfaces

ICalc * pICalc;
IFinancial * pIFin;

CoCreateInstance(CLSID_COMCalc, 0,CLSCTX_INPROC_SERVER , IID_ICalc, (void**)&pICalc);

int result=pICalc->Add(1,2);
pICalc->QueryInterface(IID_IFinancial, (void**)&pIfin);
float rate=pIFin->GetPrimeRate();
pICalc->Release();
pIFinancial->Release();

Where C++ Representations Come From

IID_IFinancial is the C++ name of IFinancial's GUID. It was generated when MIDL was run on the IDL file and the MIDL-generated header file (included in this source file).

A Visual Basic client is much simpler and can be written as shown in Listing 2.6.

Example 2.6.  A VB Client Creating COMCalc and Using Its IFinancial Interface

Dim myCalc as new VBCalc.COMCalc
Dim Financial as IFinancial

Result=myCalc.Add (1,2)
Set  Financial = myCalc

In Visual Basic, the Set keyword usually translates to a QueryInterface() method call. Although you do not see or work directly with the GUIDs as you do in C++, they are present and maintained by the development environment. Specifically, by including your component's type library in your project by choosing Project, References, the component's GUIDs are included and implicitly used. The resulting list of type libraries is shown in Figure 2.2.

VB knows all the GUIDs, objects, and so on that it needs. So, the VB line

Set Financial = myCalc

is translated behind the scenes to something like

pICalc->QueryInterface({ 0x638094e4,0x758f,0x11d1,{ 0x83,0x66,0x00,0x00,0xe8,
    0x3b,0x6e,0xf3} } ,(void**)&pIFin);

When prefaced by the Set keyword, VB reads the left side of the =, looks up the GUID for the object (IFinancial) in the object's type library, and uses that GUID in a QueryInterface call made on the object to the right of the = (which is myCalc).

Visual Basic lists all the registered type libraries.

Figure 2.2. Visual Basic lists all the registered type libraries.

Where Does COM Live?

We've talked a lot about COM objects and interfaces and it's important to understand the theory, but we haven't said much about where COM objects reside. I did mention that COM objects can live in DLLs or EXEs; and in the case of COM+, your objects really need to live in the former. But what exactly does it mean for a COM object to be in a Dynamic Link Library (DLL)? How does this DLL get loaded? And how on earth can a client application on one machine get an interface of an object residing in a DLL on another machine? DLLs, after all, are just portable libraries brought into the process space of an application that needs them, so there is really no way for a client application to load and use a DLL across the network. (Well, you could do directory sharing but that is cheating; file sharing is not the same thing as distributed objects.) COM+, however, makes this possible. We'll explore how after we spend a few pages talking about DLLs.

Dynamic Link Libraries (DLLs)

When Windows was originally released, DLLs were really nothing more than groupings of functions that could be loaded into an application when the application wanted them and discarded when the application no longer wanted them. This allowed EXEs to be smaller, because they could outsource a lot of functionality; rather than have a lot of commonly used library routines compiled into the EXE and making the application larger, this functionality could be put in DLLs. DLLs could be loaded and unloaded at will and could even be shared among multiple applications.

Another nice thing about DLLs was that there was a kind of binary standard for them. If your DLLs were compiled properly, you could write the DLL in virtually any language and make your functions available in binary form to any other language. True, your language needed to produce assembly code, and entry points to your functions had to be declared in a certain way so that certain ambiguities were cleared up (who cleans up the call stack, what case will the exported functions be listed in, and so on). That said, however, it quickly became possible to write a DLL in C and call the functions in it from Visual Basic or another language or environment. Now that was power.

Today in COM+, DLLs play a more critical and sophisticated role than ever. COM+ DLLs no longer export an arbitrary number of functions (actually, you'll see soon in the sidebar "Functions Exported by COM DLLs" that they only export four utility functions). Their role has changed. The COM class of today's DLLs have far more lofty names for themselves (OCX, ActiveX Component, ActiveX Control) and are even referred to as servers sometimes. This is not to say that there are no longer traditional DLLs that are storehouses for a group of functions; there definitely are. But we're going to concern ourselves with the new role of DLLs—that of COM components.

COM DLLs, sometimes referred to as ActiveX components or servers, are actually repositories for COM objects. When I use the term COM objects, I am referring to the actual implementation of a specific coclass in a type library. A DLL can have as many objects in it as the developer wants to put in. That same DLL also has (stuffed into its resources) a .TLB file. A single DLL then can have multiple COM objects and a type library fully describing those objects and interfaces—all inside of one DLL shell. This is shown in Figure 2.3 .

A COM DLL that has two coclasses and its type library included in its resource section.

Figure 2.3. A COM DLL that has two coclasses and its type library included in its resource section.

Type Libraries and Resources

So, what does all this have to do with type libraries? Well, another benefit of the resource section of DLLs (and EXEs) is that they are extensible in terms of the kind of data they allow you to store. Think of the resource section as the kitchen junk drawer—it usually has extra twisty-ties, crazy glue, and usually a tape measure, but it can really have just about anything in it. Although there is built-in Win32 support for bitmaps, dialog boxes, icons, and the like, you can put anything you want in the resource section. Some application designers stuff .WAV files or .AVI movie files in the resource section so that they can ship a single DLL or EXE with no dependencies.

What Are Resource Files?

When I teach my COM course, I get to one part of the lesson where I say, "Type libraries are most often stuffed into the resource section of the DLL," and I get a lot of raised eyebrows. At that point, I remember that not all developers have worked with resource files or know what they are. Well, here is the deal with resources: Every dialog box, menu, and most bitmaps, icons, and splash screens used by an application are not compiled into assembly with your program logic. On the contrary, they are stuffed into a reserved section that every DLL and EXE can have called the resources section. You can reverse-compile the resources of most EXEs and DLLs and take a gander at all the dialog boxes, bitmaps, and so on that they use. You can also change any of the resources, and the next time the program is run, you will see that the dialogs, menus, and whatever else you modified have, in fact, changed.

At this point, the students usually get excited and ask if they canchange the menus and dialog boxes of Microsoft Word, Excel, and so on. The answer is yes, you can. TheMicrosoft Compiler (VC++) does not make this easy to do (probably by design), but other compilers include resource editors that make this simple. Personally, I findWatcom's resource editor pretty easy to use when I feel like adding some graffiti to a co-worker's splash screen or changing his menu items to make his day more interesting.

COM takes advantage of this fact, and there is support for stuffing type libraries in the resource section. Remember, this is just a convenience. No one says that type libraries must be in the resources of your DLL. Although most COM DLLs do include type libraries in their resources, you can ship two files: your COM DLL and a separate type library. Microsoft Excel does this. Although Excel is a traditional application EXE, it also supports a host of COM objects that you can use to remote control Excel. The type library for Excel is greater than 0.5MB. That's a lot of IDL, and unless you want to make the EXE that much bigger, shipping the type library as a separate file is not a bad idea.

The Registry and Self-Registration

The discussion of type libraries in the previous section begs the following question: If the type library and the component can be two separate things and can even be in separate files, how can COM associate the two? The answer is the component must be officially registered with COM. In the good old days of 16-bit VB 3, the late-cretaceous ancestor of modern ActiveX controls, the VBX roamed the OS. If you wanted to add a new component to the system so that you could use it in VB 3, all you had to do was put a new VBX file in the C: Windows System directory, and that was all there was to it. There would be a new control on your tool palette which you could just drag and drop onto your form. I must admit, sometimes I miss the simplicity of this. Modern COM, however, is more exacting. VBXs are gone and replaced by OCXs, which are really just DLLs with COM objects and a type library inside them. The path to the COM DLL, the objects inside of it, and the location of its type library (whether the TLB path is the same as the DLL or is a separate file) must be known to COM before they can be used. COM DLLs, therefore, are self-registering. By this, I mean that COM DLLs know exactly what COM needs to know and have the capability to put the appropriate entries in the Registry so that COM knows about them and can use them.

Although COM DLLs know how to put the appropriate entries in the Registry, they are powerless to do so themselves. DLLs are always loaded and acted upon by a host process (EXE), so a COM DLL needs to be loaded by an application and askedto register itself. This creates a chicken-or-egg scenario for COM clients: a client needs to know what DLL an object is in if it were to ask the DLL to register itself; but if it knows what DLL the object is in, there is no need for the DLL to register itself. The solution is COM clients do not register COM DLLs—either you do it manually using a utility called regsvr32.exe or some Windows Setup.EXE program performs the registration behind the scenes. In the former case, if I syou a COM DLL called comcalc.dll and promise you that there is a calculator object in it, you would register it by going to the command prompt and typing the following:

regsvr32.exe comcalc.dll

VB and Registration

Note to VB Users: Registration is performed automatically when an ActiveX component is compiled. However, if you ship an ActiveX DLL (VBism<ismII> for COM DLL) by itself without using VB's Setup Builder, the recipient still needs to register it in this manner.

Regsvr32.exe is a very simple program. It simply loads the DLL, looks for an exported function that all COM DLLs have called DLLRegister(), and the DLL does the rest. When DLLRegisterServer() is complete, the function returns and regsvr32.exe exits. The approximate code in C for regsvr32.exe is something like that shown in Listing 2.7.

Example 2.7.  Psuedo-Regsvr32 Utility Writing in C

main(int argc, char **argv)
{
HMODULE h;
HRESULT (STDAPICALLTYPE *v)(void);

h=LoadLibrary(argv[1]);
(FARPROC&)v = GetProcAddress(h, "DllRegisterServer");

(*v)();

}

You get the idea. DLLRegisterServer() takes the ball and runs with it. In VB and VJ++, the code for this function is created for you. In C++, you write it yourself unless you are using a class library like ATL and using the accompanying VC++ IDE ATL wizard support. Fundamentally, DLLRegisterServer() must do the following two things:

  • Put path, GUID, friendly string names, and threading information in the Registry

  • Register the type library

The first point is not that complex. COM needs to know where the DLL can be located, what the coclasses (objects) are, and what their GUIDs are. It must also indicate how thread-safe the DLL is—that is, can it handle concurrent access from many different threads at once. If the answer is no, COM automatically makes the component thread-safe by providing protection for it. Threads and thread safety are discussed in depth in Chapter 4, "Threading and Apartment Models." Specifically, the minimum Registry entries a DLL enters are shown in Figure 2.4.

Registry settings for a typical component.

Figure 2.4. Registry settings for a typical component.

Common Registry Entries for COM/COM+ Classes

Following is a list (which is by no means all-inclusive) of common Registry settings. Note that most are optional:

  • Implemented Categories (Optional). Clients and utilities can look at this entry to determine what a given coclass is capable of and/or what its requirements are. Specifically, there are a number of GUIDs established by Microsoft (check MSDN for details), each one of which indicates a specific capability or requirement. One or many of these GUIDS can be added as subkeys under this key.

  • InprocServer32 (Required). Contains the path and DLL where the coclass being described can be found. Also has a data entry indicating the coclass's threading affinity (discussed in Chapter 4, "Threading and Apartment Models" ).

  • Programmable (Optional). Indicates that the object supports OLE automation. We'll talk more about automation in Chapter 5. This key is no longer necessary.

  • Typelib (Suggested but optional). Lists the GUID of its type library. This GUID can be looked up under the HKEY_CLASSES_ROOT TypeLib key, and the actual path to the type library can then be found.

  • Version (Optional). Can be used to indicate a version of the coclass. This is only important if different versions of the same coclass are co-existing, but all need to be available to the client.

The second point is also simple. Type libraries must be separately registered. As I said, they might very well be located in the same DLL as the objects it describes, but this doesn't matter. When registering, the DLL registers its type library by calling a Win32 function called, appropriately, RegisterTypeLib(). This function takes as an argument the path of the type library, and from here, Windows performs some magic. This magic takes the form of placing the type library path and GUID in the Registry in a section called TypeLib, shown in Figure 2.5.

This list is what you see when you select Project, References in VB. This Registry entry, however, is not all that happens. COM reads the type library and also registers additional information about the interfaces contained in the type library. In fact, if you look in HKEY_CLASSES_ROOT Interface , you will find your ICalc and IFinancial interfaces listed by their GUID, as shown in Figure 2.6.

The section of the Registry where type libraries are listed (HKEY_CLASSES_ROOT TypeLib).

Figure 2.5. The section of the Registry where type libraries are listed (HKEY_CLASSES_ROOT TypeLib).

Information on interfaces are kept in the Registry at HKEY_CLASSES_ROOT Interface.

Figure 2.6. Information on interfaces are kept in the Registry at HKEY_CLASSES_ROOT Interface.

Functions Exported by COM DLLs

All COM DLLs must export the following four functions:

  • DLLRegisterServer().As we discussed, this function makes Registry entries necessary for COM to use your objects and also registers the type library.

  • DLLUnregisterServer(). If you are sure you will never use a certain COM DLL again, this function can be called (call regsvr32.exe with a /U option) and will undo everything done by DLLRegisterServer(). Prior to Service Pack 4 DLLRegisterServer() would sometimes unregister Windows core oleautomation marshaler (oleaut32.dll) This would prevent OLE automation from working correctly. Thankfully, this was fixed in NT SP 4.

  • DLLCanUnloadNow().This is called by COM every few minutes or so to make sure that COM DLLs that are loaded have a reason for being loaded. In the event that a COM DLL is loaded in a process, but that process no longer has any references to interfaces of any object in the DLL, the DLL should be unloaded to conserve resources. DLLCanUnloadNow() tells COM whether it is safe to automatically unload the DLL.

  • DLLGetClassObject().COM calls this function to reach into a COM DLL and request that an object be created. Specifically, this function gives the client an interface to the object's class factory. Every object has a class factory, which is itself a COM object that knows how to create objects of a specific type; every object has one class factory. We will touch on class factories a bit more in Chapter 3 "COM Internals", but this function is a critical step in the creation of a COM object; though the process is hidden for the most part, a class factory is always used to create a new instance of a COM object.

A COM+ Variation—Surrogate Processes

We have discussed traditional COM so far, but COM+ adds a few twists and you should take note. Earlier I said that COM+ objects would live in COM DLLs. DLLs always require a process space (an EXE) to house them, so let's call this space a surrogate.

A remote client application must go through a surrogate process to talk to a COM object that is in a COM DLL. The architecture looks like that shown in Figure 2.7.

A DLL containing COM coclasses is hosted by a surrogate that provides the process space and thread(s).

Figure 2.7. A DLL containing COM coclasses is hosted by a surrogate that provides the process space and thread(s).

Because a DLL is a passive bit of code that needs to be acted upon, the surrogate process is responsible for providing the body which a COM DLL needs to make its objects available to the outside world. This body consists of an operating system (OS) process space as well as the threads (we will talk more about threads in Chapter 4) necessary to give the body a brain. As you can imagine, this gives the surrogate an enormous amount of power. After all, the surrogate always stands between the client and the COM DLL; so it could, for example, keep an object alive even if the client calls Release() the appropriate number of times and no longer has a reference to it. The surrogate could then present the same object to another client requesting a new instance of the object. In short, the surrogate could provide a form of object pooling.

What else could the surrogate theoretically do? Well, one surrogate could also be the host for a number of different COM objects and allow them to cooperate in different ways; maybe it could track all the database changes of a group of related objects and allow all database modifications to be rolled back if any one of the objects complained.

There have undoubtedly been many developers who have been entranced by the power of surrogates and have undertaken vast development efforts to do what COM+ does—furnish a value.added surrogate that provides additional services to objects. COM+, like MTS before it, uses a surrogate process (DLLHost.exe) for COM DLLs and their objects. By sticking its nose (and its nose comes in the form of something called interceptors) between the client and the objects, it can provide an additional range of services for both the client and the captive objects. This book is all about those services and will detail each one ad nauseum. If, however, you understand the structure that gives rise to these services, I believe you'll write better objects.

A History of Surrogates

Prior to NT Service Pack 4, developers were required to write their own surrogates if the client EXE and COM DLL were on separate machines. SP4, however, provided a default surrogate. With SP4, it was now possible for a client to instantiate and use a remote COM object in a COM DLL over the network.

Summary

The following summarizes some of the key points discussed:

  • COM objects support one or many interfaces.

  • Interfaces are groups of logically related functions.

  • Interfaces are really RPC channels of communication, as discussed in Chapter 1.

  • If a client has any one interface, it can get any other interface the object supports by calling QueryInterface() in C++ or the Set keyword in Visual Basic (Java's native support for interfaces allows this to be done implicitly).

  • An object's interfaces and coclass are described in a type library file.

  • A type library file is really a tokenized, binary IDL file and can be generated from an IDL file using MIDL. It can also be created directly through Win32 calls.

  • A type library can be reverse-compiled into an IDL by using OLEVIEW.EXE, as we saw in Chapter 1.

  • Type libraries are usually stored inside the resources section of the DLL component.

  • A single DLL can have one or many coclasses in it, and each coclass can support one or many interfaces.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.159.19