Network Kernel Extensions

The kernel supports extending the network stack at multiple levels through the Network Kernel Extensions (NKE) mechanism. An NKE is no different from a regular KEXT; it is merely a term used to describe a KEXT that interfaces with or extends the network stack.

As such, NKEs are also dynamically loadable and unloadable at runtime. NKEs are not part of the I/O Kit, but located in the BSD layer. The NKE mechanism is unique to Mac OS X and not found in BSD UNIX flavors, such as FreeBSD.

An NKE can be used for many purposes. Some examples of use include, but are not limited to, the following:

  • Custom firewall or security mechanisms, such as encryption
  • Adding support for new protocols
  • Adding support for new network interfaces
  • Creating virtual network interfaces
  • Creating custom routing schemes
  • Delaying, modifying, inspecting, or blocking network packets
  • Debugging network stack and drivers

An NKE typically utilizes one of the following KPI/filtering mechanisms:

  • Socket filter: Allows filters to be inserted at various points in the socket layer, and can filter inbound and outbound traffic as well as out-of-band communication. It can filter most protocols supported by the socket API. It is possible to modify, delay, or reject traffic.
  • IP filter: Allows filtering of IP version 4 and 6 traffic.
  • Interface filter: Allows traffic to be monitored and modified on a specific network interface. Since this happens at the end of the stack, all protocols and traffic destined for that interface will be visible.
  • Interface KPI: A programming interface for creating new network interfaces.
  • Protocol plumber: Provides the glue that connects a network protocol to a network interface.

Kernel Control KPI

The kernel control interface <sys/kern_control.h> is a KPI that allows a KEXT to communicate bi-directionally with user space processes. This mechanism is often used in conjunction with NKEs to allow user space programs to control and configure a KEXT. A full discussion of the Kernel Control KPI is provided in Chapter 17.

Socket Filters

A socket filter is a powerful mechanism that allows intercepting of network and IPC traffic in the kernel's socket layer. The socket layer (and hence the socket filter) is situated between user space and the network protocol stack in the kernel. Because of this, socket filters cannot peek at the IP or TCP header of an outgoing network packet because that happens later in the processing chain. However, it is still possible to filter IP-based traffic using a socket filter, as metadata, such as the IP address the packet is destined for, is known. The same is true for incoming traffic. The protocol stack will strip header information before it enters the socket layer. In effect, we are seeing the reassembled data that will eventually be read by a user space application. Because of this, a socket filter is not suitable for use when information from protocol headers is required, and one should use the lower level IP or interface filters instead.

Another thing to note is that a socket filter cannot filter traffic from protocols that are not initiated through the socket API, because some auxiliary protocols are handled directly in the protocol stack. An example would be ARP and RARP requests, which are handled by the kernel and aren't usually initiated by a user application but rather happen as a side effect of some other type of traffic. The socket API is most commonly used by user space applications or libraries, however, as previously mentioned, a socket KPI also exists, allowing the kernel to use socket communication in much the same way as user space. Kernel-initiated sockets can also be filtered.

The socket interface isn't restricted to just filtering data packets. It can also intercept out-of-band communication, such as calls to socket-related system calls like bind() and listen().

A socket filter is registered by filling out desired callbacks in the sflt_filter structure, as shown in Listing 13-2.

Listing 13-2. The sflt_filter Structure Used to Register a Socket Filter (kpi_socketfilter.h)

struct sflt_filter {
        sflt_handle                       sf_handle;
        int                               sf_flags;
        char                             *sf_name;
        sf_unregistered_func              sf_unregistered;
        sf_attach_func                    sf_attach;
        sf_detach_func                    sf_detach;
        sf_notify_func                    sf_notify;
        sf_getpeername_func               sf_getpeername;
        sf_getsockname_func               sf_getsockname;
        sf_data_in_func                   sf_data_in;
        sf_data_out_func                  sf_data_out;
        sf_connect_in_func                sf_connect_in;
        sf_connect_out_func               sf_connect_out;
        sf_bind_func                      sf_bind;
        sf_setoption_func                 sf_setoption;
        sf_getoption_func                 sf_getoption;
        sf_listen_func                    sf_listen;
        sf_ioctl_func                     sf_ioctl;
        struct sflt_filter_ext {
                unsigned int              sf_ext_len;
                sf_accept_func            sf_ext_accept;
                void                     *sf_ext_rsvd[5];        /* Reserved */
        } sf_ext;
#define sf_len                            sf_ext.sf_ext_len
#define sf_accept                         sf_ext.sf_ext_accept
};

As you can see, there are quite a few callbacks, but only a few, such as sf_attach and sf_detach, are mandatory. Non-mandatory callbacks not needed by a filter can be set to NULL. A socket filter can operate in two modes; which mode is used depends on the flags set in the sf_flags field. There are two possible values:

  • SFLT_GLOBAL If set, the filter attaches itself to every socket that matches the protocol domain and protocol specified when the filter was registered. Once registered, the filter will be invoked for every new socket created matching the criteria.
  • SFLT_PROG The filter will be activated, only if an owner of the socket specifically requests it, by using the SO_NKE socket option to the setsockopt() system call.

The first field of the structure sf_handle is used to identify the filter to clients when the filter is operating in programmatic mode (SFLT_PROG is set). It is also used to deregister the socket filter after use. The handle consists of a four-character sequence, which should be unique. Apple provides a registration process to apply for a unique character sequence called a creator code. The sft_name field is used for debug purposes and is commonly set to the bundle ID of the containing KEXT, but it can be anything.

A socket filter is registered with the system using the sflt_register() function.

Building an Application-Level Firewall Using Socket Filters

To better understand how the socket filter mechanism works, let's look at an example of what it can be used for. While Mac OS X ships with an application-level firewall already (ALF.kext), we will do a very simplistic version to demonstrate the power of socket filters. The AppWall architecture consists of an NKE KEXT, which contains the socket filter. AppWall will solve the problem of preventing unauthorized programs from accessing the network. The socket filter can also log information about data transferred in either direction for a specified program, without interfering with its operation. Because AppWall will be proof-of-concept, we will limit it to support IP version 4 using the TCP protocol.

Let's get started by defining the socket filter:

#define APPWALL_FLT_TCP_HANDLE       'apw0'      // codes should registered with Apple

static struct sflt_filter socket_tcp_filter = {
        APPWALL_FLT_TCP_HANDLE,
        SFLT_GLOBAL,
        “com_osxkernel_AppWall”,
        appwall_unregistered,
        appwall_attach,
        appwall_detach,
        NULL,
...
        appwall_data_in,
        appwall_data_out,
        appwall_connect_in,
        appwall_connect_out,
        NULL,
...
};

images Tip The unabridged source for AppWall will be made available on the publisher's website: www.apress.com.

Because of our requirements, we have left out a number of function pointers as NULL, as they are not relevant to our filter's design. If you wish, you can easily modify AppWall to implement these as well.

Let's have a look at how we register the filter:

kern_return_t AppWall_start (kmod_info_t * ki, void * d)
{
...
   ret = sflt_register(&socket_tcp_filter, PF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (ret != KERN_SUCCESS)
        goto bail;
    
    add_entry("ssh", 1);    // block the ssh application.
    add_entry("nc", 0);     // log data from the nc application.
   
    g_filter_registered = TRUE;
...
}

For brevity, we have left out general housekeeping code, such as allocating locks or error handling. Once the sflt_register() function returns, the filter may be active and we may start seeing our callbacks invoked. Therefore, it is vital that any needed resources, such as locks, are initialized prior to registering the filter.

The sflt_register() function takes four arguments:

  • The pointer to the socket filter structure, as mentioned earlier.
  • The protocol domain, which we specify as PF_INET, which is the IP version 4 family.
  • The type. We specify SOCK_STREAM, which refers to a full duplex stream-based socket.
  • And finally the protocol, which we specify as IPPROTO_TCP.

images Tip  The domain, type and protocol values are the same as those used in the user space socket API. Check the man 2 socket manual page for more details about available domains, types, and protocols.

If you wish to handle other protocols, such as UDP, a second call to sflt_register() is needed. Each registered filter needs its own unique handle, so you will need to declare a second structure for the UDP filter. If desired, the second structure may share some or all callbacks with the first.

The last step is to add some entries to our list of blocked/monitored applications using the AppWall add_entry() function. In a real NKE, you would most likely have a kernel control that allowed a user space utility to configure this instead of hard coding. The add_entry() function creates an appwall_entry structure, as shown in Listing 13-3.

AppWall Operation and Data Structures

Before we start implementing the filter callbacks, we need to declare data structures to store information collected from the filter. We declare the data structures in a shared header file, which can be used by a user space utility in the future, but for now, is only used by the AppWall KEXT. The data structure is shown in Listing 13-3.

Listing 13-3. AppWall Header File

#define BUNDLE_ID   "com.osxkernel.AppWall"

struct app_descriptor
{
    char name[PATH_MAX];
    unsigned long bytes_in;
    unsigned long bytes_out;
    unsigned long packets_in;
    unsigned long packets_out;
    int           do_block;
    int           outbound_blocked;
    int           inbound_blocked;

};

#if defined (KERNEL)
struct appwall_entry
{
    TAILQ_ENTRY(appwall_entry)   link;
    struct app_descriptor        desc;
    int                          users;   
};
#endif

#endif

The first structure app_descriptor is used to hold the name of an application to be blocked or monitored. Entries with the do_block field set to non-zero are blocked, whereas a zero value means we will simply collect and report statistics for it.

We use the name of the application and not a process indentifier (PID) to track every instance of that program. While this is not secure, because you can bypass by renaming the executable, it is fine for the sake of example.

The field do_block will be non-zero, if we wish to block this particular application; if it is zero, we will instead collect statistics only. If we see a socket from an application for which no appwall_entry exists, our filter will ignore it.

Attaching and Detaching the Filter

The attach (sf_attach) and detach (sf_detach) functions are called whenever our filter attaches itself to a socket. This happens either because the client that owns the socket specifically request that we attach or for a global filter, when the socket is created. It is not possible to attach to a socket that is already established.

Because a filter may intercept a high volume of sockets, the callbacks should avoid doing any heavy processing, as it may impact the system's network performance. AppWall was designed for demonstration and to be as simple as possible, not as a high-performance socket filter.

Let's look at the implementation of the attach callback in AppWall:

static   errno_t appwall_attach(void** cookie, socket_t so)
{
    errno_t                 result = 0;
    struct appwall_entry*   entry;
    char                    name[PATH_MAX];
    
    *cookie = NULL;
    
    proc_selfname(name, PATH_MAX);
    
    lck_mtx_lock(g_mutex);
    
    entry = find_entry_by_name(name);
    if (entry)
    {
        entry->users++;
        *cookie = (void*)entry;
        printf("AppWall: attaching to process: %s ", name);
    }
    else
        result = ENOPOLICY; // don't attach to this socket.
    
    lck_mtx_unlock(g_mutex);
   
    return result;
}

We are passed two arguments: The first is a cookie parameter that we can use to assign per-socket data. The cookie pointer will be passed back to us in every callback. The second argument is an opaque reference to the socket itself. Since the socket is opaque, it must be accessed with the socket KPI.

RETRIEVING THE IP ADDRESS OF A SOCKET

When the appwall_attach() function gets called, we are executing in the context of the task that created the socket, and we can, therefore, call proc_selfname(), which returns the process name of the current task. Once we have a name, we search the global linked list of appwall_entry structures to see if we can find a match. If a match is found, we increment its users count, and assign to the cookie return argument.

All manipulation of the linked list is performed under a global mutex to protect against concurrent access. If a match is not found, we return ENOPOLICY. Any non-zero return code from the function will have the effect of preventing the filter from being attached to this socket (without affecting the sockets lifecycle) and, hence, no further callbacks will be seen for that socket.

If you have a socket_t handle, you can manually attach to the socket by calling the sf_attach() function.

The sf_detach() callback will be invoked when the filter should be detached from the socket, which occurs when a socket closes or as a result of the filter being unregistered with sflt_unregister(). The detach callback in AppWall is implemented as follows:

static void
appwall_detach(void* cookie, socket_t so)
{
    struct appwall_entry*       entry;

    if (cookie)
    {
        entry = (struct appwall_entry*)cookie;
        
        lck_mtx_lock(g_mutex);

        entry->users--;
        if (entry->users == 0)
        {
            printf("report for: %s ", entry->desc.name);
            printf("=================================== ");
            
            if (entry->desc.do_block)
            {
                printf("inbound_blocked: %d ", entry->desc.inbound_blocked);
                printf("outbound_blocked: %d ", entry->desc.outbound_blocked);                  
            }
            else
            {
                printf("bytes_in: %lu ", entry->desc.bytes_in);
                printf("bytes_out: %lu ", entry->desc.bytes_out);
                printf("entry->desc.packets_in: %lu ", entry->desc.packets_in);
                printf("entry->desc.packets_out: %lu ",entry->desc.packets_out);
            }
            cookie = NULL;
        }        
        lck_mtx_unlock(g_mutex);
    }
    return;
}

The function simply prints a report of how many times connections were blocked, or if the application was monitored, dumps statistics for how many bytes and packets were transmitted.

Handling Connections

A socket filter can intercept calls to the connect() system call for outgoing connections. The system call handler calls our filter by using the sf_connect_out filter function. The filter function is passed the following three arguments.

  • The cookie
  • A handle to the socket itself
  • A sockaddr structure describing the intended destination of the socket

Returning non-zero from the callback will have the effect of propagating the error directly back to the caller of the connect() function (from kernel or user space) and will prevent the socket from being established without any packets going out on the network, which is how AppWall is able to block outgoing connections.

There is a catch here for UDP. UDP is connectionless and is not required to call connect() at all; it will do so only to set the default address for send() and recv(), which does not result in outgoing network traffic. Blocking UDP traffic can instead be done in the data out or in callbacks on a per packet basis.

The sf_connect_in function, on the other hand, is not called in response to a system call like sf_connect_out but called by a protocol handler just before a new connection is established. The sf_connect_in callback is currently only invoked for TCP and does not apply to UDP. (It's connectionless.)

As with the output filter, it is possible to reject the connection by returning non-zero, preventing it from being established and sending any further data to the socket. The sf_connect_in callback takes the same arguments as the output callback, but the sockaddr structure will describe the remote address instead. AppWall implements the sf_connect_in filter function as follows:

static  errno_t
appwall_connect_in(void* cookie, socket_t so, const struct sockaddr* from)
{
    struct appwall_entry*       entry;
    errno_t                     result = 0;
            
    entry = (struct appwall_entry*)cookie;
    if (!entry)
        goto bail;
    
    lck_mtx_lock(g_mutex);

    if (entry->desc.do_block)
    {
        printf("blocked incoming connection to: %s", entry->desc.name);
        if (from)
        {
            printf(" from: ");
            log_ip_and_port_addr((struct sockaddr_in*)from);
        }        
        entry->desc.inbound_blocked++;
        result = EPERM;
    }
    lck_mtx_unlock(g_mutex);
bail:
    
    return result;
}

The function looks for a non-NULL cookie, and if one is present, checks if the application owning the current socket should be blocked.

Socket Data Input and Output

The real power of socket filters are in the sf_data_in and sf_data_out filter functions. They allow interception of incoming and outgoing packets. Packets seen by a socket filter's data functions are stripped of (or have not yet had attached) protocol header information, such as IP, TCP, or UDP headers. In the case of TCP and UDP, the information will represent the actual payload data, which will be delivered to or from a socket. If you need data from the protocol headers, you may wish to write an IP or interface filter instead. For incoming data packets, you can determine the network interface a packet received by calling mbuf_pkthdr_rcvif() on the mbuf. For outgoing packets, this information isn't available because the filter function executes before the packet is routed to a network interface. The sf_data_out function in AppWall is implemented as follows:

static  errno_t
appwall_data_out(void* cookie, socket_t so, const struct sockaddr* to, mbuf_t* data,
                 mbuf_t* control, sflt_data_flag_t flags)
{
   struct appwall_entry*       entry;
   errno_t                     result = 0;
            
    entry = (struct appwall_entry*)cookie;
    if (!entry)
        goto bail;

    lck_mtx_lock(g_mutex);
    entry->desc.bytes_out += mbuf_pkthdr_len(*data);
    entry->desc.packets_out++;
    
    if (entry->desc.do_block)
        result = EPERM;
    lck_mtx_unlock(g_mutex);
bail:
    return result;
}

The function accepts the following six parameters:

  • The cookie containing the pointer to the appwall_entry structure.
  • A socket_t reference to socket transmitting data.
  • A sockaddr structure containing the address of the host to which the packet is destined. The argument is NULL for TCP packets, but set for UDP. The destination of a TCP socket can be determined at the time the connection is created (sf_connect_out).
  • A pointer to an mbuf_t handle. Note that you cannot use the mbuf_t directly, as it is merely a handle, you have to use the mbuf KPI to extract data and information from it. Also note that the mbuf argument is a pointer, so it also functions as an output argument. It is possible to assign a different mbuf_t, which will be transmitted in lieu of the original.
  • A pointer to an mbuf_t handle containing additional control data.
  • The sixth parameter is used to indicate the type of data, such as normal, out-of-band or records data. There are two valid flags: sock_data_filt_flag_oob and sock_data_filt_flag_record. A value of zero indicates normal data.

In the AppWall case, the data in function is implemented in a similar way to the connect function by checking if the calling socket has a cookie attached, which in turn means that the packet should either be logged or blocked. We return EPERM to signal the caller that it should free the packet and halt further processing if the packet should be blocked (filtered). If you wish to keep the packet, but prevent it from progressing further, you can return EJUSTRETURN instead, which will prevent the caller from freeing the packet.

AppWall implements the data input function nearly identically. It will block an incoming packet by returning EPERM.

images Tip  If you wish to learn more about socket filters, Apple provides a more comprehensive socket filter example, called: tcplognke, which can be found on their developer website. It shows how to log connections as well as how to swallow (delay) and re-inject packets at a later time. It also demonstrates some of the other filter functions we have not covered here and the user of the kernel control mechanism.

Internet Protocol Filters

Internet Protocol (IP) filters allow filtering and injection of incoming and outgoing IP packets. The IP filter mechanism works both for IPv4 and for IPv6. Because the IP operates at the network layer, there is no concept of connections or sessions, as that is handled by higher layer protocols and mechanisms. At the IP level, there are only packets going in and out. As a result, IP filters are significantly less complex than socket filters. The programming interface is similar to that of socket filters. An IP filter is defined by the structure ipf_filter:

struct ipf_filter {
    void*           cookie;
    const char*     name;
    ipf_input_func  ipf_input;
    ipf_output_func ipf_output;
    ipf_detach_func ipf_detach;
};

The structure consists of the following fields and callbacks:

  • The cookie field is used to assign a pointer containing some data that should be passed along to all the filter functions.
  • The name is used for debugging purposes and should be set to something identifying your filter/KEXT.
  • The ipf_input and ipf_output fields define the actual filter functions, which will be called for incoming and outgoing IP packets, respectively.
  • The ipf_detach function will be called when the filter is detached. Unlike a socket filter, which detaches when a socket close is terminated, IP filters need to be detached/removed explicitly by calling ipf_remove(). Note that the ipf_remove() function may defer removal of the filter if one of the filter functions are executing when the function is called. Therefore, you need to wait for the ipf_detach filter function to complete before a KEXT can be unloaded to avoid a kernel panic when the IP stack tries to call ipf_detach after it has been unloaded from memory.

A complete example of a minimal IP filter is shown in Listing 13-4.

Listing 13-4. MyIPFilter: Implementation of a Simple IP Filter

#include <mach/mach_types.h>
#include <sys/kernel_types.h>
#include <sys/systm.h>
#include <sys/kpi_mbuf.h>
#include <netinet/ip.h>
#include <netinet/kpi_ipfilter.h>

enum {
    kMyFiltDirIn,
    kMyFiltDirOut,
    kMyFiltNumDirs
};

struct myfilter_stats {
    unsigned long udp_packets[kMyFiltNumDirs];
    unsigned long tcp_packets[kMyFiltNumDirs];
    unsigned long icmp_packets[kMyFiltNumDirs];
    unsigned long other_packets[kMyFiltNumDirs];
};

static struct myfilter_stats g_filter_stats;
static ipfilter_t g_filter_ref;
static boolean_t g_filter_registered = FALSE;
static boolean_t g_filter_detached = FALSE;

static void log_ip_packet(mbuf_t* data, int dir) {
    char src[32], dst[32];
    struct ip *ip = (struct ip*)mbuf_data(*data);
    
    if (ip->ip_v != 4)
        return;
    
    bzero(src, sizeof(src));
    bzero(dst, sizeof(dst));
    inet_ntop(AF_INET, &ip->ip_src, src, sizeof(src));
    inet_ntop(AF_INET, &ip->ip_dst, dst, sizeof(dst));
    
    switch (ip->ip_p) {
        case IPPROTO_TCP:
            printf("TCP: ");
            g_filter_stats.tcp_packets[dir]++;
            break;
        case IPPROTO_UDP:
            printf("UDP: ");
            g_filter_stats.udp_packets[dir]++;
            break;
        case IPPROTO_ICMP:
            printf("ICMP: ");
            g_filter_stats.icmp_packets[dir]++;
        default:
            printf("OTHER: ");
            g_filter_stats.other_packets[dir]++;
            break;
    }  
    printf("%s -> %s ", src, dst);
}

static errno_t myipfilter_output(void* cookie, mbuf_t* data, ipf_pktopts_t options) {
    if (data)
        log_ip_packet(data, kMyFiltDirOut);
    return 0;
}

static errno_t myipfilter_input(void* cookie, mbuf_t* data, int offset, u_int8_t protocol) {
    if (data)
        log_ip_packet(data, kMyFiltDirIn);
    return 0;
}

static void myipfilter_detach(void* cookie) {
    /* cookie isn't dynamically allocated, no need to free in this case */
    struct myfilter_stats* stats = (struct myfilter_stats*)cookie;
    printf("UDP_IN %lu UDP OUT: %lu TCP_IN: %lu TCP_OUT: %lu ICMP_IN: %lu ICMP OUT: %lu OTHER_IN: %lu OTHER_OUT: %lu ",
           stats->udp_packets[kMyFiltDirIn],
           stats->udp_packets[kMyFiltDirOut],
           stats->tcp_packets[kMyFiltDirIn],
           stats->tcp_packets[kMyFiltDirOut],
           stats->icmp_packets[kMyFiltDirIn],
           stats->icmp_packets[kMyFiltDirOut],
           stats->other_packets[kMyFiltDirIn],
           stats->other_packets[kMyFiltDirOut]);
    
    g_filter_detached = TRUE;
}

static struct ipf_filter g_my_ip_filter = {
    &g_filter_stats,
    "com.osxkernel.MyIPFilter",
    myipfilter_input,
    myipfilter_output,
    myipfilter_detach
};  

kern_return_t MyIPFilter_start (kmod_info_t * ki, void * d) {  
    int result;
  
    bzero(&g_filter_stats, sizeof(struct myfilter_stats));
    result = ipf_addv4(&g_my_ip_filter, &g_filter_ref);
    
    if (result == KERN_SUCCESS)
        g_filter_registered = TRUE;
    
    return result;
}

kern_return_t MyIPFilter_stop (kmod_info_t * ki, void * d) {
    
    if (g_filter_registered)
    {
        ipf_remove(g_filter_ref);
        g_filter_registered = FALSE;
    }
    /* We need to ensure filter is detached before we return */
    if (!g_filter_detached)
        return KERN_NO_ACCESS; // Try unloading again.
    
    return KERN_SUCCESS;
}

The filter will attach itself once the KEXT is loaded, and detach itself once it unloads. The filter will print the source and destination of each received IP packet to the console, as well as keep track of statistics for TCP, UDP, and ICMP packets, for which a summary is printed once the filter is detached.

The ipf_filter structure is registered using the ipf_addv4() function, which registers an IPv4 filter. IPv6 filters can be registered with ipf_addv6().

The ipf_input and ipf_output callbacks are invoked from the IP stack on arrival or departure of an IP packet. For incoming IP packets, the filter function will be called just before the packet gets processed by a higher-level protocol handler, such as TCP or UDP. If the IP packet was fragmented, it is reassembled before being passed to the filter function. For outgoing packets, the filter function will be called before the packet is fragmented. Normally, a packet would only be seen by a filter function once. However, there is one exception, which is if the packet uses an encryption scheme like IPSec, where an IP packet may contain another encrypted IP packet. In this case, the filter function will be called once for the encrypted packet and once for the decrypted payload.

IP filters work across interfaces, so you will see packets from and to all active interfaces in the system. If you need to know which interface the packet arrived from, this information can be obtained from the mbuf packet header. For outgoing packets, this information is not yet available, because routing of the packet to a network interface happens after the output filter function is called. This is by design, because it is possible for the filter function to alter the destination of a packet, as we will see shortly.

IP filters are not limited to examining packets; it is also possible to modify packets, reject them, and inject your own packets. To illustrate the power of IP filters, we can modify the ipf_output filter function from Listing 13-4 with a new version:

static errno_t myipfilter_output_redirect(void* cookie, mbuf_t* data, ipf_pktopts_t options)
{
    struct in_addr addr_old;
    struct in_addr addr_new;
    int ret;
    
    struct ip* ip = (struct ip*)mbuf_data(*data);
    if (ip->ip_v != 4)
        return 0;
    
    addr_old.s_addr = htonl(134744072); // 8.8.8.8
    addr_new.s_addr = htonl(167837964); // 10.1.1.12
    
    // redirect packets to 8.8.8.8 to the IP address 10.1.1.12.
    if (ip->ip_dst.s_addr == addr_old.s_addr)
    {
        ip->ip_dst = addr_new;
        myipfilter_update_cksum(*data);
        ret = ipf_inject_output(*data, g_filter_ref, options);
        return ret == 0 ? EJUSTRETURN : ret;
    }
    return 0;
}

The preceding example will redirect all IP traffic to the public IP address (8.8.8.8) to an internal IP address on our network (10.1.1.12). We do this by examining the destination of the IP address and, if it matches our address, we modify the packet's destination to the new address. Because we have modified the packet, we need to re-inject it. This will have the effect of treating the modified packet as a new one and, hence, it will again pass through our filter. We can prevent the packet from being processed again by our filter by passing in the reference to our filter when we inject the packet, as shown in the preceding example.

Since we have re-injected the packet, we need to stop the original packet from progressing further, which we do by returning EJUSTRETURN. This will tell the caller to stop processing the packet without freeing it. To discard a packet completely, we can return a value other than zero or EJUSTRETURN, which will cause the caller to stop processing and also free the packet. These rules apply for both incoming and outgoing packets. When modifying an IP packet's header, we need to update its checksum (CRC) to prevent the packet from being discarded as corrupt. The IP checksum covers its own header, but not the payload. TCP and UDP checksums are calculated using some of the fields of the IP header, including the source and destination address. Consequently, UDP and TCP checksums also need to be recalculated if an IP header's address fields are modified. IP, TCP, and UDP checksums can be calculated for an mbuf_t using the function mbuf_inet_cksum(). See the myipfilter_update_cksum() function in the book sample project MyIPFilter for an example of how to update the checksums.

We can now test that our modified IP filter function works correctly using the ping command line utility:


$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 10.1.1.12: icmp_seq=0 ttl=64 time=307.636 ms
64 bytes from 10.1.1.12: icmp_seq=1 ttl=64 time=2.513 ms

As you can see, we will now get replies from 10.1.1.12 instead of the original IP address. This happens to work with the ping utility, which uses a RAW socket. However, for a regular socket-based application like ssh, we also need to modify the source address of incoming packets to enable full two-way communication, otherwise, the IP stack will be confused when it gets unsolicited packets from the 10.1.1.12 host. You can modify the ipf_input filter function to modify incoming packets so that the source address is translated from 10.1.1.12 back to 8.8.8.8, thereby ensuring that the packet is directed to the right application (which still thinks we are talking to 8.8.8.8). This is conceptually similar to how Network Address Translation (NAT) technology is implemented. NAT is the technique used by Mac OS X's Internet sharing feature or how the iPhone can share its 3G connection to other wireless devices. Refer to the full source code of the MyIPFilter example to see how we can modify a packet on input.

Although, in the previous example, we have only modified the destination address, it is possible to modify any part of the packet, including application layer data. It is also possible to completely replace a packet with a new one. The structure of a typical IP packet is shown in Figure 13-3.

images

Figure 13-3. An Ethernet frame with an IP, TCP header, and data payload

In the case of both incoming and outgoing packets, a filter function will see the complete packet, but the packet data passed to the filter function will not include any data-link layer headers, such as an Ethernet header, because that will be processed before the packet enters the IP stack where our filter function gets called. Similarly for outgoing packets, the Ethernet, or other data-link layer header, will be attached after the packet goes through the filter function. Again, if you update any part of the packet, you must ensure that relevant checksums are updated as well.

Interface Filters

Interface filters are as close to the metal as we can get using a filtering mechanism. Interface filters operate just before and after a packet is sent or received by a network interface. If a packet is destined for a physical interface, as opposed to a loopback or virtual interface, it will likely be sent to an I/O Kit driver for physical transmission. An interface filter is bound to only one interface, unlike an IP or socket filter, which sees the aggregate packet flow of all interfaces in the system. If you need to filter packets on multiple interfaces, you must register multiple filters, one for each interface. The interface filter mechanism is very similar to that of socket and IP filters. As with socket filters, interface filters can also intercept out-of-band events, such as ioctl() messages sent to the interface—for example, requests to set or get the IP address, network mask, or MTU (maximum transfer unit). An interface filter can also trap events to the interface sent via the kernel event API. As with socket and IP filters, interface filters allow insertion, modification, rejection, and delay of packets. An interface filter is defined by the iff_filter structure:

struct iff_filter {
    void*             iff_cookie;
    const char*       iff_name;
    protocol_family_t iff_protocol;
    iff_input_func    iff_input;
    iff_output_func   iff_output;
    iff_event_func    iff_event;
    iff_ioctl_func    iff_ioctl;
    iff_detached_func iff_detached;
};

All the filter functions are optional, and functions you do not care about can be left as NULL. Unlike an IP or socket filter, an interface filter will see all packets regardless of protocol, which will include protocols handled in the kernel, such as ARP. If your filter is interested only in IP packets, you can use the iff_protocol field to specify AF_INET for IPv4 or AF_INET6 for IPv6, which will ensure that the filter function will not be called for other protocols. It is only possible to specify protocol families, not individual protocols, like TCP or UDP. Furthermore, if your filter needs to examine IP packets, be aware that the IP packets may now be fragmented, and you will not have the opportunity to examine encrypted IP headers when IPSec is used. Listing 13-5 shows the implementation of a simple interface filter.

Listing 13-5. MyInterfaceFilter: A Simple Network Interface Filter

#include <libkern/libkern.h>
#include <sys/errno.h>
#include <sys/kpi_mbuf.h>
#include <mach/mach_types.h>
#include <net/kpi_interfacefilter.h>

#include <netinet/in.h>
#include <netinet/ip.h>
#include <net/ethernet.h>

static boolean_t g_filter_registered = TRUE;
static boolean_t g_filter_detached = FALSE;
static interface_filter_t g_filter_ref;

static errno_t myif_filter_input(void* cookie, ifnet_t interface, protocol_family_t protocol,
                                 mbuf_t* data, char** frame_ptr)
{
    printf("incoming packet: %lu bytes ", mbuf_pkthdr_len(*data));
    return 0;
}

static errno_t myif_filter_output(void* cookie, ifnet_t interface, protocol_family_t protocol,
                                  mbuf_t* data)
{
    printf("outgoing packet: %lu bytes ", mbuf_pkthdr_len(*data));
    return 0;
}
static void myif_filter_detached(void* cookie, ifnet_t interface)
{
    g_filter_detached = TRUE;
}

static struct iff_filter g_my_iff_filter =
{
    NULL,
    "com.osxkernel.MyInterfaceFilter",
    0,
    myif_filter_input,
    myif_filter_output,
    NULL,
    NULL,
    myif_filter_detached,
};

kern_return_t MyInterfaceFilter_start (kmod_info_t* ki, void* d)
{
    ifnet_t interface;
    
    if (ifnet_find_by_name("en1", &interface) != KERN_SUCCESS) // change to your own interface
        return KERN_FAILURE;
     
    if (iflt_attach(interface, &g_my_iff_filter, &g_filter_ref) == KERN_SUCCESS)
    {
        g_filter_registered = TRUE;
    }
    
    ifnet_release(interface);
    
    return KERN_SUCCESS;
}

kern_return_t MyInterfaceFilter_stop (kmod_info_t* ki, void* d)
{
    if (g_filter_registered)
    {
        iflt_detach(g_filter_ref);
        g_filter_registered = FALSE;
    }
    if (!g_filter_detached)
        return KERN_NO_ACCESS; // Don't allow unload until filter is detached.
        
    return KERN_SUCCESS;
}

Interface filters can be attached to a network interface using the iflt_attach() function. You can register a single iff_filter against multiple interfaces. A network interface is represented by the opaque type ifnet_t, which can be manipulated using the interface KPI (kpi_interface.h). In the preceding example, we use the interface KPI function ifnet_find_by_name() to obtain a reference to the network interface with the BSD name “en1,” which, on a MacBook, corresponds to the Wi-Fi interface.

The iff_input filter function is called when an incoming packet is received by the interface. The callback takes five arguments:

  • The cookie argument contains the pointer assigned to the iff_cookie field when the filter was registered.
  • The ifnet_t argument is a reference to the network interface that received the packet. This is especially useful in case the same filter function handles filters attached to more than one network interface.
  • The next parameter is the protocol family the incoming packet belongs to. Unless zero is specified for the iff_protocol field, this will always be the family you specified.
  • The mbuf_t represents the buffer containing the packet data.
  • The last argument, frame_ptr, is a pointer to the data-link frame header of the interface. The size and structure of the frame header varies depending on the network interface. For an Ethernet interface, the frame header consists of a source and destination MAC address as well as a 16-bit “ethertype” field, which determines the encapsulated protocol. The field will be 0x0800 for an Ethernet frame containing an IP packet. You can determine the length of the frame header for an interface by calling the ifnet_hdrlen(ifnet_t) function.

The output filter function iff_output is similar to the input function, but does not provide the frame header as a separate argument; rather the mbuf_t contains the entire frame including the data-link header, instead of pointing to the data after the data-link header. If we wish to examine the IP header of an incoming packet in an interface filter's output function, we need to first parse the data-link header to find the offset of the IP header. An example of this is shown here:

static errno_t myif_filter_output(void* cookie, ifnet_t interface, protocol_family_t protocol,
                                  mbuf_t* data)
{
    char                  src[64], dst[64];
    unsigned char*        pktbuf = mbuf_data(*data);
    struct ether_header*  eth = (struct ether_header *)pktbuf;

    if (ifnet_hdrlen(interface) != ETHER_HDR_LEN)
        return 0;
        
    if (ntohs(eth->ether_type) == ETHERTYPE_IP)
    {
        struct ip* iphdr = (struct ip*)(pktbuf + ETHER_HDR_LEN);
        inet_ntop(AF_INET, &iphdr->ip_src, src, sizeof(src));
        inet_ntop(AF_INET, &iphdr->ip_dst, dst, sizeof(dst));
        printf("outgoing packet: %lu bytes ip_src: %s ip_dst: %s ",
                mbuf_pkthdr_len(*data), src, dst);
    }
    else
        printf("outgoing packet: %lu bytes ", mbuf_pkthdr_len(*data));
    return 0;
}

The interface filter KPI does not provide functions for injecting incoming and outgoing packets. This is provided by the interface KPI instead. Outgoing packets can be injected using the function ifnet_output_raw() or using the function ifnet_input() to inject an inbound packet. For an example of how inet_output_raw() can be used, refer to the source code of the sample driver MyEthernetDriver discussed later in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.164.164