Thursday, August 23, 2007
Dear Diary,
I’ve always been a big fan of vulnerabilities in operating system kernels because they’re usually quite interesting, very powerful, and tricky to exploit. I recently combed through several operating system kernels in search of bugs. One of the kernels that I searched through was the kernel of Sun Solaris. And guess what? I was successful.
On January 27, 2010, Sun was acquired by Oracle Corporation. Oracle now generally refers to Solaris as “Oracle Solaris.”
Since the launch of OpenSolaris in June 2005, Sun has made most of its Solaris 10 operating system freely available as open source, including the kernel. So I downloaded the source code[23] and started reading the kernel code, focusing on the parts that implement the user-to-kernel interfaces, like IOCTLs and system calls.
Input/output controls (IOCTLs) are used for communication between user-mode applications and the kernel.[24]
Any user-to-kernel interface or API that results in information being passed over to the kernel for processing creates a potential attack vector. The most commonly used are:
IOCTLs
System calls
Filesystems
Network stack
Hooks of third-party drivers
The vulnerability that I found is one of the most interesting I’ve discovered because its cause—an undefined error condition—is unusual for an exploitable vulnerability (compared to the average overflow bugs). It affects the implementation of the SIOCGTUNPARAM
IOCTL call, which is part of the IP-in-IP tunneling mechanism provided by the Solaris kernel.[25]
I took the following steps to find the vulnerability:
Step 1: List the IOCTLs of the kernel.
Step 2: Identify the input data.
Step 3: Trace the input data.
These steps are described in detail below.
There are different ways to generate a list of the IOCTLs of a kernel. In this case, I simply searched the kernel source code for the customary IOCTL macros. Every IOCTL gets its own number, usually created by a macro. Depending on the IOCTL type, the Solaris kernel defines the following macros: _IOR
, _IOW
, and _IOWR
. To list the IOCTLs, I changed to the directory where I unpacked the kernel source code and used the Unix grep
command to search the code.
solaris$pwd
/exports/home/tk/on-src/usr/src/uts solaris$grep -rnw -e _IOR -e _IOW -e _IOWR *
[..] common/sys/sockio.h:208:#define SIOCTONLINK _IOWR('i', 145, struct sioc_addr req) common/sys/sockio.h:210:#define SIOCTMYSITE _IOWR('i', 146, struct sioc_addr req) common/sys/sockio.h:213:#define SIOCGTUNPARAM _IOR('i', 147, struct iftun_req) common/sys/sockio.h:216:#define SIOCSTUNPARAM _IOW('i', 148, struct iftun_req) common/sys/sockio.h:220:#define SIOCFIPSECONFIG _IOW('i', 149, 0) /* Flush Policy */ common/sys/sockio.h:221:#define SIOCSIPSECONFIG _IOW('i', 150, 0) /* Set Policy */ common/sys/sockio.h:222:#define SIOCDIPSECONFIG _IOW('i', 151, 0) /* Delete Policy */ common/sys/sockio.h:223:#define SIOCLIPSECONFIG _IOW('i', 152, 0) /* List Policy */ [..]
I now had a list of IOCTL names supported by the Solaris kernel. To find the source files that actually process these IOCTLs, I searched the whole kernel source for each IOCTL name on the list. Here is an example search for the SIOCTONLINK
IOCTL:
solaris$ grep --include=*.c -rn SIOCTONLINK *
common/inet/ip/ip.c:1267: /* 145 */ { SIOCTONLINK,
sizeof (struct sioc_add rreq), → IPI_GET_CMD,
The Solaris kernel provides different interfaces for IOCTL processing. The interface that is relevant for the vulnerability I found is a programming model called STREAMS.[26] Intuitively, the fundamental STREAMS unit is called a Stream, which is a data transfer path between a process in user space and the kernel. All kernel-level input and output under STREAMS are based on STREAMS messages, which usually contain the following elements: a data buffer, a data block, and a message block. The data buffer is the location in memory where the actual data of the message is stored. The data block (struct datab
) describes the data buffer. The message block (struct msgb
) describes the data block and how the data is used.
The message block structure has the following public elements.
uts/common/sys/stream.h[27]
[..] 367 /* 368 * Message block descriptor 369 */ 370 typedef struct msgb { 371 struct msgb *b_next; 372 struct msgb *b_prev; 373 struct msgb *b_cont;374 unsigned char *b_rptr;
375 unsigned char *b_wptr;
376 struct datab *b_datap;
377 unsigned char b_band; 378 unsigned char b_tag; 379 unsigned short b_flag; 380 queue_t *b_queue; /* for sync queues */381 } mblk_t;
[..]
The structure elements b_rptr
and b_wptr
specify the current read and write pointers in the data buffer pointed to by b_datap
(see Figure 3-1).
When using the STREAMS model, the IOCTL input data is referenced by the b_rptr
element of the msgb
structure, or its typedef mblk_t
. Another important component of the STREAMS model is the so-called linked message blocks. As described in the STREAMS Programming Guide, “[a] complex message can consist of several linked message blocks. If buffer size is limited or if processing expands the message, multiple message blocks are formed in the message” (see Figure 3-2).
I then took the list of IOCTLs and started reviewing the code. As usual, I searched the code for input data and then traced that data while looking for coding errors. After a few hours, I found the vulnerability.
uts/common/inet/ip/ip.c
ip_process_ioctl()
[28]
[..] 26692 void 26693 ip_process_ioctl(ipsq_t *ipsq, queue_t *q, mblk_t *mp, void *arg) 26694 { [..]26717 ci.ci_ipif = NULL;
[..]26735 case TUN_CMD:
26736 /* 26737 * SIOC[GS]TUNPARAM appear here. ip_extract_tunreq returns 26738 * a refheld ipif in ci.ci_ipif 26739 */26740 err = ip_extract_tunreq(q, mp, &ci.ci_ipif, ip_process_ioctl);
[..]
When a SIOCGTUNPARAM
IOCTL request is sent to the kernel, the function ip_process_ioctl()
is called. In line 26717, the value of ci.ci_ipif
is explicitly set to NULL
. Because of the SIOCGTUNPARAM
IOCTL call, the switch case TUN_CMD
is chosen (see line 26735), and the function ip_extract_tunreq()
is called (see line 26740).
uts/common/inet/ip/ip_if.c
ip_extract_tunreq()
[29]
[..] 8158 /* 8159 * Parse an iftun_req structure coming down SIOC[GS]TUNPARAM ioctls, 8160 * refhold and return the associated ipif 8161 */ 8162 /* ARGSUSED */ 8163 int 8164 ip_extract_tunreq(queue_t *q,mblk_t *mp
, const ip_ioctl_cmd_t *ipip, 8165 cmd_info_t *ci, ipsq_func_t func) 8166 { 8167 boolean_t exists;8168 struct iftun_req *ta;
8169 ipif_t *ipif; 8170 ill_t *ill; 8171 boolean_t isv6; 8172 mblk_t *mp1; 8173 int error; 8174 conn_t *connp; 8175 ip_stack_t *ipst; 8176 8177 /* Existence verified in ip_wput_nondata */8178 mp1 = mp->b_cont->b_cont;
8179 ta = (struct iftun_req *)mp1->b_rptr;
8180 /* 8181 * Null terminate the string to protect against buffer 8182 * overrun. String was generated by user code and may not 8183 * be trusted. 8184 */ 8185 ta->ifta_lifr_name[LIFNAMSIZ - 1] = '\0'; 8186 8187 connp = Q_TO_CONN(q); 8188 isv6 = connp->conn_af_isv6; 8189 ipst = connp->conn_netstack->netstack_ip; 8190 8191 /* Disallows implicit create */8192 ipif = ipif_lookup_on_name(ta->ifta_lifr_name,
8193 mi_strlen(ta->ifta_lifr_name), B_FALSE, &exists, isv6,
8194 connp->conn_zoneid, CONNP_TO_WQ(connp), mp, func, &error, ipst);
[..]
In line 8178, a linked STREAMS message block is referenced, and on line 8179, the structure ta
is filled with the user-controlled IOCTL data. Later on, the function ipif_lookup_on_name()
is called (see line 8192). The first two parameters of ipif_lookup_on_name()
derive from the user-controllable data of structure ta
.
uts/common/inet/ip/ip_if.c
ipif_lookup_on_name()
[..] 19116 /* 19117 * Find an IPIF based on the name passed in. Names can be of the 19118 * form <phys> (e.g., le0), <phys>:<#> (e.g., le0:1), 19119 * The <phys> string can have forms like <dev><#> (e.g., le0), 19120 * <dev><#>.<module> (e.g. le0.foo), or <dev>.<module><#> (e.g. ip.tun3). 19121 * When there is no colon, the implied unit id is zero. <phys> must 19122 * correspond to the name of an ILL. (May be called as writer.) 19123 */ 19124 static ipif_t * 19125 ipif_lookup_on_name(char *name
, size_t namelen, boolean_t do_alloc, 19126 boolean_t *exists, boolean_t isv6, zoneid_t zoneid, queue_t *q, 19127 mblk_t *mp, ipsq_func_t func, int *error, ip_stack_t *ipst) 19128 { [..] 19138 if (error != NULL)19139 *error = 0;
[..]19154 /* Look for a colon in the name. */
19155 endp = &name[namelen];
19156 for (cp = endp; --cp > name; ) {
19157 if (*cp == IPIF_SEPARATOR_CHAR)
19158 break;
19159 }
1916019161 if (*cp == IPIF_SEPARATOR_CHAR) {
19162 /* 19163 * Reject any non-decimal aliases for logical 19164 * interfaces. Aliases with leading zeroes 19165 * are also rejected as they introduce ambiguity 19166 * in the naming of the interfaces. 19167 * In order to confirm with existing semantics, 19168 * and to not break any programs/script relying 19169 * on that behaviour, if<0>:0 is considered to be 19170 * a valid interface. 19171 * 19172 * If alias has two or more digits and the first 19173 * is zero, fail. 19174 */19175 if (&cp[2] < endp && cp[1] == '0')
19176 return (NULL);
19177 } [..]
In line 19139, the value of error
is explicitly set to 0. Then in line 19161, the interface name provided by the user-controlled IOCTL data is checked for the presence of a colon (IPIF_SEPARATOR_CHAR
is defined as a colon). If a colon is found in the name, the bytes after the colon are treated as an interface alias. If an alias has two or more digits and the first is zero (ASCII zero or hexadecimal 0x30
; see line 19175), the function ipif_lookup_on_name()
returns to ip_extract_tunreq()
with a return value of NULL
, and the variable error
is still set to 0 (see lines 19139 and 19176).
uts/common/inet/ip/ip_if.c
ip_extract_tunreq()
[..] 8192 ipif = ipif_lookup_on_name(ta->ifta_lifr_name, 8193 mi_strlen(ta->ifta_lifr_name), B_FALSE, &exists, isv6, 8194 connp->conn_zoneid, CONNP_TO_WQ(connp), mp, func, &error, ipst); 8195 if (ipif == NULL) 8196 return (error); [..]
Back in ip_extract_tunreq()
, the pointer ipif
is set to NULL
if ipif_lookup_on_name()
returns that value (see line 8192). Since ipif
is NULL
, the if
statement in line 8195 returns TRUE
, and line 8196 is executed. The ip_extract_tunreq()
function then returns to ip_process_ioctl()
with error
as a return value, which is still set to 0.
uts/common/inet/ip/ip.c
ip_process_ioctl()
[..]26717 ci.ci_ipif = NULL;
[..] 26735 case TUN_CMD: 26736 /* 26737 * SIOC[GS]TUNPARAM appear here. ip_extract_tunreq returns 26738 * a refheld ipif in ci.ci_ipif 26739 */26740 err = ip_extract_tunreq(q, mp, &ci.ci_ipif, ip_process_ioctl);
26741 if (err != 0) {
26742 ip_ioctl_finish(q, mp, err, IPI2MODE(ipip), NULL); 26743 return; 26744 } [..]26788 err = (*ipip->ipi_func)(ci.ci_ipif, ci.ci_sin, q, mp, ipip,
26789 ci.ci_lifr);
[..]
Back in ip_process_ioctl()
, the variable err
is set to 0 since ip_extract_tunreq()
returns that value (see line 26740). Because err
equals 0, the if
statement in line 26741 returns FALSE
, and lines 26742 and 26743 are not executed. In line 26788, the function pointed to by ipip->ipi_func
—in this case the function ip_sioctl_tunparam()
—is called while the first parameter, ci.ci_ipif
, is still set to NULL
(see line 26717).
uts/common/inet/ip/ip_if.c
ip_sioctl_tunparam()
[..] 9401 int 9402 ip_sioctl_tunparam(ipif_t *ipif
, sin_t *dummy_sin, queue_t *q, mblk_t *mp, 9403 ip_ioctl_cmd_t *ipip, void *dummy_ifreq) 9404 { [..]9432 ill = ipif->ipif_ill;
[..]
Since the first parameter of ip_sioctl_tunparam()
is NULL
, the reference ipif->ipif_ill
in line 9432 can be represented as NULL->ipif_ill
, which is a classic NULL pointer dereference. If this NULL pointer dereference is triggered, the whole system will crash due to a kernel panic. (See Section A.2 for more information on NULL pointer dereferences.)
Summary of the results so far:
An unprivileged user of a Solaris system can call the SIOCGTUNPARAM
IOCTL (see (1) in Figure 3-3).
If the IOCTL data sent to the kernel is carefully crafted—there has to be an interface name with a colon directly followed by an ASCII zero and another arbitrary digit—it’s possible to trigger a NULL pointer dereference (see (2) in Figure 3-3) that leads to a system crash (see (3) in Figure 3-3).
But why is it possible to trigger that NULL pointer dereference? Where exactly is the coding error that leads to the bug?
The problem is that ipif_lookup_on_name()
can be forced to return to its caller function without an appropriate error condition being set.
This bug exists in part because the ipif_lookup_on_name()
function reports error conditions to its caller in two different ways: through the return value of the function (return (null)
) as well as through the variable error
(*error != 0
). Each time the function is called, the authors of the kernel code must ensure that both error conditions are properly set and are properly evaluated within the caller function. Such a coding style is error-prone and therefore not recommended. The vulnerability described in this chapter is an excellent example of the kind of problem that can arise from such code.
Figure 3-3. Summary of the results so far. An unprivileged user can force a system crash by triggering a NULL pointer dereference in the Solaris kernel.
uts/common/inet/ip/ip_if.c
ipif_lookup_on_name()
[..] 19124 static ipif_t * 19125 ipif_lookup_on_name(char *name, size_t namelen, boolean_t do_alloc, 19126 boolean_t *exists, boolean_t isv6, zoneid_t zoneid, queue_t *q, 19127 mblk_t *mp, ipsq_func_t func, int *error, ip_stack_t *ipst) 19128 { [..] 19138 if (error != NULL)19139 *error = 0;
[..] 19161 if (*cp == IPIF_SEPARATOR_CHAR) { 19162 /* 19163 * Reject any non-decimal aliases for logical 19164 * interfaces. Aliases with leading zeroes 19165 * are also rejected as they introduce ambiguity 19166 * in the naming of the interfaces. 19167 * In order to confirm with existing semantics, 19168 * and to not break any programs/script relying 19169 * on that behaviour, if<0>:0 is considered to be 19170 * a valid interface. 19171 * 19172 * If alias has two or more digits and the first 19173 * is zero, fail. 19174 */19175 if (&cp[2] < endp && cp[1] == '0')
19176 return (NULL);
19177 } [..]
In line 19139, the value of error
, which holds one of the error conditions, is explicitly set to 0. Error condition 0 means that no error has occurred so far. By supplying a colon directly followed by an ASCII zero and an arbitrary digit in the interface name, it is possible to trigger the code in line 19176, which leads to a return to the caller function. The problem is that no valid error condition is set for error
before the function returns. So ipif_lookup_on_name()
returns to ip_extract_tunreq()
with error
still set to 0.
uts/common/inet/ip/ip_if.c
ip_extract_tunreq()
[..] 8192 ipif = ipif_lookup_on_name(ta->ifta_lifr_name, 8193 mi_strlen(ta->ifta_lifr_name), B_FALSE, &exists, isv6, 8194 connp->conn_zoneid, CONNP_TO_WQ(connp), mp, func,&error
, ipst); 8195 if (ipif == NULL)8196 return (error);
[..]
Back in ip_extract_tunreq()
, the error condition is returned to its caller function ip_process_ioctl()
(see line 8196).
uts/common/inet/ip/ip.c
ip_process_ioctl()
[..] 26735 case TUN_CMD: 26736 /* 26737 * SIOC[GS]TUNPARAM appear here. ip_extract_tunreq returns 26738 * a refheld ipif in ci.ci_ipif 26739 */26740 err = ip_extract_tunreq(q, mp, &ci.ci_ipif, ip_process_ioctl);
26741 if (err != 0) {
26742 ip_ioctl_finish(q, mp, err, IPI2MODE(ipip), NULL); 26743 return; 26744 } [..]26788 err = (*ipip->ipi_func)(ci.ci_ipif, ci.ci_sin, q, mp, ipip,
26789 ci.ci_lifr);
[..]
Then in ip_process_ioctl()
, the error condition is still set to 0. Thus, the if
statement in line 26741 returns FALSE
, and the kernel continues the execution of the rest of the function leading to the NULL pointer dereference in ip_sioctl_tunparam()
.
What a nice bug!
Figure 3-4 shows a call graph summarizing the relationships of the functions involved in the NULL pointer dereference bug.