tUGzGtG ljGhMkG r G wG
UGUG XG UGUG YG G y
jXG
{ hs G yGmG
¢G
jYG
Fig. 8. Overall System Architecture
scalar multiplier. We assume that Key can be addressable in single bit and the addressed bit is forwarded into Control 2. Control 2 also generates the Ready signal to indicate when the final outputs of X1, X2 and Z are ready. The outputs are compared with the pre-computed results in Tester. In our system, we suppose that the coordinate conversion into affine coordinate system and calculation of Y -coordinate value are performed in the counterpart of this system if it is needed. If we assume that this system is implemented in RFID tags, the counter part can be a tag reader or back-end system.
124
Y.K. Lee and I. Verbauwhede
Step (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18)
Field Operation Command Initial Shift Copy 1. T2 ← X1 + X2 Add Shift Copy Shift 2. T2 ← T22 Multiply Shift∗4 3. T1 ← X1 × X2 Multiply Switch 4. X1 ← x Load x Shift∗2 Switch 5. X1 ← T2 × X1 Multiply Switch Shift∗5 6. X1 ← X1 + T1 Add
Reg1 Reg2 Reg3 Reg4 Reg5 Reg6 X1 X2 Z – – – – X1 X2 Z – – X1 X1 X2 Z – – T2 X1 X2 Z – – – T2 X1 X2 Z – T2 T2 X1 X2 Z – – T2 T2 X1 X2 Z T2 – – X1 X2 Z – X1 X2 Z T2 – T1 – X2 Z T2 – – T1 X2 Z T2 – X1 T1 X2 Z T2 – T2 – X1 T1 X2 Z – T2 X1 T1 X2 Z X1 T2 – T1 X2 Z T2 X1 – T1 X2 Z X1 – T1 X2 Z T2 X1 – T1 X2 Z T2
Fig. 9. Register File Management for Adding Algorithm
4.4
Register File Management for Algorithm Implementation
For better understanding how the system works, the register file management of Adding Algorithm of Fig. 5 is shown in Fig. 9. Note that when the algorithm is actually implemented, some more detailed controls are required. In this example, only the register file rearrangement is shown. Remember that the field addition and multiplication are performed as Reg1 ← Reg2 × Reg3 and Reg1 ← Reg1 + Reg3 respectively. Initially, we assume that the arrangement of register values are as step (1) in Fig. 9, and that Reg4, Reg5 and Reg6 are not available (marked as –) since meaningful values are not stored yet. The commands of Assign, Shift, Switch and Copy work as described in Fig. 7 and the rearrangements of register values are shown in each step. Note that Shift∗4 is the abbreviation of four times Shift operation and some register values are changed to be – when the old values are not used any more. The rest of the Montgomery scalar multiplication algorithm can be also described similarly. In fact, the use of this register file increases the number of cycles due to the control overhead. However, considering that a field multiplication takes a large number of cycles, the number of overhead cycles is relatively small. Note that a field multiplication requires 163 cycles for 163 bit words and the digit size of 1 (reference Fig. 3 for the digit size). We compare synthesis results for various cases in the following section.
A Compact Architecture for Montgomery Elliptic Curve
5
125
Synthesis Results
In order to verify our algorithm and architecture, we synthesized the proposed architecture using TSMC 0.18μm standard cell library. Summarized results are shown in Table 3. While version 1 uses a randomly accessible register file, the other versions use the circular shift register file which is shown in the previous section. Comparing version 1 with version 2, we can see how changing the register file management strategy can effectively reduce the area (17% reduction of total gate area). The use of the circular shift register file requires more cycles. However, if we increase the digit size into 4 (version 5), a much smaller number of cycles can be achieved with even less gate area. A comparison with other works is shown in Table 4. Since the architecture of [4] uses an affine coordinate system, it requires only 6 registers but require a larger number of cycles due to field inverse operations. The ALU of [4] has separate logic modules for multiplication, square and addition where multiplication requires 163 cycles and square and addition require 1 cycle. Except for [6], all the reported results include memory area. In [6], the reported gate area of 8,214 does not include the required RAM area. For a fairer comparison, we estimate the total gate area assuming that 1 bit memory is equivalent to 10 gate area. Note that 1 bit register require 6 gates and there should be some extra area for addressing. According to our experiment of synthesis for a 163 bit register in standard CMOS compilers, the number of gates per bit is above 10 gates. Even in this under-estimation, our results show much smaller Table 3. Synthesis Results Version Ver1 Ver2 Ver3 Ver4 Ver5
Digit size (d) 1 1 2 3 4
Register Type Random Access Circular Shift
Gate Area 15,894 13,182 14,188 14,896 15,538
Cycle 295,032 313,901 168,911 120,581 95,521
Table 4. Comparison with other works
[7] [8] [4]
Technology Key Size Digit Size 0.13 165 – 0.13 160 – – 163 –
[6]
0.13
163
1
Our Work (Ver2)
0.18
163
1
∗∗
Area (Gate) Cycle Memory Units 30,333 545,440 – 28,311 2,500,000 320∗8 bits 15,097 432,000 6∗163 bits 8,214 + 5 RAM 353,710 8∗163 bits (≈ 16, 364)∗∗ 13,182
313,901
6∗163 bits
The gate area of 8,214 does not include RAM area. The gate area of 16,364 is an estimation including the RAM area.
126
Y.K. Lee and I. Verbauwhede
gate number with a smaller cycle number. This result is obvious considering that our ALU is similar to [6] and the number of total memory units of our architecture is two less than [6]. In [8], among several implementations, we show the one having the smallest area, which is still much larger than our results and also has a much larger number of cycles. As a result, our implementation has not only the smallest area but also the smallest cycle.
6
Conclusion
We proposed a compact architecture for an elliptic curve scalar multiplier. This contribution has been achieved by reducing the number of the registers and the complexity of the register file. The reduction of the number of the registers is done in two different approaches. By proposing new formulae for the common Z projective coordinate system, one register was reduced and by the reuse of the registers, two more registers were reduced. Accordingly, three registers were reduced in total. The reduction of the complexity of the register file is done by designing a circular shift register file. As a result, for elliptic curve scalar multiplication, only 13.2k gates and 314k cycles are required. This result not only achieves the smallest area but also the smallest cycle number compared with fairly comparable architectures. Moreover, our processor architecture is secure against TA (Timing Analysis) and SPA (Simple Power Analysis) due to the property of Montgomery elliptic curve scalar multiplication. Acknowledgments. This work is supported by NSF CCF-0541472, FWO and funds from Katholieke Universiteit Leuven.
References 1. Montgomery, P.: Speeding the Pollard and elliptic curve methods of factorization. Mathematics of Computation 48, 243–264 (1987) 2. L´ opez, J., Dahab, R.: Fast multiplication on elliptic curves over GF (2m ) without precomputation. In: Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 316–327. Springer, Heidelberg (1999) 3. Meloni, N.: Fast and Secure elliptic Curve Scalar Multiplication Over Prime Fields Using Special Addition Chains. Cryptology ePrint Archive: listing for 2006 (2006/216) (2006) 4. Paar, C.: Light-Weight Cryptography for Ubiquitous Computing. Invited talk at the University of California, Los Angeles (UCLA). Institute for Pure and Applied Mathematics (December 4, 2006) 5. Sakiyama, K., Batina, L., Mentens, N., Preneel, B., Verbauwhede, I.: Small-footprint ALU for public-key processors for pervasive security. In: Workshop on RFID Security, 12 pages (2006)
A Compact Architecture for Montgomery Elliptic Curve
127
6. Batina, L., Mentens, N., Sakiyama, K., Preneel, B., Verbauwhede, I.: Low-cost Elliptic Curve Cryptography for wireless sensor networks. In: Butty´ an, L., Gligor, V., Westhoff, D. (eds.) ESAS 2006. LNCS, vol. 4357, pp. 6–17. Springer, Heidelberg (2006) ¨ ¨ Sunar, B., Savas, E.: Low-power elliptic curve cryptography using 7. OztAurk, E.A., scaled modular arithmetic. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 92–106. Springer, Heidelberg (2004) 8. Satoh, A., Takano, K.: A Scalable Dual-Field Elliptic Curve Cryptographic Processor. IEEE Transactions on Computers 52(4), 449–460 (2003)
Windows Vault: Prevention of Virus Infection and Secret Leakage with Secure OS and Virtual Machine Yoshiki Sameshima1, Hideaki Saisho1, Tsutomu Matsumoto2, and Norihisa Komoda3 1 Hitachi Software Engineering, Co., Ltd. Graduate School of Environment and Information Science, Yokohama National University 3 Graduate School of Information Science and Technology, Osaka University {same,saisho}@hitachisoft.jp, [email protected], [email protected] 2
Abstract. We present an integrated system of two Windows workstations; while the first workstation is prepared to process secret information, the second is for non-secret which may contain computer virus, and the two workstations are integrated into a PC with secure OS, virtual machine and gateways. Since the two workstations are virtually separated at the physical level, the first workstation is not infected by virus, nor is secret leaked out to the Internet, even if the second is infected by unknown virus. Comparing previous work which realizes complete data isolation for intelligence community, user of the proposed system can import data securely from the second workstation to the first through security guaranteed channel between the two workstations. The user can also read e-mail from the Internet on the first without fear of virus infection, and as a result the user does not need to be aware that she/he uses the two workstations. Keywords: virus, secret leakage, information flow, virtual machine, secure OS.
1 Introduction The top two IT security threats are virus infection and secret leakage including PC theft [1-3]. The best practice to prevent virus infection is to use virus protection software and install security patch. However, the practice is becoming less effective, because of the following two reasons: The first reason is zero-day attack [2]; the attack code appears soon after vulnerability is announced, for example the attack code of MS07-002 appeared three hours later after the patch was released. As a result, virus definition file cannot be in time. The second reason is targeted attack [4]; while existing attack codes aim to be spread over many victims, the target of the new attack is very limited, for example a single organization or few people. As a result, there is less chance to detect the target attack and the virus definition file may not be issued. A solution against the virus threat is to utilize secure OS [5, 6]. The OS supports the Mandatory Access Control (MAC) [7], and the damage of attack to vulnerability S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 128–141, 2007. © Springer-Verlag Berlin Heidelberg 2007
Windows Vault: Prevention of Virus Infection and Secret Leakage
129
of application is limited only to the application; the attack code cannot access file nor execute process which are not permitted in security policy, even if the code gets the administrator privilege. The secure OS is used mainly for server, but not client PC, because of management of the security policy; it is difficult for end user or system administrator to configure the security policy specifying which process is permitted to access to which resources with what kind of operations. Another solution is behavior based virus detection [8]. Virus has some specific behavior; some of virus code is encrypted to bypass the virus protection system, and decryption of code is one of features of the virus. Another virus sends many e-mail messages of its own copy. The new technology watches such behavior of virus and detects the virus, but the new virus detection may miss targeted attack, because the virus targets specific organization or information, and the virus may be tuned so as not to be detected by such virus protection software. On the other hand, more serious IT security threat is leakage of secret information or secret leakage [3]. The main reason of secret leakage is lost of PC or storage media, but other reason is intentional leakage by authorized user and exposition to the Internet by virus. Solutions against secret leakage are file encryption and prohibition of portable storage media/printer. The Windows OS supports file encryption, and its security policy can enforce to stop use of USB memory. However, these solutions are not effective for intentional leakage through e-mail or HTTP by authorized user; it is possible to stop sending e-mail outside or posting via HTTP, however, this is not practical for commercial organizations. NetTop [9, 10] is a countermeasure of the two threats; it is designed for intelligence community, and the goal is data isolation. User of NetTop accesses classified information of multiple categories and operates multiple workstations which are integrated into a single PC with Trusted Linux and Virtual Machine (VM). The workstations are separated virtually at physical level, so the threats do not happen. However, the user should always be aware that which workstation she/he is operating and needs to switch the two workstations. This is acceptable for users of intelligence community, but it is very troublesome for office workers of commercial companies. The authors propose Windows Vault as system isolation which usage is as same as a normal Windows as possible. The user operates a safe workstation isolated from the Internet, but she/he can access the external information that comes from the Internet on the safe workstation without threats of virus infection or secret leakage. The word ‘Vault’ means a room with thick walls and strong doors where valuables can be kept safely; Windows Vault is a vault running Windows, that is, a Windows workstation guarded by secure OS and gateways which establish secure data exchange between the isolated workstation and the external environment including the Internet. In the paper, the authors describe the architecture of Windows Vault in Section 2, evaluation of performance, security and usages in Section 3 through 5, compares with the previous works in Section 6, and conclude in Section 7.
130
Y. Sameshima et al.
2 Windows Vault 2.1 Concepts The principal of Windows Vault is very simple; data is divided into two categories, safe secret and unsafe non-secret, and the later includes information on the Internet and may contain virus. Windows Vault processes the two data categories with two virtual workstations; Internal Workstation for safe secret, and External Workstation for unsafe non-secret, and the two workstations are integrated into a single physical PC with use of VM and secure OS. Network is also divided; Internal Workstation is connected to Internal Network and External Workstation to External Network including the Internet. The above architecture realizes very high level security, as far as user processes the two categories in completely separated manner. But such use is not realistic. While main task of user of commercial company is processed on Internal Workstation, the user also needs to access the Internet and utilize information of the Internet as part of secret; text on Web and spread sheet data attached to e-mail from business partner are examples of such information. It is also desirable to use a single e-mail client; the user does not want to use two clients on Internal and External Workstations, because it is different from the current e-mail client usage. As for web browser, the other most used network application, it is normal that user operates multiple browser windows, and it is desirable that the user can operate browser window accessing a site on the Internet in the same operation of the window on Internal Network. As a result, the following functions are required with security guaranteed form: (1) Data import: data is imported from External Workstation to Internal Workstation. (2) Mail retrieval and sending on Internal Workstation: user operates e-mail client on Internal Workstation, retrieves and sends messages with the client from/to Internal and External Networks. (3) Browsing Internet sites from Internal Workstation: user operates web browser on External Workstation from Internal Workstation. With the above functions, the user needs to use Internal Workstation only and she/he can process information on External Workstation on Internal Workstation. Four gateways connect the two workstations and realize secure channel between the two workstations for e-mail and copy & paste operation. In the following, the overall architecture, platform OS, and four gateways are described. 2.2 Overall Architecture The overall architecture of Windows Vault is shown in Figure 1. Platform OS is the base of security, and we adopt the Security-Enhanced Linux (SELinux) [6], which supports the MAC based on the Type Enforcement model. Each of Internal and External Workstations consists of VM, Windows OS, and applications. Two LANs, Internal and External Networks, are connected to Platform OS. It is assumed that the network devices connected to Internal Network are managed, that is, only identified and authorized devices are connected to the network, and virus infected PCs are not connected.
Windows Vault: Prevention of Virus Infection and Secret Leakage
IW AP AP … Windows OS VM
Server
IW: Internal W/S EW: External W/S AP: Application
EW AP AP … Windows OS … VM Platform OS PC hardware
Gateway
Windows Vault
131
… Server
Internal Network
Internet Firewall
External Network
Fig. 1. Architecture of Windows Vault
Windows Vault POP Gateway IN: Internal Network EN: External Network IW: Internal W/S EW: External W/S
Display Gateway IW
Clipboard Gateway
EW
Terminal Gateway Virtual IN Virtual EN IN
EN Fig. 2. Gateways on Virtual Internal Networks
The virtual networks on Platform OS are configured as shown in Figure 2; Internal Workstation is connected to Internal Network and the virtual Internal Network, so on External Workstation, and each gateway is connected to the virtual networks and External Network. Data exchanged between Internal and External Workstations is limited only through the gateways by the configuration of Platform OS. 2.3 Platform OS In order to prevent user from changing configuration of Platform OS, direct access to Platform OS must be prohibited. The following configuration realizes this: (1) The default run level is changed to level 4. (2) The init process starts services required to manage Platform OS, for example syslogd, and does not start getty, nor trap ctl-alt-del. (3) The starting script kicks the following programs: the X window system display server, four gateways, two VMs, and screen lock program. (4) The shutdown process runs after both the VMs end. The security policy of Platform OS is configured as shown in Figure 3. The init process launches the xinit command, and the command starts the X window system display server, VMs and gateways. Each VM accesses three files, log, configuration,
132
Y. Sameshima et al.
init init_t call
xinit wv_start_t
X Display Server wv_xserver_t
w/o
VM VM wv_int_t wv_ext_t POP Gateway wv_pop_gw_t
Log File wv_int_log_t r/o
…
r/w
Configuration File wv_int_conf_t Virtual Disk File wv_int_disk_t
Fig. 3. Security Policy of Platform OS
and virtual disk image, and each of the files is assigned a different type of SELinux. The access kind of each type is minimal, for example, wv_int_log_t, the type of log file of Internal Workstation, is written only by wv_int_t, the domain of Internal Workstation. 2.4 Gateways The first request, data import from External Workstation to Internal Workstation without virus infection, is realized by Clipboard Gateway, which works as follows: (1) The clipboard watch agent on External Workstation transmits object on the clipboard to the gateway when the user copies object. (2) The gateway checks the object type, and transmits the object if the type is not file. Otherwise the gateway does not transmit. (3) The agent copies the object to the clipboard of Internal Workstation. In order to realize part of the second requirement, e-mail retrieval from External Network, the e-mail client on Internal Workstation accesses two POP servers on Internal and External Networks, and the client accesses the later through POP Gateway, which encapsulates attached files. The encapsulation and decapsulation processes are shown in Figure 4. POP gateway encrypts each attached file with a randomly generated AES key, encrypts the random key with an encryption key of RSA, and signs the encrypted random key and file with a signature key of RSA. When the user opens the attached file received from External Network, it is sent to Display Gateway, the gateway checks the signature with the verification key corresponding to the signature key, forwards the encrypted key and file to the display agent on External Workstation, and then the agent decrypts the file and displays it. The e-mail from Internal Network is opened normally on Internal Workstation. Attached Files from the Internet are opened in External Workstation and they are accessed safely from Internal Workstation through Terminal Gateway. Actually the gateway is a remote access client or terminal client running on a remote access server or terminal server; as shown in Figure 5, combination of two remote accesses, one from Internal Workstation to Terminal Gateway and the other from the gateway to External Workstation, realized a remote access from Internal Workstation to External Workstation.
Windows Vault: Prevention of Virus Infection and Secret Leakage
Text Message + Encrypted Attached File Header Text Body Encrypted Key Encrypted File Signature
Encryption Key, Signature Key
POP Gateway
Encrypted Attached File Encrypted Key Encrypted File Signature Internal Workstation
133
Display Gateway
Text Message + Attached File Header Text Body Attached File
POP Server Decryption Key
Encrypted Key Encrypted File
Verification Key
Display Agent Attached File
External Workstation
Fig. 4. Encapsulation and Decapsulation of Attached File
Internal Workstation Terminal Client Windows VM
Terminal Gateway Terminal Client Terminal Server Platform OS
External Workstation Terminal Server Windows VM
Fig. 5. Terminal Gateway as a Combination of Two Remote Accesses
Fig. 6. The same files opened in Internal Workstation (right) and External Workstation (left)
134
Y. Sameshima et al.
With use of Terminal and Display Gateways, the user can open an attached file of e-mail from the Internet with the same operation of opening attached file from Internal Network; the file is automatically displayed in the terminal client on Internal Workstation, and it looks almost the same as file opened in Internal Workstation locally as shown in Figure 6.
3 Performance Evaluation The authors have implemented a prototype of Windows Vault and measured performance in the environment shown in Table 1 with the benchmark program, CrystalMark 2004R2 [11], and result is shown in Table 2. The column M is the result of the mean score of Internal and External Workstations measured simultaneously, S is one of Internal Workstation, and W is a normal Windows PC with Intel Core Solo (1.66GHz). Table 1. Environment Item Platform OS VM Windows CPU Memory
Description Cent OS release 5 VMware Workstation 5.5.2 Windows XP Pro. + SP2 Intel Core 2 Duo (2.16GHz) 2GB
Table 2. CrystalMark Results Item Integer Float Memory HD GDI
M 8,361 9,511 10,447 11,457 1,979
S 8,383 9,465 10,722 11,985 2,205
W 3,901 4,584 4,764 4,485 1,426
Table 3. Performance of POP Gateway and Display Process File Size (B) 10K 345K 3,262K
Msg. Size (B) 15K 467K 4,406K
N (sec) 0.198 0.206 0.522
P (sec) 0.198 0.280 0.977
P/N 1.00 1.36 1.87
D (sec) 0.018 0.438 1.954
Comparing with the normal Windows, the performance of Internal Workstation and External Workstation is better, and each of the workstations has shown enough performance as a Windows PC. The performance of retrieving e-mail and opening attached file is shown in Table 3. The column N is the time without POP Gateway, and the column P is the case through the gateway. The overhead is about 0-90%. The column D is the
Windows Vault: Prevention of Virus Infection and Secret Leakage
135
transmission and decryption time of encrypted file through Display Gateway. There is overhead of encryption and decryption, however, it is not so heavy to give impact on usability.
4 Security Considerations 4.1 Attacks from External Workstation/Network to Internal Workstation Internal Workstation is not directly connected to External Network, but there are four routes of attack from External Network to Internal Workstation. The first route is via e-mail; the e-mail client receives messages from External Network through POP Gateway which encrypts attached files, and the files cannot be opened on Internal Workstation. As a result, there is no possibility to infect virus via e-mail attached file. Virus might be included in header or text body, so the gateway should check character code and line length, and sanitize if they do not meet the protocol specification. The second is via VM; External Workstation may be infected with virus which attacks the base VM, and such virus may attack Internal Workstation. But the MAC of Platform OS does not allow access between Internal and External Workstations, and such attack cannot happen. The third is via Clipboard Gateway. Normally user sees and selects an object, and copies it to the clipboard, so the possibility of virus in clipboard object, which is not a file, is considered to be low, but the object might contain virus code. In order to avoid such possibility, the following object check functions of the gateway are useful: (1) Plain text only: A clipboard object consists of type and data, and the gateway checks the type and only text object is transmitted to Internal Workstation. The size of text data and character code are also be checked. (2) Strictly defined data: If the object data type is strictly defined, it is possible for the gateway to check the clipboard object meets the definition and does not contain virus. The fourth route is via Terminal Gateway; basically the gateway transmits the keyboard and mouse events from Internal Workstation to External Workstation and graphical screen data from External Workstation to Internal Workstation, so virus code cannot come into Internal Workstation. Normally a remote access protocol supports clipboard sharing, however, Terminal Gateway kills the function, and virus infection through Terminal Gateway does not happen. 4.2 Attacks by User User may try to leak a secret file on Internal Workstation to External Workstation through Display Gateway, but the file is not forwarded to External Workstation, because the signature verification at the gateway fails. Consequently, there is no secret leakage of file created on Internal Workstation to External Workstation or Network through Display Gateway. The information flow of POP Gateway and Clipboard Gateway is only from External Workstation to Internal Workstation,
136
Y. Sameshima et al.
therefore secret leakage does not happen. Terminal Gateway transmits display image, mouse and keyboard events, and secret leakage from Internal Workstation to External Workstation cannot happen except that a malicious user leaks secret text by typing keyboard. The user can change consoles of character terminals or X Window by typing ctlalt-function key, however, Platform OS is configured as no getty and screen lock program is running on the X Window console, so the user can only access Internal and External Workstations. As a result, normal office worker cannot access Platform OS, nor change its configuration. But the user who has knowledge of Linux management can access Platform OS by trapping the boot process or direct access to hard disk. Possible solutions are change of the init process program, physical lock of the PC hardware or use of the Trusted Platform Module (TPM) [12]. The TPM is a chip on a PC motherboard and calculates hash values of software components. On request from a remote challenger, the TPM returns the hash value with signature generated within the chip. With this attestation process, the remote challenger can authenticate PC hardware and verify software integrity of Platform OS, and as a result, it is possible to detect physical attacks such as replace of PC hardware, workstations and gateways. The TPM is also used as a key storage; it is possible to encrypt the virtual disk images of the two workstations and decrypt only on the specific PC hardware that has the TPM storing the decryption key, and this countermeasure makes the attack of direct access to hard disk useless. A few services, such as system logging, are running on Platform OS, and it might be possible to attack such services. However, each service is given a domain and separated from the other processes by the MAC, so Internal Workstation cannot be attacked through a service even if there is vulnerability of the service. 4.3 Vulnerability of Gateways and Enhancements If POP Gateway has vulnerability, attacker may get the control of the gateway and can inject virus to Internal Workstation or steal secret from Internal Workstation. It is the same as Clipboard and Display Gateways. The MAC of Platform OS cannot cover the weakness of the three gateways, and the security quality of the gateways is very important. However, it is possible to enhance the security by dividing function of each gateway as follows: Display Gateway has three functions; firstly it receives encrypted and signed file from Internal Workstation, secondly verifies the signature and strips it, and finally sends the file to External Workstation. The three functions can be realized by three processes of different domains, receive process of r_display_t domain, verify process of v_display_t domain, and send process of s_display_t domain. Data between processes is passed via files of different types; the receive process receives file from Internal Workstation and saves it of rv_display_t type, the verify process verifies and strips the signature and saves data as file of vs_display_t type, and the send process sends it to External Workstation. The permitted operations between the domains and types are shown in Figure 7.
Windows Vault: Prevention of Virus Infection and Secret Leakage
Internal Workstation
External Workstation
Encrypted File + Signature
Encrypted File
Receive Process (r_display_t) write only SecurityEnhanced Display Gateway
137
Verify Process (v_display_t) read only
Encrypted File + Signature (rv_display_t)
write only
Send Process (s_display_t) read only Encrypted File (vs_display_t)
Fig. 7. Domains and Types of Security-Enhanced Display Gateway
Comparing the original implementation of the gateway, the new gateway is more secure, because of the following reasons: (1) An attacker, which might be the user, must exploit the vulnerability through file, which is enforced by the MAC, and the attack through file is more difficult than one through TCP/IP communication channel, because the former is not interactive and less measures of attack. (2) The order of attack is fixed; first the reception process, next the verification process, then the send process, and there is no other pass of attack, because the other pass is prohibited by the MAC, and the attacker has less means to attack. (3) Each process realizes one function and the code is simpler, so that it is more secure than process supporting multiple functions. (4) Even if an attacker succeeds to exploit all the processes, the information flow is limited from Internal Workstation to External Workstation by the MAC, and there is no chance virus infection of Internal Workstation. The situation of Clipboard Gateway is very similar to Display Gateway. The gateway has three functions; firstly it receives clipboard object from External Workstation, secondly verifies the type object, and finally sends the object to Internal Workstation. The security of the gateway can be enhanced in the same way of Display Gateway. Since the POP is an interactive protocol, the situation of POP Gateway is different from the other two gateways, but the basic idea is the same; the first process receives messages as a POP client, the next encrypts and signs the attached files, and the third behaves as a POP server and sends the messages to Internal Workstation. While the fail of security of the three gateways before the enhancements leads to catalysis, the channel through Terminal Gateway is safe even if there is vulnerability in the gateway. The gateway consists of two applications, terminal server and terminal client which domains are different. As a result, if an attacker gets control of one application, the attacker cannot get one of the other. Even if one application has
138
Y. Sameshima et al.
vulnerability and an attack code succeeds to transmit any type of data, the other application transmits only graphical display data, keyboard and mouse events, so that the security of the gateway is guaranteed. 4.4 Another Data Category: Unsafe Secret Current Windows Vault processes two data categories, safe secret and unsafe nonsecret, but there is another category data, unsafe secret. Online banking is a typical example; account number, password, balance sheet are secret, but the web page data sent from the server may contain virus. Another typical example is e-mail message sent from a business partner; an attached file contains business secret but the file might be infected with virus. The current Windows Vault processes such data as unsafe non-secret, because it comes from External Network, and there is a risk that such secret data is stolen from External Workstation. A solution is the third type workstation, which can access only trusted web sites through encrypted and authenticated channel over the Internet. An attached file of e-mail is transmitted through Display Gateway and opened on this workstation; even if the workstation is infected with virus, secret cannot be leaked out to the Internet, because the workstation is connected only the trusted web sites.
5 Usability of Network Applications In the following, usability of sending e-mail and web browsing is described. 5.1 Sending Message to External Network In the current implementation, user can receive e-mail message from External Network with e-mail client on Internal Workstation, but cannot reply to the message, because Internal Workstation cannot access External Network, and this leads to inconvenience. It is also true that the user cannot send a new message from Internal Workstation to External Network. As far as user sends from Internal Workstation to External Network, the only solution is encryption; all messages from internal to trusted external recipient are encrypted by the fifth gateway, SMTP gateway, which is connected to the Virtual Internal Network and External Network, and encrypts all received e-mail from Internal Workstation. A solution to reply to message from External Network is to add the original whole message as an attached file to the message; POP gateway encrypts and signs the whole message, and then adds it as the last attached file. When user wants to reply to the message, the user selects the last attached file to open, and then the file is sent to External Workstation through Display Gateway, an e-mail client opens the file and displays the original message, and then the user replies with normal operation through Terminal Gateway. As for a new message to External Network, the user needs to send it with the email client on External Network. However, by sending carbon copy to the user own account on Internal Workstation, the user can access the new message on Internal Workstation.
Windows Vault: Prevention of Virus Infection and Secret Leakage
139
5.2 Web Browsing With click of links, user can brows web pages without conscious of the page location. Windows Vault divides OS and network into internal and external, so the user needs to conscious of which network the accessing site belongs to, and changes browsers on Internal and External Workstations. This is big change of usage of web browser. In order to realize ‘smooth browsing from internal to external,’ it is better that with click of link to external page on internal page, the external page is displayed on Internal Workstation. This function can be realized with HTTP Gateway which calls web browser on External Workstation. The gateway is connected to the virtual External Network and Internal Network, and behaves as follows: (1) The web browser on Internal Workstation accesses the gateway according to the proxy configuration of the browser. (2) The gateway returns an error page to the browser, and sends the requested URL to the HTTP agent on External Workstation. (3) The agent directs browser on External Workstation to access the URL. (4) The browser accesses the page of the URL, displays the external page, and user can access the page from Internal Workstation through Terminal Gateway. With the gateway, the user can smoothly brows from a page on Internal Network to a page on External Network in the same as the current operation. The reverse direction browsing is also possible, but it needs to sanitize the URL in external page, since virus may be contained it the URL.
6 Related Works NetTop also consists of Trusted Linux and VMs, and Windows’ on VMs exchange data via ‘Regrade Server’ with explicit user authorization [10]. The user of NetTop is enforced to use two Windows OSes, two mail clients, two documentation tools, etc., and this is a burden for office workers of commercial companies. On the other hand, the user of Windows Vault accesses only Internal Workstation basically; the user can receive and read text body of e-mail from the Internet, open and read attached files with the same operation as the normal Windows. It is also true that the user can access web pages on the Internet from Internal Workstation through HTTP and Terminal Gateways. This is convenient for users who do not aware of multi-category security. The Trusted Virtual Domains framework [13] is a kind of system isolation based on Trusted Platform Module [12]. The goal of the framework is to establish secure communication channels between software components with the integrity assurance of the other components. The framework also utilizes multiple VMs and software components of different domains running on a hardware platform. There is a secure communication channel between the software components of the same domain, but no communication between those of different domains. However, Windows Vault is focusing the air gap between the different domains or workstations, and has established secure user data exchange between the two workstations. VIRTUS [14] is a new processor virtualization architecture for security-oriented next-generation mobile terminals. It creates OS instances, called domains, for
140
Y. Sameshima et al.
pre-installed applications and downloaded native applications. VIRTUS supports inter-domain communications, but it does not clearly specify its security. On the other hand, in Windows Vault the communication between Internal Workstation and External Workstation/External Network is designed carefully not to cause virus infection nor secret leakage.
7 Conclusions The authors have described Windows Vault, which consists of two Windows workstations, one for safe secret and the other for unsafe non-secret. The two workstations are integrated into a single PC with use of VM and secure OS, and connected securely by gateways. These gateways transmit data between the two workstations without virus infection of Internal Workstation or secret leakage from Internal Workstation to External Workstation. Comparing with the existing data isolation system, the proposed system realizes the same security without change of current user operations of e-mail or awareness of multi-category security of intelligence community.
References 1. Gordon, L.A., Loeb, M.P., Lucyshyn, W., Richardson, R.: 2006 CSI/FBI Computer Crime and Security Survey: Computer Security Institute (2006) 2. Symantec Reports Rise in Data Theft, Data Leakage, and Targeted Attacks Leading to Hackers’ Financial Gain: News Release (19th March 2007), http://www.symantec.com/ about/news/release/article.jsp?prid=20070319_01 3. McAfee, Inc. Releases New Research Suggesting Data Loss Will Lead To Next Major Corporate Collapse: Press Release (24th April 2007), http://www.symantec.com/ about/news/release/article.jsp?prid=20070319_01 4. Frantzen, S.: Targeted attack: Experience from the trenches: The SANS Institute (May 21, 2006) http://isc.sans.org/diary.html?storyid=1345 5. Argus Systems Group, Inc.: PitBull.comPack, OS-level Security for Solaris and AIX: White Paper (March 2001), http://www.argus-systems.com/public/docs/pitbull. white paper.oss.pdf 6. Loscocco, P.A., Smalley, S.D.: Meeting Critical Security Objectives with SecurityEnhanced Linux. In: Proceedings of the 2001 Ottawa Linux Symposium (2001), http://www.nsa.gov/selinux/papers/ottawa01.pdf 7. Trusted Computer System Evaluation Criteria; Department of Defense Standard 5200.28STD (August 1983) 8. Christodorescu, M., Jha, S., Seshia, S., Song, D., Bryant, R.E.: Semantics-Aware Malware Detection. In: IEEE Symposium on Security and Privacy (2005) 9. HP NetTop: A Technical Overview (December 2004) 10. Meushaw, R., Simard, D.: Nettop, Commercial Technology in High Assurance Applications: NSA Tech Trend Notes (Fall 2000), http://www.vmware. com/pdf/ TechTrendNotes.pdf 11. Crystal Dew World, http://crystalmark.info/?lang=en
Windows Vault: Prevention of Virus Infection and Secret Leakage
141
12. Trusted Computing Group, TCG Specification Architecture Overview, Specification Revision 1.4 (August 2, 2007) 13. Griffin, J.L., Jaeger, T., Perez, R., Sailer, R., van Doorn, L., Caceres, R.: Trusted Virtual Domains: Toward Secure Distributed Services. In: The First Workshop on Hot Topics in System Dependability (June 30, 2005) 14. Inoue, H., Ikeno, A., Kondo, M., Sakai, J., Edahiro, M.: VIRTUS: A new processor virtualization architecture for security-oriented next-generation mobile terminals. In: Proceedings of the 43rd annual conference on Design automation, pp. 484–489 (2006)
An Architecture Providing Virtualization-Based Protection Mechanisms Against Insider Attacks Frederic Stumpf0002 , Patrick R¨ oder00020002 , and Claudia Eckert Department of Computer Science, Technische Universit¨ at Darmstadt, Darmstadt, Germany {stumpf,roeder,eckert}@sec.informatik.tu-darmstadt.de
Abstract. Insider attacks are very powerful and are relevant in many scenarios, such as grid computing, corporate computing on home computers and electronic commerce of digital content. We present an example scenario to illustrate these attacks and perform a threat analysis to extract requirements for preventing insider attacks. We believe that these requirements are also representative of other scenarios. We develop a four layered protection architecture by using virtualization techniques based on these requirements. Therefore, the proposed architecture prevents insider attacks in scenarios with similar requirements as well.
1
Introduction
Insider attacks are relevant in scenarios where confidential data or intellectual property is processed, e.g., grid computing, corporate computing on home computers and electronic commerce of digital content. These attacks are more dangerous than outsider attacks, since the inside attacker can use his legitimate permissions to perform an attack, e.g., to steal data. An FBI/CSI survey [1] states that a substantial portion of losses is attributed to inside attackers, which shows that these attacks are important and must be considered. Depending on the scenario, the corresponding attacks differ in the attacker’s options to perform an attack. For example, a private user who is trying to bypass a digital rights management (DRM) protection mechanism has unlimited access to his computer. In contrast to this, office employees often do not have administrative access to their computers. The common problem of these attacks is that the attacker can modify his local system configuration to deactivate or bypass protection mechanisms. As a result, each scenario requires different protection mechanisms. These mechanisms should be effective, but should not limit usability, e.g., mechanisms for multi-purpose computers should maintain the flexibility of these computers. Therefore, we develop a flexible protection architecture for general purpose 0002 00020002
The author is supported by the Germand Research Foundation (DFG) under grant EC 163/4-1, project TrustCaps. The author is supported by the PhD program Enabling Technologies for Electronic Commerce of the German Research Foundation (DFG).
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 142–156, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
An Architecture Providing Virtualization-Based Protection Mechanisms
143
computers, which maintains the universal character of these computers. We introduce protection mechanisms which prevent software components on different trust levels from being able to influence each other. This gives us the ability to run additional software on the machine besides the software that requires protection. We use the scenario of handling confidential documents in a business environment to perform a threat analysis. Next, we extract requirements for our protection architecture from this analysis. A central challenge is to detect when the attacker has modified his local system configuration. For this purpose, we use the concepts of the Trusted Computing Group (TCG) and virtualization techniques. The remainder of this paper is organized as follows: We provide background information on the TCG concepts and virtualization techniques in Section 2. Section 3 explains our scenario. We perform a threat analysis for this scenario in Section 4. We extract requirements for the protection architecture which we present in Section 5. After that, we evaluate these mechanisms by discussing whether they prevent the attacks described before in Section 6. Section 7 discusses related work. Finally, we conclude and discuss future work in Section 8.
2
Background
In this section, we present background information on trusted computing and virtualization techniques, which are important for understanding our approach. 2.1
Trusted Computing
The core of the TCG mechanisms [2] is the Trusted Platform Module (TPM), which is basically a smart card soldered to the mainboard of a PC. The TPM serves as the root of trust. Tampering with the TPM is generally difficult, since it is implemented in hardware and uses non-migratable keys for certain cryptographic functions. Therefore, we assume that the TPM is trustworthy. One must also assume that the hardware vendor is trustworthy, and has designed the TPM chip according to the specifications. Although the TPM chip is not specified to be tamper-resistant, it is tamper-evident, meaning that unauthorized manipulations will at least be detected. In this paper, the Platform Configuration Registers (PCRs) are of particular interest. These registers are initialized on startup and then used to store the software integrity values. The TPM calculates the hash value of software components before they are executed and writes this software integrity value to a specific PCR by combining the current result with the previous value of the PCR. The following cryptographic function is used to calculate the values for the specific registers: Extend(P CRN , value) =SHA1 (P CRN value) SHA1 refers to the cryptographic hash function used by the TPM, while the operation represents a concatenation of byte arrays. The trust-anchor for a so
144
F. Stumpf, P. R¨ oder, and C. Eckert
called trust-chain is the Core Root of Trust Measurement (CRTM) , which resides in the BIOS and is first executed when a platform is powered up. The CRTM then measures itself and the BIOS, and hands over control to the next software component in the trust-chain. For each measured component, an event is created and stored in the stored measurement log (SML). The PCR values and SML are used to attest the platform’s state to a remote party. In order to guarantee the authenticity of these values, they are signed with a non-migratable TPM signing key, namely the Attestation Identity Key (AIK). A remote platform can compare those signed values with reference values to check whether the platform is in a trustworthy state or not. 2.2
Virtualization
Virtualization allows the execution of several different virtual machines, with different operating systems, on a single host entity by the introduction of a hypervisor, also known as a virtual machine monitor (VMM) [3]. This hypervisor provides an abstract interface to the hardware and partitions the underlying hardware resources. The underlying resources are then available to the virtual machines through the VMM which maintains full control of the resource given to the VM. The main contribution of this technology is the provision of strong isolation between different virtual machines, established by the virtual machine monitor.
3
Example Scenario
In this section, we present the example scenario of handling confidential documents in a business environment. We use a client server based document processing architecture (DPA) as an example. This architecture enables the user to view and edit documents, while enforcing access control policies, which are defined by corresponding rules. We perform a threat analysis for this architecture in the next section. The DPA is depicted in Figure 1 and uses three databases. The document database (Doc DB) contains all documents of the system. The rule database (Rule DB) contains the access control rules, which specify allowed or denied accesses to the documents and their parts. Finally, the user database (User DB) stores the credentials of the users of the system, as well as the corresponding roles, including their hierarchy. The document editor (DE) presents documents to the user and offers operations that can be performed on the documents. If the user invokes such an operation, the corresponding request is sent to the document processor (DP), which runs on the server and performs the requested operation if it is permitted. Inside the DP, the policy enforcement point (PEP) intercepts each operation and asks the policy decision point (PDP) whether the requested operation is allowed. The PDP uses the three databases to decide whether to allow or deny the requested operation. We believe that this scenario is a representative
An Architecture Providing Virtualization-Based Protection Mechanisms Client
User
145
Server
DE
DP PEP
Rule DB
Doc DB
PDP
User DB
Fig. 1. Document processing architecture (DPA)
example for evaluating insider attacks in scenarios where all documents reside on the server. Moreover, the scenario has similarities with scenarios for DRM or corporate computing on home computers. The common requirement of these scenarios is that the corresponding architectures depend on the trustworthiness of the client configuration. In addition to this, these architectures must ensure that data is not extracted from the protected system on the client side.
4
Threat Analysis
In this section, we evaluate possible attacks on the components of the DPA mentioned above. We describe two different types of attackers, namely the inside attacker and the outside attacker. Inside Attacker: An inside attacker is a legitimate user of the system who misuses his permissions to perform an attack. We assume that an inside attacker also has physical access to the client machine, which enables additional attacks. In addition to this, an inside attacker can perform remote attacks on the document processor. In a case, where the system is used within a company, an inside attacker is typically an employee of the company. Outside Attacker: An outside attacker is a user who does not have legitimate access to the system, which means that he has no credentials to login to the server and also has no physical access to any machine of the system. The outside attacker can only perform remote attacks on the client and the document processor. For example, if our system is used within a company an outside attacker might be either a professional hacker hired by a competitor, a government spy performing industry espionage or a hacker who tries to perform attacks just for his own entertainment. All together, an inside attacker can perform much more powerful attacks than an outside attacker. Since the attacks of an outside attacker are a subset of the attacks of an inside attacker, we focus on attacks by inside attackers. With this approach, we also prevent the attacks of an outside attacker. We assume the server to be trustworthy. Accordingly, we only examine attacks on the client.
146
4.1
F. Stumpf, P. R¨ oder, and C. Eckert
Attacks on the Client
In this section, we describe possible attacks in the scenario mentioned above. We focus on attacks that compromise the confidentiality of the protected data. Software Manipulations. In the following, we discuss software manipulations on the client, which can be performed on different components of the system. Extract data from the document editor. An inside attacker can try to extract confidential information from the document editor. As a result, the document editor must be designed in a way, that it is not possible to extract confidential data from it and transfer it to another application, e.g., an e-mail client. Manipulate the document editor. If the document editor prevents the extraction of confidential data, the attacker can try to manipulate the corresponding protection mechanisms. Alternatively, the attacker could use another document editor that is compatible with the protocols used in the DPA, but allows data to be extracted. Therefore, we must have a mechanism to ensure that the document editor is not modified and that it is the version that has been deployed originally. Use the operating system mechanisms to extract data. The attacker could also use the underlying operating system on which the document editor runs to extract data. One such method is to extract the confidential data from the memory by writing the memory used by the document editor to a file, which is also referred to as a memory dump. An attack like this requires that the application of the attacker runs in kernel mode, which enables access to the entire physical memory of the machine. There are many other similar methods, which all have in common that they use the services offered by the underlying operating system. Consequently, we must configure the underlying operating system in a way that prevents the use of services like memory dumps to extract the confidential data from the document editor. Manipulate operating system. If the underlying operating system is configured in a way that prevents the extraction of confidential data using its services, the attacker could modify the operating system or its configuration. The attacker could try to re-enable the mechanisms that we have disabled before. For example, he could exchange a system module, e.g., the module that performs memory management or the module that displays data on the screen. Another similar approach is to replace the entire operating system with a system that allows the extraction of data. As a result, we need a mechanism to ensure that the operating system is neither manipulated nor entirely replaced. This mechanism must ensure that the operating system is authentic, which means that it is the one that has been deployed and it is configured as we have defined. Extract confidential data from the swap file. The DPA does not store documents permanently on the client. Instead, the DPA keeps them in the memory to display and edit them, which avoids attacks based on accessing the hard drive of a client machine. Nevertheless, we must also ensure that the operating system does not
An Architecture Providing Virtualization-Based Protection Mechanisms
147
swap the memory used by the document editor to the hard disk, if available memory is getting low. This again must be done by configuring the operating system. As a consequence, there is no risk of losing confidential data by stealing a hard drive of an authorized client. Masquerading Attacks. In the following, we discuss different types of masquerading attacks on the DPA. We describe the cloning of a client machine and the spoofing of the server of the DPA. Clone a client machine. Another type of attack is to clone an authorized client machine by creating an exact copy of the configuration of an authentic machine, e.g., by creating an exact copy of the hard disk of the authentic client or by copying the authentication credentials to another client. For example, a legitimate user could set up such a cloned client machine in an area which is not under surveillance to extract data by taking pictures of the displayed data. To reduce the risk of this type of attack, we need a mechanisms that prevents the cloning of a client machine. This mechanism must ensure that the identity of the client machine is bound to the hardware of the machine. Masquerade the server. The attacker can also masquerade the server, e.g., by redirecting the network traffic using a DNS poisoning attack. Therefore, we need mutual authentication between client and server. In our scenario, the masquerading of a server is less dangerous, since no confidential documents are uploaded to the server. However, this can be the case in the scenario of corporate computing on home computers. Hardware Attacks and Analogue Attacks In the following, we discuss attacks on the hardware and analogue attacks. We consider DMA attacks as hardware attacks and discuss them in this section, too. Use DMA to extract data. Another type of attack is to use a device with DMA to extract confidential data. DMA bypasses any protection managed by the CPU and allows to access the entire memory. Extract data using probing attacks. Moreover, the attacker can perform probing attacks, such as mechanical or electrical probing attacks, on the hardware components of the DPA. Using this attack, he can extract confidential data directly from the hardware, e.g., from the TPM or from a memory module. Use analogue channels to extract data. Besides the attacks mentioned above, the attacker can take an analogous screen shot of the display using a camera. 4.2
Requirements for the Protection Architecture
In this section, we present the requirements for the protection architecture. The first two requirements apply to any scenario, whereas the requirements that follow are specific for the DPA. Requirement 1: The operating system is configured to support only the minimum set of services and resources. As we have discussed so far, some of the
148
F. Stumpf, P. R¨ oder, and C. Eckert
mechanisms for securing the client must be performed by configuring the operating system, e.g., configuring it to not swap the memory occupied by the document editor or disabling the possibility to take digital screen shots. Moreover, we must configure the operating system to allow only network connections to the document processor, which reduces the risk that data is sent to an unauthorized third party. Requirement 2: The number of software components of the operating system is minimal. To reduce the risk of vulnerabilities, it is highly desirable to employ a small operating system that has the minimal set of functions that are required to execute the document editor, because the chance of vulnerabilities increases with the complexity of a system. Consequently, we must use a operating system with the minimal number of user space software and kernel components, e.g., device drivers. Requirement 3: Attestation of the client machine. Moreover, we need to ensure that the operating system that we have supplied is neither exchanged with another operating system nor is its configuration manipulated to disable the security mechanisms of the protection architecture. Requirement 4: Authenticity of client and server. The user of the client platform could manipulate his machine to forward an attestation request to another machine with authentic software. Therefore we must ensure that the provided configuration values refer to the attested system and that the authenticity of both client and server is guaranteed. Requirement 5: Completeness of attestation. The definition of this state depends on what is measured on the client side. The usual approach [4] is to calculate the hash value of every executed binary and include it in the state definition. As a result, the verifier receives a list of all binaries that have been executed since the last reboot of the machine. However, this approach has two problems. First, the received list of executed binaries gives no hint about what was executed after the remote attestation. For example, a key logger or trojan horse can be started after the remote attestation. Although the malware was located on the client machine at the time of the remote attestation, it was not detected, because it was executed after the remote attestation. The second problem is, that using this approach, only the executable binaries of applications are inspected. Shell scripts and configuration files are not included in the measurement. This leads to the problem, that certain manipulations, e.g., manipulations of the configuration files, can not be detected by this approach. Consequently, we need an approach that measures the complete system configuration including scripts and configuration files.
5
Protection Architecture
As discussed before, we must ensure that the client machine can execute additional software, e.g., a web browser or an e-mail client, after the security
An Architecture Providing Virtualization-Based Protection Mechanisms
149
mechanisms are applied. This can be a problem when requirements 1 and 2 are fulfilled, since they reduce the compatibility of the client machine with additional software. For this reason, it is desirable to have a different operating system for different applications. As a consequence, we need a method to run different operating systems on one client machine, which we achieve by using virtualization techniques and running different virtual machines on a client machine. Each operating system can be executed in a separate virtual machine. Concerning the security of the system, we must ensure that these virtual machines can not influence each other. For example, we must ensure that malware running in one virtual machine can not extract confidential data from another virtual machine. We apply our approach for providing different virtual machines [5] and analyze whether the attacks described in Section 4.1 can be solved with this approach. The strong isolation achieved through virtualization guarantees that different virtual machines can not influence each other. Our previous approach uses virtualization in combination with the mechanisms defined by the Trusted Computing Group. It establishes several different execution environments by using various types of virtual machines, which are strongly isolated from each other. It also provides an abstraction of the underlying hardware TPM through a virtualized TPM (vTPM) interface. This allows the different virtual machines to use the measurement and reporting facilities of the TPM, thus they benefit from a hardware-based trust anchor. This approach has the advantage over the others [6] that the binding between TPM and vTPM is already specified, which is useful for remote attestation of virtual machines. We use this approach to execute the document editor in one such isolated virtual machine. Other applications are executed in a different virtual machine. Thus, they can not interfere with the document editor. Figure 2 depicts the approach applied to the DPA.
Document Editor
Open VM
Protection Layer 2
vTPM
Management VM
TPM
Trusted VM
Protection Layer 1
Hypervisor
Protection Layer 3
Hardware
Protection Layer 4
Fig. 2. Protection architecture organized in layers
The resulting protection architecture consists of components divided into four protection layers, in which components located on one layer provide security mechanisms to protect the components located on the layer directly above. In the case of a successful attack on one layer, the layers below can prevent the attacker from successfully transferring data to another physical machine.
150
F. Stumpf, P. R¨ oder, and C. Eckert
These components include three virtual machines, namely the open virtual machine, a management virtual machine and the trusted virtual machine (TVM). Additionally, a hypervisor partitions the underlying hardware and a TPM serves as a hardware-based trust anchor. The TPM provides hardware-based tamperevident cryptographic functions to protect the software components running on the layer directly above from unauthorized manipulations. Together with remaining hardware components, the TPM forms the lowest protection layer, more specifically, protection layer 4. 5.1
Protection Layer 4: TPM and Hardware
The TPM is the anchor of trust and the basis for the attestation. We store several non-migratable client-specific keys in the TPM, which are used for the challenge-response authentication with the server. This prevents the cloning of a client machine and additional attacks on the authenticity of the client machine. The attacker must perform a local physical attack, e.g., a mechanical or electrical probing attack, to the TPM to extract these keys. Since the TPM is specified to be tamper-evident, these attacks are not prevented, but can be detected afterwards. We assume that the hardware and the TPM behave as specified. 5.2
Protection Layer 3: Hypervisor and Management VM
The hypervisor is the first part of protection layer 3 and provides an abstraction layer to the underlying hardware. It has privileged access to the hardware and can grant and revoke resources, e.g., CPU cycles, to and from the running VMs. This hypervisor provides strong isolation of the virtual machines, which is the protection mechanism of this layer. It ensures that different virtual machines can not influence each other, e.g., by reading each others memory. In this approach, every virtual machine uses individual virtualized device drivers, which are executed within that VM. The hypervisor ensures, that these device drivers can only access the memory of the corresponding virtual machine. When running applications of different trust levels on a machine without virtualization, an attacker could use a malicious device driver to gain system wide access, e.g., to read the memory of the document editor and extract the confidential documents. We assume that the hypervisor and the management VM are set up by a trustworthy system administrator and that the user of the machine is not able to change this configuration. Because of its privileged position, the hypervisor needs to be trustworthy, since it can manipulate the CPU instructions of every virtual machine. We assume that the hypervisor is trustworthy and therefore guarantees strong isolation. Currently available virtualization solutions provide strong isolation. However, this still can be circumvented with direct memory access (DMA) operations [7]. These operations access the memory without intervention by the CPU and therefore bypass the hypervisor’s protection mechanisms. Hypervisors with secure sharing [8] prevent these attacks, but suffer from a high performance overhead, as well as a large trusted computing base, since the required I/O emulation is moved into the hypervisor layer.
An Architecture Providing Virtualization-Based Protection Mechanisms
151
The management virtual machine is the second part of protection layer 3. It is responsible for starting, stopping and configuring the virtual machines. It is part of this protection layer, since it is closely connected to the hypervisor and is a privileged VM, which has direct access to the hardware TPM. 5.3
Protection Layer 2: Open VM and Trusted VM
This protection layer consists of the open VM and the trusted VM. The open VM is allowed to run arbitrary software components. It runs applications with a lower trust level, such as web browsers or office applications. The open VM provides the semantics of today’s open machines and therefore has no additional protection mechanisms for upper layers. Since this virtual machine is not of interest for our approach, we will not focus on it in the rest of our work. The TVM runs the document editor and a tiny OS with a minimal number of software components, to reduce the possible number of security vulnerabilities. This fulfills requirements 1 and 2 without losing the ability to run other software on the client machine. The tiny OS and the document editor are part of a virtual appliance (VA), which is a fully pre-installed and pre-configured set of software for virtual machines. To ensure that the VA is not manipulated, the management VM measures its integrity before startup. The TVM runs in protection layer 2 and provides a virtual TPM (vTPM) as an additional protection mechanism. The operating system of the TVM uses this vTPM to protect the document editor running in protection layer 1. We use this vTPM to perform a complete attestation of the entire hard disk of the TVM, to ensure that neither the operating system, its configuration, nor the document editor running on top of the operating system is manipulated. As a result, the verification of the state of the entire virtual machine requires only one reference value. This eliminates the need to maintain a large amount of reference values, which is the main disadvantage of the binary attestation. Moreover, the server checks in the attestation whether the protection layers below the operating system, e.g., the hypervisor, are trustworthy. As a result, this mechanism fulfills Requirement 5. In addition to that, we use the vTPM to establish an authenticated channel between client and server. We discuss the corresponding protocol in Section 5.5. The TVM only accepts network requests from the server to reduce the chance of network attacks. After the booting process, which can not be interrupted, the operating system of the TVM directly executes the document editor. As a consequence, the only option for a user to interact with the TVM is to use the document editor provided by us. This improves the security of the TVM, since it limits the number of possible attacks. All I/O interfaces which can be used to extract data from the system are either blocked or controlled. For example, the management VM ensures that the hard disk is read only, which prevents an attacker from temporarily storing confidential data on this disk to extract it afterwards by booting a different operating system. Additionally, the management VM has a configuration file for the TVM, which defines that network connections are only allowed to the
152
F. Stumpf, P. R¨ oder, and C. Eckert
server of the DPA, which inhibits an attacker from sending confidential data to a different host. As a consequence, even if an attacker exploits a vulnerability of the document editor, he can not transfer the confidential data out of the TVM. 5.4
Protection Layer 1: Document Editor
This protection layer consists of the document editor. The document editor is written in Java, which is expected to minimize the risk of buffer overflows. In addition to this, the document editor can edit confidential documents in memory and does not need to write them to disk. The server’s authenticity is checked by verifying the server’s certificate before answering an attestation request. 5.5
Attestation Protocol
To prevent masquerading attacks on the authenticity of the platform configuration, we use an enhanced remote attestation protocol [9]. These attacks forward the integrity measurements of a conform host to masquerade a conform client state. The enhanced protocol adds a key establishment phase, to ensure that the channel of attestation is authentic. It also guarantees an end-to-end communication and prevents the attestation channel from becoming compromised by another application which could take over the attestation channel after the attestation has succeeded. The simplified process of our remote attestation protocol is illustrated in Figure 3. It consists of an initialization phase and an attestation phase. The initialization phase yields a vAIK credential which is then used in attestation phase to sign the PCRs. This vAIK credential is signed by an AIK from the hardware TPM. In the first step of the initialization phase, the vTPM is initialized,
Fig. 3. Simplified attestation process
An Architecture Providing Virtualization-Based Protection Mechanisms
153
which in turn requests a new vAIK credential from the hardware TPM (steps 2 and 3). The hardware TPM issues a vAIK credential and sends it to the vTPM (steps 5 and 6). The attestation phase is triggered by the document processor, which sends an attestation request consisting of a nonce and its public DiffieHellman key pair g a mod m for the key-establishment to the document editor (step 7). The document editor generates the corresponding Diffie-Hellman key pair g b mod m and sends this, together with the nonce, to the vTPM (step 8). The vTPM generates a digital signature using the vAIK (step 9) and transfers it together with the vAIK credential, to the document editor (step 10). The document editor forwards this data to the DP (step 11). Next, the DP verifies the authenticity of the document editor and its platform by inspecting the platform configuration registers (step 12). Finally, both the DP and the document editor calculate the shared session key for the following communication (step 13, not illustrated). The protocol for performing remote attestations guarantees that the endpoint of the communication is within the attested virtual machine. Therefore, relay or masquerading attacks are not possible. The combination of attestation and key establishment fulfills Requirement 4, since it prevents the masquerading of a trustworthy system configuration.
6
Evaluation of the Protection Architecture
In the following section, we evaluate whether the security mechanisms described in Section 5 prevent the attacks mentioned in Section 4.1. Software Manipulations. Each of the protection layers can be either manipulated during runtime or before it is executed. Runtime attacks are especially critical, since they are not detected by the current method of integrity measurement, which only measures components when they are executed. The underlying protection layer must be manipulated to modify the current layer, because the integrity of each layer is measured before execution by the layer below. This results in a chain of trust, with the TPM as a hardware anchor. As a consequence, the manipulation of any layer either requires a runtime attack or a hardware attack on the TPM. In the following, we discuss possible runtime attacks on each layer. The document editor is robust against buffer overflow attacks, since it is written in Java, which is commonly believed to decrease the possibility of buffer overflows. On the downside, other attacks, such as exploiting other programming errors, are still possible. The risk of runtime attacks on the operating system is reduced, because software with lower complexity is expected to have less errors than software with higher complexity. Moreover, we choose a strict system configuration to minimize possible attack methods, e.g., network connections are restricted to the server, the hard disk of the TVM is read-only and swapping is disabled. As a consequence, in the case of a successful attack on the operating system, the attacker has no options to transfer confidential data to another machine. This is an example of a lower layer preventing a successful
154
F. Stumpf, P. R¨ oder, and C. Eckert
attack, when the protection mechanisms of the layer above failed. Runtime attacks on the hypervisor are difficult, since it has a lower complexity compared to operating systems and it does not expose interfaces to the user which could be used for an attack. Moving protection mechanisms of upper layers into this layer, simplifies the verification of the correctness of these mechanisms. Masquerading attacks. Both client and server can be masqueraded, where the cloning of a machine is a special type of this attack. Since we assume that the server is trustworthy, we only focus on cloning attacks on the client. An attacker can create an exact copy of a client’s hard drive, but he can not copy the content of the corresponding TPM. The attacker can not use this cloned client to access the server, since our attestation protocol uses secrets stored in the TPM and therefore detects that the client is not authentic. One such secret is the AIK of the hardware TPM, which is used to sign the vAIK of the vTPM. This vAIK in turn, is used in the attestation protocol. The server can check the authenticity of the vAIK with the corresponding AIK credential that was installed on the server when the system was set up. Using an honest system to masquerade a trustworthy system state is prevented by our attestation protocol. This protocol also detects and prevents masquerading of a server, since the server’s certificate is checked by the document editor. Hardware attacks and analogue attacks. DMA attacks can either be handled entirely in software by emulating all I/O devices, which causes a high performance degradation. Alternatively, DMA attacks can be prevented by using hardware support, e.g., Intel’s Trusted Execution Technology. As a consequence, it depends on the implementation of our protection architecture whether or not DMA attacks are possible. Probing attacks are difficult to inhibit. At least the TPM is specified to be tamper-evident, which allows an attack to be detected afterwards. Analogue attacks are difficult to prevent with software mechanisms. Fortunately, the bandwidth of this channel is much lower compared to digitally copying confidential data. In addition to that, this attack method has a higher risk of being detected, if the machine is located at a monitored location, e.g., an office with many co-workers or an office that is under surveillance by security cameras. We do not provide a mechanism to prevent this type of attack.
7
Related Work
Terra [10] is an approach to create a virtual machine-based platform, which allows the attestation of virtual machines. It uses VMWare’s ESX Server to establish two types of virtual machines (Closed Box and Open Box ) and to report the state of a closed box machine to a remote entity. We could also use Terra’s Closed Box for our scenario and perform the analysis of the protection mechanisms based on Terra. However, Terra suffers from using a large trusted computing base, which is a potential security threat, since the trusted computing base could be exploited to undermine the protection mechanisms.
An Architecture Providing Virtualization-Based Protection Mechanisms
155
A similar approach that enables attestation is used in the Integrity Measurement Architecture (IMA) [4]. The authors present a comprehensive prototype based on trusted computing technologies, where integrity measurement is implemented by examining the hashes of all executed binaries. However, the prototype is not based on virtualization technologies, and therefore no strong isolation between processes is achieved. This requires to transfer a complete SML to the remote entity, which in turn, needs to validate all started processes to determine the platform’s trustworthiness. Similar to our proposed protection architecture a number of additional projects, such as EMSCB [6] and Open-TC exist that also use virtualization to establish a trustworthy computing platform. These projects are either based on the L4 microkernel [11] or on the Xen hypervisor. In this context, Sadeghi et al. have presented a prototype for DRM [12] that uses the L4 and provides secure storage and attestation of compartments. Our concepts and how the different protection layers interact with each other to prevent insider attacks could also be analyzed based on these alternative approaches. However, these approaches do not use a virtual TPM, therefore their protection mechanisms differ from the ones presented in this paper. Our concept of the DPA is very similar to server-based computing, where all documents remain on the server and are accessed by executing the corresponding applications on the server. Only the display content of these applications is transferred to the client. An example of such an architecture is the Display-Only File Server (DOFS) architecture [13]. However, in these approaches, a modified client is neither detected nor handled. Kuhlmann et al. [14] discuss similar scenarios as in our paper. The key difference to our work is that they do not focus on insider attacks and do not analyze how different protection mechanisms prevent insider attacks. This work is also based on [15], in which we have analyzed existing architectures that aim at preventing insider attacks and explained the need for protection mechanisms on the level of the operating system.
8
Conclusions and Future Work
In this paper, we presented an architecture for document processing, as an example, and analyzed insiders attacks on this architecture. Then, we performed a threat analysis and extracted requirements for a protection architecture that prevents these attacks. We introduced a four layer protection architecture based on virtualization techniques that fulfills these requirements. Among our mechanisms, we developed an authentication and communication protocol that is robust against masquerading attacks and ensures a confidential communication channel between client and server. We are finished with the implementation of this protection architecture. Currently, we are migrating the client of a document processing architecture developed by us into a trusted virtual machine. After that, we will perform penetration tests on the client and evaluate the effectiveness of the proposed protection mechanisms.
156
F. Stumpf, P. R¨ oder, and C. Eckert
References 1. Gordon, L.A., Loeb, M.P., Lucyshyn, W., Richardson, R.: 2006 CSI/FBI Computer Crime and Security Survey. Technical report, CSI (2006) 2. Group, T.C.: Trusted Platform Module (TPM) specifications. Technical report (2006), https://www.trustedcomputinggroup.org/specs/TPM 3. Goldberg, R.P.: Survey of Virtual Machine Research. Computer 7, 34–45 (1974) 4. Sailer, R., Zhang, X., Jaeger, T., van Doorn, L.: Design and Implementation of a TCG-based Integrity Measurement Architecture. In: 13th USENIX Security Symposium, IBM T. J. Watson Research Center (2004) 5. Stumpf, F., Benz, M., Hermanowski, M., Eckert, C.: An Approach to a Trustworthy System Architecture using Virtualization. In: ATC-2007. Proceedings of the 4th International Conference on Autonomic and Trusted Computing, Hong Kong, China. LNCS, Springer, Heidelberg (2007) 6. Base, E.M.S.C.: Towards trustworthy systems with open standards and trusted computing (2006), http://www.emscb.de/ 7. Fraser, K., Hand, S., Neugebauer, R., Pratt, I., Warfield, A., Williamson, M.: Safe Hardware Access with the Xen Virtual Machine Monitor. In: 1st Workshop on Operating System and Architectural Support for the on demand IT InfraStructure (OASIS) (2004) 8. Karger, P.A., Zurko, M.E., Bonin, D.W., Mason, A.H., Kahn, C.E.: A Retrospective on the VAX VMM Security Kernel. IEEE Trans. Softw. Eng. 17, 1147–1165 (1991) 9. Stumpf, F., Tafreschi, O., R¨ oder, P., Eckert, C.: A Robust Integrity Reporting Protocol for Remote Attestation. In: Second Workshop on Advances in Trusted Computing (WATC 2006 Fall) (2006) 10. Garfinkel, T., Pfaff, B., Chow, J., Rosenblum, M., Boneh, D.: Terra: a virtual machine-based platform for trusted computing. In: SOSP 2003. Proceedings of the nineteenth ACM symposium on Operating systems principles, pp. 193–206. ACM Press, New York (2003) 11. Liedtke, J.: On Micro-Kernel Construction. In: SOSP 1995. Proceedings of the fifteenth ACM symposium on Operating systems principles, pp. 237–250. ACM Press, New York (1995) 12. Sadeghi, A.R., Scheibel, M., St¨ uble, C., Wolf, M.: Play it once again, sam - enforcing stateful licenses on open platforms. In: 2nd Workshop on Advances in Trusted Computing (WATC 2006 Fall) (2006) 13. Yu, Y., Chiueh, T.: Display-Only File Server: A Solution against Information Theft Due to Insider Attack. In: DRM 2004. Proceedings of the 4th ACM workshop on Digital rights management, pp. 31–39. ACM Press, New York (2004) 14. Kuhlmann, D., Landfermann, R., Ramasamy, H., Schunter, M., Ramunno, G., Vernizzi, D.: An open trusted computing architecture - secure virtual machines enabling user-defined policy enforcement. Technical report, Open Trusted Computing consortium (OpenTC) (2007) 15. R¨ oder, P., Stumpf, F., Grewe, R., Eckert, C.: Hades - Hardware Assisted Document Security. In: Second Workshop on Advances in Trusted Computing (WATC 2006 Fall), Tokyo, Japan (2006)
Detecting Motifs in System Call Sequences William O. Wilson, Jan Feyereisl, and Uwe Aickelin School of Computer Science, The University of Nottingham, UK {wow,jqf,uxa}@cs.nott.ac.uk
Abstract. The search for patterns or motifs in data represents an area of key interest to many researchers. In this paper we present the Motif Tracking Algorithm, a novel immune inspired pattern identification tool that is able to identify unknown motifs which repeat within time series data. The power of the algorithm is derived from its use of a small number of parameters with minimal assumptions. The algorithm searches from a completely neutral perspective that is independent of the data being analysed and the underlying motifs. In this paper the motif tracking algorithm is applied to the search for patterns within sequences of low level system calls between the Linux kernel and the operating system’s user space. The MTA is able to compress data found in large system call data sets to a limited number of motifs which summarise that data. The motifs provide a resource from which a profile of executed processes can be built. The potential for these profiles and new implications for security research are highlighted. A higher level system call language for measuring similarity between patterns of such calls is also suggested.
1
Introduction
The investigation and analysis of time series data is a popular and well studied area of research. Common goals of time series analysis include the desire to identify known patterns in a time series, to predict future trends given historical information and the ability to classify data into similar clusters. Historically, statistical techniques have been applied to this problem domain whilst Immune System (IS) inspired techniques have remained fairly limited [1]. In this paper we describe the Motif Tracking Algorithm (MTA), a deterministic but nonexhaustive approach to identifying repeating patterns in time series data. The MTA abstracts principles from the human immune system, in particular the immune memory theory of Eric Bell [2]. Implementing principles from immune memory to be used as part of a solution mechanism is of great interest to the immune system community and here we are able to take advantage of such a system. The MTA implements the Bell immune memory theory by proliferating and mutating a population of solution candidates using a derivative of the clonal selection algorithm [3]. A subsequence of a time series that is seen to repeat within that time series is defined as a motif. The objective of the MTA is to find those motifs. The power of the MTA comes from the fact that it has no prior knowledge of the time series S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 157–172, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
158
W.O. Wilson, J. Feyereisl, and U. Aickelin
to be examined or what motifs exist. It searches in a fast and efficient manner and the flexibility incorporated in its generic approach allows the MTA to be applied across a diverse range of problems. The MTA has already been applied to motif detection in industrial data sets [2]. Here we test its generic properties by applying it to motif identification in system call data sets. Considerable research has already been performed on identifying known patterns in time series [4]. In contrast little research has been performed on looking for unknown motifs in time series. A distinguishing feature of the MTA is its ability to identify variable length unknown patterns that repeat in a time series. This focus on the detection of unknown patterns makes it an ideal tool for investigating underlying patterns in low level system data generated from the execution of processes on a computer system. The nature of program execution, with the re-use of functions and methods, along with standardised programming structures, implies a set of motifs do exist within each running process. Each process on a system is executed by issuing sequences of system calls which are translated by the kernel into data understandable by the underlying hardware. All processes rely on such system calls making them of high interest to the security community. By looking at the sequences of such system calls, we can observe repeating motifs, which will be identifiable and explainable in terms of higher level functions. The MTA provides an ideal mechanism to compress the data found in these large system call data sets to a limited number of motifs which effectively summarise that data. The motifs could then provide a resource from which we can build a profile of executed processes. These profiles could be used to identify sequences that indicate potentially anomalous process behaviour. Related work on motif detection is discussed in Section 2 with details of system calls found in Section 3. Terms and definitions used are covered in Section 4 followed by the MTA pseudo code and the problem to be addressed in Sections 5 and 6. Results and future work are found in Sections 7 and 8 before concluding in Section 9.
2
Related Work
The search for patterns in data is relevant to a diverse range of fields, including biology, business, finance and computer security. For example, Guan [5] addresses DNA pattern matching using lookup table techniques that exhaustively search the data set to find recurring patterns. Investigations using a piecewise linear segmentation scheme [6] and discrete Fourier transforms [7] provide examples of mechanisms to search a time series for a particular motif of interest. An underlying assumption in all these common approaches is that the pattern to be found is known in advance. The matching task is much simpler as the algorithm just has to find re-occurrences of the required pattern. The search for unknown motifs is at the heart of the work conducted by Keogh et al. Keoghs probabilistic [8] and Viztree algorithms [9] are very successful in identifying unknown motifs but they require additional parameters compared to the MTA and they also assume prior knowledge of the length of the motif
Detecting Motifs in System Call Sequences
159
to be found. Motifs longer and potentially shorter than this predefined length may remain undetected in full. Work by Tanaka [10] attempts to address this issue by using minimum description length to discover the optimal length for the motif. Fu et al. [11] use self-organising maps to identify unknown patterns in stock market data, by representing patterns as perceptually important points. This provides an effective solution but again the patterns found are limited to a predetermined length. This prior awareness of the patterns to find, or their lengths, is not appropriate for intrusion detection systems as by their very nature intrusion techniques are constantly changing to avoid detection. This represents an ideal application for the MTA as it makes no such pre-assumptions and aims to find all unknown motifs of variable length from the data set. Security researchers have been investigating system calls for a number of years. System calls represent a low level communication mechanism between processes and the systems’ underlying hardware, providing a high enough level of abstraction for intelligent process behaviour analysis and modelling. The use of system calls for anomaly detection was first introduced by Forrest et al. [12]. Their IS inspired work looks at sequences of system calls using a sliding window mechanism. System calls are used to generate a database of normal behaviour, i.e. self, which is consequently used to capture anomalous behaviour, non-self. Forrest’s work instigated a new stream of intrusion detection research, with some researchers taking the idea of generating a database of normal behaviour and extending it further [13]. Novel approaches to solving related issues by varying the methods and types of signals used to generate a set of normal behaviour were also proposed [14]. Tandon et al. have looked at system call motifs as a source for their normal behaviour profile generation, however this differs from our approach because it uses an exhaustive search mechanism [15].
3
Intrusion Detection and System Calls
System calls are lower level functions/methods, which act as a communication channel between higher level processes (e.g. executable commands) and the lower level kernel of an operating system (OS). The system calls perform system actions on behalf of a user. Each system call performs a slightly different atomic task, yet in combination they achieve much more complex functionality. Examples of some of the simplest system calls are the file I/O calls, such as open(), read(), write() and close(). An application or a process can produce on average between a dozen and thousands of system calls per execution, depending on the complexity of the task. As such system calls are ideal data signals from a security point of view, they provide a detailed view of a system or process operation while avoiding complex issues such as encryption or other possible higher level evasion mechanisms. The focus of the MTA is to look for variable length unknown motifs in the data. This fits nicely with system calls as we are interested in seeing if motifs exist in system call execution sequences. Our inspiration originates from the
160
W.O. Wilson, J. Feyereisl, and U. Aickelin
way that programs are written, compiled and executed. An application usually consists of various classes, objects, methods or functions, along with variables and constants. All these structures are high level constructs, that are processed by lower level libraries which execute appropriate system calls accordingly. These atomic functions, which are deterministic as they do not depend on any variable input, are likely to form the building blocks of our motifs of interest, no matter what application calls them. The combination of such motifs could then provide a resource to generate a profile for a process that distinguishes a permitted execution from a malicious one. The MTA provides a mechanism for compressing and summarising all this system call data into a number of repeating motifs that are prevalent in the data. The MTA would highlight consistent patterns in the data that are understandable and of value to the user, to aid in the generation of these process profiles. The ability of the MTA to find variable length motifs, with no assumptions about the data or the motifs to find, ensures it is flexible enough to carry out such a task.
4
Motif Detection: Terms and Definitions
Whilst immunology provides the inspiration for the theory behind the MTA (see [2] for more information), the work of Keogh et al. [8] is the inspiration for the time series representation used by the MTA. Keogh’s Symbolic Aggregate approXimation (SAX) technique for representing a time series was utilised. Many of the following definitions used by the MTA are adapted from the work of Keogh [8], as summarised below. Definition 1. Time series. A time series T = t1 ,..,tm is a time ordered set of m real or integer valued variables. In order to identify patterns in T in a fast and efficient manner we break T up into subsequences. Definition 2. Subsequence. “Given a time series T of length m, a subsequence C of T consists of a sampling of length n ≤ m of contiguous positions from T.” [8]. Subsequences are extracted using a sliding window technique. Definition 3. Sliding window. Given a time series T of length m, and a subsequence C of length n, a symbol matrix S of all possible subsequences can be built by sliding a window of size n across T, one point at a time, placing each subsequence into S. After all sliding windows are assessed S will contain (m n + 1 ) subsequences. Each subsequence generated could represent a potential match to any of the other subsequences within S. If two subsequences match, we have found a pattern in the time series that is repeated. This pattern is defined as a motif. Definition 4. Motif. A subsequence from T that is seen to repeat at least once throughout T is defined as a motif. The re-occurrence of the subsequence need
Detecting Motifs in System Call Sequences
161
not be exact for it to be considered as a motif. The relationship between two subsequences C1 and C2 is assessed using a match threshold r. We use the most common distance measure (Euclidean distance) to examine the match between two subsequences C1 and C2 , ED(C1 , C2 ). If ED(C1 , C2 ) ≤ r the subsequences C1 and C2 are deemed to match and thus are saved as a motif. The motifs prevalent in a time series are detected by the MTA through the evolution of a population of trackers. Definition 5. Tracker. A tracker represents a signature for a motif sequence that is seen to repeat. It has within it a sequence of 1 to w symbols that are used to represent a dimensionally reduced equivalent of a subsequence. The subsequences generated from the time series are converted into a discrete symbol string using an intuitive technique described in Section 5. The trackers are then used as a tool to identify which of these symbol strings represent a recurring motif. The trackers also include a match count variable to indicate the level of stimulation received during the matching process.
5
The Motif Tracking Algorithm
The MTA pseudo code is detailed in Program 1. and a brief summary of this algorithm as applied to system call analysis is described in the subsequent sections. The MTA parameters include the length of a symbol s, the match threshold r, and the alphabet size a. Convert Time Series T to Symbolic Representation. The MTA takes as input a univariate time series data set consisting of system call data which has been converted to a list of integers as described in Section 6. To minimise amplitude scaling issues with subsequence comparisons across T we normalise the time series. We then use the SAX representation [8] to discretise the time series under consideration. SAX is a powerful compression tool that uses a discrete, finite symbol set to generate a dimensionally reduced version of a time Program 1. MTA Pseudo Code Initiate MTA (s, r, a) Convert Time series T to symbolic representation Generate Symbol Matrix S Initialise Tracker population to size a While ( Tracker population > 0 ) { Generate motif candidate matrix M from S Match trackers to motif candidates Eliminate unmatched trackers Examine T to confirm genuine motif status Eliminate unsuccessful trackers Store motifs found Proliferate matched trackers Mutate matched trackers } Memory motif streamlining
162
W.O. Wilson, J. Feyereisl, and U. Aickelin
series consisting of symbol strings. This intuitive representation has been shown to rival more sophisticated reduction methods such as Fourier transforms and wavelets [8]. Using SAX we slide a window of size s across the time series T one point at a time. Each sliding window represents a subsequence of system calls from T. The MTA calculates the average of the values from the sliding window and uses that average to represent the subsequence. The MTA now converts this average value into a symbol string. The user predefines the size a of the alphabet used to represent the time series T. Given T has been normalised we can identify the breakpoints for the alphabet characters that generate a equal sized areas under the Gaussian curve [8]. The average value calculated for the sliding window is then examined against the breakpoints and converted into the appropriate symbol. This process is repeated for all sliding windows across T to generate m-s+1 subsequences, each consisting of symbol strings comprising one character. Generate Symbol Matrix S. The string of symbols representing a subsequence is defined as a word. Each word generated from the sliding window is entered into the symbol matrix S. The MTA examines the time series T using these words and not the original data points to speed up the search process. Symbol string comparisons can be performed efficiently to filter out bad motif candidates, ensuring the computationally expensive Euclidean distance calculation is only performed on those motif candidates that are potentially genuine. Having generated the symbol matrix S, the novelty of the MTA comes from the way in which each generation a selection of words from S, corresponding to the length of the motif under consideration, are extracted in an intuitive manner as a reduced set and presented to the tracker population for matching. Initialise Tracker Population to Size a. The trackers are the primary tool used to identify motif candidates in the time series. A tracker comprises a sequence of 1 to w symbols. The symbol string contained within the tracker represents a sequence of symbols that are seen to repeat throughout T. Tracker initialisation and evolution is tightly regulated to avoid proliferation of ineffective motif candidates. The initial tracker population is constructed of size a to contain one of each of the viable alphabet symbols predefined by the user. Each tracker is unique, to avoid unnecessary duplication of solution candidates and wasted search time. Trackers are created of a length of one symbol. The trackers are matched to motif candidates via the words presented from the stage matrix S. Trackers that match a word are stimulated; trackers that attain a stimulation level ≥ 2 indicate repeated words from T and become candidates for proliferation. Given a motif and a tracker that matches part of that motif, proliferation enables the tracker to extend its length by one symbol each generation until its length matches that of the motif.
Detecting Motifs in System Call Sequences
163
Generate Motif Candidate Matrix M from S. The symbol matrix S contains a time ordered list of all the words, each containing just one symbol, that are present in the time series. Neighbouring words in S contain significant overlap as they were extracted via the sliding windows. Presenting all words in S to the tracker population would result in potentially inappropriate motifs being identified between neighbouring words. To prevent this issue such ‘trivial’ match candidates are removed from the symbol matrix S. Trivial match elimination (TME) is achieved as a word is only transferred from S for presentation to the tracker population if it differs from the previous word extracted. This allows the MTA to focus on significant variations in the time series and prevents excessive time being wasted on the search across uninteresting variations. Excessively aggressive trivial match elimination is prevented by limiting the maximum number of consecutive trivial match eliminations to s, the number of data points encompassed by a symbol. In this way a subsequence can eliminate as trivial all subsequences generated from sliding windows that start in locations contained within that subsequence (if they generate the same symbol string) but no others. The reduced set of words selected from S is transferred to the motif candidate matrix M and presented to the tracker population for matching. Match Trackers to Motif Candidates. During an iteration each tracker is taken in turn and compared to the set of words in M. Matching is performed using a simple string comparison between the tracker and the word. We define a match to occur if the comparison function returns a value of 0, indicating a perfect match between the symbol strings. Each matching tracker is stimulated by incrementing its match counter by 1. Eliminate Unmatched Trackers. Trackers that have a match count >1 indicate symbols that are seen to repeat throughout T and are viable motif candidates. Eliminating all trackers with a match count < 2 ensures the MTA only searches for motifs from amongst these viable candidates. Knowledge of possible motif candidates from T is therefore carried forward by the tracker population. After elimination the match count of the surviving trackers is reset to 0. Examine T to Confirm Genuine Motif Status. The surviving tracker population indicates which words in M represent viable motif candidates. However motif candidates with identical words may not represent a true match when looking at the time series data underlying the subsequences comprising those words. In order to confirm whether two matching words X and Y, containing the same symbol strings, correspond to a genuine motif we need to apply a distance measure to the original time series data associated with those candidates. The MTA uses the Euclidean distance to measure the relationship between two motif candidates ED(X,Y). If ED(X,Y) ≤ r a motif has been found. The match count of that tracker is stimulated to indicate a match. A memory motif is created to store the symbol string associated with X and Y. The start locations of X and Y are also saved.
164
W.O. Wilson, J. Feyereisl, and U. Aickelin
For further information on the derivation of this matching threshold please refer to [2]. The MTA then continues its search for motifs, focusing only on those words in M that match the surviving tracker population in an attempt to find all occurrences of the potential motifs. The trackers therefore act as a pruning mechanism, reducing the potential search space to ensure the MTA only focuses on viable candidates. Eliminate Unsuccessful Trackers. The MTA now removes any unstimulated trackers from the tracker population. These trackers represent symbol strings that were seen to repeat but upon further investigation with the underlying data were not proven to be valid motifs in T. Store Motifs Found. The motifs identified during the confirmation stage are stored in the memory pool for review. Comparisons are made to remove any duplication. The final memory pool represents the compressed representation of the time series, containing all the re-occurring patterns present. Proliferate Matched Trackers. Proliferation and mutation are needed to extend the length of the tracker so it can capture more of the complete motif. At the end of the first generation the surviving trackers, each consisting of a word containing a single symbol, represent all the symbols that are applicable to the motifs in T. The complete motifs in T can only consist of combination of these symbols. This subset of trackers is therefore stored as the mutation template for use by the MTA. Proliferation and mutation to lengthen the trackers will only involve symbols from the mutation template and not the full symbol alphabet, as any other mutations would lead to unsuccessful motif candidates. During proliferation the MTA takes each surviving tracker in turn and generates a number of clones equal to the size of the mutation template. The clones adopt the same symbol string as their parent. Mutate Matched Trackers. The clones generated from each parent are taken in turn and extended by adding a symbol taken consecutively from the mutation template. This creates a tracker population with maximal coverage of all potential motif solutions and no duplication. The tracker pool is fed back into the MTA ready for the next generation. A new motif candidate matrix M consisting of words with two symbols is now formulated to present to the evolved tracker population. In this way the MTA builds up the representation of a motif one symbol at a time each generation to eventually map to the full motif. Given the symbol length s we generate a word consisting of two consecutive symbols by taking the symbol from matrix S at position i and that from position i+s. Repeating this across S, and applying trivial match elimination as per Section 5, the MTA obtains a new motif candidate matrix M in generation two, each entry of which contains a word of two symbols, covering a length of 2 x s.
Detecting Motifs in System Call Sequences
165
The MTA continues to prepare and present new motif candidate matrix data to the evolving tracker population each generation. The motif candidates are built up one symbol at a time and matched to the lengthening trackers. This flexible approach enables the MTA to identify unknown motifs of a variable length. This process continues until all trackers are eliminated as non matching and the tracker population is empty. Any further extension to the tracker population will not improve their fit to any of the underlying motifs in T. Memory Motif Streamlining. The MTA streamlines the memory pool, removing duplicates and those encapsulated within other motifs to produce a list of motifs that it associates with T.
6
Detection of System Call Patterns
This paper demonstrates that an execution of a process shall produce a sequence of system calls containing a number of motifs of variable lengths and these shall be identifiable by the MTA. Such motifs should re-occur when the same or similar processes are run. The motifs in system call sequences can be used in various security applications, for example as a data reduction tool for behaviour profiling within an Intrusion Detection System (IDS). In our experiments we have two machines connected by a local network. The client machine (Windows XP machine running an SSH client PuTTY version 0.57) connects to the server machine (Debian Linux) which then performs actions based on commands sent by the client. Our experiments use a VMware virtual machine, running a Debian Linux distribution, v.3.3.5-13, with a Linux kernel, v.2.4.27-2, as our SSH server. The SSH daemon process (OpenSSH 3.8.1p1 Debian-8.sarge.6) is monitored along with all its children, using the standard strace utility. All system calls generated by the SSH daemon and its child processes are logged and stored in separate files based on their process ID (PID). The following sequence of actions is executed to generate our data set. The client connects to the server and an SSH session is established. The following commands, chosen at random, are then issued by the client: ls, ls, ls -lsa, pwd, ls, ps, ps aux, ls -lsa, chmod a+x file, chmod a+x directory, ls -lsa, chmod a-x file. The client then disconnects from the SSH server. The PID files generated are concatenated to produce one file. During concatenation the PID file from the child with the smallest PID is added to the parent data, this is then repeated for each remaining child process. Concatenation results in a single data set containing system call names with their respective arguments. The data is further pre-processed by converting the individual system call names to their appropriate Linux OS id numbers and removing their arguments. This generates a one dimensional data set comprising a sequence of 8,040 system call numbers. From the data set generated it can be observed that a small sequence of system calls relating to one particular monitored child process is repeated across a large proportion of the data. This process looks after the SSH terminal operation for
166
W.O. Wilson, J. Feyereisl, and U. Aickelin
the duration of the whole SSH session. Repeated read() and write() calls and various real time system call actions are performed over and over again. Due to the basic nature of this repeating sequence, it is not deemed of interest to our analysis. Instead we focus on the last 1,000 system calls from the data set, to investigate motifs that occur during the last seven commands issued by the client. The data set generated is available at http://cs.nott.ac.uk/∼jqf/MTA scdata.dat.
7
Results
Having introduced the MTA we now provide some experimental results which examine the ability of the MTA to identify motifs present in system call data. As defined in Section 6 the data set examined consists of 1,000 system calls represented by an ordered list of 1,000 integers. A bind threshold r = 0 was set since system call sequences need to match identically. Symbol length s and alphabet size a values were varied to investigate the sensitivity of the MTA to these parameters. The MTA was written in C++ and run on a Windows XP machine with a Pentium M 1.7 Ghz processor with 1.0 Gb of RAM. 7.1
System Call Motifs Identified by the MTA
In this scenario a = 10 to give a large alphabet diversity and s took the values 10, 20 and 40. To evaluate the impact on speed and accuracy the MTA was run with trivial match elimination (TME) and with no trivial match elimination (NTME). We focus on significant motifs whose length exceeds 40 system calls to enable fair comparison across different values of s. With s = 10 and NTME eight motifs are identified. Table 1 lists these motifs with the number of system calls they encompass and the start locations where they occur in the data set. Motif 1 dominates the data set, it consists of 280 system calls and occurs twice in the data set from locations 386 and 717. Figure 1 presents the list of system calls from location 350 to 1,000 and motif 1 is clearly evident in this sequence. From the commands issued during the SSH session (Section 6), we observe the existence of motifs within the command list itself as there are repetitions of the ls and chmod commands. From the MTA’s analysis of the system Table 1. List of motifs found by the MTA Motif No. Length Start Locations 1 280 386, 717 2 80 0, 227 3 70 8, 160, 235 4 50 198, 262 5 50 668, 950 6 40 39, 191, 266, 324 7 40 619, 668, 950 8 40 77, 120
Detecting Motifs in System Call Sequences
167
Fig. 1. Illustration of system calls 350 to 1,000, highlighting the occurrence of Motif 1 from system call 386 to 665 and 717 to 996
call data set, motif 1 relates to the repetition of these two observed commands. The ls commands contain the same arguments (-lsa) across both repetitions, whilst the chmod command includes execute permissions for all users to a file in the first occurrence and removes those same privileges from the file during the second occurrence. Motif 1 represents two processes occurring in succession at two different time points within the overall SSH session. Motif 1 also contains other sub-motifs such as motif 7. Motif 7 occurs at three different positions as seen from Table 1. This motif represents the chmod command that was executed three times during the session, each time with different arguments. The motifs found by the MTA are a super-set of those evident in the original command list, validating the accuracy of the MTA. The existence of motif 7 shows that applications with varying arguments (i.e. performing different actions), have atomic motifs that could be used for data reduction in a security application. Motif 2 is the second largest motif, which has two repetitions in the data set. It relates to the execution of the ps command. This command occurred twice in succession, with different arguments, spawning a new process each time. Motif 2 represents 80 system calls that are identical across these two spawned processes. Motifs 3, 4 and 8 partly overlap with motif 2 indicating a subset of system calls from motif 2 that is consistent across all these motifs. However this subset occurs with a higher frequency than motif 2, representing similarities between atomic parts of the ps processes not fully captured by the motif 2. This highlights parts of the process execution that are more dynamic and input dependent and which need to be dealt with when considering an IDS. Motif 6 again relates to a component of the ps command. At a lower level, the ps command reads a number of small files from the /proc/ directory of the Linux OS and prints the read information onto the screen. This information shows the running processes on the system to the user. From this a recurring sub-pattern of open(), read() and close() system calls with various arguments is observed. This sub-pattern is largely dependant on the input of the ps command.
168
W.O. Wilson, J. Feyereisl, and U. Aickelin
In this case it is the number of running processes on the monitored system, which in turn is the number of files in the aforementioned /proc/ directory. The motifs generated by the MTA are formed by such atomic system call sequences. The randomness in the input, which results in motifs of variable lengths is of major importance when considering IDS systems. For this purpose a system call expression language is proposed which, besides giving the MTA an alternative representation to assess system call similarity, gives security researchers a regular expression type language for describing system call motifs at a higher level. This language is described in more detail in Section 8. In the above analysis we have focused on significant motifs with sequences exceeding 40 system calls. The advantage of using the MTA on system call data and not the command list becomes apparent when we look at the shorter motifs that are generated, as these indicate atomic motifs that are found across varying command instructions. One such motif, referred to as Z, has a length of 30 system calls and occurred five times at positions 387, 620, 669, 718 and 951. Motif Z relates to the re-occurrence of a sequence that occurs during the initialisation of newly spawned processes executed across the commands ls -lsa, chmod a+x file, chmod a+x directory, ls -lsa and chmod a-x file respectively. Each C application under Linux, when it starts, calls and loads the standard C library, libc. The occurrences of motif Z corresponds to the loading of this library. Thus the MTA has found a motif that is present but embedded across differing commands. This approach of using system calls as input to the MTA, and not the commands, would aid in the detection of exploits that are far smaller than the commands themselves. An example of such an exploit is the SQL slammer worm which is only 376 bytes long, compared to the text segment of the chmod command of 29,212 bytes. 7.2
Sensitivity to Changes in the Symbol Length s
In total 961 of the 1,000 system calls are identified as being part of one or more of these eight motifs. From these results it is clear the MTA is able to successfully identify a reduced set of motifs from the large system call data set. By varying the value of s and the use of trivial match elimination we can examine the sensitivity of the MTA and assess its ability to retain knowledge of these eight motifs. The results of this sensitivity analysis can be seen in Table 2. Table 2 shows the total number of motifs found and the execution time of the MTA for the various values of s. In addition a measure of the quality of Table 2. Sensitivity of the MTA to variations in the symbol length s Motifs found s NTME TME 10 8 6 20 5 4 40 4 3
Execution NTME 315.8 56.9 12.1
time (sec) TME 262.0 26.5 1.9
Motif quality measure NTME TME 1,490 1,230 1,140 940 960 720
Detecting Motifs in System Call Sequences
169
the motifs found is included by multiplying the identified length of the motif by the identified frequency and summing for all motifs found. Any omissions in the length or frequency of the complete motif will cause a decline in this quality measure. As the symbol length s increases, the number of motifs detected declines. This appears logical as a higher s implies the search is less fine grained. Introducing TME also reduces the number of motifs found. TME significantly reduces the size of the motif candidate matrix M resulting in fewer candidates being examined. TME is key to the dimensionality reduction of the original data set leading to a fast search process, however it would appear that its inclusion does lead to a loss in detection accuracy. Given NTME, the MTA is only able to identify four of the eight motifs (1, 2, 6 and 7) if s rises from 10 to 40. The quality measure also indicates that, of the motifs found, there appears to be a loss in the detection of the full length or frequency of occurrence. The quality measure falls from 1,490 to 960. Of the four motifs still detected we lose 40 system calls from motif 1 and we only detect two of the three repetitions of motif 7. However raising the symbol length from 10 to 40 results in a 96.2% reduction in the MTA execution time, taking only 12.1 seconds compared to 315.8. 7.3
Sensitivity to Changes in the Alphabet Size a
Adjusting the alphabet size alters the symbol set used to represent the time series. Reducing a means a greater diversity of sequences are now grouped together as similar. TME with a reduced alphabet set should lead to a larger number of trivial match eliminations, leading to a faster but potentially less accurate search. This hypothesis is confirmed when we look at Table 3 which lists the motifs found for various alphabet sizes. In this scenario s = 20, r = 0 and a took the values 10, 8, 6, and 4. Table 3 shows the alphabet size has no impact on the detection ability of the MTA if there is NTME. The five motifs detected when s = 20, a = 10 (Table 2) are always found and have the same quality measurement. However the search time of the MTA with NTME improves by 30.1% as a is reduced from 10 to 4. With TME activated, changes to a have a more significant impact on the motifs detected. Reducing a from 10 to 8 causes the MTA to lose track of motif 8 but it now finds motif 2. As motif 2 is longer than motif 8 we get an improvement Table 3. Sensitivity of the MTA to variations in the alphabet size a Motifs found a NTME TME 10 5 4 8 5 5 6 5 3 4 5 5
Execution NTME 56.9 48.0 43.3 39.8
time (sec) TME 26.5 19.8 16.1 6.3
Motif quality measure NTME TME 1,140 940 1,140 1,140 1,140 860 1,140 1,140
170
W.O. Wilson, J. Feyereisl, and U. Aickelin
in the overall motif quality measure from 940 to 1,140. Reducing a further from 8 to 4 causes the MTA to lose motif 6 but gain knowledge of motif 4. Thus we see that TME causes a change in the location in search space where the MTA conducts its search, resulting in less consistent results. One could imply that this inconsistency due to trivial match elimination is detrimental to this particular search problem but this need not be the case. As is evident from Table 3 trivial match elimination significantly improves the search time of the MTA and the results from including trivial match elimination are still satisfactory. When a=4 activating trivial match elimination results in the MTA still finding five motifs but it reduces the search time by 84.2% from 39.8 seconds to 6.3 seconds and with no loss to the quality of those five motifs found. 7.4
Summary Discussion of Results
From these results it is apparent that the MTA is able to identify motifs that are present in this system call data set. The MTA can compress the original data set of 1,000 system calls down to eight repeating motifs. A trade off between speed and accuracy becomes apparent as the user is able to adjust the parameters of the algorithm to speed up the search process at the cost of a reduction in detection capability, allowing a flexible search mechanism. The sensitivity to changes in s and a noted here is due to the nature of system call data. In this paper we group system calls together as similar by averaging their system call values over a fixed sized window. These are then grouped by boundary conditions and represented by a symbol which is then subject to trivial match elimination. With system calls there is no real relationship between two separate system call values, i.e. system call 2 is not twice as large as system call 1. Therefore one could argue that a more appropriate representation may be more suitable as is discussed in Section 8. The results show that the MTA, developed to identify motifs in financial and industrial data sets, is successful in identifying motifs in system call data due to its generic and flexible approach. It provides a useful tool to compress a large data set into small subset of repeating patterns that are of immediate value to the user.
8
Future Work
The difficulty with analysing system call sequences for the purposes of intrusion detection is that the variety of sequences generated is largely dependant on the diversity of the application’s input. This potential variety sidesteps most forms of pattern detection as long as the detection mechanism is not able to encode the variations in a manner that is granular enough to be able to distinguish between normal and anomalous patterns. To address this issue, as part of our future work we propose a system call expression language (SCEL), which acts as a higher level regular expression type language consisting of constructs representing atomic system call motifs of meaningful actions.
Detecting Motifs in System Call Sequences
171
An example of this language can be presented using motif Z of length 30 as described in Section 7. In the SCEL motif Z could be represented by a higher level construct, such as lib loading(libc). Where lib loading represents a particular set of motifs for that action and (libc) denotes the class to which those motifs belong. Similar constructs could be devised for other operations which contain atomic motifs, representative of a higher level functionality. The motifs 2, 3, 4, and 8 in Table 1 indicate the re-occurrence of the open(), read() and close() system calls. These three calls could now be represented as one file read(small) construct to be used for files below a certain size threshold. In addition a file read(large) construct containing a wild card for the number of read() system calls between the open() and close() calls can be generated for instances when reading larger files, where numerous read() calls are executed depending on the file’s size. When reading such a file an attribute of the language construct could denote the number of motifs present in an observed process. For example lib loading(libc)[1], file read(small)[*], other(*)[*] could denote a complete ps command being executed. This language tool would prove of value to a user as it focuses on a high level of abstraction while maintaining the ability to conduct fine grained analysis of system calls. This new representation for system call similarity could now also be used as input for the MTA to enhance its motif detection ability.
9
Conclusion
The search for patterns or motifs in data represents a generic problem area that is of great interest to a huge variety of researchers. By extracting motifs that exist in data we gain some understanding as to the nature and characteristics of that data, so that we can benefit from that knowledge. The motifs provide an obvious mechanism to cluster, classify and summarise the data, placing great value on these patterns. Little research has been performed looking for unknown motifs in time series. The MTA takes up this challenge using a novel immune inspired approach to evolve a population of trackers that seek out and match motifs present in a time series. The MTA uses a minimal number of parameters with minimal assumptions and requires no knowledge of the data examined or the underlying motifs, unlike other alternative approaches. In this paper the MTA was applied to motif detection in system call data. The MTA was shown to compress the data set into a limited number of motifs that provide good coverage of the original data set resulting in a minimal loss of information. The authors propose that these motifs highlight repeating or atomic functions that can be used to build profiles of “system behaviour”. These profiles could then assist in tasks such as anomaly detection or behaviour classification. The authors provide information on a system call expression language that addresses system call granularity issues for computer security applications in the future. In its current form we believe the MTA offers a valuable contribution to an area of research that at present has received surprisingly little attention.
172
W.O. Wilson, J. Feyereisl, and U. Aickelin
References 1. Nunn, I., White, T.: The application of antigenic search techniques to time series forecasting. In: GECCO, pp. 353–360 (June 2005) 2. Wilson, W.O., Birkin, P., Aickelin, U.: Motif detection inspired by immune memory. In: ICARIS 2007. Proceedings of the 6th International Conference on Artificial Immune Systems, Santos, Brazil. LNCS, Springer, Heidelberg (2007) 3. de Castro, L.N., Von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Transactions on Evolutionary Computation 6(3), 239– 251 (2002) 4. Lin, J., Keogh, E., Lonardi, S., Patel, P.: Finding motifs in time series. In: The 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international conference on knowledge discovery and data mining (July 2002) 5. Guan, X., Uberbacher, E.C.: A fast look up algorithm for detecting repetitive DNA sequences. In: Pacific symposium on biocomputing, Hawaii IEEE Tran. Control Systems Tech. (December 1996) 6. Keogh, E., Smyth, P.: A probabilistic approach to fast pattern matching in time series databases. In: Proceedings of the third international conference of knowledge discovery and data mining, pp. 20–24 (1997) 7. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time series databases. In: Proceedings of the SIGMOD conference, pp. 419–429 (1994) 8. Chiu, B., Keogh, E., Lonardi, S.: Probabilistic discovery of time series motifs. In: SIGKDD (August 2003) 9. Lin, J., Keogh, E., Lonardi, S.: Visualizing and discovering non trivial patterns in large time series databases. Information visualization 4(2), 61–82 (2005) 10. Tanaka, Y., Uehara, K.: Discover motifs in multi-dimensional time series using the principal component analysis and the MDL principle. In: 3rd international conference on machine learning and data mining in pattern recognition, Leipzig, Germany, pp. 252–265 (2003) 11. Fu, T.C., Chung, F.L., Ng, V., Luk, R.: Pattern discovery from stock market time series using self organizing maps. In: Workshop notes of KDD 2001 workshop on temporal data mining, San francisco, CA, pp. 27–37 (2001) 12. Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A sense of self for UNIX processes. In: IEEE Symposium on Security and Privacy, pp. 120–128. IEEE Computer Society Press, Oakland, CA (1996) 13. Sekar, R., Bowen, T., Segal, M.: On preventing intrusions by process behavior monitoring. In: Proceedings of the Workshop on Intrusion Detection and Network Monitoring, pp. 29–40. USENIX Association, Berkeley, CA (1999) 14. Warrender, C., Forrest, S., Pearlmutter, B.: Detecting intrusions using system calls: Alternative data models. In: Proceedings of the 1999 Conference on Security and Privacy (S&P-99), pp. 133–145. IEEE Press, Los Alamitos (1999) 15. Tandon, G., Chan, P., Mitra, D.: Morpheus: Motif oriented representations to purge hostile events from unlabeled sequences. In: Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security, pp. 16–25. ACM Press, New York (2004)
Comparative Studies in Key Disagreement Correction Process on Wireless Key Agreement System Toru Hashimoto1,0002 , Takashi Itoh1 , Masazumi Ueba1 , Hisato Iwai2 , Hideichi Sasaoka2, Kazukuni Kobara4,3 , and Hideki Imai3,4 1
4
ATR Wave Engineering Laboratories, Japan 2 Doshisha University, Japan 3 Chuo University, Japan National Institute of Advanced Industrial Science and Technology (AIST), Japan [email protected]
Abstract. This paper describes the comparison of the error-correcting codes that is adopted by the key disagreement correction process about wireless key agreement system called ESPARSKEY that is expected to achieve information-theoretic security. This system consists of AP with a variable directional antenna, that is, an ESPAR antenna, and U T with an omni-directional antenna. We employ conditional mutual information as the evaluation index. From experimental evaluation results, we clarified that the best way is adopting BCH(31,16,7) with table-aided soft-decision decoding as the key disagreement process where one eavesdropper exists more than 40cm from U T . After adopting this errorcorrecting code, we should transact 200 wireless packets between the nodes to share a 128-bit unguessable key against an eavesdropper.
1
Introduction
The rapid progress of such wireless communication technologies as RF devices, signal processing, modulation methods, antennas, and so on have greatly contributed to the cost reduction of terminals and the speed-up of wireless communication. As a result, such wireless communication as Wireless LAN or Personal Digital Cellular has become more popular and convenient. On the other hand, this popularization causes threats, such as tapping by eavesdroppers (passive attacks) and impersonations from third parties (active attacks). For active attacks, mutual authentication is the best solution, many kinds of which are incorporated in much application software. 0002
This work is part of “Research and development of ultra-high speed giga-bit rate wireless LAN systems” granted by National Institute of Information and Communications Technology (NICT).
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 173–187, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
174
T. Hashimoto et al.
The easiest way to protect wireless communication data from passive attackers is to encrypt the data with either public key encryption or common key encryption. The public key encryption system does not need an identical key for encryption and decryption, which is an advantage from view of the key management that includes key distribution, generation, sharing, and so on. But it has the disadvantage of low speed. Therefore, it has mainly been used for key exchange, authentication schemes, and digital signatures. Though a common key encryption system can encrypt and decrypt at high speed, it must manage common secret keys between nodes. For secure encryption using common key encryption schemes, key management between nodes is crucial. In the past, many kinds of key management schemes, especially key sharing or generating methods, have been proposed and adopted at many established wireless networks. Representative examples include the Diffie-Hellman key exchange algorithm [1] and key distribution methods using public key encryption such as RSA cryptosystem [2]. Both methods can be decoded in practical time if eavesdroppers have high computational power, which continues to increase rapidly, because the security of both is based on the difficulty of calculation, such as the Discrete Logarithm Problem or prime factorization called “computational security.” On the other hand, some methods are “unconditional (information-theoretic) secure,” based on quantum cryptography [3], noisy communication channels [4] [5], fluctuation of radio wave channels [6], and so on. Even assuming that eavesdroppers have infinite computational power, these methods are secure since they are based on information-theoretic security. In these methods, the most practical solution for wireless communication systems is the fluctuation of the radio wave channel response. Aono et al. [7] proposed a system based on the fluctuation of radio wave channels called ESPARSKEY that generates and shares keys without distributing them. In addition, for the intentional fluctuation of radio waves, this system exploits a variable directional antenna that is called Electronically Steerable Parasitic Array Radiator (ESPAR) antenna [8]. In this system, security evaluation studies have been performed. Imai et al. [9] reported that this system can achieve unconditionally secure key agreement as long as the passive eavesdropper is not located too close (on the order of tens of centimeters) to the players. In this paper, we describe the comparison of the error-correcting codes that is adopted by the key disagreement correction process for secure key generating concerning ESPARSKEY. The key disagreement correction process has possibilities not only of increasing the mutual information between AP and U T but also decreasing of the mutual information between EV and U T for miscorrection by error-correcting codes. As the result, the possibility to achieve unconditionally secure key agreement increases. We give the details of the wireless key generation system including its components, principle, and procedure in Section 2, an evaluation index in Section 3, and experiments and evaluations in Section 4. Finally we conclude this paper in Section 5.
Comparative Studies in Key Disagreement Correction Process
175
Fig. 1. Outline of AP
ESPAR antenna (variable directional) Communication device
RF signal
RF module
S y ndrome,etc
RS S I value
Making RSSI profile
DC voltage
Voltage control c i rcuit(D/A)
D/A converter
Reactance vector Pseudorandomnumber generator
Mic r o c o ntroller
K e y generator USB to PC
Fig. 2. Function block diagram of AP
2 2.1
Wireless Key Agreement System System Configuration
This system consists of Access Point(AP ) and User Terminal(U T ). AP ’s outline drawing is shown in Fig. 1, and its function block diagram is shown in Fig. 2 and consists of four parts: “Communication device,” “Microcontroller,” “D/A converter,” and “7-element ESPAR antenna.” – The “Communication device” is a CC2420 [10], which is a single-chip 2.4 GHz IEEE 802.15.4 compliant RF transceiver with baseband modem and MAC support. The simple specifications of the IEEE802.15.4 standard are shown in Table. 1. IEEE 802.15.4 is used at “ZigBee” as a physical layer and a MAC layer specification [11]. Such data as syndrome, initial value, etc.
176
T. Hashimoto et al. Table 1. Specifications of IEEE802.15.4 Frequency
2.4 GHz (ISM band) · same as wireless LAN (IEEE802.11b/g) Channel number 16 CH (CH 11∼CH 26) · use all CH for key generation · use arbitrary 1 CH for control Transmission power 1 mW Data rate 250 kbps
are set with a microcontroller and communicate with packets through an ESPAR antenna. The received power of packets is measured and converted into an RSSI value in this chip. – “Microcontroller” carries out the “making RSSI profile” and “generating the key” steps in the “key generator” function. These steps are based on the RSSI value from the “Communication device.” “Microcontroller” has a USB port to send the generated keys to a PC by USB cable. The “7-element ESPAR antenna” needs DC voltage to control the beam pattern of the antenna. The seeds of DC voltage, called reactance vectors, are made in the “Pseudo random number generator” function in the “Microcontroller,” which is set to the “D/A converter.” – The “D/A converter” changes the digital value of the reactance vectors into analog DC voltage and sets this voltage to the varactor in the “ESPAR antenna.” The DC voltage, which ranges from 0 to 5 [V], is set randomly in 4 bits resolution at the “D/A converter.” For example, the radiation patterns 6 of the 7-element ESPAR antenna result in 24 = 166 = 16777216 patterns, because it has six parasitic elements. – The “ESPAR antenna,” which is a variable-directional array antenna with a single central active radiator surrounded by parasitic elements, is a component of AP . The six parasitic elements are located at equal intervals around a single central active radiator, and each is loaded with a varactor diode, which is a variable-capacitance diode. By adjusting the DC voltage given to the varactors with reverse bias, the antenna’s beam can be formed. Because it only has a single RF radiator, it is expected to have lower power consumption than Digital Beam-Forming (DBF) array antennas. Many studies using this antenna have been performed for personal wireless communication or the direction of arrival estimation. U T ’s outline drawing is shown in Fig. 3, and its function block diagram is shown in Fig. 4. They consist of three parts: “an omni-directional patch antenna,” “a communication device,” and “a microcontroller.” The role of each block is identical to AP except for the antenna. The node is connected to the PC by a USB interface.
Comparative Studies in Key Disagreement Correction Process
177
Fig. 3. Outline of U T
Omni-directional antenna (patch antenna) RF signal
Communication device
RF module
Syndrome, etc
RSSI value
Micro controller
Making RSSI profile Key generator USB to PC
Fig. 4. Function block diagram of U T
2.2
Key Generation
Principle of Key Generation. This key generation system is based on the reciprocity theorem of radio wave propagation between AP and U T . In addition, a variable directional antenna called ESPAR antenna is used for intentional fluctuation. The key is made by this fluctuation of radio waves. We describe the principle of key generation in detail. We use the beam-forming technique of the ESPAR antenna; that is, by adjusting the DC voltage given to the varactors with reverse bias, we can intentionally undulate the response of the propagation channels between the two nodes. In this condition, we measure the Received Signal Strength Indicator (RSSI) by alternately transmitting short packets between nodes. From this measurement, AP and U T can independently obtain RSSI profiles whose characteristics are identical due to the reciprocity theorem of radio wave propagation. The same value is provided by making these
178
T. Hashimoto et al.
AP (Access Point) with a variable directional antenna
Iteration of controlled directional beam pattern
Procedure of key generation and sharing
Wireless Communication (without encryption)
UT (User Terminal) with an omni-directional antenna TDD: Time Division Duplex
Send a packet by beam pattern A
Wireless Communication (without encryption)
Receive a packet by omnidirectional pattern Measure RSSI value
Receive a packet by beam pattern A
Send a packet by omnidirectional pattern
Measure RSSI value ● ● ●
● ● ● ● ● ●
Build up RSSI profile
Build up RSSI profile
Define threshold level Delete data of the threshold level neighborhood
Define threshold level Information about delete data Information about delete data
Multilevel coding (Binary coding)
Delete data of the threshold level neighborhood Multilevel coding (Binary coding)
Generate a key candidate
1. RSSI Profile Generation Process
2. Data Deletion Process of the Threshold Neighborhood
3. Binary Coding Process
Generate a key candidate Syndrome
Correct disagreement key
Correct disagreement key Agree/Disagree
Set key Wireless Communication (with encryption)
4. Key Disagreement Correction Process
Set key TDD: Time Division Duplex
Wireless Communication (with encryption)
Fig. 5. Procedure flow of key generation
profiles multilevel coding, for example, binary coding. As a result, these values are generated keys in this system. An eavesdropper (EV ), who is a passive attacker listening at another place, has difficulty obtaining the same key between nodes because EV cannot obtain the same RSSI profile from AP and U T since their propagation characteristics are different. Procedure. Here we describe in detail the procedure of our key generation system. The preconditions are set as follows. AP has a 7-element ESPAR antenna, and U T has a conventional omni-directional antenna. EV is identical as U T except for the location. AP and U T can communicate at the same frequency by a method such as Time Division Duplex (TDD). The DC voltage sets at each varactor diode are connected to the parasitic element of the 7-element ESPAR antenna using a pseudorandom number.
Comparative Studies in Key Disagreement Correction Process
179
The procedure of this system is shown in Fig. 5 and described as follows. 1. RSSI Profile Generation Process AP decides a certain directional beam pattern by obtaining pseudorandom numbers and sending a short packet to U T , which receives a packet from AP and computes the RSSI value by measuring the received power. A computed RSSI value is recorded in U T ’s memory. U T returns a short packet to AP as soon as possible to maintain a propagation environment. AP receives a packet from U T and computes and records the RSSI value as well as U T . If a packet isn’t received by U T or AP , a packet-send-error is recognized by timeout, and AP will change the beam pattern and retry transmission. Next, AP changes the directional beam pattern and repeats the above procedure N times. Here, N consists of key length K and the length of redundant bit α, N = K + α. AP and U T can independently compile a RSSI data profile. Thanks to the reciprocity theorem of radio wave propagation between AP and U T , this profile in the nodes is identical, except for the random noise, the differences in transmission power, the receiver’s noise figures, and such antenna performance factors as sensitivity or directivity. From each profile, AP and U T independently define a threshold level that is considered the median value of each RSSI profile in this system. Now the value of the RSSI profile includes noise components. To reduce their influence, this system has a data deletion process of the threshold neighborhood, which is shown in Fig. 6. 2. Data Deletion Process of the Threshold Neighborhood AP and U T have an RSSI profile the size of N . As shown above, N consists of key length K and the length of redundant bit α. In Fig. 6, K is defined as 6 and α as 4. From this profile, the threshold level is decided by the median value of each RSSI profile sorted afterwards in each node by ascending order. At AP , a subset of the RSSI value is chosen as the most susceptible place for noise by picking the largest K/2 + β and the smallest K/2 + β RSSI values, with β < α/2 (in Fig. 6, β is defined as 1). The positions of the unchosen RSSI values are transmitted to U T and deleted at AP and U T . At U T , the RSSI values undeleted by transmitted information from AP are sorted again and the process is repeated, this time choosing the largest K/2 and smallest K/2 RSSI values from the remaining RSSI values. Again, the positions of the unchosen RSSI values are transmitted to AP , and unchosen RSSI values are deleted at both nodes. The remaining RSSI profiles whose size is K are sorted again by original number. 3. Binary Coding Process These RSSI profiles that have been coded to binary using the threshold level defined above are candidates for the shared key. 4. Key Disagreement Correction Process The key disagreement correction process is carried out on these candidates by an error-correcting code to produce the shared secure keys. In this process, we basically consider the bit patterns of key candidates a series of such block code as BCH code. In this condition, we calculate each syndrome,
T. Hashimoto et al.
AP (Access Point) with a variable directional antenna
UT (User Terminal) with an omni-directional antenna
RSSI profile Threshold level
1
2
3 4
5
6
7 8
9 10
RSSI
RSSI
RSSI profile
Threshold level
1
2
RSSI
RSSI
7 10 5 1
10 5 1 7
4 9 8 3
Delete No.9, No.10
Sort Threshold level
7
RSSI
Sort
Threshold level
1 3 4
8
A series of antenna directivity binary coding
100110
Delete No.2, No.6
The positions of the unchosen RSSI values
Delete No.9, No.10
5
9 10
10 5 1 7 4 9 2 6 8 3
The positions of the unchosen RSSI values
Delete No.2, No.6
3 4
8
Sort
7 10 5 1 4 9 6 2 3 8
1
5 6 7
RSSI
RSSI
Sort
4 9 8 3
3 4
A series of antenna directivity
A series of antenna directivity
RSSI
180
5
7 8
A series of antenna directivity binary coding
Key agreement
100110
Fig. 6. Data deletion process of the threshold neighborhood
SAP = xAP HT , and SUT = xUT HT , where xAP and xUT are bit patterns in the key candidates of AP and U T , respectively, HT denotes a check matrix, and superscript T is the transpose of the matrix. Calculated syndrome SUT is transmitted from U T to AP . At AP , we define the differences in the syndrome as S = SAP − SUT and in the bit patterns as e = xAP − xUT . The relationship between these parameters is expressed as S = eHT . If S = 0 is true, then e = 0 is true, and both bit patterns of the key candidates are considered to be in agreement. If S = 0 is false, then we will correctly estimate e to minimize the number of disagreement bits by an error correction technique. If no agreement is obtained after the key disagreement correction process, the generated key is rejected, and the entire process is repeated.
Comparative Studies in Key Disagreement Correction Process
181
I(YUT;LAP LEV) I(YUT;LAP) H(LAP)
H(YUT)
I(YUT;LEV) H(LEV)
Eavesdropper Information
Fig. 7. Entropy relation of LAP , YU T , and LEV
3
Security Evaluation Index
Fig. 7 denotes the entropies and their relation of LAP , YUT , and LEV . In this figure, L is the set of levels, and Y is the binary encoded data, Y ∈ {0, 1}N . Each node obtains these values from RSSI profiles. We calculate entropies (H) and information (I) by estimating the error rate. To estimate the error rate, we tried the time of the Np key generation procedure to show in Fig. 5. This error rate estimation procedure is described as the following protocol [9]: Protocol 1. For i = 1 to Np do: (a) For i = 1 to N do: i. AP sends a packet to U T , who receives it, measures its RSSI value, and records it as RUT (i). ii. As soon as possible U T returns a packet to AP , who receives it, measures its RSSI value, and records it as RAP (i). iii. AP changes the radiation pattern. Endfor. AP and U T conclude with their RSSI value sets RAP and RUT , respectively. (b) Both AP and UT perform the following: i. R is sorted by ascending order L ∈ {1, 2, . . . , N } and is coded to binary using threshold level TNp , which is the median value of each RSSI profiles, that is, Yi = 0, if R(i) < TNp and otherwise Yi = 1. AP and U T conclude with YAP and YUT , respectively. ii. AP and U T announce YAP and YUT to each other. For i = 1, 2, . . . , N, AP and U T calculate: err(i) = err(i) + 1, if YAP (i) 0003= YUT (i). Endfor. 2. AP and U T compute the estimated error rate(pi ) for each level as follows: For i = 1, 2, . . . , N, : pi = err(i)/Np
182
T. Hashimoto et al.
EV can get its own RSSI value and estimate the error rate by tapping the communication between AP and U T . After this procedure, we can calculate entropies H(LAP ), H(YUT ), and so on. In Fig. 5, the intersection of H(LAP ) and H(YUT ) is the amount of information about YUT obtained from LAP . In other words, it is mutual information I(YUT ; LAP ) between H(YUT ) and H(LAP ). Evaluation of only this quantity presents a problem because this quantity does not consider the eavesdropper. Because this mutual information I(YUT ; LAP ) includes information obtained by eavesdroppers, their influence must be removed. The information obtained by the eavesdropper is shown at the intersection of H(LAP ), H(YUT ), and H(LEV ) and is marked “Eavesdropper Information” in Fig. 7. For simplicity, we consider this value mutual information I(YUT ; LEV ) between H(YUT ) and H(LEV ), and we can get the upper bound of the information obtained by the eavesdropper. This quantity is shown at the intersection of H(YUT ) and H(LEV ) in Fig. 7. We need to evaluate mutual information I(YUT ; LEV ) in addition to the former evaluation. From these evaluations, the lower bound of mutual information I(YUT ; LAP LEV ) (based on LAP ) that an eavesdropper can’t know is shown as: I (YUT ; LAP LEV ) ≥ I (YUT ; LAP )− I (YUT ; LEV )
(1)
In the same way, the lower bound of I(YAP ; LUT LEV ) (based on LUT ) is shown as: I (YAP ; LUT LEV ) ≥ I (YAP ; LUT )− I (YAP ; LEV ) (2) In this paper, we show the calculation result of Eq. (1) defined as the “conditional mutual information per round” because there is little difference between the calculation result of Eqs. (1) and (2).
4 4.1
Evaluation Experiments and Simulations Experimental Condition
The conditions of the key generation method are described as follows. Key length K is 128 bits, and the length of redundant bit α, described in Subsection 2.2, is 256 bits. In other words, the number of RSSI value N is 384 at each node. Np is at least 100 key generations. This experimental system has an eavesdropper (EV ), whose figure and function block diagram is the same as U T . The difference between U T and EV is the positions in which the packet was received. The experimental room in which we carried out key generation experiments is shown in Fig. 8. Each node was arranged in 141 points all over the experiment room that has three metal walls and one concrete wall. The experiment was performed in turn by each node.
Comparative Studies in Key Disagreement Correction Process
183
Cabinet
6.7m
Table
Table
PC
PC
Table
Table
Entrance 8.4m Concrete wall
Metal wall
Fig. 8. Sketch of experimental room
4.2
Security Evaluation without Key Disagreement Correction Process
First, we evaluated the security on this wireless key agreement system. Fig. 9 shows the security evaluation results by conditional mutual information. The vertical scale of this graph is the mutual information per round in each result. The word ”round” means the process by which AP sends a packet to U T and U T returns it to AP . In this result, we find the following three things: – The mutual information between AP and U T shown in (a) is high level at any distance. – The mutual information between EV and U T shown in (b) is high level when the distance between EV and U T is close. – The system exhibits conditional mutual information 0.6 or higher when there is one EV more than 40 cm from U T . This means that to share a 128-bit unguessable key against an eavesdropper in this procedure without the key disagreement correction process, we should transact 214 wireless packets between the nodes. 4.3
Comparison of Error-Correcting Codes Concerning Key Disagreement Correction Process by Computer Simulation
Preparation. When we compare the differences between the error-correctiong codes concerning key disagreement correction processes by computer simulation, the following four types of coding methods have been prepared that mainly focus on implementation for this system: Method 1. BCH(31,21,5) with table-aided hard-decision decoding. Method 2. BCH(31,16,7) with table-aided hard-decision decoding.
184
T. Hashimoto et al. 1
Mutual information [bit/round]
0.99 0.98 0.97 0.96 0.95
40 [cm]
0.94 0.93 0.92 0.91 0.9 0
100
200
300
400
500
600
Distance between UT and EV [cm]
(a) Mutual Information between AP and UT 1
Mutual information [bit/round]
0.9 0.8 0.7 0.6
40 [cm]
0.5 0.4 0.3 0.2 0.1 0 0
100
200
300
400
500
600
Distance between UT and EV [cm]
Conditional mutual information [bit/round]
(b) Mutual Information between EV and UT 1 0.9 0.8 0.7 0.6
40 [cm]
0.5 0.4 0.3 0.2 0.1 0 0
100
200
300
400
500
600
Distance between UT and EV [cm]
(c) Conditional Mutual Information
Fig. 9. Mutual information without key disagreement correction process whose function is distance between U T and EV
Method 3. BCH(31,21,5) with table-aided soft-decision decoding. Method 4. BCH(31,16,7) with table-aided soft-decision decoding. The aided table for decoding is pre-shared by all terminals. Comparison of Mutual Information between AP and U T . Table 2 shows the mutual information between AP and U T . This value is the first term in Eq. (1). Five methods, one of which is the case of whthout a key disagreement correction process and another is described as the method of the key disagreement correction process in the previous subsection, are shown in this table. In addition, they divided by arranging EV . Case 1 is all the points in this experiment, case 2 is the points whose EV is located within 40 cm from U T , and case 3 is the other points.
Comparative Studies in Key Disagreement Correction Process
185
Table 2. Comparison of Mutual Information between AP and U T no error correction Case 1 0.9854 141 points (—) Case 2 0.9857 89 points (—) Case 3 0.9848 52 points (—)
Coding Method 1 0.9915 (0.0061) 0.9927 (0.0070) 0.9894 (0.0045)
Coding Method 2 0.9921 (0.0069) 0.9938 (0.0081) 0.9893 (0.0045)
Coding Coding Method 3 Method 4 0.9920 0.9923 (0.0066) (0.0069) 0.9935 0.9939 (0.0078) (0.0082) 0.9895 0.9896 (0.0047) (0.0047) (bit/round)
Table 3. Comparison of Mutual Information between EV and U T no error correction Case 1 0.2729 141 points (—) Case 2 0.3788 89 points (—) Case 3 0.0917 52 points (—)
Coding Method 1 0.2518 (-0.0211) 0.3535 (-0.0253) 0.0776 (-0.0141)
Coding Method 2 0.2719 (-0.0010) 0.3836 (0.0048) 0.0808 (-0.0109)
Coding Coding Method 3 Method 4 0.2486 0.2645 (-0.0243) (-0.0084) 0.3556 0.3866 (-0.0232) (0.0078) 0.0655 0.0555 (-0.0262) (-0.0362) (bit/round)
This table means that the key disagreement correction process is effective for mutual information between AP and U T for all methods regardless of the distance between EV and U T . Comparison of Mutual Information between EV and U T . Table 3 shows the mutual information between EV and U T . This value is the second term in Eq. (1). When EV is located near U T , methods 1 and 3 are more effective for the mutual information between EV and U T than methods 2 or 4 because the possibility that the RSSI profile of EV is similar to U T ’s profile increases when the distance of each node is short. Moreover, a different feature is found in which method 4 is the most effective in case 3. Miscorrection occurs in the key disagreement correction process based on the syndrome received by EV because the RSSI profile of EV is different from that of U T in case 3. Comparison of Conditional Mutual Information. Table 4 shows the conditional mutual information defined in Eq.(1). In this result, we clarified the effectiveness of the key disagreement correction process based on the error-correcting codes. Conditional mutual information has only increased 0.04 bit or less using this process. In fact, the best way is to adopt BCH(31,16,7) with table-aided soft-decision decoding as the key disagreement correction process when there is one EV more than 40 cm from U T . To share a 128-bit unguessable key against
186
T. Hashimoto et al. Table 4. Comparison of Conditional Mutual Information no error correction Case 1 0.7125 141 points (—) Case 2 0.6069 89 points (—) Case 3 0.8932 52 points (—)
Coding Method 1 0.7397 (0.0273) 0.6392 (0.0323) 0.9118 (0.0186)
Coding Method 2 0.7202 (0.0077) 0.6101 (0.0033) 0.9085 (0.0154)
Coding Coding Method 3 Method 4 0.7434 0.7278 (0.0309) (0.0154) 0.6378 0.6073 (0.0310) (0.0004) 0.9241 0.9341 (0.0309) (0.0409) (bit/round)
an eavesdropper in this procedure with the key disagreement correction process, we should transact 200 wireless packets between the nodes.
5
Conclusion
As a result of experiments and computer simulation, we clarified the following: – The system that has no key disagreement correction process exhibits conditional mutual information 0.6 or higher when there is one EV more than 40 cm from U T . – In this case, to share a 128-bit unguessable key against an eavesdropper, we should transact 214 wireless packets between the nodes. – The key disagreement correction process based on error-correcting code contributes to the improvement not only of the mutual information between AP and U T but also the mutual information between EV and U T . – The best way is to adopt BCH(31,16,7) with table-aided soft-decision decoding as the key disagreement correction process under the same conditions. – After this improvement, to share a 128-bit unguessable key against an eavesdropper, we should transact 200 wireless packets between nodes.
References 1. Diffie, W., Hellman, M.E.: New Directions in Cryptography. IEEE Trans. Information Theory 22(6), 644–654 (1976) 2. Rivest, R.L., Shamir, A., Adleman, L.M.: A Method for Obtaining Digital Signatures and Public-key Cryptosystems. Communications of the ACM 21, 120–126 (1978) 3. Bennett, C.H., Brassard, G.: Quantum Cryptography: Public Key Distribution and Coin Tossing. In: Proc. IEEE Int. Conf. Com. Sys. and Signal Processing, Bangalore, India (December 1984) 4. Maurer, U.: Secret Key Agreement by Public Discussion from Common Information. IEEE Trans. Inf. Theory 39(3), 733–742 (1993) 5. Maurer, U.: Information-Theoretically Secure Secret-Key Agreement by NOT Authenticated Public Discussion. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 209–225. Springer, Heidelberg (1997)
Comparative Studies in Key Disagreement Correction Process
187
6. Hershey, J.E., Hassan, A.A., Yarlagadda, R.: Unconventional Cryptographic Keying Variable Management. IEEE Trans. Commun. 43, 3–6 (1995) 7. Aono, T., Higuchi, K., Ohira, T., Komiyama, B., Sasaoka, H.: Wireless Secret Key Generation Exploiting Reactance-domain Scalar Response of Multipath Fading Channels. IEEE Trans. Antennas Propag. 53(11), 3776–3784 (2005) 8. Ohira, T., Cheng, J.: Analog smart antennas, Adaptive Antenna Arrays, pp. 184– 204. Springer, Heidelberg (2004) 9. Imai, H., Kobara, K., Morozov, K.: On the Possibility of Key Agreement Using Variable Directional Antenna. In: JWIS 2006. The 1st Joint Workshop on Information Security 2006, Seoul, Korea, pp. 153–167 (September 2006) 10. http://focus.ti.com/docs/prod/folders/print/cc2420.html 11. http://www.zigbee.org/
Breaking 104 Bit WEP in Less Than 60 Seconds Erik Tews, Ralf-Philipp Weinmann, and Andrei Pyshkin0002 TU Darmstadt, FB Informatik Hochschulstrasse 10, 64289 Darmstadt, Germany {e tews,weinmann,pyshkin}@cdc.informatik.tu-darmstadt.de
Abstract. We demonstrate an active attack on the WEP protocol that is able to recover a 104-bit WEP key using less than 40,000 frames with a success probability of 50%. In order to succeed in 95% of all cases, 85,000 packets are needed. The IV of these packets can be randomly chosen. This is an improvement in the number of required frames by more than an order of magnitude over the best known key-recovery attacks for WEP. On a IEEE 802.11g network, the number of frames required can be obtained by re-injection in less than a minute. The required computational effort is approximately 220 RC4 key setups, which on current desktop and laptop CPUs is negligible.
1
Introduction
Wired Equivalent Privacy (WEP) is a protocol for encrypting wirelessly transmitted packets on IEEE 802.11 networks. In a WEP protected network, all packets are encrypted using the stream cipher RC4 under a common key, the root key1 Rk. The root key is shared by all radio stations. A successful recovery of this key gives an attacker full access to the network. Although known to be insecure and superseded by Wi-Fi Protected Access (WPA) [18], this protocol is still is in widespread use almost 6 years after practical key recovery attacks were found against it [5,15]. In this paper we present a new key-recovery attack against WEP that outperforms previous methods by at least an order of magnitude. First of all we describe how packets are encrypted: For each packet, a 24-bit initialization vector (IV) IV is chosen. The IV concatenated with the root key yields the per packet key K = IV Rk. Over the data to be encrypted, an Integrity Check Value (ICV) is calculated as a CRC32 checksum. The key K is then used to encrypt the data followed by the ICV using the RC4 stream cipher. The IV is transmitted in the header of the packet. Figure 1 shows a simplified version of an 802.11 frame. A first analysis of the design failures of the WEP protocol was published by Borisov, Goldberg and Wagner [2] in 2001. Notably, they showed that the ICV merely protects against random errors but not against malicious attackers. Furthermore, they observed that old IV values could be reused, thus allowing 0002 1
Supported by a stipend of the Marga und Kurt-M¨ ollgaard-Stiftung. The standard actually allows for up to four different root keys; in practice however, only a single root key is used.
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 188–202, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Breaking 104 Bit WEP in Less Than 60 Seconds
189
Fig. 1. A 802.11 frame encrypted using WEP
to inject messages. In the same year, Fluhrer, Mantin and Shamir presented a related-key ciphertext-only attack against RC4 [5]. In order for this attack to work, the IVs need to fulfill a so-called “resolved condition”. This attack was suspected to be applicable to WEP, which was later demonstrated by Stubblefield et al [15]. Approximately 4 million different frames need to be captured to mount this attack. Vendors reacted to this attack by filtering IVs fulfilling the resolved condition, so-called “weak IVs”. This countermeasure however proved to be insufficient: In 2004, a person using the pseudonym KoreK posted a family of statistical attacks against WEP that does not need weak IVs [9,3]; moreover the number of frames needed for key-recovery was reduced to about 500,000 packets. More recently, Klein [7] showed an improved way of attacking RC4 using related keys that does not need the “resolved condition” on the IVs and gets by with a significantly reduced number of frames. Table 1 shows a statistic of employed encryption methods in a sample of 490 networks, found somewhere in the middle of Germany in March 2007. Another survey of more than 15.000 networks was performed in a larger German city in September 2006 [4]. Both data sets demonstrate that WEP still is a popular method for securing wireless networks. Similar observations have been made by Bittau, Handley and Lackey [1]. Their article also give an excellent history of WEP attacks and describes a real-time decryption attack based on packet fragmentation that does not recover the key. The structure of the paper is as follows: In Section 2 we introduce the notation that is used throughout the rest of this paper, in Section 3.1 we present a summary of Klein’s attack on RC4, in Section 4 we specialize Klein’s Attack to WEP, Section 5 describes how sufficient amounts of key stream can be obtained for the attack, Section 6 describes extensions of the attack such as key ranking techniques in detail and Section 7 gives experimental results.
2
Notation
For arrays or vectors we use the [·] notation, as used in many programing languages like in C or Java. All indices start at 0. For a permutation P denote the inverse permutation by P−1 ; meaning P[i] = j ⇔ P−1 [j] = i. We will use x ≈n y as a short form for x ≈ y mod n.
190
E. Tews, R.-P. Weinmann, and A. Pyshkin Table 1. Methods used for securing wireless networks Time No Encryption WEP WPA1/2 March 2007 21.8% 46.3% 31.9% Middle of 2006 23.3% 59.4% 17.3%
Rk is the WEP or root key and IV is the initialization vector for a packet. K = Rk IV is the session or per packet key. X is a key stream generated using K. We will refer to a key stream X with the corresponding initialization vector IV as a session.
3
The Stream Cipher RC4
RC4 is a widely used stream cipher that was invented by Ron Rivest in 1987. It was a trade secret of RSA Security until 1994 when the algorithm was published anonymously on the Internet. RC4’s internal state consists of a 256-byte array S defining a permutation as well as two integers 0 ≤ i, j ≤ 255 acting as pointers into the array. The RC4 key setup initializes the internal state using a key K of up to 256 bytes. By exchanging two elements of the state in each step, it incrementally transforms the identity permutation into a “random” permutation. The quality of randomness of the permutation after the key setup will be analyzed in Section 3.1. The RC4 key stream generation algorithm updates the RC4 internal state and generates one byte of key stream. The key stream is XORed to the plaintext to generate the ciphertext. Listing 1.1. RC4 key setup 1 2 3 4 5 6 7 8 9 10
for i ← 0 to 255 do S[ i ] ← i end j ←0 for i ← 0 to 255 do j ← j+S[i]+K[i mod len(K)] mod 256 swap(S, i , j ) end i ←0 j ←0 Listing 1.2. RC4 key stream generation
1 2 3 4
i ← i + 1 mod 256 j ← j + S[i] mod 256 swap(S, i , j ) return S[ S[ i ] + S[j] mod 256 ]
We have a closer look at the RC4 key setup algorithm described in listing 1.1, especially at the values for S, i and j. After line 4, S is the identity permutation
Breaking 104 Bit WEP in Less Than 60 Seconds
191
and j has the value 0. We will use Sk and jk for the values of S and j after k iterations of the loop starting in line 5 have been completed. For example, if the key CA FE BA BE is used, S0 is the identity permutation and j0 = 0. After the first key byte has been processed, j1 = 202 and S1 [0] = 202, S1 [202] = 0, and S1 [x] = S0 [x] = x for 0 0005= x 0005= 202. For the rest of this paper, let n = 256. It is possible to generalize our results for modified versions of the RC4 algorithm, but we will only focus on the original algorithm as printed in listing 1.1 and 1.2. 3.1
Klein’s Attack on RC4
Suppose w key streams were generated by RC4 using packet keys with a fixed root key and different initialization vectors. Denote by Ku = (Ku [0], . . . , Ku [m]) = (IVu 0006Rk) the u-th packet key and by Xu = (Xu [0], . . . , Xu [m − 1]) the first m bytes of the uth key stream, where 1 ≤ u ≤ w. Assume that an attacker knows the pairs (IVu , Xu ) – we shall refer to them as samples – and tries to find Rk. In [7], Klein showed that there is a map Fi : (Z/nZ)i → Z/nZ with 1 ≤ i ≤ m such that 0002 K[i], with Prob ≈ 1.36 n Fi (K[0], . . . , K[i − 1], X[i − 1]) = a 0005= K[i], with Prob < n1 for all a So Fi can be seen as a kind of approximation for K[i], and we can write Fi ≈n K[i]. If the first i bytes of a packet key are known, then the internal permutation Si and the index j at the ith step of the RC4 key setup algorithm can be found. We have Fi (K[0], . . . , K[i − 1], X[i − 1]) = S−1 i [i − X[i − 1]] − (ji + Si [i]) mod n The attack is based on the following properties of permutations. Theorem 1. For a random number j ∈ {0, . . . , n − 1} and a random permutation P, we have Prob(P[j] + P[P[i] + P[j] mod n] ≡ i mod n) = Prob(P[j] + P[P[i] + P[j] mod n] ≡ c mod n) =
2 n
n−2 n(n − 1)
where i, c ∈ {0, . . . , n − 1} are fixed, and c 0005= i. Proof. see [7]. In the case of n = 256, the first probability is equal to 2−7 ≈ 0.00781, and the second one is approximately equal to 0.00389. From Theorem 1 it follows that for RC4 there is a correlation between i, Si+n [Si+n [i] + Si+n [j] mod n], and Si+n [j] = Si+n−1 [i]. Next, the equality Si+1 [i] = Si+n−1 [i] holds with high probability. The theoretical explanation of this is the following. If we replace the line 6 of the RC4 key setup, and the line 2 of the RC4 key stream generator by j ← RND(n), 2 then 2
0003 Some publications approximate 1 − rest of this paper.
0004 1 n−2 n
by
1 . e
0003 We will use 1 −
0004 1 n−2 n
for the
192
E. Tews, R.-P. Weinmann, and A. Pyshkin
Prob(Si+1 [i] = Si+n−1 [i]) =
0005 0006n−2 1 1− ≈ e−1 n
Moreover, we have Si+1 [i] = Si [ji+1 ] = Si [ji + Si [i] + K[i] mod n]. Combining this with Theorem 1, we get the probability that K[i] = S−1 i [i − Si+n [Si+n [i] + Si+n [j] mod n] mod n] − (ji + Si [i]) is approximately 0007 0005 0005 0006n−2 0006n−2 1 1.36 1 2 n−2 + 1− 1− ≈ 1− n n n n(n − 1) n
4
Extension to Multiple Key Bytes
With Klein’s attack, it is possible to iteratively compute all secret key bytes, if enough samples are available. This iterative approach has a significant disadvantage: In this case the key streams and IVs need to be saved and processed for every key byte. Additionally correcting falsely guessed key byte is expensive, because the computations for all key bytes following K[i] needs to be repeated if K[i] was incorrect. We extend the attack such that is it possible to compute key bytes independently of each other and thus make efficient use of the attack possible by using key ranking techniques. Klein’s attack is based on the the fact that K[i] ≈n S−1 i [i − X[i − 1]] − (Si [i] + ji ) K[i + 1]
≈n S−1 i+1 [(i
(1)
+ 1) − X[(i + 1) − 1]] − (Si+1 [i + 1] + ji+1 )
(2)
We may write ji+1 as ji + Si [i] + K[i]. By replacing ji+1 in equation 2, we get an approximation for K[i] + K[i + 1]: K[i] + K[i + 1] ≈n S−1 i+1 [(i + 1) − X[(i + 1) − 1]] − (Si+1 [i + 1] + ji + Si [i]) (3) i+k By repeatedly replacing ji+k , we get an approximation for l=i K[l]. Because 3+i i we are mostly interested in l=3 K[l] = l=0 Rk[l] in a WEP scenario, we will use the symbol σi for this sum. 0007 i+3
−1 Sl [l] = A˜i (4) σi ≈n S [(3 + i) − X[2 + i]] − j3 + 3+i
l=3
The right side of equation 4 still depends on the key bytes K[3] to K[i − 1], because they are needed to compute Sl and S−1 3+i . By replacing them with S3 , we get another approximation Ai for σi , which only depends on K[0] to K[2]. 0007 i+3
−1 S3 [l] = Ai (5) σi ≈n S3 [(3 + i) − X[2 + i]] − j3 + l=3
Breaking 104 Bit WEP in Less Than 60 Seconds
193
Under idealized conditions, Klein derives the following probability for the event A˜i = σi : 0007 0005 0006n−2 0006n−2 0005 1 1 2 n−2 ˜ (6) Prob σi = Ai ≈ 1 − · + 1− 1− · n n n n(n − 1) The first part of sum represents the probability that S[i+3] remains unchanged until X[2 + i] is generated, the second part represents the probability that S[i + 3] is changed during key scheduling or key stream generation with Ai still taking the correct value. By replacing Sl and Si+3 with their previous values, we have reduced that probability slightly. Sk+3 [k + 3] differs from S3 [k + 3] only if one of the values of j3 to jk+2 has been k + 3. All values of Sl [l] will be correct, if for all jz with 3 ≤ z ≤ 3 + i the / {z, . . . , 3 + i} holds. Assuming j changes randomly, this happens condition jz ∈ 0003 0004
with probability ik=1 1 − nk . Additionally S3+i [ji+3 ] should not be changed between iteration 3 and 3 + i. This is true if j does not take the value of ji+3 in a 0003 0004i previous round, which happens with probability ≈ 1 − n1 and i does not take 0004 0003 the value of ji+3 , which happens with probability ≈ 1 − ni . To summarize, the probability that replacing all occurrences of S in A˜i with S3 did not change anything is: qi =
0005 0006i 0005 0006 000e 0006 i 0005 1 i k 1− · 1− 1− · n n n
(7)
k=1
This results in the following probability pcorrecti being a lower bound for Ai taking the correct value for σi . 0007 0005 0005 0006n−2 0006n−2 1 2 1 n−2 (8) · + 1 − qi · 1 − · Prob (σi = Ai ) ≈ qi · 1 − n n n n(n − 1) Experimental results using more than 50,000,000,000 simulations with 104 bit WEP keys show that this approximations differs less than 0.2% from values determined from these simulations.
5
Obtaining Sufficient Amounts of Key Stream
The Internet Protocol (IP) is the most widely deployed network protocol. For our attack to work, we assume that version 4 (IPv4) of this protocol is used on the wireless networks we attack. If host A wants to send an IP datagram to host B, A needs the physical address of host B or the gateway through which B can be reached. To resolve IP addresses of hosts to their physical address, the Address Resolution Protocol (ARP) [13] is used. This works as follows: Host A sends an ARP request to the link layer broadcast address. This request announces that A is looking for the physical address of host B. Host B responds with an ARP reply containing his
194
E. Tews, R.-P. Weinmann, and A. Pyshkin
Fig. 2. Cleartext of ARP request and response packets
own physical address to host A. Since the Address Resolution Protocol is a link layer protocol it is typically not restricted by any kind of packet filters or rate limiting rules. ARP requests and ARP replies are of fixed size. Because the size of a packet is not masked by WEP, they can usually be easily distinguished from other traffic. The first 16 bytes of cleartext of an ARP packet are made up of a 8 byte long 802.11 Logical Link Control (LLC) header followed by the first 8 bytes of the ARP packet itself. The LLC header is fixed for every ARP packet (AA AA 03 00 00 00 08 06). The first 8 bytes of an ARP request are also fixed. Their value is 00 01 08 00 06 04 00 01. For an ARP response, the last byte changes to 02, the rest of the bytes are identical to an ARP request. An ARP request is always sent to the broadcast address, while an ARP response is sent to a unicast address. Because the physical addresses are not encrypted by WEP, it is easy to distinguish between an encrypted ARP request and response. By XORing a captured ARP packet with these fixed patterns, we can recover the first 16 bytes of the key stream. The corresponding IV is transmitted in clear with the packet. To speed up key stream recovery, it is possible to re-inject a captured ARP request into the network, which will trigger another reply. The destination answers the request with a new response packet that we can add to our list of key streams. If the initiator and the destination of the original request have been both wireless stations, every re-injected packet will generate three new packets, because the transmission will be relayed by the access point. Because ARP replies expire quickly, it usually takes only a few seconds or minutes until an attacker can capture an ARP request and start re-injecting it. The first public implementation of a practical re-injection attack was in the BSD-Airtools package [6]. It is even possible to speed up the time it takes to capture the first ARP request. A de-authenticate message can be sent to a client in the network, telling him that he has lost contact with the base station. In some configurations we saw clients rejoining the network automatically and at the same time flushing their ARP cache. The next IP packet sent by this client will cause an ARP request to look up the Ethernet address of the destination.
Breaking 104 Bit WEP in Less Than 60 Seconds
6
195
Our Attack on WEP
The basic attack is straightforward. We use the methods described in Section 5 to generate a sufficient amount of key stream under different IVs. Initially we assume that a 104 bit WEP key was used. For every σi from σ0 to σ12 , and every recovered key stream, we calculate Ai as described in equation 5 and call the result a vote for σi having the value Ai . We keep track of those votes in separate tables for each σi . Having processed all available key streams, we assume that the correct value for every σi is the one with the most votes received. The correct key is simply Rk[0] = σ0 for the first key byte and Rk[i] = σi − σi−1 for all other key bytes. If the correct key was a 40 bit key instead of a 104 bit key, the correct key is just calculated from σ0 to σ4 . 6.1
Key Ranking
If only a low number of samples is available, the correct value for σi is not always the most voted one in the table, but tends to be one of the most voted. Figure 3 contains an example in which the correct value has the second most votes after 35.000 sessions. Instead of collecting more samples, we use another method for finding the correct key. Checking if a key is the correct key is simple, because we have collected a lot of key streams with their corresponding IV. We can just generate a key stream using an IV and a guessed key, and compare it with the collected one. If the method used for key stream recovery did not always guess the key stream right, the correct value just needs to match a certain fraction of some key streams. For every key byte K[i], we define a set Mi of possible values σi might have. At the beginning, Mi is only initialized with the top voted value for σi from the table. Until the correct key is found, we look for an entry σ˜i ∈ / Mi in all tables having a minimum distance to the top voted entry in table i. We then add σ˜i to Mi and test all keys which can now be constructed from the sets M that have not been tested previously. 6.2
Handling Strong Keys
For equation 5 we assumed S3 to be an approximation of S3+i . This assumption is wrong for a fraction of the key space. We call these keys strong keys. For these keys, the value for ji+3 with high probability is taken by j in a iteration before i + 3 and after 3. This results in S[ji+3 ] being swapped with an unknown value, depending on the previous key bytes and the IV. In iteration i + 3, this value instead of S3 [ji+3 ] is now swapped with S[i]. More formally, let Rk be a key and Rk[i] a key byte of Rk. Rk[i] is a strong key byte, if there is an integer l ∈ {1, . . . , i} where i
k=l
(Rk[k] + 3 + k) ≡n 0
(9)
196
E. Tews, R.-P. Weinmann, and A. Pyshkin
Fig. 3. Distribution of votes for a strong and a normal key byte
A key Rk is a strong key, if at least one of its key bytes is a strong key byte. On the contrary, key bytes that are not strong key bytes are called normal key bytes and keys in which not a single strong key byte occurs are called normal keys. Assuming that S is still the identity permutation, the value 0 will be added to jl+3 from iteration l + 3 to i + 3, making ji+3 taking his previous value jl+3 . This results in the value qi of equation 7 being close to 0 and Prob(σi = Ai ) is very close to n1 . Figure 3 shows the distribution of votes for a strong and a non strong key byte after 35.000 and 300.000 samples. It is easy to see that the correct value for strong key byte has not received the most votes of all key bytes any longer. An alternative way must be used to determine the correct value for this key byte. Our approach can be divided into two steps: 1. Find out which key bytes are strong key bytes: If a key byte Rk[i] is a normal key byte, the correct value for σi should appear with probability ≈ pcorrecti (see equation 8). We assume that all (1−pcorrecti ) other values are equidistributed with probability pwrongi = . If n−1 Rk[i] is a strong key byte we assume that all values are equidistributed with probability pequal = n1 . Let Nib the fraction of votes for which σi = b holds. We calculate errstrongi =
n
0003 00042 Nij − pequal
(10)
j=0
errnormali
0005 00062 = max(Nib ) − pcorrecti + b
n
j=0,j0003=argmaxb (Nib )
0003 00042 Nij − pwrongi
(11)
Breaking 104 Bit WEP in Less Than 60 Seconds
197
If enough samples are available, this can be used as a test if errstrongi is smaller than errnormali . If that is the case, it is highly likely that key byte Rk[i] is a strong key byte. If only a small number of samples are available, errstrongi − errnormali can be used as an indicator for the likelihood of key byte Rk[i] being a strong key byte. 2. Find the correct values for these key bytes: Assuming that Rk[i] is a strong key byte and all values for Rk[0] . . . Rk[i − 1] are known, we can use equation 9 and get the following equation for Rk[i]: Rk[i] ≡n −3 − i −
i−1
(Rk[j] + 3 + j)
(12)
j=l
Because there at most i possible values for l, we can try every possible value for l and restrict Rk[i] to at most i possible values (12 if Rk[i] is the last key byte for a 104 bit key). This method can be combined with the key ranking method as described in Section 6.1. Instead of taking possible values for σi from the top voted value in the table for key byte i, we ignore the table and use the values calculated with equation 12 for Rk[i] and assume that σi−1 + Rk[i] was top voted in the table. Possible values for σi for all assumed to be normal key bytes are still taken from the top voted values in their tables. This approach is only feasible if the number of strong bytes in the key is small. For a 104 bit key all 12 key bytes can be strong key bytes. For such a key, we need to test up to 12! ≈ 228.8 different keys, which is possible, but slows down our attack. Because Klein’s attack as described in Section 3.1 is not hindered by strong keys, we suggest to additionally implement this attack without key ranking, to be able to attack even keys with the maximum number of strong key bytes in a reasonable time. 6.3
A Passive Version
For the attack in the previous section we assumed that an attacker obtains a sufficient amount of encrypted ARP packets by actively injecting ARP requests into the network. This attack strategy however might be detected by an intrusion detecting system (IDS). In this section we present a passive version of the attack. As we have seen, an ARP packet can be detected by its destination address and size. For our attack we will assume every packet which is not an ARP packet to be an IPv4 packet. This is expected to be true with a high probability for most networks in use today. Figure 4 shows the first 15 bytes of an IPv4 packet [14]. There are three fields in the first 7 bytes of the the IPv4 header which do not contain fixed values. These are: Total Length. This is the total length of the packet, starting from the IPv4 header. We can calculate this value from the length of the 802.11 frame, which can be observed.
198
E. Tews, R.-P. Weinmann, and A. Pyshkin
Fig. 4. First 15 bytes of a 802.11 frame containing an IPv4 packet
Identification. This field is used to keep track of packets for the reassembly of fragments and must be assumed to be random. Flags and Fragment Offset. This byte contains the most significant bits of the fragment offset and two control flags. Because IPv4 packets are usually short, the most significant bits of the fragment offset can be assumed to be zero, even if the packet was fragmented. After having analyzed traffic from various sources, we found that about 85% of all packets were sent with the don’t fragment flag set and about 15% had all flags cleared. In total, this means that we cannot guess two bytes at all, and one byte can be restricted to two possible values with high probability. We can now modify our attack as follows. The values for σ10 and σ11 are not determined by statistical tests, because we cannot recover X[12] and X[13]. Instead, we just iterate over all 216 possible values when it comes to determine these values. To prevent a slowdown of our attack by a factor of 216 , we do less key ranking. To determine the value of σ12 , we now introduce the concept of a “weighted vote”. Until now, we counted the output of the function Ai as a vote for σi having a specific value. We now calculate A12 two times, once under the assumption that X[12] = 64 (which is equivalent to don’t fragment flag set), and count it as a 220 256 vote for σ12 , and once with the assumption that X[12] = 0 (which is 36 vote for σ12 . equivalent to no flags set), and count it as a 256 Of course, the exact success rate of this passive attack heavily depends on the nature of the traffic captured, but some simulations have shown that it works with reasonable reliability and just needs some more captured packets than the active version just using ARP packets. Most of the work on the passive attack has been done by Martin Beck who also integrated our attack in the aircrack-ng toolsuite [16]. 6.4
Breaking Implementations Using Larger WEP Keys
Some vendors implemented WEP using root keys longer than 104 bit. We have seen implementations using up to 232 bit key length for the root key. Attacking such networks is not as easy as attacking a 104 bit key, because only the first 16 bytes of cleartext of an ARP packet are known, but 31 bytes of the cleartext would be needed. The missing 15 bytes are constant for every request or response, if only a single ARP packet was used for injection. Using the chopchop attack invented by KoreK [8] or the fragmentation attack published by Bittau, Handley and Lackey [1] allows us do decrypt the request
Breaking 104 Bit WEP in Less Than 60 Seconds
199
packet we used and one of the response packets. The decrypted packets can then be used for the ARP injection attack. Because ARP responses contain packet-unique values, we can assume that all other responses contain exactly the same plaintext. This allows us to recover enough plaintext for breaking even implementations with root keys longer than 232 bit.
7
Experimental Results
We wrote an implementation using the parallelized computation as described in Section 4 and the key ranking methods described in Section 6.1 and 6.2. At the beginning an upper bound on the number of keys to be tested is fixed (key limit). A limit of 1,000,000 keys seems a reasonable choice for a modern PC or laptop. Three different scenarios of attacks cover the most likely keys: scenario 1 tests 70% of the limit and uses key ranking as described in Section 6.1. As long as the set of possible keys does not exceed 70% of the key limit, a new value v ∈ / Mi is added to a set Mi . The value v is chosen to have minimal distance to the top voted entry. scenario 2 tests 20% of the limit and uses the key ranking method in combination with strong byte detection. We use the difference errstrongi − errnormali to determine which key byte is the most likely to be strong. We then use equation 12 to get possible values for this key byte. As long as the number of possible keys does not exceed 20% of the key limit all other key bytes are determined as in scenario 1. scenario 3 tests 10% of the limit and works like scenario 2, with the exception that 2 key bytes are assumed to be strong. To verify a possible key Rk for correctness, 10 sessions (IVi , Xi ) are chosen at random. If the key stream generated using IVi Rk is identical to Xi in the first 6 bytes for all sessions, we assume the key to be correct. If all three scenarios where unsuccessful, the attack is retried using just 10% of the key limit, this time under the assumption that the root key is a 40 bit key instead of a 104 bit one. Figure 5 shows the result from a simulation, showing that a 50% success rate is possible using just 40,000 packets. To test whether this attack works in a real world scenario we used the aircrackng tool suite [16]. aircrack-ng contains aireplay-ng, a tool which is able to capture and replay 802.11 frames; for example, frames containing ARP requests or other kinds of packets that cause traffic on the target network. Additionally, airodumpng can be used to capture and save the generated traffic including all needed IEEE 802.11 headers. On a mixed IEEE 802.11 b/g network, consisting of cards with chipsets from Atheros, Broadcom, Intersil and Intel, we achieved a rate of 764 different captured packets per second, using aireplay-ng and a network card with a PrismGT chipset for injecting and an Atheros based card for capturing. This number might vary, depending on the chipsets and the quality of the signal. This allowed us to
200
E. Tews, R.-P. Weinmann, and A. Pyshkin
1 0.9 probability of success
0.8
keylimit 1,000,000 5% randomized keylimit 1,000,000 keylimit 5,000 linux iv keylimit 1,000,000
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 20000
25000
30000
35000
40000
45000
50000
55000
60000
65000
70000
75000
80000
85000
number of keystreams collected
Fig. 5. Success rate
recover 40,492 key streams in 53 seconds. Additional 2 seconds can be added for deauthenticating one or all clients forcing them to send out new ARP requests. The final cryptographic computation requires 1 to 3 seconds of CPU-time, depending on the CPU being used. For a 104 bit key we were able to consistently and successfully recover keys from a packet capture in less than 3 seconds on a Thinkpad T41p (1.7 GHz Pentium-M CPU) – this includes reading and parsing the dump file, recovering key streams and performing the actual attack. On more recent multi-core CPUs we expect this figure can be brought down to less than a second with a parallelized key space search. This results in 54 to 58 seconds to crack a 104 bit WEP key, with probability 50%. Main memory requirements for an efficient implementation are quite low. Our implementation needs less than 3 MB of main memory for data structures. Most of the memory is consumed by a bit field for finding duplicate IVs, this is 2 MB in size. CPU-time scales almost linearly with the number of keys tested. By reducing the number of keys tested to 5,000, this attack is suitable for PDA or embedded system usage too, by only reducing its success probability a little bit. The success rate with this reduced key limit is included in Figure 5 with label keylimit 5,000. 7.1
Robustness of the Attack
The key stream recovery method we used might not always be correct. For example, any kind of short packet (TCP, UDP, ICMP) might be identified as an ARP reply resulting in an incorrect key stream. To find out how our attack performs with some incorrect values in key streams, we ran a simulation with 5% of all key streams replaced with random values. The result is included in Figure 5, labeled ”5% randomized key limit 1,000,000”. Depending on the number of packets collected, the success rate is slightly reduced by less than 10%. If enough packets are available, there is no visible difference. In all previous simulations, we assumed that IVs are generated independently, using any kind of pseudo random function. Some drivers in fact do use an PRNG to generate the IV value for each packet, however others use a counter with some modifications. For example the 802.11 stack in the Linux 2.6.20 kernel uses an
Breaking 104 Bit WEP in Less Than 60 Seconds
201
counter, which additionally skips IVs which where used for an earlier attack on RC4 by Fuller, Mantin and Shamir which became known as FMS-weak-IVs. Using this modified IV generation scheme, the success rate of our attack (label linux iv keylimit 1,000,000 ) was slightly reduced by less than 5%, depending on the number of packets available. As before, there are no noticeable differences, if a high number of packets are available.
8
Related and Further Work
After we made a draft of this paper available on the IACR’s Cryptology ePrint Archive in early April 2007, other researchers published similar analyses. Subhamoy Maitra and Goutam Paul gave an independent analysis [10] of Klein’s attack with results similar to our multiple key bytes extension. Additionally, they found new correlations in RC4 independent of Klein’s analysis. Vaudenay and Vuagnoux presented a similar attack at SAC2007 [17], which additionally makes use of the fact that the RC4 key is stretched to 256 bytes by repeating it. The same trick was discovered by Ohigashi, Ozasa, Fujikawa, Kuwadako and Morii [12], who developed an improved version of our attack. Vaudenay and Vuagnoux additionally make use of a modified FMS attack, to improve their results. Still ongoing research of Ohigashi, Ozasa, Fujikawa, Kuwadako and Morii is expected to halve the number of packets needed for similar success rates as ours [11].
9
Conclusion
We have extended Klein’s attack on RC4 and have applied it to the WEP protocol. Our extension consists in showing how to determine key bytes independently of each other and allows us to dramatically decrease the time complexity of a brute force attack on the remaining key bytes. We have carefully analyzed cases in which a straightforward extension of Klein’s attack will fail and have shown how to deal with these situations. The number of packets needed for our attack is so low that opportunistic attacks on this security protocol will be most probable. Although it has been known to be insecure and has been broken by a key-recovery attack for almost 6 years, WEP is still seeing widespread use at the time of writing this paper. While arguably still providing a weak deterrent against casual attackers in the past, the attack presented in this paper greatly improves the ease with which the security measure can be broken and will likely define a watershed moment in the arena of practical attacks on wireless networks.
References 1. Bittau, A., Handley, M., Lackey, J.: The final nail in WEP’s coffin. In: IEEE Symposium on Security and Privacy, pp. 386–400. IEEE Computer Society Press, Los Alamitos (2006) 2. Borisov, N., Goldberg, I., Wagner, D.: Intercepting mobile communications: the insecurity of 802.11. In: ACM MobiCom 2001, pp. 180–189. ACM Press, New York (2001)
202
E. Tews, R.-P. Weinmann, and A. Pyshkin
3. Chaabouni, R.: Break WEP faster with statistical analysis. Technical report, EPFL, LASEC (June 2006), http://lasecwww.epfl.ch/pub/lasec/doc/cha06.pdf 4. D¨ orh¨ ofer, S.: Empirische Untersuchungen zur WLAN-Sicherheit mittels Wardriving. Diplomarbeit, RWTH Aachen (September 2006) (in German) 5. Fluhrer, S.R., Mantin, I., Shamir, A.: Weaknesses in the key scheduling algorithm of RC4. In: Vaudenay, S., Youssef, A.M. (eds.) SAC 2001. LNCS, vol. 2259, pp. 1–24. Springer, Heidelberg (2001) 6. Hulton, D. (h1kari).: bsd-airtools, http://www.dachb0den.com/projects/bsd-airtools.html 7. Klein, A.: Attacks on the RC4 stream cipher. Designs, Codes and Cryptography (submitted, 2007) 8. KoreK. chopchop (experimental WEP attacks) (2004), http://www.netstumbler.org/showthread.php?t=12489 9. KoreK. Next generation of WEP attacks (2004), http://www.netstumbler.org/showpost.php?p=93942&postcount=35 10. Maitra, S., Paul, G.: Many keystream bytes of RC4 leak secret key information. Cryptology ePrint Archive, Report2007/261(2007), http://eprint.iacr.org/ 11. Ohigashi, T., Kuwakado, H., Morii, M.: A key recovery attack on WEP with less packets (2007) 12. Ozasa, Y., Fujikawa, Y., Ohigashi, T., Kuwakado, H., Morii, M.: A study on the Tews, Weinmann, Pyshkin attack against WEP. In: IEICE Tech. Rep., Hokkaido, July 2007. ISEC2007-47, vol. 107, pp. 17–21 (2007) Thu, Jul 19, 2007 - Fri, Jul 20 : Future University-Hakodate (ISEC, SITE, IPSJ-CSEC) 13. Plummer, D.C.: RFC 826: Ethernet Address Resolution Protocol: Or converting network protocol addresses to 48.bit Ethernet address for transmission on Ethernet hardware (November 1982) 14. Postel, J.: Internet Protocol. Request for Comments (Standard) 791, Internet Engineering Task Force (September 1981) 15. Stubblefield, A., Ioannidis, J., Rubin, A.D.: A key recovery attack on the 802.11b wired equivalent privacy protocol (WEP). ACM Transactions on Information and System Security 7(2), 319–332 (2004) 16. The Aircrack-NG team. Aircrack-ng suite (2007), http://www.aircrack-ng.org 17. Vaudenay, S., Vuagnoux, M.: Passive-only key recovery attacks on RC4. In: Selected Areas in Cryptography 2007. LNCS, Springer, Heidelberg (to appear, 2007) 18. Wi-Fi Alliance. Wi-Fi Protected Acccess (WPA) (2003), http://www.wi-fi.org
Efficient Implementation of the Pairing on Mobilephones Using BREW Motoi Yoshitomi1 , Tsuyoshi Takagi1 , Shinsaku Kiyomoto2, and Toshiaki Tanaka2 1
Future University - Hakodate, School of System Information Science 116-2, Kamedanakano-cho, Hakodate, 041-0806, Japan 2 KDDI R&D Laboratories Inc. 2-1-15, Ohara, Fujimino, Saitama, 356-8502, Japan
Abstract. Pairing based cryptography can accomplish novel security applications such as ID-based cryptosystems, which have not been constructed efficiently without the pairing. The processing speed of the pairing based cryptography is relatively slow compared with the other conventional public key cryptography. However, several efficient algorithms for computing the pairing have been proposed, namely Duursma-Lee algorithm and its variant ηT pairing. In this paper, we present an efficient implementation of the pairing over some mobilephones, and examine the feasibility of the pairing based cryptosystems on ubiquitous devices. Indeed the processing speed of our implementation in ARM9 processors on BREW achieves under 100 milliseconds using the supersingular curve over F397 . It has become fast enough for implementing security applications using the pairing on mobilephones. Keywords: Pairing Based Cryptosystem, Mobilephone, BREW, Efficient Implementation.
1
Introduction
Tate pairing can realize novel cryptographic applications e.g. short signature [5], ID-based cryptography [4], which have not been achieved by conventional public key cryptosystems. Short signature is a digital signature suitable for applications on memory-constrained devices because the signature length in Tate pairing becomes about a half of that in elliptic curve cryptography. ID-based cryptosystems can replace the public key with an E-mail address or an IP address which can be easily memorized. However, the processing speed is slower than that of other public key cryptography. The timing in paper [2] shows that the processing speed of pairing-based cryptography is about 5 times or more slower than that of RSA cryptography or elliptic curve cryptography. Recently, Duursma and Lee proposed a very efficient algorithm for computing the pairing over supersingular curves [6]. Barreto et al. then presented ηT pairing which is about twice faster than Duursma-Lee algorithm [1]. The two algorithms can compute the pairing relatively efficient. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 203–214, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
204
M. Yoshitomi et al.
By the recent progress of devices technology, we are able to implement pairing based cryptosystems on ubiquitous devices that the processing speed is comparatively slow. Especially, it is important to implement and evaluate of a pairing based cryptography on mobilephones which are the most familiar as ubiquitous devices. There are two kinds of platforms for mobilephones, JAVA and BREW. An implementation of the pairing with mobilephones for JAVA was already reported [11]. In this paper, we report about an efficient implementation of the pairing by mobilephones in ARM9 processors on BREW. Indeed we implement both Duursma-Lee algorithm and ηT pairing over finite fields F3m for extension degrees m = 97, 167, 193, 239, 313. We try to improve the efficiency of pairing by the following procedures (mainly with engineering efforts). At first, functions of the finite field and the pairing are implemented from scratch in C language. Then, we investigate the functions which requires a lot of the processing time by using a profiling tool. It turns out that the multiplications in the finite field are about 80% of the whole computation of the pairing programs, and thus we focus on enhancing the speed of multiplication in F3m . Note that a substitution operation in F3m on ARM9 processors is relatively slow to our experiments. Comb method can perform the polynomial multiplication fast because there are few substitution operations than normal multiplication algorithm [10]. In this paper, we propose the improved Comb method which reduces the number of substitution operations by eliminating the first loop of the Comb method. The improved Comb method enhances the whole speed of pairing by about 20%. In addition, the unrollment of the loop in the programs further improves the speed due to the reducement of pipeline hazard. Finally, we examine the speeds of Duursma-Lee algorithm and ηT pairing using the above improvements. The processing speed of our implementation in ARM9 processors on BREW achieves under 100 milliseconds using the supersingular curve over F397 . Therefore the pairing on BREW mobilephones has become practical enough.
2
Algorithms for Implementing the Pairing
We explain some efficient algorithms for computing in finite fields and Tate pairing. 2.1
The Elements Representation in F3m
Let F3 [x] be the polynomials with coefficients over field F3 = {0, 1, 2}. Finite field F3m is the set of all polynomials represented by F3m = F3 [x]/f (x), where f (x) is an irreducible polynomial. An element A(x) in F3m can be represented as follows: A(x) = (am−1 , am−2 , · · · , a1 , a0 ), ai ∈ F3 , where ai is an element in F3 for i = 0, 1, · · · , m− 1. Note that F3 has the three values 0,1 and 2. Therefore, we represent an element in F3 as (hi,lo)-bit, where hi and
Efficient Implementation of the Pairing on Mobilephones Using BREW
205
lo are binary bits. Let W be the word size of a targeted computer, and let A[i] be a sequence of (hi, lo)-bits with size of W , where i is a positive integer including zero. Let t = 0003m/W 0004. Then A(x) can be represented by the right-to-left array A[t − 1], · · · , A[1], A[0]. In this case of W = 32 and m = 97, A(x) in F397 is represented by the array such that A[3] = (0, 0, · · · , 0, a96 ), A[2] = (a95 , a94 , · · · , a65 , a64 ), A[1] = (a63 , a62 , · · · , a33 , a32 ), A[0] = (a31 , a30 · · · , a1 , a0 ). Finally, let A[i]k be the k-th element of A[i] (for example, A[1]0 is a32 ). 2.2
Arithmetic in Finite Field F3m
Addition/Subtraction A(x) ± B(x) in F3m can be efficiently implemented by logical operations such as AND, OR and XOR [7]. Shift-and-Add method [10] is well known as a standard algorithm for computing the polynomial multiplication over finite fields. This method is an algorithm which shifts A(x) from right to left, and performs addition C(x) ← A(x) ± B(x) based on each (hi, lo)-bit of B(x), where A(x), B(x) and C(x) are elements in F3m . Therefore Shift-and-Add method requires 23 m2 additions a + b for a, b ∈ F3 and 23 m2 substitution operations c ← a + b for the resulting addition. In this paper the leftarrow (←) is called the substitution operation. Comb method [10] is another efficient algorithm for computing the polynomial multiplication, which shifts A(x) from right to left only after performing t additions in F3m based on each (hi, lo)-bit of the array B[t − 1], · · · , B[1], B[0]. A(x) · xiW +k is computed by shifting A(x) for k ∈ [0, W − 1], j ∈ [0, t − 1] and the word size W . Therefore the number of the substitution operations in Comb method becomes 0003m/W 0004 times smaller than that of Shift-and-Add method, namely it is 1/t. In our implementation in Section 3, we use the Comb method because the substitution operations on ARM9 processors are inefficient to our experiment. 0002m−1 In characteristic 3, cube can be computed as A(x)3 = i=0 ai x3i . Cube can be computed analogously to square in characteristic 2 by using table look-up [7]. Therefore, this process is fast because it does not require the multiplication step actually. Inversion can be computed by using the extended Euclidean algorithm for the polynomials over F3 [x] [11]. This algorithm was improved extended Euclidean algorithm in characteristic 2. Note that each operation in F3m has the following relationship of the cost: A < C < M < I, where A, C, M and I are the cost for addition/subtraction, cube, multiplication and inversion, respectively. In our implementation in F397 , the difference of the costs in each inequality sign (<) is about 10 times (see for Section 3.4). 2.3
Arithmetic in Extended Field F33m and F36m
We can represent F33m and F36m by a tower of extensions to use the operations of F3m . Let F33m = F3m [ρ]/(ρ3 − ρ − 1) and F36m = F33m [σ]/(σ 2 + 1). An element A in F36m is represented as follows:
206
M. Yoshitomi et al.
A = α1 σ + α0
(1)
= a5 σρ2 + a4 σρ + a3 σ + a2 ρ2 + a1 ρ + a0
(2)
= (a5 , a4 , a3 , a2 , a1 , a0 )
(3)
where αi is an element in F33m , aj be an element in F3m and i = 0, 1, j = 0, 1, · · · , 5. Operations in F33m and F36m are addition, subtraction, cube, multiplication and inversion, and the operations require operations in F3m . The operations in F33m and F36m are computed the same way in [8]. We show the costs of operations in F33m and F36m in Table 1. The operations cost of addition/subtraction and cube in F36m is just 2 times of that in F33m . Meanwhile the operation cost of multiplication and inversion in F36m is more than 2 times of that in F33m . Table 1. The costs of operations in F33m and F36m Operations F33m F36m Addition / Subtraction 3A 6A Multiplication 12A + 6M 51A + 18M Cube 3A + 3C 6A + 6C Inversion 6A + 15M + 1I 57A + 38M + 1I
2.4
Tate Pairing
Tate pairing requires operations of elliptic curve over finite fields. In this paper, we use the following supersingular elliptic curve over F3m , E(F3m ) = {(x, y) ∈ (F3m )2 y 2 = x3 − x + 1} ∪ {O} where O is the point at infinity. The group order 0006E of E(F3m ) is 0006E = 3m + 3(m+1)/2 + 1. Let r be a prime number which satisfies with r 0006E and r (36m − 1). Tate pairing is defined as follows: e0007·, · : E(F3m )[r] × E(F36m )[r] → F∗36m /(F∗36m )r where E(F3m )[r] is the subgroup of order r in E(F3m ). Point in E(F36m ) is generated from point in E(F3m ) by using distortion map φ(x, y) = (−x + ρ, yσ). The pairing e0007P, Q satisfies bilinearity e0007aP, Q = e0007P, aQ = e0007P, Qa , where P, Q are points in E(F3m )[r] and a is an integer. Miller proposed the first polynomial time algorithm for computing Tate pairing [14]. Duursma and Lee proposed an efficient algorithm using supersingular elliptic curve over finite fields in characteristic 3 [6]. Kwon proposed an improved algorithm of Duursma-Lee algorithm which requires no cube root. We present Duursma-Lee algorithm without cube root in Algorithm 1. Duursma-Lee algorithm has the step which is called the final exponentiation. 3m The step is necessary to compute T (3 −1) , where T = τ1 σ +τ0 is the representation in equation (2). One final exponentiation usually requires 3m multiplications and 1 inversion in F36m . However it can be computed by 1 multiplication and 3m 1 inversion due to T (3 −1) = (−τ1 σ + t0 )(τ1 σ + τ0 )−1 [12].
Efficient Implementation of the Pairing on Mobilephones Using BREW
207
Algorithm 1. Duursma-Lee algorithm [13] INPUT: P = (xp , yp ), Q = (xq , yq ) ∈ E(F3m )[r] OUTPUT: e0003P, Q0004 ∈ F36m 1: initialization: T ← 1 (in F36m ) a ← xp , b ← yp , x ← xq3 , y ← yq3 (in F3m ) d ← 1 (in F3 ) 2: for i ← 0 to m − 1 do 3: a ← a9 , b ← b9 (in F3m ) 4: c ← a + x + d (in F3m ) 5: R ← −byσ − ρ2 − cρ − c2 (in F36m ) 6: T ← T 3 (in F36m ) 7: T ← T R (in F36m ) 8: y ← −y (in F3m ) 9: d ← d − 1 (in F3 ) 10: end for 11: final exponentiation: T ← Algorithm 2 (T ) 12: return T
Algorithm 2. Final exponentiation (Duursma-Lee algorithm) [12] INPUT: T = τ1 σ + τ0 ∈ F36m 3m OUTPUT: T (3 −1) ∈ F36m −1 U ←T (in F36m ) τ1 ← −τ1 (in F33m ) T ← UT (in F36m ) return T
2.5
ηT Pairing
ηT pairing can reduce the cost of Duursma-Lee algorithm to the half by using Frobenius map [1]. An improved algorithm without cube root was also proposed in ηT pairing [3]. We show ηT pairing without cube root in Algorithm 3. ηT pairing need a final exponentiation step to compute T S , where S = (33m − 1)(3m + 1)(3m − 3(m+1)/2 + 1). Because S is a large value, it takes much time to compute the final exponentiation compared with Duursma-Lee algorithm. However, an efficient algorithm which uses the torus T2 for F∗36m was proposed, where T2 is defined T2 (F33m ) = {A0 +A1 σ ∈ F∗36m : A02 +A12 = 1} [15]. This algorithm for computing T S is shown in Algorithm 5. The output of Duursma-Lee algorithm and ηT pairing relates as follows: (m+1)/2
(ηT 0007P, QS )3(3
+1)2
(m+3)/2
= e0007P, Q−3
,
and e0007P, Q can compute from U = ηT 0007P, QS as follows [3]: 0005−1 0003 0004 (m+1)/2 3m e0007P, Q = U 2 · U 3 · U (m−1)/2 .
208
M. Yoshitomi et al.
Algorithm 3. ηT pairing [3] INPUT: P = (xp , yp ), Q = (xq , yq ) ∈ E(F3m )[r], S = (33m − 1)(3m + 1)(3m − 3(m+1)/2 + 1) (m+1)/2 OUTPUT: (ηT 0003P, Q0004S )3 ∈ F36m 1: initialization: a ← xp , b ← −yp , x ← xq , y ← yq (in F3m ) d ← 1 (in F3 ) c ← a + x + d (in F3m ) T ← yσ + bρ − bc (in F36m ) 2: for i ← 0 to (m − 1)/2 do 3: c ← a + x + d (in F3m ) 4: R ← byσ − ρ2 − cρ − c2 (in F36m ) 5: T ← T R (in F36m ) 6: T ← T 3 (in F36m ) 7: b ← −b (in F3m ) 8: x ← x9 , y ← y 9 (in F3m ) 9: d ← d − 1 (in F3 ) 10: end for 11: final exponentiation: T ← Algorithm 5 (T ) 12: return T Table 2. Computation costs of Duursma-Lee algorithm and ηT pairing in F397 Duursma-Lee algorithm (Alg. 1) 4635A + 972C + 1511M + 1I (final exponentiation (Alg. 2)) (108A + 56M + 1I) ηT pairing (Alg. 3) 2785A + 784C + 871M + 1I (final exponentiation (Alg. 5)) (496A + 294C + 86M + 1I)
≈ 1664M (≈ 67M) ≈ 987M (≈ 130M)
Here we estimate the computational costs of Duursma-Lee algorithm and ηT pairing. The extension degree of the underlying finite fields is usually chosen as m = 97, 167, 193, 239, 313 [1,2,3,8,11,12]. The computational cost with m = 97 is shown in Table 2. The third column is the estimated number of multiplications with A= 0.01M, C= 0.1M, I= 10M appeared in Section 3.4. Actually, the cost of ηT pairing is smaller than that of Duursma-Lee algorithm.
3
Implementation of the Pairing on BREW
In this section, we explain our efficient implementation of the pairing in mobilephones on BREW1 . 1
BREW is a registered trademark of Qualcomm company and it is an application platform developed for mobilephones of cdmaOne and cdma2000.
Efficient Implementation of the Pairing on Mobilephones Using BREW
3.1
209
Experimental Environment and Analysis of the Program
In this paper, we try to implement the pairing on ARM9 processors which is currently often used for mobilephones. BREW supports an emulator of ARM processors on a PC whose programs are written in C language. A source file in C is complied using an ARM compiler on BREW, and then an executable file (*.mod) of BREW applications for ARM processors is generated. Table 3. Timings by profiling the functions in F397 Duursma-Lee algorithm Time of Function % Hit Count 927.176 84.1 199900 83.649 7.6 97776 22.580 2.0 408606
Function FF Multi FF Cube FF Add
ηT pairing Time of Function % Hit Count 338.506 78.6 113200 53.508 12.4 165976 10.281 2.4 351406
Function FF Multi FF Cube FF Add
Here we are interested in the timing of the executable files on ARM processors. The same source code can be also compiled using a standard C compiler, and we can examine the timing of the compiled codes on a PC. We deploy a PC (AMD Opteron Processor 246 (2.0 GHz), RAM : 1 GByte) with GCC version 3.4.2 using the flags “-O2 -fomit-frame-pointer”, and mobilephones (150MHz ARM9 processor and 225MHz ARM9 processor) with an ARM complier using “-Otime” for optimizing the speed. In order to implement the pairing, we implement the functions of the finite field in Section 2.2 and the pairing in Sections 2.4-2.5 from scratch in C language. The extension degrees and their irreducible trinomial are chosen as m = 97, 167, 193, 239, 313 and x97 + x12 + 2, x167 + x96 + 2, x193 + x12 + 2, x239 + x24 + 2, x313 +x126 +2, respectively. The functions in our implementation are named as follows: FF Add (addition in F3m ), FF Multi (Comb method in F3m ), FF Cube (cube in F3m ) and so on. Then we examine the timing of the basic functions (FF Add, FF Multi, FF Cube, etc) by GCC using a profiling tool. The timings by profiling for Duursma-Lee algorithm and ηT pairing on F397 is shown in Table 3. In Table 3, the multiplication speeds of both Duursma-Lee algorithm and ηT pairing are about 80% in the whole program of the pairing. Accordingly, we try to optimize the speed of multiplication in F3m . 3.2
Optimized Multiplication for BREW
In the following we propose the improved Comb method, which reduces the number of the substitution operations by unrolling the first loop of Comb method.
210
M. Yoshitomi et al.
Here we explain the improved Comb method with extension degree m = 97, but it can be applicable to other extension degrees m = 167, 193, 237, 313. We now focus on B[3] which is one of the array representing B(x) ∈ F397 . Note that B[3] only contains the 96-th coefficient of B(x) as B[3]0 , namely, B[3] = (B[3]31 , B[3]30 , · · · , B[3]1 , B[3]0 ) = (0, 0, · · · , 0, b96 ). When we compute A(x) · B(x) using Comb method in Section 2.2, we have to perform the addition of A(x) · B[3] and substitution operations for a temporary save of the addition even B[3]j = 0 for j = 31, 30, · · · , 1. There are many substitution operations and additions which do not affect the result of A(x) · B(x). Those operations can be eliminated by unrolling the loop of computing B[3]. In other words, the loop corresponding to B[i]j with j > 0 is not computed for i = 3, and we only fulfill the complete loop for B[i]0 with i = 0, 1, 2, 3. As a result, the proposed scheme can save 31 substitution operations compared with Comb method, and we achieve about 20% faster multiplication. We show the improved Comb method in Algorithm 4. Algorithm 4. Improved Comb method m = 97 INPUT: A(x), B(x) ∈ F3m , W = 32 : word length OUTPUT: C(x) = A(x) · B(x) mod f (x) ∈ F3m 1: C(x) ← 0 2: for i ← 0 to 3 do 3: C(x) ← C(x) + B[i]0 A(x)xiW 4: end for 5: for j ← 1 to W do 6: for i ← 0 to 2 do 7: C(x) ← C(x) + B[i]j A(x)xiW +j 8: end for 9: end for 10: for i ← 2m − 2 downto m do 11: ci−85 ← ci−85 − ci 12: ci−97 ← ci−97 + ci 13: ci ← 0 14: end for 15: return C(x)
This algorithm has two steps, the polynomial multiplication step (line 1-9) and the reduction step (line 10-14), where the reduction step is the computation of c(x) mod f (x). The main loop is line 5-9, and process of loop unrolling is line 2-4. The difference of the proposed scheme from Comb method is line 6. The number of iteration in the loop of line 6 becomes one time shorter, and the omitted process is moved to line 2.
Efficient Implementation of the Pairing on Mobilephones Using BREW
3.3
211
Further Discussion on Speed-Up
We carry out the following two methods for speed-up of the pairing. In the one method, we perform effectively multiplication in F36m in the pairing algorithms. The multiplication T ·R in F36m is computed in line 7 of Algorithm 1 and line 5 of Algorithm 3. Kerins et al. pointed out the element R = r5 σρ2 + r4 σρ + r3 σ + r2 ρ2 + r1 ρ + r0 in F36m satisfies r4 = r5 = 0 and r2 = 2 [12]. The number of multiplication in F3m required for T · R is reduced because the multiplication with the constants (r2 , r4 and r5 ) is virtually for free. We can reduce the speed of the whole pairing about 10% by developing the optimized multiplication for T · R. In the other method, we unroll the loop used in the functions of F3m . The functions in F3m processes 32 coefficients depending on word length at a time. For example, the addition of m (hi, lo)-bits is constructed 0003m/W 0004 times of loop. By unrolling this loop, the count of a pipeline hazard in the target processor can be reduced and can be speeded up. Actually, the processing speed can be improved about 30% by unrolling the loop. Finally, we also implemented a window method in Algorithm 4. However the speed of the window method of width 2 was slower on the ARM9 processors (Note that it was faster on the Opteron processor). Therefore we do not use a window method in this paper. One of the reasons is that the precomputation table in a window method cannot store in the CPU cache. 3.4
Implementation Result
We show the average time of the operations in F3m and the pairing algorithms, Duursma-Lee algorithm and ηT pairing, on the ARM9 processors and on the Opteron processor in Table 4-6. We compute the average time for the pairing algorithm with random input at least 200 times on the ARM9 processors and at least 20,000 times on the Opteron processor. The optimized program at m = 97 is denoted by ”optF397 ”. This program uses the improved Comb method with unrolling the loop, and other programs use Comb method without it in F3m . Table 4. The average time of the operations in F3m and the pairing algorithms on the 150MHz ARM9 processor (msec) 150MHz ARM9 Addition (A) Subtraction (A) Cube (C) Multiplication (M) Inversion (I) Duursma-Lee algorithm ηT pairing
optF397 0.0006 0.0006 0.0070 0.0642 0.6915 98.96 56.50
F397 0.0009 0.0009 0.0067 0.0852 0.8360 129.19 76.68
F3167 0.0012 0.0012 0.0156 0.2055 1.8540 549.39 337.25
F3193 0.0014 0.0014 0.0189 0.2200 2.3765 701.18 401.27
F3239 0.0016 0.0016 0.0229 0.3410 3.4115 1303.07 738.23
F3313 0.0020 0.0019 0.0261 0.5308 5.8725 2616.63 1459.65
212
M. Yoshitomi et al.
Table 5. The average time of the operations in F3m and the pairing algorithms on the 225MHz ARM9 processor (msec) 225MHz ARM9 Addition (A) Subtraction (A) Cube (C) Multiplication (M) Inversion (I) Duursma-Lee algorithm ηT pairing
optF397 0.0004 0.0004 0.0050 0.0393 0.4590 66.00 37.52
F397 0.0005 0.0005 0.0051 0.0530 0.5825 84.70 50.34
F3167 0.0007 0.0007 0.0107 0.1448 1.6890 356.11 218.27
F3193 0.0008 0.0008 0.0133 0.2313 1.4480 457.93 261.88
F3239 0.0009 0.0010 0.0155 0.2200 2.3040 847.56 478.54
F3313 0.0012 0.0012 0.0162 0.3420 3.7280 1702.64 947.30
Table 6. The average time of the operations in F3m and the pairing algorithms on the 2.0 GHz Opteron processor (μ sec) 2.0 GHz Opteron Addition (A) Subtraction (A) Cube (C) Multiplication (M) Inversion (I) Duursma-Lee algorithm ηT pairing
optF397 0.0118 0.0118 0.1631 1.5955 16.6686 2,610 1,480
F397 0.0158 0.0160 0.1631 2.3438 19.7985 3,810 2,240
F3167 0.0219 0.0221 0.3062 5.1805 44.3534 13,980 8,540
F3193 0.0250 0.0250 0.3406 5.9468 62.2334 18,531 10,473
F3239 F3313 0.0280 0.0421 0.0281 0.0421 0.4151 0.4440 8.6590 14.7052 86.1984 137.4680 32,800 71,780 18,400 39,720
In optF397 and F397 , it turns out that the addition/subtraction (A) was about 0.01 times of the multiplication (M), the cube (C) was about 0.1 times and the inversion (I) was about 10 times. This estimation have little differences with the ARM9 processors and the Opteron processor. Based on the values, we can estimate the ratios of final exponentiation of Duursma-Lee algorithm and ηT pairing as 4.03% and 13.2%, respectively. The ratios required in F3m in the pairing computation become 2.78% addition/subtraction (A), 5.84% cube (C), 90.78% multiplication (M) and 0.60% inversion (I) in Duursma-Lee algorithm, and 2.82% addition/subtraction (A), 7.94% cube (C), 88.22% multiplication (M) and 1.01% inversion (I) in ηT pairing, respectively. In our programs except optF397 , when extension degree becomes about 2 times, the processing speed becomes about 5 times slower on both the ARM9 processors and the Opteron processor. The total size of executable files (*.mod) in optF397 was 36,528 Bytes. The size of other programs for each degree (m = 97, 167, 193, 239, 313) is smaller than that of optF397 . As an average size of executable files currently, 300 Kbytes or less is standard. Then, the size of our executable files becomes the size of around 10% in BREW applications. ηT pairing is more efficient than Duursma-Lee algorithm in ARM9 processors and an Opteron processor. The processing speed achieves 56.5 msec in the
Efficient Implementation of the Pairing on Mobilephones Using BREW
213
150MHz ARM9 processor and 37.52 msec in the 225MHz ARM9 processor for computing ηT pairing on the supersingular curve over F397 .
4
Conclusion
In this paper, we presented efficient implementation of Duursma-Lee algorithm and ηT pairing over F3m using BREW mobilephones. From our initial implementation in F397 , the whole timing required for the multiplication is about 80% in the computation of the pairings. We thus proposed improved Comb method, which is an optimized multiplication for BREW mobilephones, namely with fewer substitution operations. Moreover, we improved the multiplication T · R in F36m of the pairing algorithms and we performed loop unrolling in the finite field. As a result, the processing speed of our optimized pairing implementation using BREW on the 150MHz ARM9 and the 225MHz ARM9 achieved under 100 milliseconds. It has become efficient enough to implement security applications, such as short signature or ID-based cryptosystems, using the pairing on BREW mobilephones.
References 1. Barreto, P., Galbraith, S., O’hEigeartaigh, C., Scott, M.: Efficient Pairing Computation on Supersingular Abelian Varieties. Designs, Codes and Cryptography 42(3), 239–271 (2007) 2. Barreto, P., Kim, H., Lynn, B., Scott, M.: Efficient Algorithms for Pairing-Based Cryptosystems. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 354–368. Springer, Heidelberg (2002) 3. Beuchat, J., Shirase, M., Takagi, T., Okamoto, E.: An Algorithm for the ηT Pairing Calculation in Characteristic Three and its Hardware Implementation. In: IEEE International Symposium on Computer Arithmetic, ARITH-18, pp. 97–104 (2007) 4. Boneh, D., Franklin, M.: Identity Based Encryption from the Weil Pairing. SIAM J. Comput. 32(3), 514–532 (2001) 5. Boneh, D., Lynn, B., Shacham, H.: Short Signatures from the Weil Pairing. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Heidelberg (2001) 6. Duursma, I., Lee, H.: Tate Pairing Implementation for Hyperelliptic Curves y 2 = xp − x + d. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 111–123. Springer, Heidelberg (2003) 7. Galbraith, S., Harrison, K., Soldera, D.: Implementing the Tate Pairing. In: Fieker, C., Kohel, D.R. (eds.) Algorithmic Number Theory. LNCS, vol. 2369, pp. 324–337. Springer, Heidelberg (2002) 8. Granger, R., Page, D., Stam, M.: On Small Characteristic Algebraic Tori in PairingBased Cryptography. LMS Journal of Computation and Mathematics 9, 64–85 (2006) 9. Granger, R., Page, D., Stam, M.: Hardware and Software Normal Basis Arithmetic for Pairing-Based Cryptography in Characteristic Three. IEEE Transactions on Computers 54(7), 852–860 (2005)
214
M. Yoshitomi et al.
10. Hankerson, D., Menezes, A., Vanstone, S.: Guide to Elliptic Curve Cryptography. Springer, Heidelberg (2004) 11. Kawahara, Y., Takagi, T., Okamoto, E.: Efficient Implementation of Tate Pairing on a Mobile Phone using Java. In: CIS 2006. LNCS (LNAI), vol. 4456, pp. 396–405 (2007) 12. Kerins, T., Marnane, W., Popovici, E., Barreto, P.: Efficient Hardware for the Tate Pairing Calculation in Characteristic Three. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 412–426. Springer, Heidelberg (2005) 13. Kwon, S.: Efficient Tate Pairing Computation for Supersingular Elliptic Curves over Binary Fields. Cryptology ePrint Archive, Report 2004/303 (2004) 14. Miller, V.: Short Programs for Functions on Curves (unpublished Manuscript) 15. Shirase, M., Takagi, T., Okamoto, E.: Some efficient algorithms for the final exponentiation of ηT pairing. In: ISPEC 2007. LNCS, vol. 4464, pp. 254–268 (2007)
A
Final Exponentiation for Pairing
In this appendix, we describe the algorithms used for the pairing in this paper. Algorithm 5. Final exponentiation (ηT pairing) [15] INPUT: K ∈ F36m , S = (33m − 1)(3m + 1)(3m − 3(m+1)/2 + 1) OUTPUT: K S ∈ F36m 3m m 1: K ← K 3 −1 , G ← Λ(K) = K 3 +1 m 2: K ← G, K ← Λ(K) = K 3 +1 3: for i ← 0 to (m − 1)/2 do 4: G ← G3 5: end for 6: g2 ← −g2 , g1 ← −g1 , g0 ← −g0 7: return K · G
Algorithm 6. Computation of Λ(K) [15] INPUT: K = (k5 , k4 , k3 , k2 , k1 , k0 ) ∈ F36m m OUTPUT: Λ(K) = K 3 +1 ∈ T2 (F33m ) 1: v0 ← k0 k2 , v1 ← k3 k5 , v2 ← k1 k2 , v3 ← k4 k5 , v4 ← (k0 + k3 )(k2 − k5 ) 2: v5 ← k3 k1 , v6 ← k0 k4 , v7 ← (k0 + k3 )(k1 + k4 ), v8 ← (k1 + k4 )(k2 − k5 ) 3: c0 ← 1 + v0 + v1 ∓ v2 ∓ v3 4: c1 ← v7 − v2 − v3 − v5 − v6 (m ≡ 1 mod 12) (m ≡ −1 mod 12) c1 ← v5 + v6 − v7 5: c2 ← v2 + v3 + v7 − v5 − v6 6: c3 ← v1 + v4 ± v5 − v0 ∓ v6 7: c4 ← v3 + v8 ± v0 − v2 ∓ v1 ∓ v4 8: c5 ← ±v3 ± v8 ∓ v2 9: return C = (c5 , c4 , c3 , c2 , c1 , c0 )
Security Analysis of MISTY1 Hidema Tanaka1 , Yasuo Hatano2 , Nobuyuki Sugio3 , and Toshinobu Kaneko4 1
National Institute of Information and Communications Technology, 4-2-1 Nukui-Kitamachi, Koganei, Tokyo, 184-8795, Japan 2 Systems Development Laboratory, Hitachi, Ltd. Yokohama Laboratory, 292, Yoshida-cho, Totsuka-ku, Yokohama, Kanagawa, 244-0817, Japan 3 Service & Solution Development Department, NTT DoCoMo, Inc. NTT DoCoMo R&D Center 3-5, Hikarinooka, Yokosuka, Kanagawa 239-8536, Japan 4 Tokyo University of Science 2641 Yamazaki Noda, 278-8510, Japan
Abstract. We analyze 64-bit block cipher MISTY1 from several standpoints. Our analysis consists of two algorithms based on the higher order differential property of the S-box. The first succeeds in attacking a six round MISTY1 provided 218.9 chosen plaintexts and 280.9 computational cost. The second succeeds in attacking a seven round MISTY1 with no FL functions by controlling the value of the fixed part of the plaintext and using a 2-round elimination method provided 211.9 chosen plaintexts and 2125.1 computational cost. Both algorithms exceeds the existing attack algorithms against MISTY1 and give new perspectives for the security of MISTY1. Keywords: MISTY1, Higher order differential, Chosen plaintext attack.
1
Introduction
MISTY1 is a 64-bit block cipher with a 128-bit secret key [6]. It has an eight round Feistel structure with FO functions and FL functions. Since its security is basically guaranteed by FO functions only, a three round MISTY1 with no FL functions is still provably secure against differential cryptanalysis and linear cryptanalysis. It has relatively high performance on both software environments and hardware platforms compared with other 64-bit block ciphers. Because of these desirable security properties and the high performance, MISTY1 is considered as one of the strongest 64-bit block ciphers and it has been widely evaluated by some international cryptographic technology evaluation projects such as NESSIE [8]. And it is considered as a strong candidate for several international standardization [9]. We also note that KASUMI, a variant of MISTY1, is the standard cipher within 3GPP systems [10]. Hence, it is significant for information security community to understand the precise security level of MISTY1. In this paper, we show a new development of security evaluation for MISTY1 by describing two attack algorithms using the higher order differential property S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 215–226, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
216
H. Tanaka et al.
of S-box. By analyzing the key schedule and relation among sub-keys, we derive the condition for the secret key to be insecure. Under the condition, we can construct an attack algorithm against a six round MISTY1 that has FL functions in the last round. This algorithm needs 218.9 chosen plaintexts and 280.6 computational cost. Knudsen and Wagner have shown that a five round MISTY1 is attackable by integral cryptanalysis [2]. Their method needs 234 chosen plaintexts and 248 computational cost. MISTY1 has FL functions in the last round, however, Knudsen and Wagner study a five round MISTY1 that has no FL function in the last round. We study a six round MISTY1 that has FL functions in the last round. Therefore, our target is more realistic and harder to attack than the model in [2]. Next, we invent a technique that controls the value of the fixed part of the plaintext. Using this technique and the 2-round elimination method, we can construct an attack algorithm against a seven round MISTY1 with no FL functions. This uhn shows that a needs 211.9 chosen plaintexts and 2125.1 computational cost. K¨ six round structure of MISTY1 with no FL functions is attackable by impossible differential cryptanalysis [5]. His method needs 254 chosen plaintexts and 261 computational cost. So, we improve the number of attackable rounds from [5]. These results indicate that MISTY1’s simple key schedule and low order of S-boxes may be its weaknesses. P
FO function
?
FL1
FO1
FLb 2 -
h (( (h (h h -FO2- b h (( (h (h h
FL3
FL
· · FL9
· · FL10
-FO3- b 4 h h( h( (( h h (( (h (h h ? C
? b001bKOi1 FIi1 001bKIi1 b001b h (( (h (h h b001bKOi2 FIi2 001bKIi2 b001b hh (( ((h h b001bKOi3 FIi3 001bKIi3 b001b h h (( (h ( b001bKOi4h ?
FI function
?
S9
b001b h (( (h (h h
S7
b001b b001b ij1 b001bKIij2 hKI (( (h (h h
S9
FL function
?
? ∩ -b b001b ∪?
KLi1 KLi2
?
b001b h (( (h (h h ?
Fig. 1. MISTY1
Table 1. Key schedule Sub-key KOi1 KOi2 KOi3 KOi4 KIi1 KIi2 KIi3 KLi1 KLi2 Secret key Ki Ki+2 Ki+7 Ki+4 K’i+5 K’i+1 K’i+3 Ki (odd i) K’i+6 (odd i) sub-block K’i+1 (even i) Ki+3 (even i)
Security Analysis of MISTY1
Zi−2
Zi−1
- FO - ? b ? (i)
Zi
217
Zi0002
?
FL
?
Zi (ii)
Fig. 2. Variables
2 2.1
Preliminaries Notation
We explain briefly the notations in this paper. See Figure 1 for the structure of MISTY1. A plaintext P is expressed as follows : P = (PL PR )
0002
= (X15 , . . . , X8 X7 , . . . , X0 ), Xi ∈
GF(2)7 : i = even GF(2)9 : i = odd
(2.1)
MISTY1 has two types of S-box; S7 and S9 in the FI function. S7 is a 7-bit S-box whose order is 3. And S9 is a 9-bit S-Box whose order is 2. We summarize the key schedule in Table 1. The sub-keys are defined as follows. K = (K7 , . . . , K0 ), Ki ∈ GF(2)16 Ki0003 = FI(Ki ; Ki+1 )
(2.2)
We define the variables Zi and Zi0003 as follows (see Figure 2). When FL functions are operated after the FO function, Zi is rewritten as Zi0003 , and the output from 0003 the FL function is redefined as Zi (see Figure 2 (ii)). Therefore, Z−1 = PL and 0003 Z0 = PR . We denote the left-most m-bit of the sub-block Qi by QiLm , and the value of n-th order differential of the sub-block Qi by Δ(n) Qi , respectively. 2.2
Higher Order Differential Property
Babbage and Frisch analyzed the details of the relation between the linear and differential properties and the seventh order differential property [1]. Then they found an important property of the MISTY-structure and S-boxes. The property plays a crucial role in this paper. Property 1: When MISTY1 has no FL functions, if the most-right seven bits of the input of FOi consists of seventh order differential, the following holds for any value of the fixed part of the input or sub-keys. L7 = 0x6d Δ(7) Zi+1
(2.3)
For example, Δ(7) Z3L7 =0x6d holds for the variable sub-block X0 ∈ GF(2)7 . Our algorithms against MISTY1 exploit the property 1. In Section 3, we show
218
H. Tanaka et al.
that the condition of the secret key satisfies the equation (2.3) in the case of MISTY1 with FL functions as well. Our algorithm exploits the simple relation among sub-keys to make successful attack. In Section 4, we show an attack algorithm which exploits the property 1 as effective as possible. It focuses on the low cost of solving the attack equation because of the low order of S-box.
3
Weakness of Key Schedule
3.1
Basic Idea
In this section, we analyze MISTY1 with FL functions. The MISTY1’s compact key schedule contributes to high implementing performance. However, there are very simple relations among sub-keys produced by the key schedule. Therefore, the number of unknowns in an attack equation is small, and so the necessary complexity to solve it turns out to be small actually appeared. If the sub-key used by the AND operation contains only zeros and the sub-key used by the OR operation contains only ones, the output of the FL function is the EX-OR of its input and a constant. Thus, if we fix the sub-keys as follows, KL21 = KL31 = 0x0000,
KL22 = KL32 = 0xffff
(3.1)
the higher order differential property of Z30003 is same as the property of output from third round of MISTY1 with no FL functions. Thus Δ(7) Z30003L7 =0x6d holds from equation (2.3) (see Figure 3). The following condition for the secret key K is derived using the key schedule. K30003 = K2 = 0x0000,
K5 = K80003 = 0xffff
(3.2)
K80003 = FI(K8 ; K1 ).
(3.3)
where K30003 = FI(K3 ; K4 ),
Consequently, we need to fix the secret key sub-blocks K1 , K2 , K3 , K4 , K5 and K8 . However, we can show that the number of fixed secret key sub-blocks becomes small if we use an eighth order differential for the attack. In the case of fixed KL3 , from the formal analysis, the degree of the left-most 7 bits in the output of FO3 (Z30003 L7 ) is 9 (see left part of figure 4). However, equation (2.3) implies that Δ(7) Z30003 L7 =0x6d. In the case of NOT fixed KL3 , the formal degree of Z30003 L7 is estimated to be 8 (see right part of figure 4). Comparing these estimations, we conjecture that the degree of Z30003 L7 for NOT fixed KL3 is 7 and the value of its seventh order differential depends on the value of the fixed part of the plaintext and sub-keys. From the properties of higher order differentials, it is easy to see that if Δ(n) Z is constant, then Δ(n+1) Z is equal to 0. Since the attacker cannot specify the value of the seventh order differential of Z30003 L7 , we consider an attack using the
Security Analysis of MISTY1
? X0 PR KL21 =K’3 =0x0000 KL22 =K5 =0xffff
- FL? 2
PR +0xffff0000
?
-
FO2
KL31 =K’8 =0x0000 KL32 =K2 =0xffff
b001b - FL? 3 ? -b ?
? X0
FI21 [0 1]
PL
0001 +0xffff0000 Z2
FO3
S9
b001b h (( (h (h h
FI22 [3 3]
b001b h (( (h (h h
[1]
-
[2 2]
Δ(7) Z30002L7 =0x6d
b001b h (( (h (h h
S7 [3]
FI23
[3 3]
219
b001b h (( (h (h h
S9
b001b h (( (h (h h [3 3] ? [2 2]
[2] [3]
b001b h (( (h (h h [3] ? [3]
Fig. 3. Formal analysis [3 3]
? [2 2]
FI31 [9 12]
b001b h (( (h (h h
FI32
[6 8]
b001b h (( (h (h h
FI33
b001b h (( (h (h h [9 12] ? (7) 0002L7 Δ
Z3
=0x6d
[2 2]
?
S9 [4] b001b
h (( (h (h h
- S7 [6] b001b h (( (h (h h S9 001b [8] bh
(( (h (h h [6] [8] ?
[3 3]
?
S9 [6] b001b
[3 3]
001b
h (( (h (h h
S7 [9] b001b
h (( (h (h h
S9 001b [12]bh
(( (h (h h [9] [12] ? 6
? [2 2]
FI31 [9 12]
b001b hh (( (h ( h
FI32
[9 12]
b001b hh (( (h ( h
FI33 [8 11]
b001b hh (( (h ( h ? 0002L7
Δ(7) Z3
=unknown
Fig. 4. Formal analysis of FO3
eighth order differential. Moreover, we consider the case of NOT fixed KL21 . The plaintext sub-blocks X2 and X3 affect the maximum degree of output because they are inputted to FI21 . Therefore we cannot use them as variable sub-blocks. Thus we consider an eighth order differential using X0 and one bit selected from among X1 . We confirmed that Δ(8) Z30003 L7 =0 for such an eighth order differential by computer simulation although we have to omit the details of the simulation due to the limitted space. The attack equation derived from the equation Δ(8) Z30003 L7 = 0 has unknowns with respect to sub-keys for FL6 and FO5 . However, if we fix the value of the secret key sub-block as K7 = KL62 =0xffff, we have Δ(8) Z3L7 = 0. We can neglect the unknowns with respect to the sub-keys for FL6 .
220
H. Tanaka et al. B
Δ(8) Z3L7 =0 C -FO5 - b A h (( (h (h h -FO6 - b h h( h( (( h FL7
FL8
CL
CR
?
?
Fig. 5. Relation of the variables A, B and C in equation (3.4)
Our attack succeeds under the condition that the secret key sub-blocks satisfy K5 = K7 =0xffff. inter.tex 3.2
Attack Equation
In general, an attack equation is derived by calculating a higher order differential from the ciphertext side step by step. So, the attacker starts to derive an equation to solve the sub-keys in the last round. Then he derives a new equation to determine the next level unknown sub-keys using the information already obtained. So it is necessary for attacker to solve many attack equations. Contrary to such a general approach, we employ a different method, that is, we derive an attack equation for a secret key directly. So it suffices for us to construct an attack equation only once, and solve it. We apply a method in [7] to solve the attack equation. In this paper, we call it an algebraic method. The algebraic method was proposed by Moriai, Shimoyama and Kaneko, to solve KN cipher and CAST. This method solves higher order equations by regarding them as linear equations: basically, every product of (more than two) variables is identified to a new variables, and the resulting equation has order 1, that is, every term of the resulting equation contains only one variables. Let us demonstrate an example of higher order equation with three variables (x2 , x1 , x0 ) and the equation transformed. The equation a0 x0 ⊕ a1 x1 ⊕ a2 x0 x2 = 1 is transformed to a0 x0 ⊕ a1 x1 ⊕ a2 x3 = 1, where x3 is a new variable which is identified to x0 x2 . We should note that the resulting equation is linear, that is, every term contains at most one variable. So we can regard the equation as a linear equation which contains at most 3 C1 + 3 C2 + 3 C3 = 7 new variables if the original equation contains three variables. Since the higher order equation with three variables is defined as the linear equation with seven variables by this procedure, to determine variables, we need more linear independent equations than equations required by a brute force search. However, the computational cost to solve linear equations is much smaller than the cost required by a brute force search. The reader is referred to [7] for the precise definition and detailed discussion of the algebraic method. The number of the independent unknown terms turns out to be large if the number of secret key sub-blocks inputted to S-boxes is large. On the other hand,
Security Analysis of MISTY1
221
it is hard to solve a system of equations with a large number of independent unknown terms. Therefore, we need to keep the number of secret key sub-blocks inputted to S-boxes to a minimum in order to make the attack equation easily solvable. Following such a strategy, we construct an attack equation with a small number of independent unknown terms, and then present an attack algorithm against a six round MISTY1 using 2-round elimination method that estimates a part of the secret key sub-blocks with a brute force search. Let us see the construction of our attack equation as follows. First of all, the following attack equation can be derived using 2-round elimination method. Δ(8) Z3L7 = Δ(8) {A + B} = 0, ⎧ ⎪ A = FL8 (CR ; KL8 )L7 ⎪ ⎪ ⎪ ⎪ = FL8 (CR ; K60003 , K8 )L7 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ B = FO5 (C; KO5 , KI5 )L7 ⎪ ⎪ = FO5 (C; K1 , K20003 , K4 , K5 , K60003 , K7 , K80003 )L7 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ C = FL7 (CL ; KL7 ) + FO6 (FL8 (CR ; K60003 , K8 ); KO6 , KI6 ) ⎪ ⎩ = FL7 (CL ; K20003 , K4 ) + FO6 (FL8 (CR ; K60003 , K8 ); K10003 , K2 , K30003 , K5 , K70003 , K8 ) (3.4) The relation of the variables A, B and C is shown in Figure 5. Next, we consider the optimal combination which divides unknowns into the group determined by an algebraic method and the group determined by a brute force search. Since K5 and K7 are fixed, we have 6 secret key sub-blocks (total: 96 bits) as unknowns. Since the number of terms inputted to the S-boxes becomes the maximum, it is difficult to solve KO61 , KO62 and KI61 by using the algebraic method. Thus we use a brute force search to estimate the secret key sub-blocks, K6 (= KO61 ), K8 (= KO62 ) and K30003 (= KI61 ). Since K70003 (= FI(K7 ; K8 )), we can treat the terms with respect to KI62 = K70003 as known values. Consequently, K5 , K7 = fixed, K30003 , K6 , K8 = estimated by brute force search, K70003 = calculated,
(3.5)
and we have K1 , K10003 , K2 , K20003 and K4 as unknowns. Note that K1 = KO54 is a constant term in the attack equation. Thus the terms with respect to K1 do not exist in the eighth order differential. Thus we can omit terms with respect to K1 and determine K10003 , K2 , K20003 and K4 (total 64 bit) by using the algebraic method. Consequently, our attack equation consists of equation (3.4) and (3.5). By solving the attack equation, we can determine secret key sub-blocks K10003 , K2 , K20003 , K30003 , K4 , K5 , K6 , K7 and K8 . Since K10003 = FI(K1 ; K2 ) and K30003 = FI(K3 ; K4 ), secret key sub-block K1 and K3 can be determined by using (K10003 , K2 ), or (K30003 , K4 ).
222
3.3
H. Tanaka et al.
Complexity
Let us assume that an attack equation derived from the value of the n-th order differential of the b-bit sub-block among the output. Let L be the number of independent unknowns in the attack equation. To determine all the unknowns, we need at least L linear equations. Since we can derive b linear equations from one n-th order differentials, we need 0004L/b0005 different n-th order differential to derive the L × L coefficient matrix. If we use the same techniques shown in [7] to derive the matrix, we need to calculate the F function operation using L different temporary sub-keys. Thus we need 2n × 0004L/b0005 chosen plaintexts and 2n × 0004L/b0005 × L F function operations (computational cost). After deriving the coefficient matrix, we can solve the equation by Gauss - Jordan elimination. In general, the cost required by Gauss - Jordan elimination can be negligible comparing with the cost required to calculate theL × L coefficient matrix. We consider the complexity to solve the attack equation by estimating the other unknown S (s bits). If we solve the attack equation using L + α linear equations, the equation holds a false value of S, with probability 2−α . Therefore, if we have additional α linear equations for such that 2s−α << 1, we are able to eliminate all false value of S. Thus we need 2n × 0004(L + α)/a0005 chosen plaintexts and 2n+s × 0004(L + α)/a0005 × L computational cost. We counted the number of independent unknown terms appeared in the attack equation (3.4) and (3.5) by computer simulation. Although the details are omitted, we found 13,269 independent unknown terms. Since we estimated an s = 48 bit unknown with the brute force search, we set α = 64 (248−64 = 2−16 << 1). And since we derive the attack equation (3.4) focusing on 7-bit output sub-block Z3L7 , b is equal to 7. Consequently, we estimate that 28 ×0004(13269 + 64)/70005 0006 218.9 chosen plaintexts and 28+48 × 0004(13269 + 64)/70005 × 13269 0006 280.6 computational cost are needed to determine a 128-bit secret key. In this section, we attacked 6 round MISTY1 with FL functions, and succeeded in attacking because we could make the complexity for solving the attack equation relatively small (280.6 ). There are two reasons that we could make it. First, the key schedule for MISTY1 makes only 16 combinations of sub-keys. Second, the relation between 16 combinations of secret key sub-blocks is relatively simple because the key schedule is simple. MISTY1 has high performance because of its simple key schedule whereas our result suggests its key schedule might be a weakness in its security.
4 4.1
Low Order of S-Box Basic Idea
In this section, we analyze MISTY1 with no FL functions (no FL version). We omit the key schedule in the following, because it is not defined for no FL version. The structure of no FL version is shown in Figure 6. We uses equivalent functions and sub-keys to simplify the attack algorithm. Our goal is to know the maximum attackable number of rounds for the no FL version using property 1.
Security Analysis of MISTY1
P
Equivalent FO
?
b b001bK K1 FO1 - b 2
?
Equivalent FI
? b001bKij1
h (( (h (h h -FO2 - b h (( (h (h h -FO3 - b h h( h( (( h -FO4 - b
FIi1
S9
FIi2
S7
· ·
FIi3
S9
· ·
h (( (h (h h ? C
b001b hh (( ((h h b001b h (( (h (h h
b001b h (( (h (h h ?
223
b001b hh (( ((h h b001bKij2 b001b h (( (h (h h b001bKij3 b001b h (( (h (h h ?
Fig. 6. MISTY1 with no FL functions
We consider an attack using the chosen plaintext P which has the following form : P = (V PR ), V = (0, 0, 0, 0, 0, 0, 0, X8)
(4.1)
where “0” denotes a fixed plaintext sub-block. In this case, since the variable sub-block is inputted to the first round, Δ(7) Z2L7 =0x6d holds from the equation (2.3). However, if the attacker can select the value of PR which satisfies Z1 ⊕ PR = constant,
(4.2)
he can get Δ(7) Z4L7 =0x6d. Let us discuss the details of FO1 . The structure of FO1 is shown in Figure 7. Since the output of FI11 and the outputs of the first S9 in FI12 and FI13 are constants, the value of the output of FO1 is determined by the value of equivalent sub-keys K122 , K123 , K132 and K133 (total: 32 bits). Let us consider the technique to adjust the value of PR using the equation PR = FO(V ; K)
(4.3)
where K denotes a 32 bit variable corresponding to equivalent sub-keys K122 , K123 , K132 and K133 . The other equivalent sub-keys in FO are fixed to some constant value (for example, all zeros). As a result, equation (4.2) holds with probability 2−32 . If the attacker has all the value of K, he can get Δ(7) Z4L7 =0x6d. 4.2
Attack Equation
From the discussion above, Δ(7) Z4L7 =0x6d holds with probability 2−32 . Thus we can derive the following attack equation by calculating the seventh order differential from the ciphertext side.
224
H. Tanaka et al.
? V
b001bK
? V
1j1
FI11
S9
b001b h (( (h (h h
FI12
b001b h (( (h (h h
FI13
b001b h (( (h (h h ?
-
b001b h (( (h (h h b001bK1j2
S7
b001b h (( (h (h h b001bK1j3
S9
b001b h (( (h (h h ?
j = 2or3
Fig. 7. Structure of FO1
Δ(7) {FO6 (CL ; K611 , K612 , K621 , K622 , ) + CR }L7 = 0x6d
(4.4)
We solve the equation using the algebraic method. Although the details are omitted, we found that the number of independent unknown terms appeared in the attack equation (4.4) is 74 by computer simulation. We use a brute force search for K. If K satisfies equation (4.2), the attack equation (4.4) holds and we can determine the unknown terms. But if it does not satisfy equation (4.2), the attack equation does not hold. In the same way as in Section 3.3, we set α = 48 (232−48 = 2−16 << 1) to solve the equation. And since we derive the attack equation (4.4) focusing on 7-bit output sub-block Z4L7 , b is equal to 7. As a result, we need 27 × 0004(74 + 48)/70005 0006 211.2 chosen plaintexts and 27+32 × 0004(74 + 48)/70005 × 74 0006 249.4 computational cost. This computational cost is much smaller than 2128 needed for brute force search for secret key. Therefore a 2-round elimination method using a brute force search for the sub-key in the seventh round (total: 75 bits) works. 4.3
Complexity
Since we determine a 107-bit unknown (a 32-bit K and 75-bit sub-key in the seventh round), we set α = 123 (2107−123 = 2−16 << 1). Thus we need 27 × 0004(74 + 123)/70005 0006 211.9 chosen plaintexts and 27+32+75 × 000474 + 123/70005 × 74 0006 2125.1 computational cost to attack a seven round MISTY1 with no FL functions. In this section, we attacked 7 round MISTY1 with no FL functions, and succeeded in attacking because our technique enable us to raise the number of rounds. The computational cost can be divided into three tasks: 2-round elimination, adjusting the value of PR , and matrix computation. Since S-boxes has low order, the order of the attack equation turns out to be low. Then the number of the
Security Analysis of MISTY1
225
independent unknown terms is small. This makes the cost for the matrix computation small. Then we can use our computation resource to the other tasks, that is, computation for the 2-round elimination and adjusting the value of PR . Our technique enables us to determine the output of FO1 for the input V using only 32-bit sub-key satisfying (4.2). Note that this is a big improvement because 75-bit K satisfying (4.2) is necessary to determine the output of FO1 unless a certain trick is employed. Because the cost for matrix computation is kept small, we have more computation cost available and then the reduction from 75-bit to 32-bit helps to raise the number of rounds, that is, we could attack seven round MISTY1 with no FL. Table 2. Comparison of results #rounds #chosen plaintexts comput. cost method 34 48 MISTY1 5 2 2 integral [2] with FL 6 218.9 280.9 Section 3 54 61 MISTY1 6 2 2 impossible differential [4,5] with no FL 7 239 2125.1 Section 4
5
Conclusions
First, we showed the attack of a six round MISTY1 with FL functions. This attack algorithm takes advantage of the simplicity of the key schedule. We gave the condition for the secret key to satisfy Δ(8) Z3L7 = 0. And we also showed the technique to reduce the complexity to solve the attack equation using the simple relation among sub-keys. Consequently, we showed the attack equation for a six round MISTY1 with FL functions and the effective algorithm to determine a 128-bit secret key. Our algorithm needs 218.9 chosen plaintexts and 280.6 computational cost. Since our target, reduced round MISTY1, has FL functions in the last round, our algorithm is more realistic and powerful than the existing methods. Although it may be unfair to compare only the number of rounds and the number of chosen plaintexts, our algorithm can attack more rounds using less number of chosen plaintexts than Knudsen and Wagner’s integral cryptanalysis. We also succeeded in attacking MISTY1 with no FL functions that has more rounds than the model attacked previously. We presented a novel technique to control the value of fixed part of plaintext. Since the order of S-boxes is very small and unknown terms are very few, the algebraic method works very efficiently. The 2-round elimination method works because the complexity needed by the algebraic method in the procedure of solving the attack equation is very small. Our algorithm needs 211.9 chosen plaintexts and 2125.1 computational cost, and is superior to the previous ones, in the sense that it can attack more rounds with fewer chosen plaintexts. We summarize our results and the related works in table 2.
226
H. Tanaka et al.
Acknowledgements We would like to thank Ms. Makiko Shirota for her contribution to the computer simulations.
References 1. Babbage, S., Frisch, L.: On MISTY1 Higher Order Differential Cryptanalysis. In: Won, D. (ed.) ICISC 2000. LNCS, vol. 2015, pp. 22–36. Springer, Heidelberg (2001) 2. Knudsenand, L.R., Wagner, D.: Integral Cryptanalysis. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 114–129. Springer, Heidelberg (2002) 3. Jakobsen, T., Knudsen, L.R.: The Interpolation Attack on Block Cipher. In: Preneel, B. (ed.) Fast Software Encryption. LNCS, vol. 1008, pp. 28–40. Springer, Heidelberg (1995) 4. K¨ uhn, U.: Cryptanalysis of Reduced-Round MISTY. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 325–339. Springer, Heidelberg (2001) 5. K¨ uhn, U.: Improved Cryptanalysis of MISTY1. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 61–75. Springer, Heidelberg (2002) 6. Matsui, M.: New block encryption algorithm MISTY. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 54–68. Springer, Heidelberg (1997) 7. Moriai, S., Shimoyama, T., Kaneko, T.: Higher Order Differential Attack of a CAST Cipher. In: Vaudenay, S. (ed.) FSE 1998. LNCS, vol. 1372, pp. 17–31. Springer, Heidelberg (1998) 8. NESSIE, https://www.cosic.esat.kuleuven.ac.be/nessie/ 9. RFC2994, http://www.faqs.org/rfcs/rfc2994.html 10. 3GPP, http://www.3gpp.org/
A Generic Method for Secure SBox Implementation Emmanuel Prouff2 and Matthieu Rivain1,2 1
University of Luxembourg Faculty of Sciences, Technology and Communication 6, rue Richard Coudenhove-Kalergi L-1359 Luxembourg 2 Oberthur Card Systems, 71-73 rue des Hautes Pˆ atures, 92726 Nanterre Cedex, France {m.rivain,e.prouff}@oberthurcs.com
Abstract. Cryptographic algorithms embedded in low resource devices are vulnerable to side channel attacks. Since their introduction in 1996, the effectiveness of these attacks has been highly improved and many countermeasures have been invalidated. It was especially true for countermeasures whose security was based on heuristics and experiments. Consequently, there is not only a need for designing new and various countermeasures, but it is also necessary to prove the security of the new proposals in formal models. In this paper we provide a simple method for securing the software implementation of functions called SBoxes that are widely used in symmetric cryptosystems. The main advantage of the proposed solution is that it does not require any RAM allocation. We analyze its efficiency and we compare it with other well-known countermeasures. Moreover, we use a recently introduced proof-of-security framework to demonstrate the resistance of our countermeasure from the viewpoint of Differential Power Analysis. Finally, we apply our method to protect the AES implementation and we show that the performances are suitable for practical implementations.
1
Introduction and Motivations
Side Channel Analysis are powerful attacks that utilize side channel leakage of embedded devices such as timing execution or power consumption to obtain information about secret data. It essentially allows two kinds of Power Attacks: the Simple Power Analysis (SPA) and the Differential Power Analysis (DPA). SPA consists in directly interpreting power consumption measurements and in identifying the execution sequence. In a DPA, the attacker focuses on the power consumption of a single instruction and performs statistical tests to reveal some correlations between the distribution of the measurement values and the sensitive data (i.e. depending on a secret value) manipulated by the instruction. Since the publication of the first DPA, many papers describing either countermeasures or attack improvements have been published (see [1,3,4,12] for example). Among S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 227–244, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
228
E. Prouff and M. Rivain
these improvements, Higher order DPA attacks (HODPA) are of particular interest. They extend the DPA by considering a set of several instructions instead of a single one. The number d of instructions targeted by the attack is called order of the DPA. The most common way of thwarting DPA involves random values (called masks) to de-correlate the leakage signal from the sensitive data which are manipulated [4, 6, 1]. If every sensitive variable is protected with a single mask, the implementation may thwart first order DPA but it can be theoretically attacked by a second order DPA targeting both the mask and the masked value. To thwart second order DPA (and more generally d-th order DPA), every sensitive variable must be masked with 2 (resp. d) random values. This implies that the timing-memory cost of the existing masking countermeasures increases greatly with the order of the DPA they aim to counteract. For applications where time constraints are very strong (such as contact-less applications), it may be considered as sufficient to mask all the sensitive data with a small number of random values generated at the beginning of the algorithm execution. The performances of many DPA countermeasures have been studied in this model. However recent results (see [13]) showed that second order DPA represents a real practical threat, especially when the same value is used to mask several intermediate results throughout the algorithm. Therefore, as noticed in [5, 15], the masks must be re-generated as often as possible and the analysis of the efficiency of a countermeasure has to take this fact into account. In this model, it appears that only a few among the existing countermeasures are still efficient in a low resource context and there is therefore still a need for investigating new and various countermeasures that thwart first order DPA efficiently when the masks change frequently. Due to the very large variety of Side Channel Attacks reported against cryptosystems and devices, sensitive applications (e.g. Banking, GSM or Identity Card) cannot make use of countermeasures with ad hoc security but need countermeasures which are provably secure against a precisely modeled adversary. Recently, new notions and tools have been introduced to evaluate the security of an implementation against Side Channel Attacks [20, 18, 17]. They allow for the formal validation or the formal invalidation of the resistance of a countermeasure under some realistic assumptions on the behavior of the device and on the power of the adversary. In this paper, we focus on DPA against block cipher algorithms. The most critical part when securing implementations of such algorithms against DPA is to protect their non-linear operations (i.e. the calls to the SBoxes). In the next section, we recall the methods which have been proposed in the Literature. Then, we introduce in Sect. 3 a new and simple countermeasure which counteracts first order DPA against SBox implementations whatever the algebraic structure of the SBox. When the masks change frequently, we argue that the new method has a good efficiency compared to the other generic methods. In Sect. 4, we analyze the security of our proposal by following the methodology described in [20]. In Appendix A, we apply our method to protect the implementation of
A Generic Method for Secure SBox Implementation
229
the AES SBox and we compare its efficiency and its security with the ones of other existing countermeasures.
2 2.1
Secure Implementation of Non-linear Functions in the Literature State of the Art of the Generic Methods
To counteract DPA attacks, one usually tries to make the power consumption signal as independent as possible of the sensitive data manipulated by the algorithm. Goubin and Patarin proposed in [6] a general solution, called duplication method, to protect the implementation of an algorithm. In this approach, every sensitive variable x is split into d blocks r1 , .., rd and every cryptographic primitive F that manipulates x is associated to d functions F1 , .., Fd and to a simple transformation σ (e.g. a simple bitwise addition) such that F (x) = σ(F1 [r1 , .., rd ], · · · , Fd [r1 , .., rd ]). Goubin and Patarin showed that an implementation protected by duplication method thwarts first order DPA if for every x (resp. every F [x]) the d blocks r1 , .., rd (resp. F1 [r1 , .., rd ], .. , Fd [r1 , .., rd ]) are never manipulated at the same time. Another approach consists in masking all the sensitive internal data with random values. Depending on the kind of operations performed by the linear parts of the algorithm, the mask values are introduced by modular addition, bitwise addition or multiplication. After selecting the masking operation 0003, the implementation of a cryptographic function F is rendered resistant to first order DPA by solving the following problem: Problem 1. Knowing x 0003 r, r and s, compute F (x) 0003 s such that every value of the power consumption signal is independent of x. If the function F is linear for the law 0003, then solving the above problem is a simple task. Indeed, since F (x0003r) equals F (x)0003F (r), we have F (x)0003s = F (x0003r)0003F (r)0003s. If F is non-linear for the law 0003 (which is the case when F is a SBox), then designing an implementation solving Problem 1 is much more difficult. Several kinds of methods have been proposed in the Literature and we recall two of them hereafter. The first one, called re-computation method [1, 11], involves the computation of a table corresponding to the masked SBox and the generation of one or several random value(s). In its most elementary version, two random values r and s are generated and the table T 0002 representing the function F 0002 : x 0002→ F (x 0003 r) 0003 s is computed from F and stored in the RAM of the device. Then, each time the masked value F [x] 0003 s has to be computed from the masked input x 0003 r, the table T 0002 is accessed. For such a method, the number of tables to be recomputed during the execution of the algorithm equals the number of different input/output masks which is allowed. Remark 1. The re-computation method is a particular case of the duplication method where d equals 2, where the sensitive value x is split into r1 = x 0003 r
230
E. Prouff and M. Rivain
and r2 = r and where the functions F1 and F2 equal r1 0002→ S[r1 ] 0003 s and r2 0002→ s respectively. The second kind of methods, that we call here SBox secure calculation, has been essentially applied to protect AES implementations [4, 7, 18, 19, 21] due to the strong algebraic structure of the AES SBox. The SBox outputs are not directly obtained by accessing a table but are computed on-the-fly by using a mathematical representation F of the SBox. Then, each time the masked value F [x] 0003 s must be computed from the pair (x 0003 r, r), an algorithm performing F and parameterized by the 3-tuple (x 0003 r, r, s) is executed. The computation of F is split into elementary operations (bitwise addition, bitwise multiplication, addition, multiplication, ..) and/or is performed in spaces of small dimensions (e.g. 4) by accessing one or several look-up table(s) (see [15]). The security of such a method is achieved by protecting each elementary operation and each memory transfer. When the same pair of input/output masks is used throughout the algorithm, the latter is said to be protected in the single-mask protection mode. In such a mode, a new pair of input/output masks (r, s) is generated at each execution of the algorithm and every computation F (x) performed during the execution is protected with this single pair. When the algorithm is protected in this mode, SBox secure calculation methods are often much more costly than the re-computation methods since they essentially replace a single access to a table by numerous logical operations and memory transfers. This difference between the performances of the two methods decreases when the number of different masks generated to protect the SBox calculations increases. In the multi-mask protection mode, the pair of masks (r, s) is re-generated each time a calculation F (x) must be protected (and thus many times per algorithm execution). In such a context, the SBox secure calculation methods become more appropriate and induce a smaller timing/memory overhead than the re-computation methods. Indeed, when re-computation methods are used in the multi-mask protection mode, a new table must be computed from F after each re-generation of masks (i.e. before every computation F (x)). As discussed in the previous paragraph, the choice between the first and the second category of methods highly depends on the protection mode, single-mask or multi-mask, in which the algorithm is implemented. We compare the two modes in the next section. 2.2
Single-Mask Protection Mode versus Multi-mask Protection Mode
When it is only required to thwart first order DPA, then implementing the algorithm in the single mask protection mode is sufficient. However recent results (e.g. [13]) show that second order DPA can represent a real practical threat when the amount of information leaking in the two consumption points targeted by the attacker is sufficiently high to make the effects of the de-synchronization and of the noise negligible. More generally, the analyses of second order DPA
A Generic Method for Secure SBox Implementation
231
published in [8,13,23] illustrate that the complexities of the various second order DPA are very different and show that some of them must be considered when implementing an algorithm for secure applications. In fact, as already put forward in [5,15], second order DPA are especially effective when the same pair of masks is used to protect all the inputs/outputs of the cryptographic primitives (e.g. all the inputs/outputs (x, F (x)) of the SBoxes) involved in the algorithm. Indeed, the beginning and the end dates of the execution of these primitives are quite easy to localize in the power consumption curves. Consequently, when all the inputs (resp. all the outputs) of the primitives are masked with the same value, then an attacker can precisely isolate two consumption points manipulating the same masks and is therefore able to unmask a sensitive data. A straightforward solution to make this particular class of second order DPA difficult to perform in practice consists in re-generating the masks as often as possible. Even if such a solution does not ensure that the algorithm thwarts all kinds of second order DPA, it allows to counteract those among the most efficient ones. For the reasons detailed above, we think that there is a practical interest to introduce an intermediate resistance level between the first order DPA-resistance, in which every first order DPA is counteracted, and the second order DPAresistance, in which every second order DPA is also counteracted. We call this intermediate level, first order DPA-resistance in the multi-mask protection mode. In this new level, the pair of masks (r, s) is re-generated each time a calculation F (x) must be protected (and thus many times per algorithm execution). Since the implementation of an algorithm perfectly thwarting second-order DPA requires a great timing and memory overhead and since an implementation thwarting only first order DPA does no longer provide enough security, we think that the first order DPA-resistance in the multi-mask protection mode nowadays offers the better security/efficiency trade-off. Analyzing the security of an implementation of an SBox for the new resistance level is equivalent to study the first order DPA resistance of the SBox implementation. The difference appears when it comes to investigate the efficiency of the countermeasure. For instance, a technic efficient in the single-mask mode (e.g. the re-computation method) can become much more costly in the multi-mask mode. In the next section, we present a simple SBox secure calculation method which resolves Problem 1 and we compare its performances with other generic methods in the multi-mask protection mode.
3 3.1
The New S-Box Secure Calculation Method Our Proposal
Let x denote a sensitive variable, let r and s be an input mask and an output mask and let F denote an SBox function mapping Fn2 into Fm 2 . The core idea of our proposal is to compute F ((x ⊕ r) ⊕ a) ⊕ s for every value a, storing the result in a register R0 if a equals r and in a second register R1 otherwise.
232
E. Prouff and M. Rivain
Let compare : x, y 0002→ compare(x, y) be the function returning 0 if x = y and 1 otherwise, we depict our proposal in the following algorithm. Algorithm 1. Computation of a masked S-Box output from a masked input Input: a masked value x ˜ = x ⊕ r, an input mask r, an output mask s, a look-up table for F Output: the masked S-Box output F (x) ⊕ s 1. for a = 0 to 2n − 1 do 2. cmp ← compare(a, r) x ⊕ a) ⊕ s 3. Rcmp ← F (˜ 4. return R0
Remark 2. Many microprocessors implement the function compare by a single instruction. Thus, we will assume in the rest of the paper that this function is an elementary operation. To verify the correctness of Algorithm 1., it can be checked that Step 3 performs the following operation: 0002 R0 ← F (˜ x ⊕ a) ⊕ s if a = r , (1) R1 ← F (˜ x ⊕ a) ⊕ s otherwise . x ⊕r)⊕s = F (x)⊕s when the loop is completed. Hence, R0 contains the value F (˜ The security of the new method highly depends on the assumption that the leakage generated by a register transfer is the same whatever is the register. This assumption is commonly accepted and it is the security core of SPA countermeasures used to protect asymmetric cryptosystems (see for instance [9]). As every variable manipulated during the execution of the algorithm is masked by a random value, it can be proven (in a similar way as done in Sect. 4) that it thwarts DPA in both Hamming Weight and Hamming Distance models. Nevertheless, Algorithm 1. has a potential security weakness because it involves dummy operations (i.e. operations which do not impact the value returned by the algorithm)3 . This flaw could be exploited by an attacker who would disrupt the Step 3 operation (for instance by fault injection [2]) during a chosen loop iteration a = a0 . By checking if Algorithm 1. output is erroneous or not, the attacker would be able to detect when the mask r equals the known loop index a0 . Finally, for the power consumption measurements corresponding to the cases r = a0 , the attacker would be able to unmask x˜ and to perform a classical first order DPA. To circumvent the flaw, dummy operations must be avoided. When F is balanced 4 , i.e. when #F −1 (y) equals 2n−m for every y ∈ Fm 2 , then we propose in 3 4
Attacks exploiting dummy operations have been mainly applied to attack SPAresistant implementations of RSA (see [24] for an example of such attacks). For security reasons, functions F involved in cryptographic applications are always balanced.
A Generic Method for Secure SBox Implementation
233
Algorithm 2. Computation of a masked S-Box output from a masked input Input: a masked value x ˜ = x ⊕ r, an input mask r, an output mask s, a look-up table for F Output: the masked S-Box output F (x) ⊕ s 1. 2. 3. 4. 5. 6. 7.
R0 ← s R1 ← s for a = 0 to 2n − 1 do cmp ← compare(a, r) x ⊕ a) Rcmp ← Rcmp ⊕ F (˜ cmp ← compare(R0 , R1 ) return R0 ⊕ (cmp × R1 )
the following a slightly modified version of Algorithm 1. where both registers R0 and R1 are involved in the computation of the output value. According to (1), register R0 contains the value F (x) ⊕ s at the 0003 end of the x⊕ loop. Moreover it can be checked that the content of R1 equals s ⊕ a∈Fn2 F (˜ a0003=r 0003 a). As F is balanced, the summation F (˜ x ⊕ a) equals F (˜ x ⊕ r) that a∈Fn 2 a0003=r 0003 is F (x) since we have x ˜ = x ⊕ r. Indeed, the summation F (x) can be a∈Fn 2 0003 0003 rewritten y∈Fm ( a∈Fn ; F (a)=y y). As F is assumed to be balanced, each term 2 2 0003 y corresponds to 2n−m times the sum of the vector y with itself. a∈Fn 2 ; F (a)=y 0003 Thus, each term a∈Fn ; F (a)=y y equals the null vector if n > m and equals y if 2 0003 0003 n = m. This implies that a∈Fn F (a) equals 0 if n > m and equals y∈Fm y if 2 2 n = m. Since 0003 the sum of all the elements of a space equals the zero vector, one deduces that a∈Fn F (a) is also equal to 0 if n and m are equal. Consequently, 2 x ⊕ r) = F (x) ⊕ s holds at the end when F is balanced, the equality R1 = s ⊕ F (˜ of the loop. Finally, Step 6 aims at verifying that the contents of the two registers R0 and R1 are equal. Then, Step 7 returns either the expected result if no perturbation occurred or an erroneous result otherwise. This simple improvement of Algorithm 1. ensures that the attacker is no longer able to determine when the index a0 of the targeted loop iteration equals the input mask r. Algorithm 1. requires 2n × 3 logical operations (2 x-or operations and 1 comparison per loop iteration) and 2n memory transfers (1 table look-up per loop operation). For Algorithm 2., two assignments (Steps 1 and 2), one comparison (Step 6), one multiplication and one x-or operation (Step 7) are added. Since changing the input and output masks at each execution of Algorithm 2. has no impact on its performances, our proposal is efficient in the multi-mask protection mode. Moreover, it is generic in the sense that it can be applied to any balanced SBox F without any assumption on the algebraic structure of F . In what follows, we compare the performances of our proposal with the ones of other generic methods.
234
3.2
E. Prouff and M. Rivain
Comparison with Other Generic Methods
We focus here on two well-known elementary masked SBox computation methods. The first one is the table re-computation method recalled in Sect. 2. The second one, that we call here the global look-up table method, uses a large look-up table addressed with the mask and the masked value. Re-computation method. Let T denote a 2n bytes table allocated in RAM5 . When the block cipher algorithm is protected in the multi-mask protection mode, a new pair of input/output masks is generated each time an SBox output is computed and the following sequence of operations is performed: 1. for x = 0 to 2n − 1 do 2. T [x] ← F (x ⊕ r) ⊕ s 3. return T [˜ x] This algorithm requires 2n × 2 logical operations (2 x-or operations per loop iteration) and 2n ×2+1 memory transfers (1 read operation and 1 write operation per loop iteration and one access to the re-computed table). The computational cost of the table re-computation method is approximatively the same as for Algorithm 2. However, it also requires the allocation of 2n bytes of RAM which can be problematic in a low resource context, especially when several SBoxes need to be protected. Global look-up table method. Let T 0002 denote the look-up table associated to the function (x, y) 0002→ F (x ⊕ y) ⊕ y. To compute F (x) ⊕ r from x ⊕ r and r, the global look-up table method performs a single operation: the table look-up x, r]. Its timing performances are ideal since it requires only one memory T 0002 [˜ transfer. However, the size 22n of the look-up table T 0002 makes an application of the method difficult in a low resource context. For instance, if n is greater than or equal to 7, the amount of ROM required is definitively too great (at least 16 KB!). When n is lower than or equal to 6, the feasibility of the method depends on the amount of ROM of the device and on the number of different SBoxes which must be protected. The method can become interesting when it comes to protect SBoxes mapping F42 into itself (as it is the case for FOX [10] where three such SBoxes are involved) or when the SBox calculus can be performed in spaces of dimensions lower than or equal to 4 (as it is the case for the AES SBox - see Appendix A -). From a security point of view, the global look-up table method has a flaw since it manipulates the mask r and the masked data x˜ at the same time. Indeed x˜ and r are concatenated to address the look-up table T 0002 and thus, the value x˜ r is transferred through the bus. Since the variables x˜ r and x are statistically dependent, the leakage on x ˜ r is potentially exploitable by a first order DPA. 5
To make the description easier, we assume that every element of Fm 2 is stored on one byte.
A Generic Method for Secure SBox Implementation
235
Table 1. Comparison of methods solving Problem 1 for the bitwise addition Method Table recomp. Table recomp. Global LUT Algo. 2.
Masking Mode Pre-computation SBox calculation RAM ROM Single-masking 2n+1 MT + 2n+1 RLO 1MT 2n 2n n+1 n+1 Multi-masking 0 (2 + 1)MT + 2 RLO 2n 2n Multi-masking 0 1MT 0 22n Multi-masking 0 2n MT + (3 × 2n + 5)RLO 0 2n
Table 1 summarizes the costs of the three previously considered methods according to the number of register logical operations (RLO), the number of memory transfers (MT), the memory size (bytes in RAM) and the code size (bytes in ROM).
4
Security Analysis
To study the security of our proposal we will use some basic notions of information Theory. We recall them in the next section. 4.1
Preliminaries
We use the calligraphic letters, like X , to denote finite sets. The corresponding large letter X is then used to denote a random variable over X , while the lowercase letter x - a particular element from X . The probability of the event (X = x) is denoted P[X = x]. The entropy H(X) of a random variable X aims at measuring the amount of information provided by an observation of X and 0004 entropy of satisfies H(X) = − x∈X P[X = x] log(P[X 0004 = x]). The conditional 0004 X given Y , denoted by H(X Y ), equals − y∈Y P[Y = y] x∈X P [X = x Y = y] log(P [X = x Y = y]). To quantify the amount of information that Y reveals about X, the notion of mutual information is usually involved. The mutual information of X and Y is the value I(X, Y ) defined by I(X, Y ) = H(X) − H(X Y ). The random variables X and Y are independent if and only if I(X, Y ) equals 0. Moreover, the mutual information is always positive or null and it satisfies the following property. Property 1. Let X and Y be two random variables respectively defined over X and Y. For every function f defined over Y, we have I(X, f (Y )) ≤ I(X, Y ). For our security analysis, we shall also use the following proposition. Proposition 1. Let X and Y be two random variables defined over X and let Z be a random variable defined over Z. If Z is mutually independent of X and Y and has a uniform distribution over Z, then for every measurable function f defined from X 2 into Z, we have I(X, Z ⊕ f (X, Y )) = 0. As a consequence of Proposition 1, we have I(X, Z ⊕ X) = 0 when X and Z satisfy the conditions of Proposition 1.
236
4.2
E. Prouff and M. Rivain
Evaluation Methodology
To evaluate the security of our proposal, we follow the outlines of the methodology depicted in [20]. This methodology holds in five steps: specify the target implementation, specify the target secret, define the adversary model, evaluate the information leakage and define a metric to evaluate the security. The target implementation is Algorithm 2. running on a smart card. The target secret is the un-masked value x corresponding to the masked input x ˜ of Algorithm 2. The adversary model. We assume that the attacker can query the targeted cryptographic primitive with an arbitrary number of plaintexts and obtain the corresponding physical observations, but cannot choose its queries in function of the previously obtained observations (such a model is called non-adaptive known plaintext model in [20]). We also assume that the attacker has access to the power consumption and electromagnetic emanations of the device and applies a first order DPA attack but is not able to perform HODPA. The effectiveness of the prediction made by the adversary is strongly related to the amount of information provided by the physical observations. In our analysis, we assume that the attacker knows how information leaks from the device and straightforwardly makes its prediction based on the leakage model (such a prediction is called device profiled prediction in [20]). Moreover, the physical observations are assumed to be perfect i.e. matching exactly the leakage model (which is a very favorable situation from the attacker’s viewpoint). For our analysis, we choose to consider a general leakage model that can be used for both power consumption or electromagnetic emanations. The leakage model. Different models coexist to quantify the leakage of CMOS circuits with respect to the data handled but two of them are predominantly used: the Hamming Weight model (where the leakage is related to the Hamming Weight of the data handled) and the Hamming Distance model (where the leakage is related to the Hamming Distance between the previous and the current data handled in a register or transmitted through a bus - see [3]). In the Hamming Distance model, the leakage of the bit-transitions 0 → 1 and 1 → 0 are assumed to be equal, which makes the leakage analysis much simpler. However this assumption, which is adequate when trying to attack an unprotected implementation, is not relevant to model a strong opponent against a secure implementation. Indeed, in practice CMOS gates leak differently when charging or discharging the load capacitance (especially in the case of electromagnetic emanations [17]). Hence, as mentioned in [17,20], a more accurate leakage model must be defined to model an attacker who is able to observe these differences.
A Generic Method for Secure SBox Implementation
237
Definition 1. [17] In the Hamming Distance Extended (HDE) model, the leakage LHDE (s, y) related to a variable y that replaces an initial state s, satisfies LHDE (s, y) = N0→1 (s, y) × P0→1 + N1→0 (s, y) × P1→0 + β ,
(2)
where N0→1 (s, y) (resp. N1→0 (s, y)) denotes the number of transitions 0 → 1 (resp. 1 → 0) from s to y, P0→1 (resp.P1→0 ) denotes the average energy consumed by a transition 0 → 1 (resp. 1 → 0) and where β denotes some noise. −P1→0 Denoting by δ the normalized difference P0→1 and by ε the leakage P0→1 , P0→1 we have P1→0 = ε(1 − δ) and Relation (2) can be rewritten:
0005 0006 δ δ LHDE (s, y) = ε 1 − HW(s ⊕ y) + ε (HW(y) − HW(s)) + β . 2 2
(3)
From Relation (3), it can be verified that the HDE model includes the Hamming Weight (HW) and the Hamming Distance (HD) models. Indeed, when there is no difference between the transitions 0 → 1 and 1 → 0 (i.e. when δ = 0), we have LHDE (s, y) = εHW (s ⊕ y) + β and the HDE model is equivalent to the HD model. On the other hand, when the initial state s is constant, equal to 0, then we have LHDE (s, y) = εHW (y) + β and the HDE model is equivalent to the HW model. The HDE model is also appropriate to quantify electro-magnetic emanations leaking in the signed distance model which assumes that P1→0 equals −P0→1 [17]. In this case, we have δ = 2 and the leakage satisfies LHDE (s, y) = ε(HW (y) − HW (s)) + β. Once the behavior of the device and the attacker capacities are modeled, a method can be deduced from Standaert et al. [20] to prove the security of a countermeasure. Evaluation of the security. Let us denote by X, Y and IS the random variables respectively corresponding to the sensitive data targeted by the attacker, the data manipulated at the date of a leakage and the initial state replaced by Y . In the theoretical model depicted by the four steps above, it has been proved in [18] and [20] that a first order DPA does not succeed in extracting information about X if and only if X and LHDE (IS, Y ) are independent for every pair (IS, Y ) appearing during the execution of the algorithm. In the following section, we evaluate I(X, LHDE (IS, Y )) for every pair (IS, Y ) appearing during the execution of Algorithm 2. 4.3
Proof of Security
If IS is an operation code, a constant memory address or a system variable which is independent of the intermediate results of the algorithm, then Property 1 and the positivity of the mutual information imply the following inequality: 0 ≤ I(X, LHDE (IS, Y )) ≤ I(X, Y ) .
(4)
In such a case, proving I(X, Y ) = 0 is sufficient to prove I(X, LHDE (IS, Y )) = 0.
238
E. Prouff and M. Rivain
If IS corresponds to an intermediate result of the algorithm, then Property 1 and the positivity of the mutual information imply the following inequality: 0 ≤ I(X, LHDE (IS, Y )) ≤ I(X, (IS, Y )) ,
(5)
where (IS, Y ) denotes the random variable that has the joint distribution of IS and Y . In this case, proving that I(X, (IS, Y )) equals 0 is sufficient to prove that I(X, LHDE (IS, Y )) equals 0. To show that I(X, LHDE (IS, Y )) equals 0 for every pair (IS, Y ) appearing during the execution of Algorithm 2., we decompose our security proof into two steps, depending on the nature of IS. In a first step, we show that for every intermediate variable Y manipulated by Algorithm 2., the mutual information I(X, Y ) equals 0. According to Inequality (4), this will prove that X and LHDE (IS, Y ) are independent when IS is assumed to be an operation code, a constant memory address or a system variable: we shall say in this case that there is no variables leakage. In a second time, we show that for every transition occurring between two intermediate results Y1 and Y2 , the mutual information I(X, (Y1 , Y2 )) equals 0. According to Inequality (5), this will prove that X and LHDE (IS, Y ) are independent when IS corresponds to an intermediate result of the algorithm: we shall say in this case that there is no transitions leakage. Variables leakage. We decompose Algorithm 2. into several elementary operations each manipulating an intermediate result computed from the sensitive variable X, the input mask R and the output mask S. Since the random variables R and S correspond to randomly generated values, we can assume that they have a uniform distribution and that 0003a X, R and S are mutually independent. j=0 F (X ⊕ R ⊕ j) and let tmp denote Let suma (X, R) denotes the sum j0003=R the register used to store the intermediate results at Step 5. Table 2 lists the intermediate values that occur during an execution of Algorithm 2. As R and S have a uniform distribution and as X, R and S are mutually independent, one straightforwardly deduces from Proposition 1 that all the intermediate results listed in Table 2 are independent of X. Table 2. Intermediate results manipulated during Algorithm 2 Step 5.
Instruction
Intermediate results
tmp ← x ˜
X⊕R
tmp ← tmp ⊕ a
X⊕R⊕a
tmp ← F (tmp)
F (X0007⊕ R ⊕ a)
Rcmp ← Rcmp ⊕ tmp
S⊕ S⊕
0
if R = a
0007 suma−1 (X, R) otherwise F (X) if R = a suma (X, R)
6.
cmp ← compare(R0 , R1 ) F (X) ⊕ S
7.
return R0 ⊕ (cmp × R1 ) F (X) ⊕ S
otherwise
A Generic Method for Secure SBox Implementation
239
Table 3. Transitions between intermediate results occurring during Algorithm 2 Step
Operation
Target
Initial State IS
5
tmp ← x ˜
tmp
F (X ⊕ R ⊕ (a − 1))
X ⊕R
5
tmp ← tmp ⊕ a
tmp
X ⊕R
X ⊕R⊕a
5
tmp ← F (tmp)
A-BUS
adF + (X ⊕ R ⊕ (a − 1))
adF + (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
D-BUS
F (X ⊕ R ⊕ (a − 1))
F (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
S-BUS
F (X ⊕ R ⊕ (a − 1))
adF + (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
S-BUS
5
tmp ← F (tmp)
tmp
5
Rcmp ← Rcmp ⊕ tmp
Rcmp
R1 ← R1 ⊕ cmp
cmp
7
New State Y
adF + (X ⊕ R ⊕ a) ⎧ ⎨ 0 S⊕ ⎩ sum
F (X ⊕ R ⊕ a)
X ⊕R⊕a if R = a a−1 (X, R) F (X ⊕ S)
otherwise
S⊕
⎧ ⎨ ⎩
F (X ⊕ R ⊕ a) F (X)
if R = a
suma (X, R)
otherwise
F (X ⊕ S)
Transitions leakage. We consider hereafter the transitions between intermediate results that occur either on the bus or in the registers R0 , R1 , cmp and tmp. For the bus, we consider transitions that appear when memory addresses and data transit either on the same bus (here denoted S-BUS for single bus) or on different bus (an address bus denoted A-BUS and a data bus denoted D-BUS). Let adF denote the memory address of the look-up table F . In Table 3, we list the successive bus or register transitions occurring during an execution of Algorithm 2. We only list the transitions involving the sensitive data X. For all except the fifth row of Table 3, Proposition 1 and Property 1 straightforwardly imply that I(X, (IS, Y )) equals 0. For the fifth row, which corresponds to the update of Rcmp , let us study (IS, Y ) for R = a and R = a. If R equals a, then (IS, Y ) can be written (S, S⊕F (X⊕R⊕a)) where S is uniformly distributed over Fm 2 and is independent of the pair (X, R). If R differs from a, then (IS, Y ) can be written (S⊕suma−1 (X, R), S⊕suma(X, R)) that is (S 0007 , S 0007 ⊕F (X ⊕R⊕a)) after denoting S ⊕ suma−1 (X, R) by S 0007 . It can be verified that S 0007 has a uniform 0007 distribution over Fm 2 . Thus, due to Proposition 1, S is independent of (X, R). One deduces that (IS, Y ) is equivalent to a random variable (U, U ⊕F (X ⊕R⊕a)) where U is uniformly distributed over Fm 2 and independent of the pair (X, R). Then, from Property 1, we get I(X, (IS, Y )) ≤ I(X, (U, X ⊕R)) and Proposition 1 implies that I(X, (IS, Y )) equals 0. We showed in this section that the sensitive variable X is independent of every intermediate result Y and every transition IS → Y that occurs during the execution of Algorithm 2. As argued in Sect. 4.2, this implies that there is no mutual information between the sensitive variable and the instantaneous power consumption leakages. We can therefore conclude that Algorithm 2. is secure against first order DPA in the HDE model.
5
Conclusion
In this paper we have presented a new masking scheme for software SBox implementations that requires no RAM allocation. Since our method does not rely on specific SBox properties, it is generic and can thus be applied to protect any symmetric cryptosystem. We have argued that a first order DPA countermeasure
240
E. Prouff and M. Rivain
must be efficient not only in the single-mask protection mode but also in the multi-mask mode. In this mode, we have shown that our countermeasure is as efficient as the other classical generic methods and does not require RAM allocation. We have evaluated our solution within the framework recently introduced by Standaert et al. in [17, 20], proving its security against first order DPA under realistic assumptions about the attacker and the device behaviors. Finally, we have applied our method to AES and we have compared its efficiency with other secure implementations. Based on this analysis, we think that the timing and memory overhead of our countermeasure are suitable for practical implementations of the AES algorithm when a protection in multi-mask protection mode is required.
Acknowledgements We would like to thank Christophe Giraud and Emmanuelle Dottax for their fruitful comments and suggestions on this paper.
References 1. Akkar, M.-L., Giraud, C.: An Implementation of DES and AES, Secure against Some Attacks. In: Ko¸c, C ¸ .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 309–318. Springer, Heidelberg (2001) 2. Boneh, D., DeMillo, R., Lipton, R.: On the Importance of Eliminating Errors in Cryptographic Computations. Journal of Cryptology 14(2), 101–119 (2001) 3. Brier, E., Clavier, C., Olivier, F.: Correlation Power Analysis with a Leakage Model. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 16–29. Springer, Heidelberg (2004) 4. Chari, S., Jutla, C., Rao, J., Rohatgi, P.: Towards Sound Approaches to Counteract Power-Analysis Attacks. In: Wiener, M.J. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 398–412. Springer, Heidelberg (1999) 5. Goli´c, J., Tymen, C.: Multiplicative Masking and Power Analysis of AES. In: Kaliski Jr., B.S., Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 198–212. Springer, Heidelberg (2003) 6. Goubin, L., Patarin, J.: DES and Differential Power Analysis – The Duplication Method. In: Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 158–172. Springer, Heidelberg (1999) 7. Gueron, S., Parzanchevsky, O., Zuk, O.: Masked Inversion in GF(2n ) Using Mixed Field Representations and its Efficient Implementation for AES. In: Embedded Cryptographic Hardware: Methodologies and Architectures, pp. 213–228. Nova Science Publishers (2004) 8. Joye, M., Paillier, P., Schoenmakers, B.: On Second-Order Differential Power Analysis. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 293–308. Springer, Heidelberg (2005) 9. Joye, M., Yen, S.-M.: The Montgomery Powering Ladder. In: Kaliski Jr., B.S., Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 291–302. Springer, Heidelberg (2003)
A Generic Method for Secure SBox Implementation
241
10. Junod, P., Vaudenay, S.: FOX: a new family of block ciphers. In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 114–129. Springer, Heidelberg (2004) 11. Messerges, T.: Securing the AES Finalists Against Power Analysis Attacks. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 150–164. Springer, Heidelberg (2001) 12. Messerges, T.: Using Second-Order Power Analysis to Attack DPA Resistant software. In: Paar, C., Ko¸c, C ¸ .K. (eds.) CHES 2000. LNCS, vol. 1965, pp. 238–251. Springer, Heidelberg (2000) 13. Oswald, E., Mangard, S., Herbst, C., Tillich, S.: Practical Second-Order DPA Attacks for Masked Smart Card Implementations of Block Ciphers. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, Springer, Heidelberg (2006) 14. Oswald, E., Mangard, S., Pramstaller, N., Rijmen, V.: A Side-Channel Analysis Resistant Description of the AES S-box. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 413–423. Springer, Heidelberg (2005) 15. Oswald, E., Schramm, K.: An Efficient Masking Scheme for AES Software Implementations. In: Song, J., Kwon, T., Yung, M. (eds.) WISA 2005. LNCS, vol. 3786, pp. 292–305. Springer, Heidelberg (2006) 16. Oswald, E.: Stefan, and N. Pramstaller. Secure and Efficient Masking of AES – A Mission Impossible? Cryptology ePrint Archive, Report 2004/134 (2004) 17. Peeters, E., Standaert, F.-X., Quisquater, J.-J.: Power and Electromagnetic Analysis: Improved Model, Consequences and Comparisons. In Integration, the VLSI Journal. Elsevier, Spring (to appear) 18. Prouff, E., Giraud, C., Aumonier, S.: Provably Secure S-Box Implementation Based on Fourier Transform. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 216–230. Springer, Heidelberg (2006) 19. Rudra, A., Bubey, P.K., Jutla, C.S., Kumar, V., Rao, J., Rohatgi, P.: Efficient Rijndael Encryption Implementation with Composite Field Arithmetic. In: Ko¸c, C ¸ .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 171–184. Springer, Heidelberg (2001) 20. Standaert, F.-X., Malkin, T.G., Yung, M.: Side-Channel Resistant Ciphers: Model, Analysis and Design. Cryptology ePrint Archive, Report 2006/139 (2006) 21. Trichina, E.: Combinatorial Logic Design for AES SubByte Transformation on Masked Data. Cryptology ePrint Archive, Report 2003/236 (2003) 22. Trichina, E., Korkishko, L.: Secure and Efficient AES Software Implementation for Smart Cards. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 425–439. Springer, Heidelberg (2005) 23. Waddle, J., Wagner, D.: Toward Efficient Second-order Power Analysis. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 1–15. Springer, Heidelberg (2004) 24. Yen, S.-M., Joye, M.: Checking before output may not be enough against faultbased cryptanalysis. IEEE Transactions on Computers 49(9), 967–970 (2000)
A
Application to AES
We recall that the AES SBox is composed of two parts: a non-linear function and an affine mapping. In the following we focus on the non-linear part, which will be denoted here by F . Let p(x) denotes the irreducible polynomial x8 ⊕ x4 ⊕
242
E. Prouff and M. Rivain
x3 ⊕ x ⊕ 1 ∈ F2 [x]. The function F is defined in F2 [x]/p(x) by F (a) = 0 if a = 0 and by F (a) = a−1 otherwise. At first, we applied the method depicted in Algorithm 2. for n = 8 to protect the SBox access of the AES algorithm. We implemented the solution on a classical 8051 chip running at 8 Mhz and we studied the performances of the implementation. Clearly, the timing of the resulting AES algorithm was not interesting (around 115 ms and 1, 8 KB of ROM) compared to 5ms for an implementation without countermeasures. Secondly, we represented F28 has an extension of F24 , allowing us to perform the computations in F24 instead of F28 (such a method is usually called composite field approach). We chose the two irreducible polynomials p0007 (x) = x2 + x + {e} and p00070007 (x) = x4 + x + 1 in F4 [x] and F2 [x] respectively and we denoted by map the field isomorphism which takes an element a of F2 [x]/p(x) as input and outputs the pair (ah , al ) ∈ (F2 [x]/p00070007 (x))2 corresponding to the coefficients of the linear polynomial (ah x + al ) ∈ F24 [x]/p0007 (x). Moreover, we denoted by InvF24 the function which corresponds to the inverse function over F2 [x]/(x4 + x + 1){0} and which maps 0 into itself. In the following, we depict the different steps of our computation: Algorithm 3. Inversion of a masked element a = a ⊕ ma in F28 Input: ( a = a ⊕ ma , ma ) ∈ F28 2 −1 = a−1 ⊕ m0002 , m0002 ) Output: (a a a 1. Pick up three 4-bit random md , m0002h and m0002l 2. (mh , ml ) ∈ F224 ← map(ma ) a) 3. ( ah , a000bl ) ∈ F224 ← map( 2
h ⊗ a000bl ⊕ a000bl 2 ⊕ md ⊕ a
h ⊗ ml 4. d ← a
h ⊗ {e} ⊕ a ⊕ a000bl ⊗ mh ⊕ m2h ⊗ {e} ⊕ m2l ⊕ mh ⊗ ml −1 ← Algorithm 2.(d, md , md−1 , InvF ) 5. d 24
[( ah , a000bl ) = (ah ⊕ mh , al ⊕ ml )] [d = d ⊕ md ] −1 = d−1 ⊕ m −1 ] [d d 0002 0002
[a = ah ⊕ m0002h ]
−1 ⊕ m0002 ⊕ m ⊗ d −1 ⊕ m −1 ⊗ a
0002 ← a
h ⊗ d
h ⊕ md−1 ⊗ mh 6. a h h d h 0002 0002 0002 −1 ⊕ m ⊕ a −1 ⊗ m ⊕ a
7. a000b0002 ← a000bl ⊗ d ⊕ d ⊗ m −1 ⊕ mh ⊕ ml ⊗ m l l l l
8. m0002a ← map−1 (m0002h , m0002l )
0002 , a000b0002 ) −1 ← map−1 (a 9. a 10. return
h l −1 , m0002 ) (a a
h
d
h
d−1
[a000b0002l = a0002l ⊕ m0002l ]
−1 = a−1 ⊕ m0002 ] [a a
For this version, the timing of the resulting AES algorithm are very interesting and the input and output masks can be changed at each execution of the algorithm. In the following table, we have listed the timing/memory performances of our proposal and the ones of other methods proposed in the Literature. As the performances have been measured for a particular implementation on a particular architecture, the table above does not aim at arguing that a method is better than another but aims at enlightening the main particularities (timing performances and ROM/RAM requirements) of each method.
A Generic Method for Secure SBox Implementation
243
Table 4. Comparison of several methods to protect AES against DPA Method Timings (ms) RAM (bytes) ROM (bytes) Multi-masking Straightforward implementation 5 0 1150 Re-computation Methods in the single-mask mode Re-computation Method in F28 [11] ×1.42 +256 +49% not allowed Re-computation Method in F24 [11] ×2.60 +16 +150% not allowed Re-computation Methods in the multi-mask mode Re-computation Method in F28 [11] ×50, 60 +256 +49% allowed Re-computation Method in F24 [11] ×5.86 +16 +150% allowed Secure SBox computation methods based on the composite field approach Oswald et al. [14, 16] ×5.20 0 +173% allowed This paper (Algo. 3.) ×5.30 0 +150% allowed Prouff et al. [18] ×6.40 0 +147% allowed Methods with security under discussion Oswald and Schramm [15] ×2.40 0 +200% allowed Trichina et al. [22] ×4.20 +256 +165% not allowed
The AES implementations listed above only differ in their approaches to protect the SBox access. The linear steps of the AES have been implemented in the same way and the internal sensitive data have been masked by bitwise addition of a random value. We chose to protect only rounds 1 to 3 and 8 to 10, assuming that the diffusion properties of the AES algorithm make DPA attacks impossible to mount on inner rounds 4 to 7 (this implies that the SBox calculations made in rounds 4 to 7 are performed by simply accessing the table representation of F which is stored in ROM). In the single mask mode, the re-computation method in F28 has the best timing performances but at least 256 bytes of RAM must be allocated to store the re-computed SBox table. As RAM is a sensitive resource in the area of embedded devices, we implemented a second version which follows the outlines of the composite field approach and then applies the re-computation method in F24 . Because only 16 bytes of RAM are required to store the table re-computed from the InvF24 function, the new implementation requires much less RAM than the version in F28 and the timing performances are suitable for practical applications. As the multi-mask protection mode offers better security with respect to power analysis attacks [5, 15], we tested the re-computation method in this mode. As expected, the re-computation method in F28 no longer gives full satisfaction in this mode (requiring 253 ms for one AES execution). The timing performances of the re-computation method in F24 are acceptable in the multi-mask protection mode, however they are close to (and even slightly greater than) the performances of the SBox secure calculation methods. Moreover, 16 bytes of RAM are required. The SBox secure calculation methods of Oswald et al., Prouff et al. and our proposed approach only differ in the ways of securely computing the value ˜ md and md−1 (i.e. to securely perform the fifth Step of d−1 ⊕ md−1 from d, Algorithm3.): – In [14,16], the inversion is performed by going down to F4 and its complexity approximatively equals the one of Algorithm 3. excluding the 5th Step which
244
E. Prouff and M. Rivain
is replaced by a square operation (since the inversion operation in F4 is equivalent to squaring). For our implementation of [14, 16], the number of cycles required for the fifth step is 267. – In [18], the inversion is essentially performed by computing a Fourier transform on F42 . For our implementation of [18], the number of cycles required for the fifth step is 468. – For the new solution presented here, the fifth step essentially corresponds to the computation of y −1 ⊕ md−1 for every y ∈ F42 . For our implementation, the number of cycles required by the fifth step is 270. The three methods can be used in the multi-mask protection mode without decreasing the performances of the implementation and they offer approximatively the same (good) level of security related to first order DPA attacks. The execution timings of AES implementations based on our proposal or on Oswald et al.’s method are very close and their RAM requirements are almost equal. The additional time required by the Prouff et al. method is slightly greater, however the code seems to be shorter (2844 bytes of ROM versus 2881 and 3144 bytes of ROM for our method and the Oswald et al.’s method). The methods proposed by Trichina [22] and Oswald-Shramm [15] have good timing performances but are not perfectly resistant to first order DPA attacks. – In the method of Trichina et al., a primitive element of F28 is computed and every non-zero element of F28 is expressed as a power of that element. To resolve Problem 1 for the bitwise operation, Trichina et al. use pre-computed discrete logarithm and exponentiation tables to realize the SBox operation. As argued in [15], the method has a faulty behavior when some intermediate values are null and to correct the method without introducing a flaw with respect to first order attacks seems to be an issue. – The method proposed by Oswald and Shramm offers the best timing performances. As for Algorithm 3., it is based on the composite field approach but steps 4 to 7 are replaced by a sequence of table look-ups and bitwise additions. The table look-ups have been render resistant to first order DPA attacks by applying the global look-up table method (which is recalled in Sect. 3). For example, the computation of d−1 ⊕ md−1 (Step 5) is performed by accessing the table Tinv associated to the function ((d ⊕ md ), md ) ∈ (F24 )2 0002→ (d−1 ⊕ md ) ∈ F24 . As argued in Sect. 3, the global look-up table method has a flaw with respect to first order DPA attacks. Indeed, to address the Tinv table the value (d ⊕ md ) md is manipulated, which results in a power consumption that leaks information on the sensitive value d. For instance, it can be checked that I(d, H((d ⊕ md ) md )) is not null, which results in an information leakage in the Hamming Weight model. Moreover, the input and output masks being equal, this method has also a potential flaw in the Hamming Distance model. Indeed, if a transition occurs between the index (d ⊕ md ) md and the value d−1 ⊕ md accessed in the look-up table (which is very likely in a single bus architecture), the mask md is canceled and information leaks about d and/or d−1 .
On the Security of a Popular Web Submission and Review Software (WSaR) for Cryptology Conferences Swee-Won Lo1,0002 , Raphael C.-W. Phan2 , and Bok-Min Goi3 1
School of Electrical & Electronics Engineering & Computer Science, Kyungpook National University, Sankyuk-dong, Buk-gu, Daegu 702-701, Korea [email protected] 2 Laboratoire de s´ecurit´e et de cryptographie (LASEC), Ecole Polytechnique F´ed´erale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland [email protected] 3 Centre for Cryptography and Information Security (CCIS) Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Malaysia [email protected]
Abstract. Most, if not all, conferences use an online system to handle paper submissions and reviews. Introduction of these systems has significantly facilitated the administration, submission and review process compared to traditional paper-based ones. However, it is crucial that these systems have strong resistance against Web attacks as they involve confidential data and privacy. Some submissions could be leading edge breakthroughs that authors do not wish to leak out and be subtly plagiarized. Also, security of the employed system will attract more submissions to conferences that use it and gives confidence of the quality that the conferences uphold. In this paper, we analyze the security of the Web-Submission-and-Review (WSaR) software - latest version 0.53 beta at the time of writing; developed by Shai Halevi from IBM Research. WSaR is currently in use by top cryptology and security-related conferences including Eurocrypt 2007 & 2008, Crypto 2007, and Asiacrypt 2007, annually sponsored by the International Association for Cryptologic Research (IACR). We present detailed analysis on WSaR’s security features. In particular, we first discuss the desirable security features that are designed into WSaR and what attacks these features defend against. Then, we discuss how some untreated security issues may lead to problems, and we show how to enhance WSaR security features to take these issues into consideration. Our results are the first known careful analysis of WSaR, or any type of online submission system for that matter. Keywords: Web submission and review software, security analysis, privacy, passwords, email, protocol. 0002
Part of work done while the author was at [email protected] University (Cyberjaya campus) and iSECURES [email protected] University of Tech (Sarawak campus).
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 245–265, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
246
1
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Introduction
About two decades ago, authors interested to submit papers to a conference would submit via postal service or airmail. After being reviewed, papers (together with the reviews) would be sent back by similar means. Before the camera-ready deadline, authors of accepted papers would need to race against time to send their camera-ready versions over, and hope that they do not get lost in transit. This method of correspondence is not only costly, it is also time-consuming. More notably, it will be hard to trace lost or delayed papers and reviews since it all depends on the reliability of the postal system. Fast forward a few years, technology advances introduce the use of email (attachments) and facsimile. Although most email services are free of charge, this method is often limited in terms of the size of the attachment(s). Facsimile, on the other hand, can be costly although fast. Thus, the introduction of online web-based systems significantly facilitates the paper submission and review process [26], and overcomes shortfalls in the traditional paper-based system. Authors and reviewers can track their papers’ progress anytime, anywhere, as long as they have an Internet connection. In addition, the conference Chair is now capable of managing papers and reviews more effectively, as well as reacting quickly to feedback and complaints. According to a survey [26] by ALPSP1 on web submission and review systems for journals, among the 442 respondents selected at random from the ISI Web of Knowledge database, 81 per cent preferred to use web submission and review systems and 36 per cent said that they would think twice when choosing a journal without online submission for their work. Following the introduction of online submission, there was a 25 per cent increase in submission volumes and publishers reported a 30 per cent decrease in administration time. From this survey, we see that online submission and review systems are playing a significant role. Nevertheless, popular though they be, there is still an issue involved that should be a major concern among the community - how secure are the data handled by these systems? In this setting, security and privacy can be seen from two opposite perspectives. One, arguably less eminent, is the risk of malicious individuals attempting to obtain unauhorized access to leading edge research results and thus idea theft, or cause unfair dismissal of submitted papers. On the other is the case of honest paper authors desiring that the submission system maintains their privacy and secrecy of research ideas, and be able to verify to themselves and prove later to others that their submissions are properly handled by the system; at least that any errors should be detectable without unnecessary delay. Also desirable to the honest reviewer is that reviewer anonymity is upheld. In this paper, we analyze the web-submission-and-review system (known as WSaR from here onwards) [10] developed by Shai Halevi from IBM Research. WSaR is currently in use by top cryptology conferences including Eurocrypt 2007 & 2008, Crypto 2007 and Asiacrypt 2007, annually sponsored by the International Association for Cryptologic Research (IACR) [11]. See Appendix B for a longer list. 1
Association of Learned and Professional Society Publishers.
On the Security of a Popular Web Submission
2
247
WSaR and Its Security Features
WSaR is open source and is hosted at SourceForge [23]. While analyzing its HyperText Preprocessor (PHP) scripts, we found some security features that have been designed into WSaR to protect against several common Web attacks. This section will analyze how the features are added and the type of attacks the features defend against. 2.1
Password Strength
Passwords are often the first defense against intrusion. Relative to online submission and review systems, passwords are the first fortification to ensure the quality of conference proceedings because they are used to safeguard the submissions and their reviews. In order to gain initial access to the administration or review sites, WSaR computes and generates unique passwords for the conference Chair and reviewers respectively. Firstly, after the customization phase, the Chair will be given a 10-alphanumeric-character password to log in to the administration site. Once logged in, he has the option to change the default password. The same goes to the reviewers (in most security conferences, a reviewer with access to the online system is called a program committee member) - soon after the Chair grants the reviewers access to the review site, they will receive a notification email that gives them the password to log on to their own review sites. Here, they will be able to view the list of submissions (for which they have no conflict of interest), change their reviewing preferences, post their reviews, participate in paper discussions or take part in a ballot. On the other hand, throughout the submission phase, a distinct submission-ID and password will be generated for every paper. Authors will need these parameters to revise or withdraw the papers. However, they do not have the option to change the submission password. These “WSaR-generated” passwords are 10-alphanumeric-character strings. Each character is either uppercase (A-Z), lowercase (a-z), numerals (0-9), or sometimes a tilde (˜) or an underscore ( ). Figure 1 illustrates how passwords are generated in WSaR: 32 hexadecimal-character string
A random string 10753459c7300b71b5838899114
MD5 hash function
5870660775b9e7e9551c9f314c5bb651
Extracted 16 hexadecimal-character string 5870660775b9e7e9
10-alphanumeric-digit string Encoding scheme
M71c1tMvv_
Fig. 1. Password generation process for the reviewer
248
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Fig. 2. An example email where generated password is emailed to the conference Chair
1. For the Chair and PC members: A long random string is generated using PHP functions such as uniqid(), mt_rand(), and rand(). This random string has length between 15 to 28 digits. For the submissions: A long random string is generated using the similar functions as above. This string is then appended by the submission’s title and the author’s name. 2. The hash of the resulting string is then computed via PHP’s MD5 function. 3. The first 16 hexadecimal digits of the resulted message digest are extracted. 4. A custom WSaR encoding function is used to compress these extracted digits into a 10-digit alphanumeric string. 5. The resulting string (which is the password) will be emailed to the user (see Figure 2). Note that an alphanumeric character corresponds to 6-bit entropy, so the entire 10-alphanumeric-character string requires an exhaustive effort of 260 to brute force. Therefore, the passwords generated by WSaR are deemed to be secure. 2.2
Password Storage for Conference Chair and PC Members
No matter where passwords are stored on the online system server, they should be stored in encrypted or hashed form, similar to multi-user operating systems like Unix, Linux etc. The most popular way to seal passwords in a database is via password hashing. There are currently two most commonly used hash functions, namely the Secure Hash Algorithm-1 (SHA-1) and the Message Digest 5 (MD5). MD5 produces a hash that is 128 bits long (equivalent to 32 hexadecimal characters) while SHA-1 computes a hash that is 160 bits long (equivalent to 40 hexadecimal characters). Among these two, MD5 is the more commonly used hash function to safeguard passwords as well as to ensure message or software integrity. Likewise, this function is also employed in WSaR when it comes to storing passwords.
On the Security of a Popular Web Submission
A random string
249
32-hexadecimal-character string MD5 hash function
1014245b61e33a37011350915937
2fa0d14fca37b6dce99aecc44521f298
22-alphanumeric-digit salt string BIY05o7gnK3vvHRaepMt1R
Compression function
salt string email password [email protected]_
32-hexadecimal-character string MD5 hash function
2f857cace861d81979f78854399058eb
Database
Fig. 3. Generation of salt string and password storing process
Here, we look at how WSaR protects the Chair and PC members’ passwords stored in the system database (see Figure 3): 1. Upon customization, a random string is generated using PHP functions such as uniqid(), mt_rand(), and rand(). The string’s message digest is computed and the digest is then compressed using a custom WSaR function into a 22-alphanumeric-character salt string. This salt string is saved and is constant throughout the entire conference. 2. The user’s email address and password are retrieved. 3. The salt string is then appended by the user’s email address and password, and it is fed to the MD5 hash function. 4. The resulting digest is stored into the database table (see Figure 4).
Fig. 4. The digests are stored in the database instead of the passwords
In this case, users’ passwords are not stored in the clear in the database, and it is infeasible for an attacker to reverse on the hash values to retrieve the preimages (passwords). In Section 3.5, we will discuss an issue with the storage of submissions’ passwords. Furthermore, this technique of appending a salt string to the email address and password before hashing it increases the difficulty in cracking the passwords using brute-force attack even if an attacker has access to the hash table. In Section 3.4, we discuss how using different salt strings (instead of a constant one) can increase the difficulty in cracking users’ passwords.
250
2.3
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Input Sanitization
As discussed in [8], SQL injection is considered one of the most dangerous threats to Web applications because it allows an attacker to connect to the back-end database and extract any data as he wants. An example of an SQL injection attack is the log in form where a user enters his username and password to be authorized, and the server will retrieve the user’s ID and credit card number, for example. Generally, the SQL query is shown in Table 1. Table 1. SQL statement to retrieve user’s ID and credit card number SELECT FROM WHERE AND
ID, CREDIT_NUM users username = ‘$username’ password = ‘$password’
Assuming an attacker enters “Jane” in the username field and provides the string “anything’ OR ‘a’=‘a” in the password field. The SQL query would become: Table 2. SQL statement as “modified” by the attacker SELECT FROM WHERE AND
ID, CREDIT_NUM users username = ‘Jane’ password = ‘anything’ OR ‘a’=‘a’
The ‘a’=‘a’ part is always true regardless of what the first part of the query contains, thus the attacker would be able to trick the application to obtain database data that is not supposed to be returned by the application. Successful SQL injection attack will also result in authentication bypass and database modification [15]. The “one rule” to defend against SQL injection (as well as cross-site scripting, buffer overflows etc.) is input sanitization. If this is done, the Web application will be 80 per cent more secure. There are two commonly used ways to validate input: (1) strip off any undesirable characters (such as meta-characters) and (2) check input data for expected data type [12]. Both ways are implemented in WSaR. We note that the potential SQL injection characters include: (”*;&<>/’ˆ) [7]. In WSaR, a possible exploitation of special characters for an SQL Injection attack is in the “Submission/Revision Receipt” page where, as an example, the receipt page’s URL for submission A with password ‘ABC’ will be“http://localhost/receipt.php? subId=A&subPwd=ABC”. In this case, query to the database will be as in Table 3 (as an example).
On the Security of a Popular Web Submission
251
Table 3. SQL statement to retrieve submission with subId=A and subPwd=ABC SELECT FROM WHERE AND
title, authors, abstract submissions subId=‘A’ subPwd=‘ABC’
Firstly, WSaR uses the my_addslashes() function in PHP to remove undesirable characters. If an attacker happens to be one of the submitters to the conference, she would know the pattern of the receipt page’s URL. Now, she is interested in finding out the paper submitted by her rival, so she launches an SQL Injection attack on the receipt page by changing the URL to “http://localhost/ receipt.php?subId=1&subPwd=1’% 20OR% 20‘1’=‘1, where %20 represents a space in its HTML entity. This time, the query will be as follows: Table 4. SQL query for an SQL Injection attack SELECT FROM WHERE AND
title, authors, abstract submissions subId=‘1’ subPwd=‘1’ OR ‘1’=‘1’
However, since all special characters are removed by the my_addslashes() function - single quotes are ignored using backslashes (commented out), the ‘1’=‘1’ part no longer makes sense. Thus, the attacker will receive a generic error message as shown in the screen shot in Figure 5. If the software does not sanitize user’s input in all PHP scripts (i.e. all my_addslashes() functions are removed from every WSaR scripts), the URL
Fig. 5. Attacker receives an error message
252
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Fig. 6. An attacker can click on the Revision or Withdrawal link to revise or withdraw the submission
constructed by the attacker will return submission 1’s receipt page and from there, she can redirect to the revision page and revise her rival’s paper or even withdraw it from the conference without her rival knowing it (see Figure 6). Secondly, developers should specify the type of input that is expected for certain form fields, and that the unexpected input will be removed. As an example, a form field that requests for user’s phone number should accept only numbers as input. WSaR always has a specific input type that is expected for the submission ID - it processes only integers for the submission ID field. In spite of that, developers can make use of regular expressions to tell the program the type of pattern in the text that it should look for [19]. If the data submitted does not match the regular expression, it will be ignored [7] or error messages will be generated. Both the above mentioned methods are employed in WSaR. Therefore, this system is not categorized as one of the 60 per cent of Web applications that is vulnerable to SQL injection (or other attacks due to invalidated input) [24]. 2.4
Resistance to Bypass of Access Control Checks Through Forced Browsing
Forced browsing refers to a technique used by attackers to access to resources that are not referenced, but are nevertheless accessible [25]. Access control checks are normally performed after a user gets authenticated and it monitors what authorized users are allowed to do. As an example, if a reviewer is blocked from reviewing certain submissions due to conflict of interest, he should not be able to bypass the access control checks by requesting the review form of that submission directly in the URL.
On the Security of a Popular Web Submission
253
Fig. 7. Review form for submission 1
Taking a look at the URL of a review form; to review submission A, reviewer will access the review form at the URL “http://localhost/review/ review.php?subId=A” (see Figure 7). Assume that the insider attacker is one of the reviewers in a conference and she is blocked from reviewing submission 1 since she is one of the authors of that submission. In order to make sure that her submission is accepted to the conference, the attacker needs good reviews for her paper. Thus, she tries to change submission A’s review form URL to which she has access, from “http://localhost/review/ review.php?subId=A” to “http://localhost/ review/review.php?subId=1” since she is prohibited to access the link directly from her review page. If the system then displays the review form for submission 1, the attacker has bypassed the access control checks through forced browsing. Fortunately, the attempt to bypass WSaR’s access control check is forbidden in the review form page, as well as the voting page and the reviewers’ discussion forum. Whenever a reviewer attempts to access a blocked submission by directly specifying the submission-ID in the URL, WSaR will firstly perform an authorization check in the reviewer table; if the reviewer is blocked from that submission, it will display an error message indicating that the submission is not found, or the reviewer has a conflict as shown in Figure 8.
Fig. 8. WSaR prevents broken access control
3
Security Issues and Enhancements
In addition to the security features already designed into WSaR discussed in Section 2, we have also discovered some security issues not treated in WSaR and in this section, we discuss them in detail and then describe ways to enhance WSaR security by taking them into consideration.
254
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
For any security issue, we will discuss its implications from the two opposing perspectives motivated in Section 1. We also highlight whether exploitation of issues can be traceable or uniquely pointing the finger to the culprit. This has devastating consequences if the attacker cannot be traced since it means even a curious (if not malicious) researcher from the scientific community could have mounted the attack without any counter-incentives i.e. no adverse effects on his reputation. Furthermore, issues that lead to attacks for which the culprit cannot be unambiguously accused will cause disputes for which a malicious attacker could deny his involvement or an honest user be unfairly thought by peers to have mounted an attack. 3.1
Browser Caching
Browser caching is categorized as one of the critical areas in OWASP’s Top Ten projects under “Broken Authentication and Session Management” [25]. In WSaR, the submission-ID and password are sent by the HTTP GET method. Information sent in such way will be displayed in the browser’s URL [18] and it is extremely undesirable (see Figure 9).
Fig. 9. Submission-ID and password are displayed in the page’s URL
This means the submission-ID and password are part of the URL. As has been highlighted in [25], authentication and session data should not be submitted as part of a GET to prevent someone malicious from using the “Back” button in an authorized user’s browser to backup the page and hence obtain the password from the URL. Alternatively, even if a browser window is closed, the URL info can also be obtained from the browser’s cache and history. Recall that browsers cache most of the contents of frequently visited pages so that the pages will load faster the next time they are visited, including images, sounds, cookies, web pages and their URLs. Information stored in browsers’ cache is not encrypted [1] and it can be obtained by anyone who accesses the computer. Both Netscape Navigator and Mozilla Firefox, by default require users to clear the browser’s cache (and disk cache - also known as “Temporary Internet Files”) manually. On the other hand, Internet Explorer will clear the browser’s cache but not the “Temporary Internet Files”, once the browser is restarted. In addition, previously visited URLs are typically stored in the browser’s history; for instance the latest versions (at the time of writing) of Netscape Navigator and Mozilla Firefox by default will only clear browser’s history every nine days, while the default for Internet Explorer is 20 days.
On the Security of a Popular Web Submission
255
Fig. 10. Internet Explorer’s temporary internet files that stores the receipt page’s URL
Figure 10 shows the receipt URLs stored in the “Temporary Internet Files” folder. In spite of that, if the user does not clear the browser’s cache or history after he uses the computer, the URLs will also be displayed in the browser’s history pane as shown in Figure 11.
Fig. 11. Submission ID and password are exposed to the attacker
This threat can be launched by any individual having access to a common machine previously utilized by a WSaR user, and upon retrieving the login information (ID and password), he can login as the WSaR victimized user. This attack is non-traceable in the sense that the attacker cannot be later pinpointed, so he can mount the attack without any risk of jeopardizing his reputation. Thus, the counter-incentive is non-existent that it will indeed be tempting for any individual to abuse this issue to the victim’s disadvantage. Browser caching can be prevented by submitting the submission-ID and password as part of a HTTP POST method [25] instead. If the POST method is used, we can employ the PHP’s predefined variable “$_POST” to retrieve the values entered in the submission-ID and password fields respectively. In this case, both submission-ID and password will not be displayed in the page’s URL. Other than this, WSaR users are advised to clear the browser’s cache before they leave the computer. In Netscape Navigator, users would have to clear both the memory and disk cache; in Mozilla Firefox, users should check all fields when clearing their private data [4]; in Internet Explorer, users should clear the history and all temporary internet files [21]. This way, users can ensure the security of their private information. 3.2
Constant Salt String for Reviewer and Chair Passwords
In Section 2.2, we discussed how WSaR uses a salt string appended to the reviewer’s email address and password to build a stronger defense in securing his
256
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
passwords. However, this salt string is constant throughout the entire conference for any party, thus all users of the same conference will have the same salt appended to their password. So what differentiates each user is just the email address (which is typically public information) and his password. This indicates that the salt string does not really provide better strength against insider attackers (users of the same conference) than conventional password-based authentication systems. Therefore, we recommend the use of different salt strings for different users. These salt strings will be stored in a single PHP script upon customization of WSaR, and they will be labeled in the sense that “SALT_2” will be used for PC member with the ID of 2 and so on. In this case, since the salt strings will be random and of different lengths, the attacker would have a much harder time trying to guess the exact salt that is appended to a specific user’s email address and password. Again, this threat is non-traceable as it allows a malicious insider individual to brute-force the password, upon which he can login as the victim without any evidence pointing back to him. 3.3
Storage of Submission Passwords
Throughout our analysis, we discovered that although WSaR stores the hashes of reviewers’ passwords, it does not do the same when it comes to storing submissions’ passwords (see Figure 12). For someone who manages to gain access to the database, he has the ID and password for every submission as well. We recommend that the submission passwords to be salted and hashed as well to secure them so that even if a hacker manages to penetrate the database’s security, he would face the prospect of a potentially expensive search for the exact password.
Fig. 12. Submissions’ passwords are stored in the clear in the database
3.4
Password Policy and Strength Checking
Using a good password by every user is vital to defending a system. The components of a good password should be at least eight characters long, should not consist of dictionary words (would be vulnerable to dictionary attack otherwise), should never be the same as the user’s log in name, should not consist of any item that is easily identified with the user, and have at least three of the following elements [5]: – One or more uppercase letters (A-Z) – One or more lowercase letters (a-z)
On the Security of a Popular Web Submission
257
– One or more numerals (0-9) – One or more special characters or punctuation marks However, it is worth noting that some systems do not permit the use of certain special characters in user’s password string. Although the passwords supplied by WSaR fulfill all the mentioned requirements, users might find the password hard to remember and opt to change it. Therefore, we analysed WSaR’s password change mechanism. WSaR does not have any password policies imposed to ensure the strength of new passwords, thus a careless user may happen to employ a password which is easily cracked by any password-cracking tools, without being reminded not to do so. Hence, we suggest that WSaR should force its users to adopt passwords that cannot be easily cracked. Google mail [9] uses a scale to show its users the strength of their passwords with parameters such as “Too short”, “Weak”, “Fair”, “Good” or “Strong”. This is a feature that could be added to WSaR.
Fig. 13. Google mail’s password strength evaluation
However, careless users tend to ignore the password strength evaluation and proceed to employing the new password. Consequently, the system’s security could still be breached. Having said that, WSaR’s password changing mechanism should perform an assessment on the new password - whether it is too short or it contains dictionary words or even a row of letters from a standard keyboard layout (e.g., ertyui) [14]. If a user’s new password happens to be a weak password, a pop up window should be issued to require him to re-enter a new password. Besides that, a password policy should also be implemented to require frequent change of passwords. If an attacker uses brute-force attack to crack a password, it is possible that he would be taking a long time (depending on the length of the password, speed of both the machine and network connection) to complete the attack [14]. In this case, if the user changes his password frequently, he could avoid his password from being cracked via brute-force attack. 3.5
Absence of File Integrity and Binding
Whenever we download any files or software, the first thing we would want to do is to check the file’s integrity. Presently, the most popular method used to verify a file’s integrity is via the use of the MD5 function. If the file content is modified, it can be easily detected by re-computing its hash value. In an online submission and review system, calculating a submission’s message digest is important both to ensure message integrity and to bind the author to the submitted paper due to absence of face-to-face communication. We note that WSaR does not have this feature, i.e. upon submission the file’s message digest is not computed and thus not supplied to both the author and the
258
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Chair. Therefore, considering the case of malicious users, if an author happens to submit a not-so-perfect paper in order to meet the submission deadline, he can later deny that he submitted that version of the paper, then claim that the file is corrupted and request for a second-time submission, thus gaining advantage over other authors. More devastatingly, an honest author could have submitted a proper paper but the file became corrupted by the server machine; in which case it is desirable to be able to unambiguously prove that the corrupted version is not the submitted file, or at least be able to check during submission that it was properly received. This avoids ambiguity when a Chair notes during review phase that a file is corrupted and is unsure if it was intentionally submitted in that form in order to buy authors some time, or if it was indeed submitted properly but was corrupted in transit. Indeed, this issue is often not so much to guard against malicious authors but rather so as to allow a scientifically honest author in this situation to be able to prove his innocence beyond any doubt. To counter this issue, we recommend that authors be presented with a 128bit hash of his submission file and that this hash value is kept as a record for the Chair. Since WSaR is developed in PHP, we can apply the “md5_file()” function where it calculates the message digest of the given file [18]. Firstly, a new column has to be created in the “submissions” table to record the message digest by adding a “msgDigest varchar(255) BINARY NOT NULL” statement in the “create_table( )” function, which can be found in WSaR’s database.php script. Subsequently, we added a statement to calculate the file’s message digest in the act-submit.php and act-revise.php scripts, which are used in WSaR to process all parameters entered in the submission and revision form respectively. The resulted digest is inserted into the database under the “msgDigest” column and authors will be redirected to the receipt page. An email will also be generated to the author, and carbon-copied to the Chair with all submission details listed (including the message digest). Now, we can be assured of the file’s integrity as well as binding the author to his submission. The screenshots are shown as Figure 14 and Figure 15 in Appendix A for further illustration.
4
Protocol Sketch for Password Distribution Via Email
In general (not specific to WSaR), passwords for reviewers are commonly sent via emails to the reviewers’ email addresses. This is indeed the most practical way to distribute passwords, although it is known that email formats by default (and this is the setting that users commonly use) do not provide any form of confidentiality nor authenticity, unless explicit email clients or plug-ins like PGP are applied. We take what we view as a concrete step towards securing online systems by motivating here and sketching the basic idea of our ongoing work: a proposal for an email-based password distribution protocol for security conferences. Having this kind of protocol in place will prevent reviewer passwords from being easily
On the Security of a Popular Web Submission
259
compromised through attacks mounted not on the system itself but on how this password is distributed. Indeed, it is well known that the study of passwordbased key exchange protocols is a long-standing research topic in cryptology, thus we should avoid having security conferences using in practice the emailbased password distribution protocols that are not securely designed. The setting for an email ID-based password distribution protocol is different from those in typical key exchange protocols [3]. The protocol involves two parties, the program chair C and the reviewer R. Rather than requiring any public-key infrastructure (PKI), only a trusted web site bulletin board is used, for instance the IACR website (http://www.iacr.org), where URLs for different IACR conferences or workshops are hosted or mirrored. On this website is listed the email address IDC of the program chair. Meanwhile, it is common that the chair invites program committee members (the reviewers) who are experts in the field and for whom the chair knows the authentic email addresses IDR either himself, or for which he can ask from other experts through some out of band mechanism. Alternatively, the email addresses can be obtained from the IACR membership database. Thus, when a reviewer R receives an email from the chair C, it can check to be certain that the email came from the chair; vice versa a chair knows if an email came from a particular reviewer. This provides source authentication without resorting to any PKI. Then a general sketch of this kind of protocol proceeds as follows: 1. C generates an ID-based public and private key pair based on his IDC . Denote these as pkC and skC . Then C computes the message sigSKC (IDC , IDR ,Invite) where sigSK denotes signing under a person’s private key SK, and Invite contains a one-way (e.g. hashed) representation of a typical invitation email stating the conference name etc. 2. C sends m1 = sigSKC (IDC , IDR , Invite) to R. 3. R generates an ID-based public and private key pair based on his IDR . Denote these as pkR and skR . Then R computes the message sigSKR (IDR , IDC , Accept). 4. R sends m2 = sigSKR (IDR , IDC , Accept) to C. 5. C generates a password pwdR for R. 6. C sends m3 = 0002sigSKC (IDC , IDR , m1 , m2 , EncP KR (pwdR )), EncP KR (pwdR )0003 to R, where EncP K (·) denotes encryption under a person’s public key P K. 7. R obtains pwdR by decrypting EncP KR (pwdR ). In fact, this can be a concrete step to a long-term setting where one submission and review system is used for all conferences that use WSaR, so only a one-time setup cost e.g. the ID-based password distribution protocol as above, is incurred for new users, while existing users can update their passwords online through the system at any time; and new membership in conference program committees of existing users only require the program chair to email to a PC member R asking him to update his password himself rather than having to email newly generated passwords every time R is involved in a new conference program commitee. Indeed, this centralization is possible since conferences using WSaR are typically
260
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
hosted on a central machine e.g. at http://s1.iacr.org, unlike some other submission software that need to be locally set up. Furthermore, conferencesusing WSaR commonly involve program committee members who are involved in multiple conferences so the setup cost will be amortized over time and this centralization makes sense compared to treating each conference system separately. We emphasize here that the above is a sketch of a protocol design that we are currently working on, whose formal security we are in the process of proving. This should therefore preclude the sprouting of future papers that propose informal “breaks” on the above enumerated steps. The above sketch should only be taken as a basic template and taken for the general idea that it is sketching and nothing more. We do welcome comments for which will be acknowledged in future work, or collaborations in this direction.
5
Concluding Remarks
In this paper, we have seen that WSaR is built with a strong defense against a number of known attacks in the Web. We have also discovered and discussed the absence of several features that could jeopardise the security of WSaR users and we proposed suggestions to further enhance WSaR’s security that address these issues. As a side remark, we recommend that besides strengthening the defense of the software itself e.g. WSaR, Web administrators should secure the Web server as well as the back-end database and constantly monitor the activities going on in the Web server to detect any malicious behaviour. In addition, as a further step to securing these types of systems, we have also motivated the design of an email-based password distribution protocol for WSaR users and argued that together with such a design, there are advantages in centralizing conferences that use WSaR.
Acknowledgement The first author thanks Shai Halevi for encouragement during the initial stages of this research, for patiently answering queries about WSaR and for detailed comments on an early version of this paper. The second author thanks Thomas Baign`eres for stimulating discussions on another popular submission and review system iChair. Nous avons eu un temps merveilleux pendant la pause de caf`e, ou bien? We thank an anonymous referee for motivational comments especially on the need to make clear the distinction between pro- and counter-incentives for an attacker; and for acknowledging the fun of this research work :-)
References 1. AICT Security - Empty your Cache. Available online at https://www.ualberta.ca/AICT/Security/BrowserCache.html#private 2. Archer, T.: Are Hash Codes Unique? Available online at http://blogs.msdn.com/tomarcher/archive/2006/05/10/594204.aspx
On the Security of a Popular Web Submission
261
3. Bellare, M., Rogaway, P.: Entity Authentication and Key Distribution. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg (1994) 4. CIBC - Clear Your Browser’s Cache. Available online at http://www.cibc.com/ca/legal/clear-browsers-cache.html 5. Conklin, W.A., White, G.B., Cothren, C., Williams, D., Davis, R.L.: Principles of Computer Security: Security+ TM and Beyond. McGraw-Hill, New York (2005) 6. EasyChair Conference System. Available online at http://www.easychair.org/ 7. Foster, J.C.: Defense Tactics for SQL Injection Attacks. Available online at http://searchappsecurity.techtarget.com/tip/0,289483,sid92 gci1219912, 00.html 8. Fyre, C.: One Simple Rule to Make your Web Apps more Secure (2006), Available online at http://searchappsecurity.techtarget.com/qna/0,289202, sid92 gci1225425,gci1225425,00.html00.html 9. Google Mail. Available online at http://gmail.google.com 10. Halevi, S.: Web Submission and Review Software. Available online at http://theory.csail.mit.edu/∼ shaih/websubrev 11. IACR Conferences. Available online at http://www.iacr.org/conferences/ 12. McClure, S., Shah, S., Shah, S.: Web Hacking: Attacks and Defense. AddisonWesley, Reading (2003) 13. Microsoft Corporation. Microsoft’s Conference Management Toolkit. Available online at http://msrcmt.research.microsoft.com/cmt/ 14. Password Cracking: Information from Answers.com (2006), Available online at http://www.answers.com/topic/password-cracking 15. Peikari, C., Chuvakin, A.: Security Warrior. O’Reilly (2004) 16. Phan, R.C.-W., Goi, B.-M.: Flaw in IEEE Trans on Consumer Electronics Online Submission System. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, Springer, Heidelberg (2005) 17. Phan, R.C.-W., Ling, H.-C.: On the Insecurity of the Microsoft Research Conference Management Tool (MSRCMT) System. In: CITA 2005. Proceedings of International Conference on IT in Asia, pp. 75–79 (2005) Also presented at the rump session of Asiacrypt 2004, Jeju Island, Korea 18. PHP Manual. Full version available online at http://www.php.net/manual/en/ 19. Regular Expressions (2006), Available online at http://searchappsecurity. techtarget.com/sDefinition/0,290660,sid92 gci517740,00.html 20. ScholarOne, Inc. Manuscript Central: About Manuscript Central. Available online at http://www.scholarone.com/products manuscriptcentral aboutMC.shtml 21. Security Information Clearing Browser Cache and History. Available online at http://www.hlasset.com/files/Clearing Cache History.pdf 22. SoftConf.com - Software for Conferences. Available online at http://www.softconf.com/index.html 23. SourceForge.net: Web Submission and Review Software. Available online at http://sourceforge.net/projects/websubrev 24. What is SQL Injection? (2006), Available online at http://searchappsecurity. techtarget.com/sDefinition/0,290660,sid92 gci1003024,00.html 25. The Ten Most Critical Web Application Security Vulnerabilities (2004) Available online at http://osdn.dl.sourceforge.net/sourceforge/owasp/OWASPTop Ten2004.pdf 26. Ware, M.: Online Submission and Peer-Review System (2005) Available online at www.zen34802.zen.co.uk/Learned Publishing offprint.pdf
262
A
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Storage and Display of Submissions’ Digests
We present below the screen shots of the “submissions” table, and the submission/revision receipt after the calculation of submission’s digest is incorporated.
Fig. 14. Digest of submission is stored in the database for Chair’s reference
Fig. 15. Digest of submission is presented in the receipt for the author
B
Conferences That Have Used or Are Using WSaR
In reverse chronological order: 1. EUROCRYPT 2008: 27th Annual International Conference on the Theory and Applications of Cryptographic Techniques 2. CT-RSA 2008: RSA Conference 2007, Cryptographers’ Track 3. LATIN 2008: 8th Latin American Theoretical Informatics 4. TCC 2008: 5th Theory of Cryptography Conference 5. PKC 2008: 11th International Conference on Theory and Practice in PublicKey Cryptography 6. ASIACRYPT 2007: 13th Annual International Conference on the Theory and Application of Cryptology and Information Security 7. ISC 2007: 10th Information Security Conference 8. CRYPTO 2007: 27th Annual International Cryptology Conference 9. ICALP 2007: Track C of the 34th International Colloquium on Automata, Languages and Programming
On the Security of a Popular Web Submission
263
10. GOCP 2007: 1st International Workshop on Group-Oriented Cryptographic Protocols 11. ACNS 2007: 5th International Conference on Applied Cryptography and Network Security 12. EUROCRYPT 2007: 26th Annual International Conference on the Theory and Applications of Cryptographic Techniques 13. USEC 2007: Usable Security Workshop 14. TCC 2007: 4th Theory of Cryptography Conference 15. CT-RSA 2007: RSA Conference 2007, Cryptographers’ Track 16. HVC 2006: 2nd Annual Haifa Verification Conference 17. PKC 2006: 9th International Conference on Theory and Practice of PublicKey Cryptography
C
Related Work
Many online paper submission and review systems are emerging. For the context of cryptology and information security, the predecessor to WSaR is the collection of PHP/Perl scripts written progressively by Chanathip Namprempre, Andre Adelsbach, Andrew Clark and the Computer Security and Industrial Cryptography (COSIC) group at Katholieke Universiteit Leuven. These scripts were used for almost all mainstream cryptology and information security conferences till 2006 when building on ideas in these scripts, two successor systems were developed independently: WSaR by Shai Halevi of IBM Research and iChair by Thomas Baign`eres and Matthieu Finiasz of LASEC at EPFL. These two systems are now used by almost all mainstream cryptology and information security conferences. A few other major submission and review systems used in other fields deserve some mention here: 1. Manuscript Central Manuscript Central [20], developed by ScholarOne, Inc., is the online submission and peer review system used to handle manuscript submissions to journals. It manages over 44,000 submissions per month and its comprehensive and user-friendly features result in reports that most journals using Manuscript Central achieve gains in submissions of 20 to 40 per cent per annum. This system is currently used by most IEEE and Association of Computing Machinery (ACM) journals. Manuscript Central is a fullydeveloped software with 24-hour support on weekdays. By contacting the sales representative, one would be able to obtain and understand the system’s functionalities, features, pricing and get the system running in two weeks’ time. 2. Microsoft Research Conference Management Tool (MSRCMT) Firstly developed for ACM SIGKDD 1999, the MSRCMT [13] is an academic conference management service sponsored by Microsoft Research. Surajit Chaudhuri, a Research Area Manager at Microsoft Research is the architect of MSRCMT. Since the year 1999, this system has been used in over 500
264
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
conferences, among them are the International Conference on Security of Information and Networks (SIN 2007) and the ACM SIGCOMM 2007 Data Communication Festival. Similar to Manuscript Central, the MSRCMT is also a fully-developed system. It is free and hosted by Microsoft Research, but with limited support since it is developed and managed by a small team. 3. EasyChair Developed in the year 2002 by Andrei Voronkov, a Professor from the University of Manchester, EasyChair [6] is used by over 600 conferences in year 2007 alone. EasyChair is free and it is currently hosted by University of Manchester’s Computer Science Department. EasyChair is capable of supporting two models: (1) the standard model for conferences having one program committee and (2) the multi-track version for conferences having multiple tracks that have their own program committee. There are a number of ACM and IEEE conferences/workshops that have used or are using EasyChair. Among them are the 8th IEEE/ACM International Conference on Grid Computing (Grid 2007), the 2nd ACM Workshop on Scalable Trusted Computing (STC’07) and the 20th IEEE Computer Security Foundation Symposium (CSF 20). 4. START V2 ConferenceManager START V2 [22], written by Rich Gerber, is a product from SoftConf.com. Apart from EasyChair, several IEEE and ACM conferences have, or are employing START V2 as the submission and peer review system since the year 2002. The IEEE Symposium on Intelligence and Security Informatics (ISI 2007), the 2007 IEEE Symposium on Security and Privacy, the 5th ACM Workshop on Recurring Malcode (WORM 2007) and the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) are among the many IEEE and ACM conferences using START V2. Users would need to contact the developer to order the system; the pricing depends on whether one needs the system to be hosted at softconf.com, and on the different types of licensing arrangements. This system is constantly improving based on the users’ feedback. It is worth to note two earlier work related to the security of online submission and review systems, although our work here is the first detailed analysis of this type of system. Phan and Goi [16] pointed out the lack of privacy in a system used by an IEEE Transactions where the URL of pages that disclose paper information and that allow paper revision for a particular submitted paper, differ from pages of other papers by an ID counter. Thus if the URL to revise author A’s submitted paper is uniquely identified by ID 100, then A can also view the revision page for the paper submitted right after (resp. before) his, which he knows will be uniquely identified by ID 101 (resp. 99). Correspondence with the administrator of the system obtained the response that this was not a significant issue. Phan and Ling [17] discovered by accident when submitting their paper to a conference using the Microsoft Research Conference Management Tool (MSRCMT) that the system automatically creates an account for co-authors of a
On the Security of a Popular Web Submission
265
corresponding author who submitted a paper, where email addresses of these co-authors are used as login usernames and the numeric 0 is used as the default password. This applied for any co-author(s) of any paper. Correspondence with developers of MSRCMT obtained the response that this was an exercise of regression testing, though the flaw was present for some months in the actual online system used by several conferences. As of 2005 it was verified that both the above systems no longer exhibit those issues.
Authorization Constraints Specification of RBAC Lilong Han, Qingtan Liu, and Zongkai Yang Department of Information and Technology&Engineering Research Center on Education Information Technology, Central China Normal Uninversity, 430079,Wuhan, China {Lilong Han,Qingtan Liu,Zongkai Yang,hanlilong2001}@yahoo.com.cn
Abstract. Constraints are an important aspect of role-based access control (RBAC) and are often regarded as one of the principle motivations behind RBAC. Although the importance of the constraints in RBAC has been recognized for a long time, they have not received much attention. In this article, we introduce an intuitive formal language for specifying role-based authorization constraints named RCL2000 including its basic elements, syntax and semantics. We show how previously identified role-based authorization constraints such as separation of duty (SOD) can be expressed in this language, and that there are other significant SOD properties that have not been previously identified in the literature. Our work indicates that there are many alternate formulations of even the simplest SOD properties, with varying degree of flexibility and assurance. So this language provides us a rigorous foundation for systematic study of role-based authorization constraints. Keywords: RBAC, Constraints, RCL2000, SOD, DSOD.
1 Introduction Role-based access control (RBAC) has emerged as a widely accepted alternative to classical discretionary and mandatory access controls. RBAC regulates the access of users to information and system resources on the basis of activities that users need to execute in the system, and requires the identification of roles in the system. Since roles in an organization are relatively persistent with respect to user turnover and task reassignment, RBAC provides a powerful mechanism for reducing the complexity, cost, and potential for error in assigning permissions to users within the organization. Because roles within an organization typically have overlapping permissions, RBAC models include features to establish role hierarchies, where a given role can include all of the permissions of another role. Another fundamental aspect of RBAC is authorization constraints (also simply called constraints). Although the importance of constraints in RBAC has been recognized for a long time [1], they have not received much attention in the research literature, while role hierarchies have been practiced and discussed at considerable length. In this article, our focus is on constraint specifications, that is, on how constraints can be expressed, whether in natural languages, such as English, or in more formal languages. Natural language specification has the advantage of ease of comprehension by human beings, but may be prone to ambiguities, and the specifications do not lend themselves to the analysis of properties of the set of constraints. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 266–276, 2007. © Springer-Verlag Berlin Heidelberg 2007
Authorization Constraints Specification of RBAC
267
To specify these constraints we introduce the specification language RCL2000 (for Role-Based Constraints Language 2000, pronounced Ríckle2000) which is the specification language for role-based authorization constraints [2]. In this article, we describe its basic elements, syntax, and the formal foundation of RCL2000. RCL2000 is a substantial generalization of RSL99 [Ahn and Sandhu 1999], which is the earlier version of RCL2000. It encompasses obligation constraints in addition to the usual separation of duty and prohibition constraints.
2 Role-Based Constraints Language (RCL 2000) RCL2000 is defined in context of the well-known family of models for RBAC of Sandhu et al. This model has become a widely cited authoritative reference and is the basis of a standard currently under development by the National Institute of Standards and Technology. Here we use a slightly augmented form of the model illustrated in Figure 1. We decompose permissions into operations and objects to enable formulation of certain forms of constraints. Also in Figure 1 we drop the administrative roles since they are not germane to RCL2000.
Fig. 1. Basic elements and system functions
Constraints are an important aspect of role-based access control and are a powerful mechanism for laying out higher-level organizational policy. The importance of flexible constraints to support emerging applications has been recently discussed by many scholars. Consequently, the specification of constraints needs to be considered. To date, this topic has not received much formal attention in the context of role-based access control. A notable exception is the work of Giuri and Iglio who defined a formal model for constraints on role-activation. RCL2000 considers all aspects of role-based constraints, not just those applying to role activation. RCL2000 goes beyond separation of duty to include obligation constraints such as those used in the constructions of Sandhu and Osborn et al. for simulating mandatory and discretionary access controls in RBAC.
268
L. Han, Q. Liu, and Z. Yang
One of our central claims is that it is futile to try to enumerate all interesting and practically useful constraints because there are too many possibilities and variations. Instead, we should pursue an intuitively simple yet rigorous language for specifying constraints such as RCL2000. The expressive power of RCL2000 is demonstrated in Section 4, where it is shown that many constraints previously identified in the RBAC literature and many new ones can be conveniently formulated in RCL2000. 2.1 Basic Components The basic elements and system functions on which RCL2000 is based are defined in Figure 2. Figure 1 shows the RBAC model which is the context for these definitions. RCL2000 has six entity sets called users (U), roles (R), objects (OBJ), operations (OP), permissions (P), and sessions (S). These are interpreted as in RBAC model as discussed above. OBJ and OP are not in RBAC model. OBJ is the passive entities that
… …
—U=a set of users,{u1, ,un} —R=a set of roles,{r1, ,rm} —OP=a set of operations,{op1, ,opo} —OBJ=a set of objects,{obj1, ,objr} J a set of permissions,{p1, —P=OP —S=a set of sessions,{s1, ,sr}
…
… …
…,p } q
—RH R Ris a partial order on R called the role hierarchy or role dominance relation,written as ≤ —UA U R, a many-to-many user-to-role assignment relation —PA P R=OP OBJ R, a many-to-many permission-to-role assignment relation —user :S U,a function mapping each session si to the single user. user: R U, a function mapping each role ri to a set of users. r , a function mapping the set U,P and S to a set of roles R. —roles :U P S roles*: U P S r, extends roles in presence of role hierarchy.
roles(ui)={r R (ui, r) U roles*(ui)={r R ( r ≥r)[(ui, r ) U roles(pi)={r R (pi, r) P roles*(pi)={r R ( R ≤r)[(pi, r ) P roles(si)={r R (sessions(si),r) U roles*(si)={r R ( R ≥r)[ r roles(si) —sessions: U 2s, a function mapping each user ui to a set of sessions
’ ’ ’ ’ ’ ’
—permissions :R p, a function mapping each role ri to a set of permissions. p , extends permissions in presence of role hierarchy. —permissions*: R permissions(ri)={p P (p,ri) PA} permissions*(ri)={p P ( r≤ri)[(p,ri) PA]} —Operations: R OBJ 2OP, a function mapping each role ri and object obji to a set of operations Operations(ri,obji) ={op OP (op,obji,ri) PA} OBI , a function mapping each permissions pi to a set of objects —object: P
Fig. 2. Basic elements and system functions
Authorization Constraints Specification of RBAC
269
contain or receive information. OP is an executable image of a program, which upon execution causes information flow between objects. P is an approval of a particular mode of operation to one or more objects in the system. 2.2 Additional Elements Additional elements and system functions used in RCL2000 are defined in Figure 3. The precise meaning of conflicting roles, permissions, and users will be specified as per organizational policy in RCL2000. For mutually disjoint organizational roles such as those of purchasing manager and accounts payable manager, the same individual is generally not permitted to belong to both roles. We defined these mutually disjoint roles as conflicting roles. We assume that there is a collection CR of sets of roles that have been defined as conflicting.
Fig. 3. Additional elements and nondeterministic functions
The concept of conflicting permissions defines conflict in terms of permissions rather than roles. Thus the permission to issue purchase orders and the permission to issue payments are conflicting, irrespective of the roles to which they are assigned. We denote sets of conflicting permissions as CP. As we show, defining conflict in terms of permissions offers greater assurance than defining it in terms of roles. Conflict defined in terms of roles allows conflicting permissions to be assigned to the same role by error (or malice). Conflict defined in terms of permissions eliminates this possibility. In the real world, conflicting users also should be considered. For example, for the process of preparing and approving purchase orders, it might be company policy that members of the same family should not prepare the purchase order, and also be a user who approves that order. RCL2000 has two nondeterministic functions, oneelement and allother. The oneelement(X) function allows us to get one element Xi from set X. We usually write oneelement as OE. Multiple occurrences of OE(X) in a single RCL2000 statement all select the same element Xi from X. With allother(X) we can get a set by taking out one element. We usually write allother as AO. These two nondeterministic functions are related by context, because for any set S, {OE(S)} U AO=S, and at the same time, neither is a deterministic function. In order to illustrate how to use these two functions to specify role-based constraints, we take the requirement of the static separation of duty (SOD) property which is the simplest variation of SOD [3]. For simplicity assume there is no role
270
L. Han, Q. Liu, and Z. Yang
hierarchy (otherwise replace roles by roles*). Requirement: No user can be assigned to two conflicting roles. In other words, conflicting roles cannot have common users. We can express this requirement as below. Expression: roles (OE (U)) I OE(CR) ≤ 1 OE(CR) means a conflicting role set and the function roles(OE(U)) returns all roles that are assigned to a single user OE(U). Therefore this statement ensures that a single user cannot have more than one conflicting role from the specific role set OE(CR). We can interpret the above expression as saying that if a user has been assigned to one conflicting role, that user cannot be assigned to any other conflicting role. We can also specify this property in many different ways using RCL2000, such as OE(OE (CR)) ∈ Roles(OE(U)) ⇒ AO(OE(CR)) I roles(OE(U))= φ or user(OE(OE(CR)))
I User(AO(OE(CR)))= φ .
The expression roles (OE (U)) I OE(CR) ≤ 1 specifies dynamic separation of duties (DSOD) applied to active roles in a single session as opposed to static separation applied to user-role assignment. Dynamic separation applied to all sessions of a user is expressed by roles(sessions(OE (U))) I OE(CR) ≤ 1. A permission-centric formulation of separation of duty is specified as roles(OE(OE(CP))) I roles(AO(OE(CP)))= φ .The expression roles(OE(OE(CP))) means all roles that have a conflicting permission from, say cpi, and roles(AO(OE(CP))) stands for all roles that have other conflicting permissions from the same conflicting permission set cpi. This formulation leaves open the particular roles to which conflicting permissions are assigned but requires that they be distinct. This is just a sampling of the expressive power of RCL2000 discussed in Section 4. 2.3 Syntax of RCL 2000 The syntax of RCL 2000 is defined by the syntax diagram and grammar given in Figure 4. The rules take the form of flow diagrams. The possible paths represent the possible sequence of symbols. Starting at the beginning of a diagram, a path is followed either by transferring to another diagram if a rectangle is reached or by reading a basic symbol contained in a circle. Backus Normal Form (BNF) is also used to describe the grammar of RCL2000 as shown in the bottom of Figure 4. The symbols of this form are “::=” meaning “is defined as” and “ ” meaning “or.” Figure 4 shows that RCL2000 statements consist of an expression possibly followed by implication ( ⇒ ) and another expression. Also RCL2000 statements can be recursively combined with a logical AND operator ( ∧ ). Each expression consists of a token followed by a comparison operator and token, size, set, or set with cardinality. Also a token itself can be an expression. Each token can be just a term or a term with cardinality. Each term consists of functions and sets including set operators. The sets and system functions described earlier in Section 2.1 are allowed in this syntax. Also, we denote oneelement and allother as OE and AO, respectively.
Authorization Constraints Specification of RBAC
271
Fig. 4. Syntax of language
3 Formal Semantics of RCL2000 In this section, the formal semantics for RCL2000 is discussed. We do so by identifying a restricted form of first-order predicate logic called RFOPL which is exactly equivalent to RCL2000. Any property written in RCL2000, called a RCL2000 expression, can be translated to an equivalent expression in RFOPL and vice versa. The translation algorithm, namely, Reduction, converts a RCL2000 expression to an equivalent RFOPL expression. The Reduction algorithm eliminates AO function(s) from a RCL2000 expression in the first step. Then we translate OE terms iteratively introducing universal quantifiers from left to right. If we have nested OE functions in the RCL2000 expression, translation will start from the innermost OE terms. This algorithm translates the RCL2000 expression to an RFOPL expression in time O(n), supposing that the number of OE terms is n.
272
L. Han, Q. Liu, and Z. Yang
For example, the following expression can be converted to an RFOPL expression according to the sequences below. Example 1. OE(OE(CR)) ∈ roles(OE(U)) ⇒ AO(OE(CR)) I roles(OE(U))= φ
(1)OE(OE(CR)) ∈ roles(OE(U) ⇒ (OE(CR)-{OE(OE(CR))} I roles(OE(U))= φ
(2) ∀ cr ∈ CR : OE(cr) ∈ roles(OE(U)) ⇒ (cr-{OE(cr)}) I roles(OE(U))= φ (3) ∀ cr ∈ CR, ∀ r ∈ cr : r ∈ roles(OE(U)) ⇒ (cr-{r}) I roles(OE(U))= φ (4) ∀ cr ∈ CR, ∀ r ∈ cr, ∀ u ∈ U : r ∈ roles(u) ⇒ (cr-{r}) I roles(u) = φ
The resulting RFOPL expression will have the following general structure. (1) The RFOPL expression has a (possibly empty) sequence of universal quantifiers as a left prefix, and these are the only quantifiers it can have. We call this sequence the quantifier part. (2) The quantifier part will be followed by a predicate separated by a colon(:) (i.e., universal quantifier part : predicate). (3) The predicate has no free variables or constant symbols. All variables are declared in the quantifier part (e.g., ∀ r ∈ R, ∀ u ∈ U: r ∈ roles(u)). (4) The order of quantifiers is determined by the sequence of OE elimination. In some cases this order is important so as to reflect the nesting of OE terms in the RCL2000 expression. For example, in ∀ cr ∈ CR, ∀ r ∈ cr, ∀ u ∈ U: predicate; the set cr, which is used in the second quantifier, must be declared in a previous quantifier as an element, such as cr in the first quantifier. (5) Predicate follows most rules in the syntax of RCL2000 except the term syntax in Figure 4. Because the reduction algorithm has a nondeterministic choice for reduction of the OE term, we may have several RFOPL expressions that are translated from a RCL2000 expression.
4 Expressive Power of RCL2000 In this section, we demonstrate the expressive power of RCL2000 by showing how it can be used to express a variety of separation of duty properties. As a security principle, SOD is a fundamental technique for prevention of fraud and errors, known and practiced long before the existence of computers [4]. It is used to formulate multiuser control policies, requiring that two or more different users be responsible for the completion of a transaction or set of related transactions. The purpose of this principle is to minimize fraud by spreading the responsibility and authority for an action or task over multiple users, thereby raising the risk involved in committing a fraudulent act by requiring the involvement of more than one individual. A frequently used example is the process of preparing and approving purchase orders. If a single individual prepares and approves purchase orders, it is easy and tempting to prepare and approve a false order and pocket the money. If different users must prepare and approve orders, then committing fraud requires a conspiracy of at least two, which significantly raises the risk of disclosure and capture.
Authorization Constraints Specification of RBAC
273
Although separation of duty is easy to motivate and understand intuitively, so far there is no formal basis for expressing this principle in computer security systems. Several definitions of SOD have been given in the literature. For the purpose of this article, we use the following definition. Role-based separation of duty ensures SOD requirements in role-based systems by controlling membership in, activation of, and use of roles as well as permission assignment. There are several papers in the literature over the past decades that deal with separation of duty. During this period various forms of SOD have been identified. Attempts have been made to systematically categorize these definitions. However, this work has significant limitations. It omits important forms of SOD including session-based dynamic SOD needed for simulating lattice-based access control and Chinese Walls in RBAC. It also does not deal with SOD in the presence of role hierarchies. Moreover, as shown, there are additional SOD properties that have not been identified in the previous literature. Here, we take a different approach to understanding SOD. Rather than simply enumerating different kinds of SOD we show how RCL2000 can be used to specify the various separation of duty properties. 4.1 Static SOD Static SOD (SSOD) is the simplest variation of SOD. In Table 1, we show our expression of several forms of SSOD. These include new forms of SSOD that have not previously been identified in the literature. This demonstrates how RCL2000 helps us in understanding SOD and discovering new basic forms of it. Property 1 is the most straightforward property. The SSOD requirement is that no user should be assigned to two roles which are in conflict with each other. In other words, it means that conflicting roles cannot have common users. RCL2000 can clearly express this property, which is the classic formulation of SSOD. It is a rolecentric property. Property 2 follows the same intuition as Property 1, but is permission-centric. Property 2 says that a user can have at most one conflicting permission acquired through roles assigned to the user. Property 2 is a stronger formulation than Property 1, which prevents mistakes in role permission assignment. This kind of property has not been previously mentioned before [5]. RCL2000 helps us discover such omissions in previous work. In retrospect, Property 2 is an “obvious property” but there is no mention of it in over a decade of SOD literature. Even though Property 2 allows more flexibility in role-permission assignment since the conflicting roles are not predefined, it can also generate roles that cannot be used at all. For example, two conflicting permissions can be assigned to a role. Property 2 simply requires that no user can be assigned to such a role or any role senior to it, which makes that role quite useless. Thus, Property 2 prevents certain kinds of mistakes in role-permissions but tolerates others. Property 3 eliminates the possibility of useless roles with an extra condition, permissions*(OE(R)) I OE(CP) ≤ 1. This condition ensures that each role can have at most one conflicting permission without consideration of user-role assignment. Property 4 can be viewed as a reformulation of Property 3 in a role-centric manner. Property 3 does not stipulate a concept of conflicting roles. However, we can interpret conflicting roles to be those that happen to have conflicting permissions assigned to
274
L. Han, Q. Liu, and Z. Yang
them. Thus, for every cpi, we can define cri ={r ∈ R cpi I permissions(R) ≠ φ }. With this interpretation, Properties 3 and 4 are essentially identical. The viewpoint of Property 3 is that conflicting permissions get assigned to distinct roles which thereby become conflicting, and therefore cannot be assigned to the same user. Which roles are deemed conflicting is not determined a priori but is a side-effect of permission-role assignment. The viewpoint of Property 4 is that conflicting roles are designated in advance and conflicting permissions must be restricted to conflicting roles. These properties have different consequences on how roles get designed and managed but essentially achieve the same objective with respect to separation of conflicting permissions. Both properties achieve this goal with much higher assurance than Property 1. Property 2 achieves this goal with similar high assurance but allows for the possibility of useless roles. Thus, even in the simple situation of static SOD, we have a number of alternative formulations offering different degrees of assurance and flexibility. Table 1. Static separation of duty Properties 1.SOOD-CR 2.SOOD-CP 3.Variation of 2 4.Variation of 1
5.SOOD-CU 6.Yet another variation
Expressions
roles*(OE(U)) I OE(CR) ≤ 1
permissions(roles*(OE(U))) I OE(CP) ≤ 1 (2) ∧ permissions*(OE(R)) I OE(CP) ≤ 1 (1) ∧ permissions*(OE(R)) I OE(CP) ≤ 1
∧ permissions(OE(R)) I OE(CP) ≠ φ ⇒ OE(R) I OE(CR) ≠ φ (1) ∧ user(OE(CR)) I OE(CU) ≤ 1 (4) ∧ (5)
Property 5 is a very different property and is also new to the literature. With a notion of conflicting users, we identify new forms of SSOD. Property 5 says that two conflicting users cannot be assigned to roles in the same conflicting role set. This property is useful because it is much easier to commit fraud if two conflicting users can have different conflicting roles in the same conflicting role set. This property prevents this kind of situation in role-based systems. A collection of conflicting users is less trustworthy than a collection of non-conflicting users, and therefore should not be mixed up in the same conflicting role set. This property has not been previously identified in the literature. We also identify a composite property that includes conflicting users, roles, and permissions. Property 6 combines Properties 4 and 5 so that conflicting users cannot have conflicting roles from the same conflict set while ensuring that conflicting roles have at most one conflicting permission from each conflicting permission set. This property supports SSOD in user-role and role-permission assignment with respect to conflicting users, roles, and permissions.
Authorization Constraints Specification of RBAC
275
4.2 Dynamic SOD In RBAC systems, a dynamic SOD property with respect to the roles activated by the users requires that no user can activate two conflicting roles. In other words, conflicting roles may have common users but users can not simultaneously activate roles that are in conflict with each other. From this requirement, we can express user-based Dynamic SOD as Property 1. We can also identify a session-based DSOD property that can apply to the single session as Property 2. We can also consider these properties with conflicting users such as Properties 1-1 and 2-1. Additional analysis of DSOD properties based on conflicting permissions can also be pursued as was done for SSOD. Table 2. Dynamic separation of duty Properties 1.Used-based DSOD 1-1.Used-based DSOD with CU
Expressions Roles*(sessions(OE(U))) I OE(CR) ≤ 1
Roles*(sessions(OE(OE(CU)))) I OE(CR) ≤ 1
Roles*(OE(sessions(OE(U)))) I OE(CR) ≤ 1 2-2.Session-based DSOD with CU Roles*(OE(sessions(OE(OE(CU))))) I OE(CR) ≤ 1 2.Session-based DSOD
5 Conclusion In this article, we have described the specification language RCL2000. This language is built on RBAC components and has two nondeterministic functions OE and AO. We have given a formal syntax and semantics for RCL2000. Any property written in RCL2000 may be translated to an expression written in a restricted form of first order predicate logic, which we call RFOPL. There is room for much additional work with RCL2000 and similar specification languages [6]. The language can be extended by introducing time and state. Analysis of RCL2000 specifications and their composition can be studied. The efficient enforcement of these constraints can also be investigated. A user-friendly front-end to the language can be developed so that it can be realistically used by security policy designers. Acknowledgments. This work was supported by National Science Foundation under Grant No:60673010 and supported by the Natural Science Foundation of Hubei Province under Grant No:2006ABC011 and supported by National Great Project of Scientific and Technical Supporting Programs Funded by Ministry of Science & Technology of China During the 11th Five-year Plan under Grant No: 2006BAH02A24.
References 1. Chen, F., Sandhu, R.S.: Constraints for Role-based Access Control. In: Proceedings of the First ACM Workshop on Role-Based Access Control, pp. 39–46. ACM Press, New York (1995) 2. Ahn, G.J., Sandh, R.: The RSL99 Language for Role-based Separation of Duty Constraints. In: Proceedings of 4th ACM Workshop on Role-Based Access Control, pp. 43–54. ACM Press, New York (1999)
276
L. Han, Q. Liu, and Z. Yang
3. Giuri, L., Iglio, P.: A Formal Model for Role-based Access Control with Constraints. In: Proceedings of 9th IEEE Workshop on Computer Security Foundations, pp. 136–145. IEEE Press, Piscataway, NJ (1996) 4. Gligor, V.D., Gavrila, S., Ferraiolo, D.: On the Formal Definition of Separation-of-duty Policies and Their Composition. In: Proceedings of the 1998 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 172–183. IEEE Computer Society Press, Los Alamitos, CA (1998) 5. Jaeger, T.: On the Increasing Importance of Constraints. In: Proceedings of 4th ACM Workshop on Role-Based Access Control, pp. 33–42. ACM Press, New York (1999) 6. Osborn, S., Sandhu, R., Munawer, Q.: Configuring Role-based Access Control to Enforce Mandatory and Discretionary Access Control Policies. ACM Trans. Inf. Syst. Secur. 3(2) (2000)
Dynamic Access Control Research for Inter-operation in Multi-domain Environment Based on Risk0002 Zhuo Tang, Ruixuan Li, Zhengding Lu, and Zhumu Wen School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China hust [email protected], {rxli,zdlu}@hust.edu.cn, [email protected]
Abstract. For the complexity of the multi-domain environment and the ceaseless evolvement of the information secure sharing, the traditional access control method can not ensure the absolute security for the exchange of data resources. Through introducing the concept of risk, this paper proposes a dynamic access control model for multi-domain environment based on risk of inter-operations. The risk rank of an access policy can be calculated by the history of the inter-operations among domains, the security degree of the objects and the safety factor of the access events. Through adjusting the access policies which be considered the high risk, the risk in the system can be controlled in real time. The security analysis shows that this method can reinforce the facility of the access control and the security of the multi-domain environment.
1
Introduction
With the increase in information and data accessibility, there is a growing concern for security and privacy of data. The realization of the connections and the inter-operations among the different data sources under the distributed heterogeneous environment is becoming the practical problem. There is a growing concern for the security problem for the inter-operations between multi-domains. More and more researches try to resolve the defense for the vicious behaviors through the economic methods. In fact, recent research in these directions has suggested some economical models for a wide range of secure distributed systems, including a payment based security system for mobile agents [1] and game based model for secured grid computing [2]. However, risk remains un-quantified in these proposals. In fact, there exists an emerging consensus that every security question is indeed an economical question concerning the utility of the underlying system [3]. For example, it is easy to show that in a mobile agent based e-commerce system, both the protection of agents and hosts have a direct impact 0002
This work is partially supported by National Natural Science Foundation of China under Grant 60403027, Natural Science Foundation of Hubei Province under Grant 2005ABA258, Open Foundation of State Key Laboratory of Software Engineering under Grant SKLSE05-07.
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 277–290, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
278
Z. Tang et al.
on the utility: attacks on a host by malicious agents will cause loss of commercial secrets such as customers private information, downtime to the system, loss of customers, which will eventually be counted as utility loss. Attacks on agents will result in similar consequences that will also lead to the lost of utility. Thus utility maximization and risk minimization are important issues in the design of a secure distributed system if we seek to gain maximum economical benefits from the underlying system. This is also an important target of the distributed system security. But at present, there are little literatures to demonstrate the actual signification of the risk of distributed system [4]. In the economic area, there exist the consanguineous relations between the risk and trust. Mayer and Rousseau [5] discussed the difference and the relationship between the risk and trust. Jarvenpaa [6] proposed definitely: trust can influence the risk in a certain extent. Further more, it can influence the subjects’ behaviors. For example, there is the lower risk when you loan money to the familiar than stranger. Moreover, this literature considers that the extent of trust can influence the cognitive extent of the risk. The objective of the inter-operations is to offer the rational distributing and effective share of the resource, and the cooperation of the distributed systems. It means the ability of the two software components to communicate and cooperate to complete a common task. It contains two meanings: the basic and the application. The basic inter-operations mean the communications and cooperation among the different platforms. And the applied inter-operations mean the cooperation among the distributed application components which above the computational platforms. This paper mainly discusses the later. The multi-domain environment has the characteristic of dynamic and indetermination. As the frequent changes of the security policies in individual autonomy, the changes of the relationship among the domains, even the birth and the death of the individual domains, the any security polices can not insure the absolute security of the data resource in the process of inter-operations. For the access control method of the traditional model, the subject sometimes can use its permissions constantly once been authorized. It hardly satisfies the dynamic changes of the multi-domain environment. And it will bring many security risks and hidden trouble to the inter-operations among multi-domain environment. In order to decrease the risk of inter-operations, for the problems of trust and risk of the security inter-operations under the application tier, which base the mappings between the users and roles in the different domains, this paper proposes a risk based dynamic access control model for multi-domain environment. In this model, through calculating the trust degree between the subjects firstly, we can ascertain the risk extent when a subject in one domain has an operation to the objects in the other domains. Therefore, we can receive the risk degree of the inter-operations. Using this risk degree, we can adjust the subjects’ access privilege dynamically. The risky permissions will be revoked, and the utility of the system will be maximized. The rest of the paper is organized as follows. Section 2 describes the related works. Section 3 presents algorithms for the calculating the risk of the
Dynamic Access Control Research
279
inter-operations. Section 4 describes the dynamic access control model for multidomain environment, followed by the conclusion in Section 5.
2
Related Works
In the recently 20 years, people have acquired the plentiful achievement for the research of the access control. Many access control models have been proposed. The most popular models include discretionary access control (DAC), mandatory access controls (MAC) and role based access control (RBAC). In the RBAC family which be proposed by Sandhu in 1996[7], the users’ privilege is related with their roles, and the users acquire their privilege through roles. A role is a permission set for a special work station. When the users’ privilege needs to be changed, we can do it by revoking the roles or re-distributing the user’s roles. Michael J. Covington et al [8] have proposed the Generalized Role Based Access Control (GRBAC) model. In this model, they extend the traditional RBAC by applying the roles to all the entities in a system. (In RBAC, the role concept is only used for subjects). By defining three types of roles, i.e., Subject roles, Environment roles, and Object roles, GRBAC uses context information as a factor to make access decisions. Guangsen Zhang et al. [9] also uses context parameters in their dynamic role-based access control model under the two key ideas: (1) A user‘s access privileges should be changed when the user’s context changes. (2) A resource must adjust its access policy when the environment context information (e.g., network bandwidth, CPU usage, memory usage) changes. These above two papers make the access control dynamic and flexible but the decision-making process is not as powerful and precise as that in our model. They did not consider the aspect of security in making-decision process and the impact of security problems on the system. The Nathan Dimmock’s paper [10] uses the concept of outcome to calculate cost for each outcome and risk value. Comparing to this paper, they do not consider the context for risk assessment. So it loses the flexibility characteristic in evaluating risk. They did not consider risk as an important factor in their access control mechanism and they did not use risk directly in making decision. There is little attention to the trust and risk in the access control research [11]. The term trust management system was introduced by Blaze et al. in [12], but the solution it proposes involves an unduly static notion of trust application programmers choose where to insert code to evaluate their notion of trust, for example at the starting point of a given execution session. Most of the past research combining access control with trust concepts focuses on a trust-management approach in which trust values flow in a manually defined way through access control policy. For example, in literature [13] and [14], the mutual trust relationship is founded by the continuously negotiation. Literature [15] illuminates the relationship between the trust management and distributed access control, and it extends the access control system of OASIS and the access control language. So, the access policy can be decided base on the trust and risk. But the trust mentioned in this paper is defined by the special operation, and the relationship between risk and trust is faint in this paper.
280
Z. Tang et al.
The above access control methods mostly base the traditional model. They are all short of the dynamic description for the subjects. With the complexity of the system and the dynamics of the applications, the changes of the access control objects are very large. Hence, these methods may increase the difficulty of the authorization. These access control models all try to protect the resource from the perspective of system. The weakness of these passive security models is that they cannot manage the privilege according to the environment dynamically. Once the subject acquire the privilege, it can use this privilege until it be revoked. It can bring the risk easily. Compared with the traditional RBAC model, the paper’s main contributions are as follows: 1. Introducing of a concept of risk into access control area. This method can ascertain the risk of inter-operations between the different domains in real time through the histories of the interactive events. It is better able to adapt to the distributed, complex, and diverse multi-domain environment. 2. Through adjusting the privilege of the subjects dynamically according to the risk levels of the access events, the functions of the access control system can be changed from the static protect for the resources to the dynamic authorization. The system can detect the environment and the security venture in real time, and the permissions of the subjects are not unchangeable anymore since be authorized. The system can identify the risky permissions automatically and revoke them duly. 3. This method can bring the convenience to the security management. The difference between this dynamic model and the traditional access control models is as follows: The management for the user’s permission settings is according to the actual events and historical records. In this way, the permission management is more convenient, and the control to the authorization is more convincing.
3
The Risk of the Inter-operation in Multi-domains
In the traditional model of the trust relationship, trust was usually defined as a Boolean variable, that is to say, in the session of both trust entities, one trust another entirely, or absolutely not, there would never be middle status. For instance, the entity A trusts entity B, but it is hard to tell how much they trust each other. For this reason, we have to quantify their trust. In this section, firstly, we formalize the definitions of the permissions and the operations between the permissions in the multi-domain environment. On this basis, we introduce the description of the trust in multi-domains. 3.1
The Formalization of the Permissions
Definition 1. Authorization Term. Authorization terms are 2-tuple of the form: < object, accessmode >, which is denoted as < O, A > for short. It is the basic form of the permission. The set of authorization terms is denoted as P . We have P = {< O, A >}.
Dynamic Access Control Research
281
Definition 2. Permission Set. Permission set represents all permissions of some subject, which is the set of the authorization terms. We can formulize it as PS. For example, we can describe a role r1 ’s all permission as: P S(r1 ) = {< f ile1, +read >, < f ile2, −write >}. That is to say the users, which are assigned to r1 , can read file1 and write file2. The denotation P Su can also express the permission set of the user u. obviously, if a role r is assigned to a user u, then P S(u) ⊇ P S(r). In this paper, the denotation role(u) represents the role set, which is assigned to user u. We can define the basic operators of the PS. The BNF definition for permission set as follows: P S = P S P S ∪ P S P S ∩ P S P S − P S SoD(P S, P S) Where the ∪,∩ and − are the basic operation in set theory, SoD(P S1 (r), P S2 (r)) denotes the separation of duties, it returns P S1 (r) or P S2 (r) , but it can return the P S1 (r) and P S2 (r) concurrently. OS = {O/D} returns the controlled objects set for a subject. D denotes the object’s domain. For instance OS(r1 ) = {f ile1/A, f ile2/A}. 3.2
The Role-Mappings Based Trust Degree Between Domains
In a typical multi-domain environment, we partition the domains into external and local domains. The role mapping can be formalized as a 4-tuple: < r1 , d1 , r2 , d2 >, r1 is a role in domain d1 , r2 is a role in domain d2 respectively, in general, d1 is the local domain, and d2 is the external domain. A subject in the local domain can access the objects in the external through the inter-domain mappings. As the mapping exhibits, the permissions of the role r1 in the domain d1 is P S(r1 ) = P S(r1 ) ∪ P S(r2 ) and OS(r1 ) = OS(r1 ) ∪ OS(r2 ). Trust is one entity assessing to behavior credibility, reliability, integrity and performance of other entity. Trust relationship is such a case: if the subject meets the object’s expectation, then the subject is trustable to the object. Definition 3. The trust degree denotes the trust extent between the different domains which be formed by the role-mappings. Which is formalized: C (d1 , d2 ), depict the trust relationship between the domain d1 and d2 . As its value range
rB2
rA3
rA1
rA2 rB1
Domain A
Domain B
Fig. 1. The role-mappings between two domains
282
Z. Tang et al.
is [0, 1], supposing a role-mapping < r1 , d1 , r2 , d1 , f alse >, C(d1 , d2 ) = 1 means the complete trust, that is to say the all permissions of the role r2 in the domain d2 can be inherited by the role r1 in the domain d1 ; by contraries, C(d1 , d2 ) = 0 means the complete distrust, that is to say the all permissions of the role r2 in the domain d2 are forbidden to be inherited by the role r1 in the domain d1 . The trust degree between domains is changed according to the inter-operation events. This is denoted by the function as follows: δ :C ×E →C
(1)
Where, E denotes the set of the inter-operation events. In general, if there are role-mappings exist between two domains, for each inter-operation, if the result is successful, the trust degree will be strengthened; by contraries, if the result is failed, it will be weakened. The following subsection discusses how to found the trust relationship between two domains. In this paper, we consider the trust in the multi-domain environment through their past transaction experiences. Considering the interoperations between the domain i and j, if the subject in domain i request to access the resources in domain j, according the estimate of each event, if the request is be satisfied, then the estimate from i to j is positive, that is denoted as trk (i, j) = 1, by contraries, if the estimate is passive, then trk (i, j) = −1. This paper defines the denotation sij as 0002 the appraisement for the all transactions between the domain i and j: sij = (tr( ij). Let’s use sat(i, j) to denote the total of the positive appraisement, while the denotation unsat(i, j) is used to denote the total of the passive appraisement. Then, sij = sat(i, j) − unsat(i, j)
(2)
For the convenience of the denotation, we mapping the value of the trust value to [0, 1]: 0003 sij sij ≥ 0 C(i, j) = sat(i,j)+unsat(i,j) (3) 0 otherwise In this way, every domain maintains local trust degrees of the other domains which the local ever affiliates with. We can use a vector to describe the trust of a local domain to the externals: T = {C(i, 1), c(i, 2), . . . , C(i, n)}, n is the number of the external domains. 3.3
The Risk of the Inter-operations in Multi-domain Environment
In general, a domain can maintain the trust vector for the externals domains which it interact with. As mentioned above, in the multi-domain environment, the risk of the inter-operations is up to the trust value of the domains C(i, j), the security level of the operation object Os , and the safety factor of the access action As . The following function defines the risk of the interoperation between the different domains:
Dynamic Access Control Research
283
Definition 4. Ri = F (C(i, j), Os , As ). The parameter Os denotes the security level of the operation object, the more high level the role, the more security level the objects can be accessed. The As denotes the security extent of the access operation. In general, the risk of the inter-operations will be increased with the heightener of the security level of the objects. Reversely, it will be decreased with the heightener of the trust between the interrelated domains and the safety factor of the access action. The function of the risk for the inter-operations is defined as follows: F (C(i, j), Os , As ) = Os × (1 − C(i, j)) × (1 − As )
(4)
By (1), we can see the value of Ri is in the range of [0, 1]. Where, the value 0 denotes no risk, and the value 1 denotes the maximal risk. The following is the algorithm for the security level of all operation objects in the special domain. The basic idea is that the leaf nodes in a role hierarchy only access the objects with the lowest security level in a special domain. That is to say, if the objects can be only accessed by the senior role, their security level is higher in the domain. The detailed algorithm is as follows. The parameter k is the basic security parameter in a special domain. k is an integers. k ≥1. Algorithm 1. The calculation for the security level of the access control objects. program Obj Security level() begin 1. Searching the role hierarchy, find the deepest leaf nodes. 2. Setting the value of the security level of the objects which can be controlled by the leaf node as k. While, the “visited” flags of the nodes are modified as “already visited”. 3. Finding the directly senior up from the leaf node, the security level of the objects which directly under the next senior node is on the basis of an increase for the objects controlled by the directly junior nodes. 4. Searching the all unvisited nodes down from the root node, the security level of the objects which directly under the next junior node is on the basis of a decrease for the objects controlled by the directly senior nodes. 5. Adjusting the security level of all nodes in the role hierarchy. The value of security level of all nodes is divided by the value of the root to be mapped to the range [0, 1] end. There is a role-mapping < A4 , A, B2 , B, f alse > exist between the domains in the figure 2. In a moment, a user in the domain A requests the operation “write” to the object O5 which is in the domain B: < O5 , write >. We set the basic security parameter k in this example as 1. Firstly, according to algorithm 1, We set security level of the directly controlled objects of the deepest leaf nodes B4 , O7 and O8 , as 1. Followed up, we can get that the security level of the directly controlled objects of B2 is 2, and the security level of the directly controlled objects of B1 is 3. By the step 5, we can get the security level of O5 : (k+1) (k+2) = 2 3 = 0.67.
284
Z. Tang et al. A1
U2
B1 U1
A2
A3
U3
A4 B2
U4
A5
O4 A7
A6 U5 O1
O6 B6
B4 O2
B3 O11
O5
B5
O3
O10 O7
Domain A
O8
O9 Domain B
Fig. 2. The inter-operations between domain A and domain B Table 1. The historical inter-operations between domain A and domain B Sequence Subject Object Operation Status 1 U1 O5 read successful 2 U2 O6 write successful 3 U3 O7 read failed 4 U4 O5 execute successful 5 U5 O8 copy failed 6 U1 O9 write successful 7 U1 O10 read successful
The safety factor for all access events in this example, which are denoted as read, copy, write, execute, is set as (0.8, 0.6, 0.4, 0.2) respectively. We suppose that the inter-operation history between domain A and B is as the table 1 shows. There are seven access events which the subject is in domain A and the object is in domain B. And five of them are successful and two failed. By the above method = 0.43. to compute the trust degree between domain A and B, we have: (5−2) 7 So, According to (4), we can get the risk of the above 2-tuple < O5 , write > as: Ri (Os , C(i, j), A) = Os ×(1−C(i, j))×(1−As ) = 0.67×(1−0.43)×(1−0.4) = 0.23 Thus, we can educe the risk rank in the multi-domain environment. In this instance, the risk is divided as 5 levels, which are as {potty, little, general, grave, verygrave}. The mapping from risk values to the risk ranks is as table 2 shows. This table can be configured by the administrators according to the special context. Table 2. The mapping from risk values to the risk ranks Sequence ranks 1 I 2 II 3 III 4 IV 5 V
values description 0 ≤ Ri < 0.2 potty 0.2 ≤ Ri < 0.4 little 0.4 ≤ Ri < 0.6 general 0.6 ≤ Ri < 0.8 grave 0.8 ≤ Ri < 1.0 very grave
Dynamic Access Control Research
285
We can conclude from the table 2 that the above operation < O5 , write >, a user in the domain A requests the operation “write” to the object O5 which is in the domain B, it’s risk rank is II. This means that the operation < O5 , write > has little risk. It may bring the failure for the operation. In the following sections, we will discuss how to avoid the risk through adjusting the privileges of the subjects.
4 4.1
The Risk-Based Dynamic Access-Control Model for Multi-domain The Model of MD-R2 BAC
The traditional security mechanism is generally designed for static network and closed system. In these systems, the authorizations of the users are determinate, and the relationship between user’s privileges and resources are found early. Based this, the protected resource are only be accessed by the authorized users. As these security models are simpler, we can call them traditional security model. But in the multi-domain environment, as the requestor and the provider of the resource can be in the different domains, because there is no absolute trust between these domains, it can not satisfy the all requests of the requestors through the traditional access control mechanism. Further more, the multi-domain environment changes frequently, and the real-time update is also unpredictable, so, the requestor and the provider of the resource may do not know each other. Therefore, the traditional models do not match the multi-domain environment well. This paper proposes a risk-based dynamic access-control model through importing the risk of the event context to the policies of RBAC. The authorization of the traditional is general denoted as 3-tuple: < S, O, A >, where S represents the subject, O represents the object, and A is the set of the actions. If a 3-tuple < s, o, a > exist, that is to say, the subject S can do the operation A on the object O. These 3-tuples are all predefined in the security system, and they are effective of all times. For the privilege constrain to the users, this access control method is passive and negative. RR UA
U
UD
Ro
RP
P
RD
D
DD
Ri
Fig. 3. The model of MD-R2 BAC
286
Z. Tang et al.
The following definitions contain some elements in the literature. The relationship between these definitions is as the figure 3. Definition 5. The variables and the relationships in this model. 1. M D − R2 BAC = (U, Ro, P, D, Ri), where U is the set of the users, Ro is the set of the roles, P is the set of the permissions, D is the set of the domains, and Ri denotes the set of the risk banks. 2. User Assignment. This is a many-to-many relationship between users and roles. It is denoted as U A, U A ⊆ U × Ro. This function can be expressed as assigned − user(ro) = {u ∈ U (u, ro) ∈ U A}. 3. Permission Assignment. This is a one-to-many relationship from roles to permissions. And it is a function which from the roles and risk ranks to the permissions. It is denoted as RP , Ro × Ri → P . The function can be expressed as: assigned − permission(ro : Ro, ri : Ri) → 2P . 4. Role Relation. This relationship contains the hierarchy and inheriting between the roles. We denote the set of the relationship of roles as RR, RR ⊆ R × R. 5. Risk between Domains. This is a mapping from the inter-operation between the domains to the risk rank. It is a function from a special inter-operation event to a certain risk rank. We can denote as DR : D × D × P 0003 → Ri, P 0003 ⊆ P. 6. User Hypotaxis. This is a one-to-many relationship from roles to domains. It is denoted as RD : R → D. Each user and role in the model must belong to a special domain, and a user or role can not be subject to two or more domains synchronously. This constrain is defined as follows: – Constrain 1. Each user in the model must only subordinate to only domain. < u, d1 >∈ U D∩ < u, d2 >∈ U D → d1 = d2 – Constrain 2. Each role in the model must only subordinate to only domain. < r, d1 >∈ RD∩ < r, d2 >∈ RD → d1 = d2 The dynamic distribution of permission in the MD-R2 BAC is mainly embodied in relation of DR, RP. The ration DR can acquire the trust degree of two domains through the inter-operation history. All the more, it can acquire the risk rank of an access event according to the security extent of the access operation and the security level of the operation object. Where, the access event between different domains can also be denoted as 6-tuple: ¡S, O, A, D1 , D2 , ri¿. The meaning of the elements is the same as the authorization item. Hence, in the relation DR, we have P 0003 ⊆ P . RP is a real-time implementation of the dynamic function. It can adjust the authorization to a subject in a domain duly according to the risk rank of the inter-operation. This paper will detail the authorization and the adjustment in the following sections.
Dynamic Access Control Research
4.2
287
The Policy and Mechanism of the Access Control in MD-R2 BAC Model
MD-R2 BAC is an access control model which can be changed with the interoperation history between different domains. It is a dynamic process that the users acquire the permissions through the roles. In this process, system can adjust the subject’s permissions according to the risk rank of its operation. In this way, the access control implementation process can be divided into three steps: privilege distribution, ascertaining the risk rank, and dynamic adjustment of the privilege. Privilege distribution. The privilege distribution includes that the administrator assigns the roles to user and predefine the permissions of the roles. Referred to above, the users’ permissions can be denoted as a 2-tuple :¡ u, ro¿, the set is formalized as ua(u,ro). We use UA(ro) to denote that assigning the role ro to a set of users U A(ro) = {ua(ui, ro) ua(ui, ro) ∈ U A(i = 1, 2, . . . , n)} In the multi-domain environment, the privilege usually performs as the power of the subject to access the objects which in the different domain. The basic authorization item can be formalized as the 6-tuple < s, o, a, d1 , d2 , ri >,which is denoted as atomp(s, o, a, d1 , d2 , ri). It means that a subject s in the domain d1 can do the operation a to the object o which is in the domain d2 , and the risk rank of this operation is ri in the current context. RP (ro) = {atomp(s, o, a, d1 , d2 , ri) atomp(s, o, a, d1 , d2 , ri) ∈ P } Ascertaining the risk rank. The risk rank ri in authorization item is a function of the access history events in the multi-domain environment. We have detail the process of calculation for the risk rank in the third part of this paper. The function which acquires the risk rank of an inter-operation is recorded as risk count(s,o,a,d 1 , d2 ). It returns the risk rank of a subject s in the domain d1 do the operation a to the object o which is in the domain d2 Ri = {ri ri = risk count(s, o, a, d1 , d2 )} Dynamic adjustment of the privilege. The prominent character of MDR2 BAC is that it can adjust the subject’s privilege according to the risk rank of its operation to the objects. We can set a risk threshold RV between two different domains. For the subject which acquire the access permissions to the objects in the other domains through the role-mappings, we can check each authorization items, and revoke the items whose risk rank over the predefined threshold RV. The dynamic adjustment of the privilege mainly reflected in the relation of privilege distribution RP. It is a function which from the roles and risk ranks to the permissions. F (Ro, Ri) → P, P = {atomp1 , atomp2 , . . . , atompn }, this is the initial permission set.
288
Z. Tang et al.
G(Ro, Ri, P1 ) → P2 , P1 ⊆ P, P2 ⊆ P, P2 = P − P1 , G is the revoke function. Return to the example in the third section, suppose that the initial permission set of the role A4 in the domain A to the objects in the domain B is P S(A4 ) = {< O5 , write >, < O6 , read >, < O7 , read >, < O8 , write >, < O9 , read >, we can acquire the risk value of these five authorization items: risk count(A4 , O5 ,write,A,B) = Ri(O s (O5 ),C(A,B),As (write)) = 0.23,the risk rank is II; risk count(A4 , O6 ,read,A,B)= Ri(O s (O6 ),C(A,B),As (read))=0.08, the risk rank is I; risk count(A4 , O7 ,read,A,B)= Ri(O s (O7 ),C(A,B),As (read))=0.04, the risk rank is I; risk count(A4 , O8 ,write,A,B)= Ri(O s (O8 ),C(A,B),As (write))=0.11, the risk rank is I; risk count(A4 , O9 ,read,A,B)= Ri(O s (O9 ),C(A,B),As (read))=0.04, the risk rank is I; If the predefined threshold RV is set as 0.2, base the policy of dynamic adjustment of the privilege, we will revoke the write permission of the subject A4 to the object O5 between domains A and B: P S(A4 ) = {< O6 , read >, < O7 , read >, < O8 , write >, < O9 , read >} Through the privilege’s dynamic adjustment, whether the subject can acquire some privilege lie on the risk rank of the relevant authorization items. The course which the subjects acquire the privilege is a dynamic and frequent process. We can decide on that whether the operation is can be executed base the operations history, the security level of the operation object, and the security extent of the access operation. Hence, the authorization is dynamic which will be adjusted with the time and the hierarchy of the subjects and objects. 4.3
The Security Analyses for the MD-R2 BAC
Comparing with the traditional security model, the contribution of the MDR2 BAC is as follows: 1. It is adapted well to the dynamic change in the multi-domain environment. In the MD-R2 BAC, the change of operations and objects can bring the change of the authorization. Through the risk rank, this model can reflect the change of the operations and the hierarchy of the access objects. Further more, these changes in this model will not affect the special authorizations. Hence, it can be adapted well to the frequently change of the multi-domain environment. 2. The more security MD-R2 BAC first imports the concept of risk to access control model, and the ultimately minimal of a subject is up to the conclusion of the access history. The risky permissions will be revoked in the dynamic adjustment
Dynamic Access Control Research
289
for the authorizations. It will restrict the risky event from the source and advance the success probability of the inter-operations. Through the dynamic adjustment, this model supports these two famous security principles: – The least of privilege. Through the risk control, the privilege with high risk rank will be revoked in time. When the subject accesses the other domain’s objects, it only holds on the relative security privilege. – The separation of duty. Sometimes, there are some sensitive objects can not be accessed by one subjects at the same time, and two different subjects also do not hope access one object simultaneity. These above can be regarded as the high risky events. These events can be identified by the estimate of the risk rank. And the system can revoke some part of the privileges to implement this security principle.
5
Conclusion and Future Work
In the multi-domain environment, the randomicity exist in the share of the information, the validity of the security mechanism, and the demand of the information exchange between users. For the complexity of the multi-domain environment and the ceaseless evolvement of the information secure share, the traditional access control method can not ensure the absolute security for the exchange of data resource. The traditional security mechanism is always designed for static network or closed system. As lacking the dynamic description of the subjects and objects, the traditional security mechanism hardly to adjust the subjects’ privilege base the security status of the system. Through introducing the concept of risk, this paper proposes a dynamic access control model MD-R2 BAC for multi-domain environment based the risk of interoperations. This model can acquire the risk of the authorization items of the subject’s privilege base the operations history, the security level of the operation object, and the security extent of the access operation. And the risky permissions will be revoked in the dynamic adjustment for the authorizations. The security analyses for the MD-R2 BAC indicate that this model can reduce the risk and hidden trouble for the information exchange in the multi-domain environment, and advance the security of the system obviously. In this paper, we only discuss the risk for a single access operation in the multidomain environment. When an attacker who combines multiple low risk operations into a new operation, how to assess the risk for the new multiple operation is a problem being worth paying close attention to. It will be our future works.
References 1. Sonntag, M., Hrmanseder, R.: Mobile agent security based on payment. Operating Systems Review 34(4), 48–55 (2000) 2. Kwok, Y.K., Song, S., Hwang, K.: Selfish grid computing: Game-theoretic modeling and nas performance results cardiff. In: CCGrid-2005. Proceedings of the International Symposium on Cluster Computing and the Grid, Cardiff, UK, May, pp. 9–12 (2005)
290
Z. Tang et al.
3. IEEE (ed.): IEEE Security and Privacy. Economics of Information Security, vol. 3. IEEE Computer Society, Los Alamitos (2005) 4. Grandison, T., Sloman, M.: A Survey of Trust in Internet Applications. IEEE Communications Surveys 3(4), 2–16 (2000) 5. Mayer, R.C., Davis, J.H., Schoorman, F.D.: An Integrative Model of Organizational Trust. Academy of Management Review (20), 75–91 (1995) 6. Jarvenpaa, S.L., Leidner, D.E.: Communication and trust in global virtual teams. Organization Science 10(6), 791–815 (1999) 7. Sandhu, R., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role Based Access Control Models. Computer 29(2) (1996) 8. Moyer, M.J., Covington, M.J., Ahamad, M.: Generalized role-based access control for securing future applications. In: NISSC 2000. 23rd National Information Systems Security Conference, Baltimore, Md, USA (October 2000) 9. Zhang, G., Parashar, M.: Context-Aware Dynamic Access Control for Pervasive Applications. In: Proceedings of the Communication Networks and Distributed Systems Modeling and Simulation Conference (CNDS 2004), Western MultiConference (WMC), San Diego, CA, USA (January 2004) 10. Dimmock, N., Belokosztolszki, A., Eyers, D., Bacon, J., Moody, K.: Using Trust and Risk in Role-Based Access Control Policies. In: Proceedings of Symposium on Access Control Models and Technologies (2004) 11. Grandison, T., Sloman, M.: A Survey of Trust in Internet Applications. IEEE Communications Surveys 3(4), 2–16 (2000) 12. Blaze, M., Feigenbaum, J., Lacy, J.: Decentralized trust management. In: Proc. IEEE Conference on Security and Privacy. AT&T (May 1996) 13. Li, N., Mitchell, J.C., Winsborough, W.H.: Design of a role-based trust management framework. In: 2002 IEEE Symposium on Security and Privacy, pp. 114–131. IEEE, Los Alamitos (2002) 14. Teh-Ming, W., Fidelis, Y.: A policy-driven trust management framework. In: Nixon, P., Terzis, S. (eds.) iTrust 2003. LNCS, vol. 2692, Springer, Heidelberg (2003) 15. Dimmock, N., Belokosztolszki, A., Eyers, D., et al.: Using Trust and Risk in Role based Access Control Policies. In: SACMAT 2004, New York, USA, June 2-4, 2004 (2004)
A Compositional Multiple Policies Operating System Security Model Lei Xia, Wei Huang, and Hao Huang State Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, 22 Hankou Road, Nanjing 210093, China [email protected], [email protected], [email protected]
Abstract. Multilevel security policies aim at only confidentiality assurance, with less consideration on integrity assurance and weakness in expressing channel control policies. Besides, the trusted subjects it introduces to handle the information flow “downgrade” have many security flaws. Moreover, increasing diversity of the computing environments results in various security requirements. However, current mainstream security models are aiming at only one or few requirements of them each. The Multi-Policy Views Security Model is presented, which is based on the MLS model, combining the domain and role attributes to the model, to enforce the expression power in channel control policies, make permission management more fine-grained and enhance the ability of confining the permission of the trusted subjects. Moreover, MPVSM has integrated the properties and functions of MLS, Domain-Type and Role Based models into one unified model. It is able to enforce multi-policy views in operating system in a flexible way. Keywords: security model, multiple policy views, integrity assurance, confidentiality assurance, least privilege, separation of duties.
1 Introduction Multilevel Security Model (MLS) [1] is one of the most widely used security model in various system. MLS aims at confidentiality assurance of the information, preventing the unauthorized leakage of the data in high security level. However, it rarely considers the integrity assurance, which is also a very critical security requirement [2,7,11]. Biba model is a mathematical dual of BLP, intending to protect integrity of information. However, they are both based on lattice mechanism, and information policies based on the lattice are transitive. Therefore, MLS models are weak in expressing channel control policies [22]. A firewall’s control policy is a typical channel control policy. As an example in Figure 1.1, Information is allowed to flow from the Outside component to the Inside component via the Access Control, but is forbidden to do so directly. Such policies are hardly expressed in the MLS models. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 291–302, 2007. © Springer-Verlag Berlin Heidelberg 2007
292
L. Xia, W. Huang, and H. Huang
Firewall
Intranet
Inside
Access Control
Outside
Internet
Fig. 1. Information channel control in a firewall
In addition, in MLS models, information is not allowed to flow “downward”, such as from high confidentiality level to lower, or from lower integrity level to higher. However, in many situations, information flow “downward” is needed. Such an example of information flow from lower integrity level to the higher is the system call between user process and operating system. Data at a higher integrity level is more accurate and/or reliable than data at a lower integrity level. And the integrity level of user’s data is usually lower than the operating system’s data. Therefore, according to multilevel policies, user’s data is not allowed to flow to the operating system data objects. However, in actual computer system, though user applications are not permitted to violate the integrity of operating system data, they should be given appropriate ways to pass the information or parameters to operating system. To handle these problems, many MLS systems (such as HP-UX/CMW [8]) introduce some special trusted subjects outside the TCB. These special subjects are given privileges to bypass the MLS access control mechanisms. For they are not controlled by the access control mechanisms, they have almost all of privileges, which is far more than what they need to do they jobs. It is obviously a violation to the Principle of Least Privileges. These subjects turn to be the potential targets for malicious attacks. Once they are compromised, they can bring huge damages to the system. Moreover, various security requirements are coming up with the sharply increased diversity and complexity of the computing environments. To satisfy these security requirements, a variety of security models were proposed in last twenty years. Widely-used security policies in current mainstream systems include multi-level military security model (Bell-LaPadula model, BLP) [1] and its variants (Biba [6], Dion model), Domain and Type Enforcement (DTE) [9], Role-based access control (RBAC) [14], and etc. Each of these models aims mainly at one or several security requirements, such as BLP aiming at the confidentiality of the information, Biba aiming at data integrity assurance, DTE aiming at confining the information flow channels, etc. Previous trusted operating system usually enforced only one kind of mandatory access control model, for instance, Multics[3] implemented only BLP model in it. However, as mentioned above, the security goals in different applications are various. The different security requirements of applications result in different security models needed for them. How operating system to support this kind of multiple security model views--the access control model different applications can perceive in the system is different. Recently, as a policy neutral security model, RBAC provides a valuable level of permission abstraction. However, using RBAC to simulate multi-level security level or
A Compositional Multiple Policies Operating System Security Model
293
discretionary access control models [12] is over complex and therefore unpractical in real-world operating system. In this paper, a Compositional Multiple Policy Security Model (MPVSM) is presented. MPVSM is a hybrid security model, which is based on Multi-level Security models. It combines confidentiality and integrity lattices into a unified security lattice for confidentiality and integrity assurance. It then divides the subjects and objects in the same security level into different domains and types, using access control mechanisms in DTE to make the permission assignment and management more fine-grained and flexible, meantime enforce the separation of duties between subjects in the same security level. In addition, using the thought of RBAC, role is added in MPVSM. Roles are assigned the extra permissions, which are independent of MLS and Domain-Type parts of the model. MPVSM makes use of the flexible permission assignment and revocation mechanisms in RBAC to confine the permissions of those special “trusted subjects”, provides a way to make them do they job out of control range of the MLS and Domain-type access control parts, but meanwhile prevent them from too powerful to be potential security holes of the system. MPVSM has integrated the properties and functions of Multiple Level Security, Domain-Type and Role Based models into one unified model. By combining the elements and attributes of DTE and RBAC to the MLS model, MPVSM avoids the drawbacks of MLS. MPVSM is able to enforce channel control and assured pipelines policies, with providing fine-grained permissions management. In addition, MPVSM owns an enhanced ability of policy expression. It can ensure the enforcement of least privilege to these special “trusted subjects” in the MPVSM model. Moreover, MPVSM provides a framework to enforce multiple policy views in operating system. It can not only enforce the equivalent functions of these three kinds of models independently in the system, but also can enforce multi-policy views between different applications in system. The remainder of the paper is organized as follows. Section 2 describes the MPVSM formally. Section 3 gives the example policy configurations in MPVSM. Section 4 is the related works. And section 5 is the conclusion.
2 Multiple Policy Security Model 2.1 Overview The architecture of the MPVSM is shown in figure 2. MPVSM comprises of elements, relations and mappings. A user in the framework is a system user. A role is a job function or job title within some associated semantics regarding the authority. Subjects are active entities in the system, usually processes or transactions. Objects are data objects as well as resource objects. Domain is a control access attribute associated with each subject. And type is the other control attribute associated with objects. Two global matrixes are defined to represent allowed access or interaction modes between domains and types or domains and domains respectively. Permission is an approval of a particular mode of access to object or interaction to subject. Security label is a 2-tuple, containing a confidentiality label and an integrity label.
294
L. Xia, W. Huang, and H. Huang
There are several relations and mappings between elements. The relation between users and roles are defined in user-role assignment relation. The user-subject relation gives relation between subjects and users, while subject-role mapping figures out a subject’s current running role. Permissions in system can be authorized to roles. Roles’ authorized permissions are given in the role-permission authorization relation. Each role in system can have many authorized domains. Role-domain authorization relation gives the authorized domains of each role. Each subject has only one running domain, which is given in subject-domain mapping. Besides, each subject has a security label. The security labels are assigned to roles, and subject’s security label is determined by the label of its running role. Each object has a security attribute which includes the type and security label of that object. User
Subject
Role Label
Subject perms Role perms
Domain
Perms
Domain perms
Object Type
Label
MLS perms
Fig. 2. The MPVSM
The final permissions that a subject gets are based on three kinds of permissions corresponding to that subject: MLS permissions, Domain permissions and Role-based permissions. MLS Permissions are coarse-grained base permissions, indicating the subject has the read-related permission or write-related permission. Domain permissions are the fine-grained permissions based on the MLS Permissions. Role-based permissions is independent from MLS permissions and domain permissions. Correspond to the three kinds of permissions, MPVSM model contains three access control views: 1)Multi-level Security Access Control. The MLS permissions given to the subject are based on the security level of the subject and the target object. 2)Domain Access Control. Subjects run in different domains. A subject’s Domain permissions are based on its running domain and the target objects’ types. 3)Role-Based Access Control. The subject has the permissions of its running role as its Role-based permissions. In addition, MPVSM model is also an extensible security model, by adding other attributes to the role and object, besides the label and domain or type attributes of current role and object, more functions or properties can be added. 2.2 Formal Definitions The following definitions formalize the discussion above: Definition 2.1. Elements Sets • Users: U • Subjects: S
A Compositional Multiple Policies Operating System Security Model
295
• • • • • • • •
Objects: O Roles: R Domains: D Types: T Confidentiality Labels: C Integrity Labels: I Security Labels: SL ⊆ C×I Access modes: P, containing of two disjointed subsets: Read-related Modes RP={read(r), execute(e), getattr(g) .. }; Write-related Modes WP={ write(w), append(a), create(c), delete(d), setattr(s) .. }; P=RP WP. • Domain transfer operation: transfer(t) transfer denotes the subject can transfer from one domain to another domain. • Role Permissions CAP ⊆ P×O, (p, o) CAP denotes the permission to access o in mode p.
∪
∈
Definition 2.2. US ⊆ U×S,user-subject relation. More than one subject can run on behalf of one user at the same time. But each subject can only run on behalf of one user, called its running user. • user: S→U, mapping from subject to its running user. user(s) =u if and only if u U (u, s) US.
∧
∈
∈
Definition 2.3. UA ⊆ U×R, user-role assignment relation. Each user can be assigned many roles and each role can be assigned to many users. • UR: U→2R, mapping from user to the set of roles assigned to that user. UR(u) ={r R (u, r) UA}. • SR: S→R, subject-role mapping, from the subject to its running role. Each subject has a running user and running role at anytime, and the role must have been assigned to that user: SR(s) UR(user(s)).
∈
∈
∈
Definition 2.4. role’s security label • RL: R→SL, mapping from role to its security label. • Ssl: S→SL, mapping from subject to its security label. Subject’s security label is the same as its running role’s security label: Ssl (s)=RL(SR(s)). Definition 2.5. RD ⊆ R×D, role-domain authorization relation. Each role has many authorized domains and each domain can be authorized to many roles. • RDom: R→2D, mapping from role to its authorized domains set. RDom(r) ={d D (r, d) RD}. • SDom: S→D, mapping from subject to its running domain. Each subject is running in only one domain at anytime, and the domain is authorized to that subject’s running role: SDom(s) RDom (SR(s)).
∈
∈
∈
Definition 2.6. object’s security attribute • OT: O→T, mapping from object to its type. • OL: O→SL, mapping from object to its security label.
∈
Definition 2.7. RCAP ⊆ R×CAP, role-permission authorization relation. (r1,cap) RCAP denotes that role r1 has the role permission cap.
296
L. Xia, W. Huang, and H. Huang
• Rolecap: R→2CAP, mapping from role to its authorized Role permissions set. Rolecap(r)={cap (r,cap) RCAP}.
∈
Definition 2.8. Two control matrixes • DTM: D×T→2P, domain-type access control matrix. p DTM(d, t) denotes that the subjects in domain d can access objects with type t in mode p. • DDI: D×D→{Φ,{transfer}}, domain-domain interaction control matrix. transfer DDI(d1, d2) denotes that subjects in domain d1 can transfer into domain d2.
∈
∈
∈
Definition 2.9. Multiply Levels Security rule: MLS_rule: SL×SL→2P, a MLS_rule (ls, lo) implies that subjects with security label ls can access objects with security label lo in mode a. This rule combines the BLP confidentiality and Biba integrity lattices. Let ls=(cs, is), lo=(co, io): • • • •
≥co, permit all read-related operations, that is: RP ⊆ MLS_rule(ls, lo). <co, deny all read-related operations, that is: ∀ p∈RP, p∉ MLS_rule (ls, lo). ≥io, permit all write-related operations, that is: WP ⊆ MLS_rule(ls, lo). <io, deny all write-related operations, that is: ∀ p∈WP, p∉ MLS_rule (ls, lo).
If cs If cs If is If is
2.3 Permission Decision Definition 2.10 • MLS permission (MLP): a subject’s MLP on an object is determined as follow: mlp(s, o)={(o, p) p MLS_rule (Ssl(s), OL(o))} z Domain permission (DP): a subject’s DP on an object is determined as follow: dp(s, o)={(o, p) p DTM(SDom(s), OT(o)) z Role permission (RP): a subject’s RP on an object is determined as follow: rp(s, o)={(o, p) (o, p) Rolecap(SR(s))} A subject’s Final permissions on an object is determined as: fp(s, o)=(mlp(s, o) ∩dp(s, o)) rp(s, o).
∈
∈
∪
∈
3 Examples of Policy Configuration 3.1 Trusted Subjects’ Permission Confinement We take the information interaction between user process and operating system as an example on the information flow from lower integrity level to higher integrity level. As shown in Figure 3. User data is in lower integrity level, and operating system data in higher integrity level. In order to satisfy the needs of system calls, User process is permitted to write to the buffer data space of the operating system, but no permission to the other data object of the OS. Similarly, operating system writes the data to the buffer object of the user process. To enforce the policy described above, each process is assigned a security attribute {role, domain}, denoting the process’s running role and running domain. And each data object is assigned a security attribute {(c, i), t}, denoting the object’s confidentiality level c, integrity level i and type t, as shown in Figure 3. And Rolecap(usr_r)={(w, kerbuffer)}, role usr_r has write permission to the kerbuffer object, its security label is (0,1),
A Compositional Multiple Policies Operating System Security Model
297
User process {usr_r, usr_d} {(0,1),usr_t
{(0,1), usrbuf_t}
usrbuffer
usrprivate
Kernel process {ker_r, ker_d} {(0,2),kerbuf_t}
kerbuffer
kerprivate
{(0,2),ker_t}
Fig. 3. Information interaction between user process and operating system Table 1. DTM ker_t ker_d usr_d
r,w
Kerbuf_t
usr_t
r
usrbuf_t w
r,w
r
and its authorized domains set is {usr_d}. Rolecap(ker_r)= Φ, ker_r has no role permission, its security label is (0,2), and its authorized domains set is {ker_d}. The DTM and DDI between the domains and types are shown in Table 3.1. We isolate the operating system data and user data from each other by dividing them into different integrity level. The usrbuffer and usrprivate data objects which are in the same integrity level are divided into different types, therefore User process can have different fine-grained permissions to these two objects. According to the MLS policy, User process has no write permission to the kerbuffer objects for its integrity level is lower than the object. However, this write permission is necessary for getting job done. So, we assign write permission on object kerbuffer to the role usr_r, this permission is independent of MLS and Domain permissions. In this way, the function of system call is achieved without giving too much permission to the User process to bring potential security flaws to system. 3.2 Channel Control We design a simplified firewall system to demonstrate the use of MPVSM in configuring channel control policies. The firewall is shown in Figure 4. The security policy of the firewall is that all information flow from outside network to inside network or in reverse direction must be checked by the access control component. It can be described as follow: information is only allowed to flow from the Outside to the Inside or in reverse direction via the Access Control. It can not be flowed directly between them. Besides, all components are permitted to read the Config without modifying it. And all components can append information to the Log, but can not read it. Our configurations to enforce this policy are shown in Figure 3.2. The DTM and DDI between domains and types are shown in Table 3.2. And Rolecap(fw_r) =Φ, role
298
L. Xia, W. Huang, and H. Huang
Firewall
Log {(3,0),con_t} {f_r, ac_d} Access Control
{(1,1),out_t}
{(1,1),in_t
Outside
Inside
Intranet
{fw r, in d}
Internet
{fw_r, out_d}
Config
{(0,3),con_t}
Fig. 4. Policy configuration of the Firewall Table 2. The DTM and DDI in_t in_d
con_t
r,w
r,a
r,w
r,a
r,w
out_d ac_d
out_t
r,w
in_d
out_d
ac_d
r,a
fw_r has no Role permissions. The security label of the fw_r is (1,1), and its authorized domains are {ac_d, in_d, out_d}, there is no transfer permission between any two of these three domains. We upgraded the confidentiality level of the Log to make it unreadable to the components, upgraded the integrity level of the Config to make it unmodifiable by components. Then we divided the subjects of the same security level to different domains, and objects of the same security level to different types. By controlling the fine-grained permissions between the domains to types, information channel control policy between the inside and outside network is enforced. 3.3 Enforcing Multiple Security Policies Through different configuring ways, Multi-Level security model, DTE and RBAC can be enforced separately in the MPVSM, and multi-policy views between different user groups can be enforced too. 3.3.1 Enforcing Multi-level Security model The way configuring MPVSM to enforce Multi-Level Security model, which based on both confidentiality and integrity lattices, is described as following:
A Compositional Multiple Policies Operating System Security Model
299
1. R = SL , number of roles in the system is the same as the number of the security labels. Each role corresponds to one security label. 2. D={gen_d}, T={gen_t}, there are only one domain and one type in the system. RD={(r, gen_d) r R}, indicates that all roles’ authorized domain is gen_d. The type of all objects is gen_t: OT={(o, gen_t) o O}. 3. DTM={(d, t, p) d D, t T, p P}, which indicates domain gen_d have all domain permissions to type gen_t. 4. Rolecap(r:R)=Φ, no role permission is authorized to every role.
∈ ∈ ∈ ∈
∈
3.3.2 Enforcing DTE 1. R={gen_r}, only one role in system. UA={(u, gen_r) u U}, role gen_r is assigned to every users. 2. RD={(gen_r, d) d D}, indicates that all domains in system are authorized to the role gen_r. 3. SL={(Only_C,Only_I)}, only one security label in the system, therefore: RL= {(r, Only_C, Only_I)) r R}. 4. Rolecap(r:R)= Φ.
∈
∈
∈
3.3.3 Enforcing RBAC 1. D={gen_d}, T={gen_t}, only one domain and one type in system. RD={(r, gen_d) r R}, gen_d is authorized to every role in system. The type of all objects is gen_t, OT={(o, gen_t) o O}. 2. DTM(gen_d, gen_t)=Φ, denotes subjects in gen_d domain have no domain permissions to objects in type gen_t. 3. SL={(Only_C, Only_I)}, only one security label. RL={(r,(Only_C,Only_I)) r R}.
∈
∈
∈
3.3.4 Enforcing Multiple Model Views Assume all users in the system can be divided into three groups: Grpa, Grpb and Grpc. Now we may hope that users in each group can perceive different access control model views. For instance, users in Grpa think that the security model enforced in system is MLS, users in Grpb think that is RBAC and users in Grpc think that is DTE. The configuring method that enforces this multi-model views in one system is given as below.
∪
∪
1. U=Grpa Grpb Grpc, the three sets are disjointed each other. 2. R= mls_rs rbac_rs {dte_r}. mts_rs is the roles set corresponding to MLS model. rbac_rs is the roles set corresponding to RBAC model. And dte_r is the role corresponding to DTE model. 3. D= {mls_d} {rbac_d} dte_ds. 4. (u, r) UA (u, r’) ∉ UA, where u Grpa, r mls_rs, r’ ∉ mls_rs, the roles in mls_rs are only permitted to be assigned to users in Grpa. (u, r) UA (u, r’) ∉ UA, where u Grpb, r rbac_rs, r’ ∉ rbac_rs, the roles in rbac_rs can only be assigned to users in Grpb. Similarly, (u, dte_r) UA (u, r) ∉ UA, where u Grpc, r≠dte_r, every user in Grpc is assigned the only role dte_r. 5. mls_rs = SL , number of roles in set mls_rs is the same as the number of security labels in system. Each role in mls_rs corresponds to one security label. MLS_rule(Ssl(r), tsl)=M, r rbac_rs, tsl SL, have all of possible MLS permis-
∪
∪
∪
∪
∈ ∧
∈
∈
∈
∈
∈
∈ ∧
∈ ∧
∈
∈
300
L. Xia, W. Huang, and H. Huang
∈
6.
7.
sions to other subjects or objects. MLS_rule(Ssl(dte_r), tsl)=M, tsl SL, role dte_r’s MLS permissions to other subjects or objects include all of possible permissions too. The security label of the roles in rbac_rs is the lowest level label of the system, that is: ∀ r rbac_rs, RL(r)=(cs, is), ∀ c C, cs c and ∀ i I, is i. the security label of dte_r is the highese level label of the system, that is: RL(dte_r)=(cs, is), ∀ c C, cs c and ∀ i I, is i. (r, mls_d) RD (r,d) ∉ RD, where r mls_rs, d≠mls_d, every roles in mls_rs is authorized the only domain mls_d. (dte_r, d) RD (r’, d) ∉ RD, where r’≠dte_r, d dte_ds, all domains in dte_ds are authorized to role dte_r. Simliarly, (r,rbac_d) RD (r,d) ∉ RD, where r rbac_rs, d≠rbac_d, every role in rbac_rs is authorized the only domain rbac_d. (mls_d,t,p) DTM,t T, p P. DDI(mls_d, d)=Φ, d D, subjects in domain mls_d can not transfer to other domains. (rbac_d, t, p) DTM, t T, p P, the subjects in domain rbac_d have all domain permission to all types’ objects in system. DDI(rbac_d, d)=Φ, d D, subjects in domain rbac_d can not transfer to other domains. For every r mls_rs {dte_r} and c CAP, (r, c) ∉ RCAP, there is no role permission authorized to roles in set mls_rs and the role dte_r.
∈
≥
∈ ∈ ∧
8. 9.
10.
∈
∈ ≥ ∈ ∧
≤
∈
∈
∈
∈ ∈ ∈ ∈ ∈
∈
∪
∈ ≤
∈
∈ ∧
∈
∈
∈
4 The Related Works Bell-LaPadula [1] (BLP) model mainly emphasizes the protection of confidentiality. It is able to limit flow of information and unauthorized information leakage. However, it does not care about the integrity, which is also important [2,7,11]. Besides, BLP is weak in channel control of information flow [22]. Biba Integrity Model [6] is the mathematical dual of BLP, intending to protect the integrity in system. Type enforcement is a table-oriented mandatory access control policy for confining applications and restricting information flows. DTE [9] is an enhanced version of type enforcement designed to provide needed simplicity and compatibility. Role-based access control [14] provides a valuable level of abstraction to promote security administration at a business level. The Flask [10] security architecture emphasizes diverse security policies support. However, it applies only MAC to the Fluke Microkernel. It provides the mechanisms for diverse policies without giving how to enforce multiple policies in system. One of the earliest MAC mechanisms in operating system is Lattices [1, 20]. For instance, LOMAC [13] enforces Biba integrity. CMW [8] can dynamically relabel the current object for increased flexibility. Recently, Asbestos [17] provides labeling and isolation mechanisms that can support applications to express a wide range of policies and make MAC more practical. KernelSec [18] aims at improving the effectiveness of the authorization model and the security policies that can be implemented. In capability-based confinement systems, KeyKOS [21] achieved military-grade security by isolating processed into compartments and interposing reference monitors to control the use of capabilities. EROS [19] later successful realized this principles on
A Compositional Multiple Policies Operating System Security Model
301
the modern hardware. And the Coyotes kernel [5] mainly explores use of software verification techniques to achieve higher confidence in the correctness and security of the kernel. Mandatory access control can also be achieved with unmodified traditional operating system through virtual machines [16, 4].
5 Conclusions The Compositional Multiple Policy Security Model is presented, which is based on the MLS model, combining the domain and type attributes to the model, to eliminate the limitations of MLS models. It has enforced expression power in channel control policies, and made permission management more fine-grained and enhanced the ability of confining the permission of the trusted subjects. MPVSM is also able to enforce multiple policy views in operating system in a flexible way.
References 1. Bell, D., La Padula, L.: Secure Computer Systems: Mathematical Foundations. Technical Report MTR-2547, vol. I, MITRE Corporation (1975) 2. Amoroso, E., Nguyen, T., Weiss, J., et al.: Towards an Approach to Measuring Software Trust. In: 1991 IEEE Symposium on Research in Security and Privacy, pp. 198–218 (1991) 3. Organick, E.: The MULTICS System: An Examination of Its Structure. MIT Press, Cambridge (1972) 4. Karger, P.A., Zurko, M.E., Bonin, D.W., et al.: A VMM security kernel for the VAX architecture. In: 1990 IEEE Symposium on Security and Privacy, pp. 2–19 (1990) 5. Shapiro, J., Doerrie, M.S., Northup, E., et al.: Towards a Verified, General-Purpose Operating System Kernel. In: 1st NICTA Workshop on Operating System Verification (2004) 6. Biba, K.: Integrity Considerations for Secure Computer Systems. Technical Report MTR-3153, MITRE Corporation (1977) 7. Eswaran, K., Chamberlin, D.: Functional Specifications of Subsystem for Database Integrity. In: The International Conference on Very Large Data Bases (1975) 8. Berger, J.L., Picciotto, J., Woodward, J.P.L., Cummings, P.T.: Compartmented mode workstation: Prototype highlights. IEEE Transactions on Software Engineering, Special Section on Security and Privacy 16, 608–618 (1990) 9. Badger, L., Sterne, D.F., Sherman, D.L., et al.: A Domain and Type Enforcement UNIX Prototype. In: 5th USENIX UNIX Security Symposium (1995) 10. Spencer, R., Smalley, S., Hibler, M., et al.: The Flask Security Architecture: System Support for Diverse Security Policies. In: 8th USENIX Security Symposium, pp. 123–139 (1999) 11. Lipner, S.: Non-Discretionary Controls for Commercial Applications. In: 1982 Symposium on Privacy and Security (1982) 12. Osborn, S., Sandhu, R., Munawer, Q.: Configuring Role-based Access Control to Enforce Mandatory and Discretionary Access Control Policies. ACM Transactions on Information and System Security 3, 85–106 (2000) 13. Fraser, T.: LOMAC–low water-mark mandatory access control for Linux. In: 9th USENIX Security Symposium (1999)
302
L. Xia, W. Huang, and H. Huang
14. Sandhu, R., Coyne, E., Feinstein, H., Youman, C.: Role-Based Access Control. IEEE Computer 29 (1996) 15. Loscocco, P., Smalley, S.: Meeting critical security objectives with security-enhanced linux. In: Ottawa Linux Symposium 2001 (2001) 16. Goldberg, R.P.: Architecture of virtual machines. In: AFIPS National Computer Conference, vol. 42, pp. 309–318 (1973) 17. Efstathopoulos, P., Krohn, M., VanDeBogart, S., et al.: Labels and Event Processes in the Asbestos Operating System. In: 20th Symposium on Operating Systems Principles (2005) 18. Radhakrishnan, M., Solworth, J.A.: Application Support in the Operating System Kernel. In: ACM Symposium on Information, Computer and Communications Security (2006) 19. Shapiro, J.S., Smith, J.M., Farber, D.J.: EROS: A Fast Capability System. In: 17th ACM symposium on Operating systems principles (1999) 20. Saltzer, J.H., Schroeder, M.D.: The protection of information in computer system. Proceedings of the IEEE 63, 1278–1308 (1975) 21. Key Logic. The KeyKOS/KeySAFE System Design (1989), http://www.agorics.com/ Library/KeyKos/ keysafe/Keysafe.html 22. Rushby, J.: Noninterference, Transitivity, and Channel-Control Security Policies. Technical Report CSL-92-02, Computer Science Lab, SRI International (1992)
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones Colin D. Walter Comodo Research Laboratory 7 Campus Road, Bradford, BD7 1HR, UK [email protected]
Abstract. Side channel leakage from smart cards has been of concern since their inception and counter-measures are routinely employed. So a number of standard and reasonable assumptions are made here regarding an implementation of RSA in a cryptographic token which may be subjected to non-invasive side-channel cryptanalysis. These include blinding the re-usable secret key, input whitening, and using an exponentiation algorithm whose operation sequence partially obscures the key. The working hypothesis is that there is limited side channel leakage which only distinguishes very imprecisely between squarings and multiplications. For this typical situation, a method is described for recovering the private exponent, and, realistically, it does not require an excessive number of traces. It just requires the modulus to be public and the public exponent not to be too large. The attack is computationally feasible unless parameters are appropriately adjusted. It reveals that longer keys are much more vulnerable than shorter ones unless blinding is proportional to key length. A further key conclusion is that designers must assume that the information theoretic level of leakage from smart cards can be transformed into usable key information by adversaries whatever counter-measures are put in place. Keywords: Side channel leakage, power analysis, SPA, DPA, RSA.
1
Introduction
Side channel leakage of secret key information from cryptographic devices has been known publicly for a number of years [1], and very widely since the work of Kocher [6,7]. In the case of RSA, the main software counter-measures to this have included message whitening, key blinding and more complex exponentiation algorithms. These, therefore, form the main assumptions here. In the past, there were no obvious ways of extracting weak leaked information from this and using it to recover the secret key. Either the leaked information had to distinguish clearly between squarings and multiplications for individual uses of the key [3] or, with less precise leakage, the same key had to be re-used many times in an unblinded state so that the leakage could be averaged to reduce noise [6,2,13]. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 303–316, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
304
C.D. Walter
However, from the information-theoretic standpoint, it is clear that there can be enough data for the key to be recovered when blinding is used but side channel leakage is imprecise. Here a means for obtaining the key is given for that situation, developed from the case of perfect, but partial, side channel information described by Fouque et al. [3]. One of the main contributions here is a metric for evaluating choices and enabling the best to be investigated first. The first objective is to determine the blinding factor for several dozen cases. This is done by brute force: testing every possible value until one is found which would provide a trace that matches the measured trace sufficiently well under a suitable metric. The analysis is complicated by an unknown factor k equal to the size of the public exponent E. That factor must also be determined from the side channel leakage in the same way, and therefore affects the computational feasibility of the method if E is large. However, there is no obvious way to avoid the exhaustive search. Once the blinding factors are determined for as many traces as are needed, the second objective is to determine the unblinded private exponent. Its bits are guessed from most to least significant by looking at both possible values and selecting the one which matches the observed leakage better. Incorrect choices are quickly noticed, and corrected by back-tracking and lookahead. This phase of the attack is less computationally intensive than the first, but it requires more traces when the leakage is weaker – a number inversely proportional to the strength of leakage. The adversary makes use of certain properties of the exponentiation algorithm which lead to the leakage. The standard 4-ary sliding windows [4] considered here has a pattern of squarings and multiplications which contains information about the bit pattern of the exponent. The method applies equally well to any other algorithm with a variable pattern of operations where the variation is derived from a local property of the secret key, such as bit values. Finally, the complexity of the attack is considered. There is low space complexity and the attack is highly, and easily, parallelisable to make full use of computing resources. The total time complexity is of order which is the product of the public key, the maximum blinding factor and a measure of the unreliability of the side-channel leakage. Thus, it appears to be computationally feasible to extract the key in many normal circumstances. A significant conclusion is that, for a fixed amount of blinding, longer keys are less secure because blinding factors are determined more accurately. This means that blinding should be increased in proportion to key length in order to thwart the attack. The organisation of the paper is as follows. The main assumptions, the leakage model, pre-requisite notation and background algorithms are covered in sections §2 to §5. Phase 1 of the attack, during which the blinding factors are recovered, is treated in §6. Phase 2 of the attack, namely the recovery of the secret key, is described in §7. The computational cost is reviewed in §8 and wide-ranging conclusions are drawn in §9.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
2
305
Notation
The n-bit RSA modulus N and public exponent E are assumed to be known by the attacker. His aim is to recover the private key D which is re-used a number of times but only in the blinded form Di = D+ri φ(N ) where ri < R is a small random number (typically up to 32 bits) and φ(N ) is unknown. The modulus is a product of two primes N = P Q which must be of similar magnitude. For convenience, we assume P and Q have the 0002bits. √ same number of Then, without loss of generality, P < Q < 2P so that 2 N < P +Q < 3 N/2 and φ(N ) = N −(P +Q)+1 is bounded by 0002 √ (1) N − 3 N/2 + 1 < φ(N ) < N − 2 N + 1 √ This interval has length less than 18 N , so that more than half of the most significant bits of φ(N ) are known by the attacker from those of N . The exponents D and E are related by D×E = 1+kφ(N )
(2)
for some k. Without loss of generality, let D be the smallest non-negative solution to this congruence, so that D < φ(N ) and k < E. When key D is re-used many times, blinding factors are normally added to produce the randomly different exponents which are actually used for decryption or signing [6]: Di = D+ri φ(N )
(3)
where ri is a random number, usually of 16 to 32 bits. Thus, Di =
1 + (k+ri E)φ(N ) E
(4)
Let R be an upper bound on such ri . Then the coefficient k+ri E of φ(N ) is, in effect, a random number in the range 1 to RE. (So it is irrelevant whether k and D were chosen minimally in equation (2)). In equation (4) the adversary is initially only interested in the most significant half of the bits. He ignores the 1 and approximates φ(N )/E by computing N/E. By the earlier remarks this gives him at least the top n/2 bits of φ(N )/E. So, Di ≈ (k+ri E)N/E
(5)
The attacker now has to generate each of the RE possible values of the random coefficient of N/E in order to obtain a set containing an approximation to the value of Di used in the exponentiation which he has observed.
3
The Exponentiation
For convenience we assume that the exponentiation algorithm is 4-ary sliding windows using the re-coding in Fig. 1 [4,5]. This uses a window of 1 bit width
306
C.D. Walter Input: Binary D = (bn−1 ..b2 b1 b0 )2 Output: Recoding D = (dm−1 ..d2 d1 d0 ) i ← 0 ; m ← 0 ; While i < n do If bi = 0 then Begin dm ← 0 ; i ← i+1 ; m ← m+1 ; End else Begin dm ← 2bi+1 + 1 ; i ← i+2 ; m ← m+1 ; End Fig. 1. Quaternary Sliding Windows Recoding
when there is a digit zero in the recoded exponent, and otherwise a window of 2 bits width, for which the digit is 1 or 3. Although this does not provide the same protection against side channel cryptanalysis as the square-and-always multiply algorithm, it is more time efficient even than the usual square-andmultiply algorithm and also creates some difficulty for an attacker who may have to distinguish whether the multiplications pertain to digit 1 or digit 3. This algorithm, or its fixed-width equivalent, is typical of a smart card because of its speed and low storage overhead: only the first and third powers of the input message need storing for the exponentiation. Both algorithms generate a pattern of squarings and multiplications which is related to occurrences of the zero digit. This is the property that can be exploited here by an attacker.
4
The Leakage Model
With expected counter-measures in place it is unrealistic to assume that every long integer multiplicative operation in an RSA exponentiation can be identified and distinguished as a squaring or not. However, some imperfect deductions may be possible from a side channel trace, particularly in contactless cards where severe resource limitations and an explicit aerial limit the scope and effectiveness of any counter-measures. The following two leakage scenarios are likely in practice. Others are certainly possible. First, because the conditional subtraction in a Montgomery modular multiplication ([8], see Fig. 2) consumes a number of extra clock cycles, there is a possibility that it may be observed in a side channel trace via the longer time for the operation. The slightly different frequencies of the subtraction for squares and multiplies mean that each occurrence or absence of the subtraction makes
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
307
Input: A and B such that 0 ≤ A, B < N < r n and N prime to r. Output: C = ABr −n mod N C ← 0 ; For i ← 0 to n-1 do Begin qi ← -(c0 +ai b0 )n0 −1 mod r ; C ← (C+ai B+qi N) div r ; End ; { Assertion: Cr n ≡ A×B mod N and ABr −n ≤ C < N +ABr −n } If C ≥ N then C ← C-N ; Fig. 2. Montgomery’s Modular Multiplication Algorithm (MMM)
a square or multiplication marginally more likely [13]. As previous attacks have been unable to use this information in the presence of exponent blinding and message whitening, implementors may not perceive the leakage as a threat when such counter-measures are in place. One can therefore expect many of them to prefer more widely applicable code which includes the conditional subtraction, despite the existence of straightforward and efficient alternatives [12]. Secondly, the data loading cycles for multiplications and squarings are different and therefore vulnerable. For example, the Hamming weight of the words of the arguments may leak when they pass along the internal bus [7]. A squaring is almost certain where the Hamming weights are equal, and a multiplication must be the case if they are different. However, this information is usually well submerged in noise, and in a well designed implementation it should only yield a minimal bias towards a squaring or a multiplication. The above are very much more realistic leakage models than that of [3] where it was assumed that each multiplicative operation was known to be a squaring or a multiplication. In practice, only weak probabilistic information is known. In order to obtain specific measures of implementation strength, the attack here is modelled on the level of data leakage from observing every conditional subtraction in Montgomery modular multiplication. However, the attack is generic, and applies to both of the above scenarios as well as many others.
5
Selecting the Leakiest Traces
The word-based algorithm for Montgomery multiplication (MMM) is given in Fig. 2 where the digits ai , bi etc. are for the base r representation of the long integers A, B etc. From the assertion after the loop it is easy to establish the frequency of the conditional subtraction under the reasonable assumption of the output residues being uniformly distributed modulo N . The probability is proportional to the fraction of the interval which is greater than N , namely ABr−n N −1 . For a typical multiplication with independent arguments, this can be summed with respect to A and B over the interval [0, N ) to obtain the average probability of
308
C.D. Walter
pM ≈
1 −n Nr 4
(6)
Similarly, setting A = B and summing gives the probability of the subtraction for a squaring, namely 1 pS ≈ N r−n (7) 3 The difference between pM and pS shows that the occurrences of a conditional subtraction indicate a squaring is slightly more likely to be the case than a multiplication. The difference, however, is small. Early attacks on Montgomery’s algorithm relied on being able to perform hundreds or thousands of exponentiations with the same key in order to observe enough subtractions to conclude with high probability whether the operation was a squaring or a multiplication. The formulae (6) and (7) also indicate that decreasing N or increasing the number of iterations n will reduce the occurrences of the conditional subtraction and so make the algorithm more secure. However, the multiplications in individual exponentiations are not as random as used for the formula (6). For 4-ary sliding windows, one of the two precomputed powers of the input message is used as one of the arguments, the other being a random output from an earlier Montgomery multiplication. So only one argument is uniformly distributed in a given exponentiation. Let A be the fixed input to such a multiplication. Then, summing ABr−n N −1 with respect to B yields the true probability of a subtraction, viz. pA ≈
1 −n Ar 2
(8)
Thus, when the pre-computed powers of the input are small (resp. large) there will be very few (resp. many) conditional subtractions resulting from multiplications. This increases the probability of distinguishing between squarings and multiplications. Overall, this will be noticed by an adversary because the total number of conditional subtractions will be less (resp. greater) than the average for such exponentiations. This provides the opportunity for the adversary to select the leakiest traces with very little computational effort. Similarly, in the Hamming weight leakage scenario, instead of an enhanced or reduced frequency of conditional subtractions from large or small values of A, the adversary homes in on the argument pairs A, B which are the highest Hamming distance apart. They have the highest probability of being multiplications. This occurs most frequently when the re-used, pre-computed multiplier A has the highest number of extreme Hamming weights. So, by screening for extreme Hamming weights, traces which leak significantly more information than average can be identified easily by the adversary. In both leakage models, the attacker can therefore begin by selecting side channel traces which yield the greatest amount of information, and these can be chosen without excessive computational effort for the initial phase of data capture, signal processing and selection.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
6
309
The Attack: Phase 1
In the leakage scenarios of §4, the attacker is expected to obtain little or no useful information about a multiplicative operation in many cases, and only a very weak probability in favour of a squaring rather than a multiplication (or vice versa) in other cases. However, as described in §5, he begins his attack by collecting as many traces as possible and selecting those for which the leakage promises to be greatest. Phase one of his attack then progresses as follows. Suppose he has selected a promising trace corresponding to the use of the blinded exponent Di . He first determines the top half of the digits of φ(N )/E as in §2 and then guesses the values of k and ri . Equation (5) gives him a possible approximation Di 0003 for Di . He then compares the side channel leakage expected from Di 0003 with that obtained from Di and discards the guessed pair (k, ri ) if the match is poor. Repeating this for all pairs leaves him with a set Si of the most likely blinding values for Di . This process is repeated with more traces – enough for him to complete the second phase successfully. The decision about whether guesses are good enough is based on a metric μ(tr(Di ), ops(Di 0003 , m)). The first parameter tr(Di ) is the processed side channel leakage from use of the unknown blinded key Di . Specifically, it is a list of probabilities pr(op) that the operations op of the exponentiation using Di were squares rather than multiplications. Thus tr(Di ) = [pr(ops ), pr(ops−1 ), .., pr(op3 ), pr(op2 ), pr(op1 )] where s = len(tr(Di )) is the total number of operations in the exponentiation with key Di . In the second parameter, Di 0003 is the bit sequence for the guessed value of Di , and ops(Di 0003 , m) is the sequence of multiplicative operations carried out in an exponentiation with key 0004Di 0003 /2m 0005. So the m least significant bits of Di 0003 are irrelevant, and need not have been guessed yet. This parameter will be a list containing, say, ‘0’ to denote a squaring, and ‘1’ a multiplication. In this phase we will set m = n/2+ log2 R. If the side channel leakage tr = tr(Di ) for Di indicates with probability trj that the jth operation was a squaring and the jth operation in ops(Di 0003 , m) is also a squaring then trj − 12 is added to the metric. However, if a multiplication occurs as the jth element in ops(Di 0003 , m), then 12 −trj is added. So 0 is added if the leakage provides no information since then trj = 12 , but there is a positive contribution to the sum when the operation in tr(Di ) is more likely to be the same as that in ops(Di 0003 , m), and there is a negative contribution when the two operations are more likely to be different. Thus, the sum for calculating the metric is μ(tr, ops(D0003 , m))
=
0003 1≤j≤nops(0005D0002 /2m 0006)
0002 1 (−1)ops(D ,m)j (trj − ) 2
(9)
when ops(D0003 , m)j ∈ {0, 1} as suggested above, and nops(D00030003 ) is the number of operations in exponentiating to the power D00030003 . Here the lists tr and ops(D0003 , m) are in temporal execution order assuming an exponentiation algorithm which processes bits from most to least significant. This correctly aligns corresponding elements of the lists when the lowest bits of D0003 are ignored.
310
C.D. Walter
If the guess (k, ri ) is correct, the value Di 0003 provides the same pattern of squarings and multiplications as Di over the first (approximately) half of the operations of the exponentiation. So, unless the noise is overwhelming, this should maximise the value for the sum when restricted to those operations. Therefore larger values for μ imply a better match between the guessed value Di 0003 and the targetted exponent Di , whereas smaller and negative values indicate a poor match. Because of the unknown difference between N and φ(N ), the n/2+ log2 R least significant bits of Di 0003 are unreliable even when (k, ri ) is guessed correctly. By taking m = n/2+ log2 R the operations corresponding to them are ignored. The metric could be improved by taking account of the dependence between consecutive operations: for example, the probable occurrence of a multiplication implies that the next operation is more likely to be a squaring. Schindler [9,11,10] treats this in detail and provides theory about the best choice of metric. 6.1
Phase 1 Simulation
In our simulation, values were chosen which correspond to a leaky implementation of MMM where every conditional subtraction is observed and N ≈ rn . Conditional subtractions were generated randomly with frequencies in accordance with the model described in §5. There was no selection of “better” traces on the grounds of fewer or more subtractions than normal. Since conditional subtractions occur with slightly greater frequency for squarings than for multiplications, the metric was (arbitrarily) incremented by pj − 12 = 0.1 for every conditional subtraction in the trace when there was a squaring in the guessed value and therefore incremented by 12 −pj = −0.1 (i.e. decremented) for every conditional subtraction that coincided with a multiplication. Table 1. Proportion of Guesses returning a Higher Value of μ than the Correct One log2 RE
8
n = 384 7.9×10−3 512 2.7×10−3 768 5.3×10−4 1024 2.0×10−5 1536 < 2.5×10−7
12
16
32
48
8.0×10−3 2.4×10−3 3.2×10−4 1.9×10−5 < 2.5×10−7
8.2×10−3 3.4×10−3 1.0×10−3 8.7×10−4 < 2.5×10−7
5.0×10−3 2.0×10−3 1.4×10−4 1.7×10−5 ..
3.6×10−3 1.0×10−3 1.2×10−4 8.0×10−6 ..
With only this weak knowledge to distinguish between squarings and multiplications, the “best” guess is rarely the correct one. The correct values are ranked among the best, but do not usually come top. Therefore, to assess the feasibility of the attack, it is necessary to know the size of the set S of best guesses which is big enough to include the correct guess. This depends on the strength of the leakage. With the parameters just described, the results in Table 1 were obtained. It gives a good indication of how well the matching process works and shows the minimum proportion of all guesses which must be considered if the correct one is not to be excluded. For example, with a modulus of n = 1024 bits, and
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
311
RE = 232 , the leakage of interest is from the top 512 or so bits. Then the metric μ places the correct values (k, ri ) above all but about RE×1.7×10−5 ≈ 216 incorrect values, on average. In information theoretic terms, the metric has extracted about 16 bits from the side channel, i.e. about 1 bit in every 512/16 = 32. This is the case for all the entries in the table: they all correspond to about 1 bit per 32 in the top half of the key, i.e. n/64 bits in total. An improved metric is possible (e.g. taking into account multiplications having to be next to squarings) and this would enable more information bits to be obtained. However, for 2048-bit keys (not tabulated), this means about 32 bits’ worth of information is recovered, so that k and ri should be determined almost uniquely when RE ≤ 232 . This is indeed what was found in the simulation. Clearly, longer keys are more vulnerable: – For a given size of blinding and public exponent, the longer the key, the more likely (k, ri ) is to be guessed correctly and uniquely. The figures in the table show little effect from increasing RE. k and ri blind information equivalent to about log2 RE bits’ worth of operations. However, longer blinding factors also seem to constrain the pattern of the blinded key more tightly. With these conflicting forces, the average success of the method is little changed: the number of bits leaked depends almost entirely on the length n of the key. Consequently, for a given key length, the same proportion of choices (k, ri ) are removed irrespective of the value of RE. Of course, the number of accepted pairs must still increase directly in proportion to RE. Thus, – Typical leakage from 2048-bit or longer keys will usually reveal (k, ri ) correctly with current standards for key blinding and a small public exponent; and – In these cases, an exhaustive search is computationally feasible to find the correct blinding factors (k, ri ). Incidentally, a powerful counter-measure in the case of Montgomery conditional subtractions is just to halve the modulus. This halves the number of conditional subtractions, and so halves the number of bits which are leaked. 6.2
Combining Traces to Determine k in Phase 1
The leakage from t traces can be processed for an outlay of t times the effort for one. If these traces are independent, t times as much bit information is extracted. Thus, a very small number of traces should result in k being determined with some confidence, since the same k is used in all cases. In fact, the correct value of k should have been guessed for all or almost all traces, and, if there is any bias, the correct value for k should be one of the most popular among the best guesses for an individual trace. Guesses at k are ranked as follows. For each sufficiently good guess k+ri E, the value of k = (k+ri E) mod E is extracted and the associated value of the
312
C.D. Walter
metric μ is added to the weighting of k. The higher the total weight for a guess at k, the more likely that value is to be correct. The possible values of k are then considered in descending order of weight in Phase 2, the heaviest first. Our simulation did not investigate how much this ranking reduces the search space in Phase 2 as a function of t; from the information theoretic point of view, it seems possible that k is almost completely determined by only a very small number of traces. This is an important detail that still needs to be researched as it affects the effectiveness of the blinding.
7
The Attack: Phase 2
Let Si be the set of plausible guesses at (k, ri ) for the ith trace, and suppose Si is partitioned into subsets Sik which share the same k. Armed with these sets, the adversary progresses to phase 2, which is the recovery of the remaining, least significant bits of φ(N ). He repeats this phase for each k separately, choosing the most likely k first. φ(N ) is constructed bit by bit from the most significant end. The first half of φ(N ) was obtained already from the public N and equation (1). Let φ(N )j = (φn−1 φn−2 ..φj )2 be the part of φ(N ) already determined, so φj−1 is the next bit to be guessed. Let Φj = (φj−1 φj−2 ..φj−w )2 be a guess at the next w bits of φ(N ). For each possible value of word Φj , the right side of equation (4) is evaluated with φ(N )j−w in place of φ(N ). This yields an approximation Dri ,j−w to Di in which only the most significant n−j+w bits are of interest. The same metric as in Phase 1 is used again to measure how well this matches the leakage from Di , namely μ(tr(Di ), ops(Dri ,j−w , j−w+ log2 R)). (As before, at the point before division by E, we ignore the lowest log2 RE bits containing a contribution from φj−w because they are too contaminated by the carries up from less significant bits of φ(N ).) For the given k, the sum 0003 0003 μ(tr(Di ), ops(Dri ,j−w , j−w+ log2 R)) (10) μw (k, j, Φj ) = i
ri ∈Sik
over all guesses is used to assess the worth of the choice for Φj . The leading bit of Φj from the maximum μw (k, j, Φj ) is selected as the value for φj−1 . Correct bit choices amplify any peak (i.e. maximum) values of the metric μw whilst incorrect choices decrease it. Moreover, previous mistakes reduce any peaks. When that happens, it is necessary to backtrack and select the most promising previous value. The difference between the two cases is determined using a threshold value for the metric which is obtained by experience. When it becomes too low for every value of Φj , it is necessary to backtrack and select the most promising previously untried value. The least significant bits of Φj are partly masked by carries up, and contribute less to the peak values than the more significant bits. So only the top one or two bits of the best Φj are chosen each time. In this way the bits φj are chosen from most to least significant. Once most bits have been guessed, the final log2 E bits are fully determined by the division being exact in equation (4).
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
313
Table 2. Probability of predicting the correct bit of φ(N ) from t correct guesses ri with w lookahead bits when n = 1024 and log 2 RE = 16 w t = 25 t = 50 t = 100 t = 250
7.1
1
2
3
4
6
8
0.613 0.642 0.673 0.706
0.767 0.819 0.846 0.863
0.833 0.896 0.922 0.930
0.868 0.939 0.954 0.971
0.914 0.973 0.981 0.991
0.930 0.989 0.994 0.995
Phase 2 Simulation
For the simulation it was assumed that the correct (k, ri ) had been chosen for each i, i.e. that k had been deduced correctly and for the ith trace only the correct ri had been selected. So Sik = 1 and Sik0002 = 0 if k 0003 = k. From the conclusions about Phase 1, this should usually be the case for long keys. As long as there is a reasonable probability of detecting the correct bit each time, all of φ(N ) can be determined. Typical probabilities can be seen in Table 2. There seems little to be gained from having more than 100 traces; more is achieved by having more lookahead bits. In fact, the probability of picking the wrong bit seems to fall exponentially as the number w of lookahead bits increases1 . From Table 3, w ≥ 8 allows a significant proportion of keys to be recovered if the k and the randoms ri have been guessed correctly. The figures are for an implementation of the algorithm without backtracking. When incorrect bits are predicted, the process does not recover and random bits are generated thereafter. With most bits being correct, backtracking is a cheaper alternative to solve this than increasing the number of lookahead bits. Assuming that Table 2 probabilities are constant over the length of the key and are independent of the key length, it is possible to compute the probability of successfully recovering the key: approximately pn/2 where p is the table entry and n the key length. Table 3 gives these probabilities as obtained from a simulation with 100 traces and w = 8. This corresponds to p = 0.9973. With 10 lookahead digits the simulation shows there is a 60% chance of recovering 2048-bit keys, and this corresponds to p = 0.999512. Lastly, with 50 traces but w varying dynamically between 8 and 16 as necessary, 2048-bit keys were recovered in 11% of cases. Since the values of k and ri from Phase 1 will be mostly correct for 2048-bit keys with log2 RE ≤ 32, – It is computationally feasible to recover a substantial number of 2048-bit keys using 50 traces, current standards for random blinding, typical small public exponents, and expected levels of weak side channel leakage. 1
The maximum values of μw (k, j, Φj ) were computed where Φj ranged over i) values with φj = 0 and ii) values with φj = 1. The difference between these was a good indicator of the reliability of the choice of φj . Increasing w just for the cases for which this difference was smallest led to a remarkable improvement in accuracy. Moreover, decreasing w for other cases led to a considerable computational saving. Many 2048-bit keys were recovered successfully using just 50 traces and varying w between 8 and 16.
314
C.D. Walter
Table 3. Probability of success in determining φ(N ) from t = 100, correct ri s and key length n with w = 8 lookahead bits, no back-tracking and log2 RE = 16
7.2
n
512
768 1024
1536
2048
prob
0.50
0.40
0.13
0.04
0.29
The Case of Some Incorrect Phase 1 Deductions
Now consider the case where not all pairs (k, ri ) are correct. If (k, ri ) is incorrect then the above process applied only to this pair (i.e. t = 1 and Sik = 1) would result in choosing the lower bits of φ(N ) to satisfy (4) with the incorrect values (k, ri ) and the correct Di . This makes the lower bits incorrect by a multiplication factor of (k 0003 +ri0003 E)/(k+ri E) where (k 0003 , ri0003 ) is the correct pair. Moreover, for these bit choices the metric retains the peak values associated with a correct choice. So, without the context of other traces, the error will remain undetected and the pair (k, ri ) cannot be removed from consideration. Thus, if the above process is performed with a set of pairs (k, ri ), some of which are correct and others incorrect, then the incorrect values predict random bits, while the correct ones predict the correct bits. This averages to a weaker prediction of the correct bits. However, the incorrect choices become more apparent as more correct bits are appended to φ(N ). Eventually this is noticed and those choices can be dropped to speed up the process. It is easy to choose threshold values for the metric – several standard deviations below the average, say – to guide this decision. So the Phase 2 process is applied to all the outputs of Phase 1 for a given k, i.e. every (k, ri ) ∈ Sik for every trace, and the sum of all the metric values is used to choose the lower bits of φ(N ). Clearly, however, the limiting factor in this phase is the ratio of correct to incorrect predictions (k, ri ). If this is too small it will not be possible to identify correct bits through peaks in the value of the metric. Table 1 shows that key length is a very strong contributor to this: longer keys improve the ratio, making recovery of φ(N ) much easier. 7.3
Comparison with Fouque
In this algorithm the bits of φ(N ) are determined in the reverse order from that used by Fouque et al. [3]. This has several advantages. It makes the transition between the known upper half of φ(N ) and unknown lower half seamless, it allows the metric easily to include the value of all previous decisions, and it allows the division by E to be done straightforwardly. The problems of carry influence in the multiplications of equation (4) is similar for both directions.
8
Complexity
The first phase has time complexity O(REt log(RE)) where t is the number of traces needed to complete the second phase, and depends on the level of leakage.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
315
This complexity results from an exhaustive search over all possible (k, ri ). It was remarked that there was an information leakage which is proportional to the length of the traces. Therefore, recovering the log2 (RE) bits of each (k, ri ) only requires processing a part of the traces with length O(log(RE)), not the whole length. Space is not an issue in this phase as only one pair (k, ri ) need be considered at any one time. The pairs are treated independently and so the work can be completely parallelised. For standard choices of E = 216 +1, R = 232 and a similar level of leakage to the example, this is clearly computationally feasible. In the second phase the worst situation is that all RE guesses are considered for every trace at each bit selection, making a total time complexity O(REnt), which is at worst similar to the first phase. However, if only R0003 E 0003 guesses survive then the complexity is reduced to O(R0003 E 0003 nt). This assumes that metrics do not have to be recomputed over the whole length of the trace every time another bit is guessed; instead the incremental effect of the new bit is used to update the preceding value. This approach requires O(R0003 t) space as different values of k are processed sequentially. The second phase requires strong leakage or a high ratio of correct pairs (k, ri ) to have a chance of working. Therefore practical limitations on the number of traces that can be obtained guarantees that space will not be the overriding problem. Furthermore, the work can be parallelised without difficulty at least as far as distributing the effort for each k to different processors. This would reduce the time complexity by a factor of O(E).
9
Conclusion
The scope of the attack of Fouque et al. [3] has been extended to include imprecise leakage by introducing a practical metric which prioritises the selection of guesses at the random blinding factors and bits of φ(N ) for an RSA modulus N . Both attacks target the typical set-up for RSA decryption/signing in a smartcard with standard counter-measures which include exponent blinding. It was found that very weak, imprecise leaked data could be successfully manipulated to reduce the ambiguity in the blinding factors by a factor essentially proportional to the length of the keys, so that the blinding factors are fully determined when the key is long enough. For typical choices of public exponent and blinding parameters, and a leakage rate equivalent to only 1 bit per 32 bits of key per trace, the blinding factors can be recovered correctly for keys above about 2048 bits in length. Reconstruction of the unknown lower bits of φ(N ) requires most of the blinding factors to be recovered correctly and sufficiently many traces to be available. With a leakage rate of 1 bit per r key bits, 1.5r traces suffice to recover φ(N ) and hence factor N without any need for an expensive search. In a simulation, a sizeable proportion of 2048-bit keys were successfully recovered using leakage from only 50 traces (r=32). Thus the attack is certainly computationally feasible with only weak, imprecise leakage. So longer keys were found to be more vulnerable. The best counter-measure is to ensure that blinding increases with key length at least until it becomes
316
C.D. Walter
computationally infeasible to test every blinding value individually. The attack illustrates that the information theoretic level of leakage can be into converted successfully into the secret key even in the presence of a typical collection of standard counter-measures.
References 1. Portable Data Carrier including a Microprocessor, Patent 4211919, US Patent and Trademark Office (July 8, 1980) 2. Dhem, J.-F., Koeune, F., Leroux, P.-A., Mestr´e, P., Quisquater, J.-J., Willems, J.L.: A practical implementation of the Timing Attack. In: Schneier, B., Quisquater, J.-J. (eds.) CARDIS 1998. LNCS, vol. 1820, pp. 175–190. Springer, Heidelberg (2000) 3. Fouque, P.-A., Kunz-Jacques, S., Martinet, G., Muller, F., Valette, F.: Power Attack on Small RSA Public Exponent. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 339–353. Springer, Heidelberg (2006) 4. Knuth, D.E.: The Art of Computer Programming, 3rd edn. Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1997) 5. Ko¸c, C ¸ .K.: High Radix and Bit Recoding Techniques for Modular Exponentiation. International J. of Computer Mathematics 40(3-4), 139–156 (1991) 6. Kocher, P.: Timing attack on implementations of Diffie-Hellman, RSA, DSS, and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996) 7. Kocher, P., Jaffe, J., Jun, B.: Differential Power Analysis. In: Wiener, M.J. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999) 8. Montgomery, P.L.: Modular Multiplication without Trial Division. Mathematics of Computation 44(170), 519–521 (1985) 9. Schindler, W.: A Combined Timing and Power Attack. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 263–279. Springer, Heidelberg (2002) 10. Schindler, W.: On the Optimization of Side-Channel Attacks by Advanced Stochastic Methods. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 85–103. Springer, Heidelberg (2005) 11. Schindler, W., Walter, C.D.: More detail for a Combined Timing and Power Attack against Implementations of RSA. In: Paterson, K.G. (ed.) Cryptography and Coding. LNCS, vol. 2898, pp. 245–263. Springer, Heidelberg (2003) 12. Walter, C.D.: Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 30–39. Springer, Heidelberg (2002) 13. Walter, C.D., Thompson, S.: Distinguishing Exponent Digits by Observing Modular Subtractions. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 192–207. Springer, Heidelberg (2001)
Differential Power Analysis of HMAC Based on SHA-2, and Countermeasures Robert McEvoy, Michael Tunstall, Colin C. Murphy, and William P. Marnane Department of Electrical & Electronic Engineering, University College Cork, Ireland {robertmce,miket,cmurphy,liam}@eleceng.ucc.ie
Abstract. The HMAC algorithm is widely used to provide authentication and message integrity to digital communications. However, if the HMAC algorithm is implemented in embedded hardware, it is vulnerable to side-channel attacks. In this paper, we describe a DPA attack strategy for the HMAC algorithm, based on the SHA-2 hash function family. Using an implementation on a commercial FPGA board, we show that such attacks are practical in reality. In addition, we present a masked implementation of the algorithm, which is designed to counteract first-order DPA attacks.
1
Introduction
In today’s modern society of e-mail, internet banking, online shopping and other sensitive digital communications, cryptography has become a vital tool for ensuring the privacy of data transfers. To this end, Message Authentication Code (MAC) algorithms are used to verify the identity of the sender and receiver, and to ensure the integrity of the transmitted message. These algorithms process the message to be authenticated along with a secret key, which is shared between the sender and receiver. The result is a short string of bits, called a MAC. HMAC [1] is a popular type of MAC algorithm which is used in the IPsec [14] and TLS protocols [6], and is based on a cryptographic hash function such as SHA-2 [16]. The last decade has also seen the emergence of attacks which target cryptographic algorithms that are implemented in embedded hardware [9]. Of particular interest are differential side-channel attacks, such as Differential Power Analysis (DPA) [12]. These non-invasive attacks exploit information that leaks from a cryptographic device via some side channel, such as timing information, power consumption, or electromagnetic emanations. Comparing small variations in the side-channel information as a device processes different messages can potentially allow an attacker to recover secret information stored within the device. In this paper, we examine the susceptibility to differential side-channel attacks of the HMAC algorithm based on the SHA-2 family of hash functions. Side-channel attacks on hash functions and the HMAC algorithm have been discussed in the past, but specific attack details for the SHA-2 family have not been given, nor have countermeasures been designed. In 2001, Steinwandt et al. [21] presented a theoretical attack on the SFLASH signature scheme, which S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 317–332, 2007. © Springer-Verlag Berlin Heidelberg 2007
318
R. McEvoy et al.
targeted an exclusive-OR (XOR) operation in SHA-1. Coron and Tchoulkine [5] noted the vulnerability of the HMAC algorithm to a DPA attack. Lemke et al. [10] described a theoretical DPA attack on the HMAC algorithm based on the hash function RIPEMD-160, noting that a similar approach could be taken for a HMAC scheme based on SHA-1. Okeya et al. [18,19] highlight the susceptibility of MAC and HMAC algorithms to side-channel attacks, but the exposition is for the HMAC algorithm based on block-cipher based hash functions, in contrast with SHA-2, which is a dedicated cryptographic hash function. In this paper, we characterise a differential side-channel attack on an implementation of the HMAC algorithm that uses the SHA-2 hash function family. Furthermore, we provide attack results on a FPGA implementation of the algorithm. We also describe countermeasures that could be used to prevent such side-channel attacks, by designing masked circuits for the vulnerable SHA-2 operations. The rest of this paper is organised as follows. In Section 2, the necessary background theory regarding the HMAC algorithm, the SHA-2 family, and DPA attacks is introduced. Section 3 gives a detailed account of how the SHA-256 based HMAC scheme can be broken by a side-channel attacker. Results from a practical FPGA-based implementation of this attack are presented in Section 4. In Section 5, a masking scheme is designed as a countermeasure against the attack, and the resulting FPGA-based scheme is tested in Section 6. Section 7 concludes the paper.
2 2.1
Background Theory HMAC Algorithm Overview
The HMAC authentication scheme was first introduced by Bellare et al. at CRYPTO’96 [1]. The scheme was designed such that the security of the MAC is built upon the security of the underlying hash function h. The MAC is calculated as follows: (1) HMACk (m) = h((k ⊕ opad) h((k ⊕ ipad) m)) where k is the secret key (padded with zeros to equal the block size of h), and m is the message to be authenticated. ipad is a fixed string whose length equals the block size of h; generated by repeating the hexadecimal byte 0x36. Similarly, opad is fixed and is formed by repeating the hexadecimal byte 0x5C. ⊕ and denote XOR and concatenation respectively. Clearly, in order to calculate HMACk (m), the hash function h must be invoked twice. In this paper, we focus on the first call to the hash function, which calculates the partial MAC: HMAC0002k (m) = h((k ⊕ ipad) m)
(2)
In [1], the authors suggested using MD5 or SHA-1 to instantiate the hash function h. In 2002, the HMAC algorithm was released as a standard by NIST [17], in which h is defined as a NIST-approved hash function. In this paper, we adhere to this standard and choose SHA-256 to instantiate h. This follows a recent
Differential Power Analysis of HMAC
319
trend in the cryptographic community away from older hash functions, for which weaknesses have been identified [11], and towards newer constructions like the SHA-2 family [16]. We use the term “HMAC-SHA-256” to denote the HMAC algorithm that uses SHA-256 to instantiate h. 2.2
SHA-256 Description
There are four hash functions in the SHA-2 family: SHA-224, SHA-256, SHA384 and SHA-512. Each algorithm generates a fixed-length hash value; SHA-224 produces a 224-bit output, SHA-256 has a 256-bit output, etc. The compression functions in SHA-224 and SHA-256 are based on 32-bit operations, whereas the compression functions for SHA-384 and SHA-512 are based on 64-bit operations. We focus on SHA-256 in our attacks, because it is easier in practice to perform a side-channel attack on a 32-bit word than on a 64-bit word. However, in theory, the side-channel attacks and countermeasures described in this paper should also be applicable to HMAC-SHA-384 and HMAC-SHA-512. The SHA-256 algorithm essentially consists of three stages: (i) message padding and parsing; (ii) expansion; and (iii) compression. Message Padding and Parsing. The binary message to be processed is appended with a ‘1’ and padded with zeros until its bit length ≡ 448 mod 512. The original message length is then appended as a 64-bit binary number. The resultant padded message is parsed into N 512-bit blocks, denoted M (i) , for 1 ≤ i ≤ N . These M (i) message blocks are passed individually to the message expansion stage. Message Expansion. The functions in the SHA-256 algorithm operate on 32-bit words, so each 512-bit M (i) block from the padding stage is viewed as (i) sixteen 32-bit blocks denoted Mt , 1 ≤ t ≤ 16. The message expansion stage (also called the message scheduling stage) takes each M (i) and expands it into sixty-four 32-bit Wt blocks for 1 ≤ t ≤ 64, according to equations given in [16]. Message Compression. The Wt words from the message expansion stage are then passed to the SHA compression function, or the ‘SHA core’. The core utilises eight 32-bit working variables labelled A, B, . . . , H, which are initialised (0) (0) to predefined values H0 –H7 (given in [16]) at the start of each call to the hash function. Sixty-four iterations of the compression function are then performed, given by: A = T1 0002 T2 (3) E = D 0002 T1 (7) B=A (4) F =E (8) C=B D=C where T1 = H 0002
(5) (6) 0002 1
G=F H =G
(E) 0002 Ch(E, F, G) 0002 Kt 0002 Wt
(9) (10)
(11)
320
R. McEvoy et al.
T2 =
0002 0
(A) 0002 M aj(A, B, C)
(12)
Ch(x, y, z) = (x ∧ y) ⊕ (¯ x ∧ z) M aj(x, y, z) = (x ∧ y) ⊕ (x ∧ z) ⊕ (y ∧ z) 0002 (x) = ROT2 (x) ⊕ ROT13 (x) ⊕ ROT22 (x) 0 0002 (x) = ROT6 (x) ⊕ ROT11 (x) ⊕ ROT25 (x)
(13) (14) (15) (16)
1
and the inputs denoted Kt are 64 32-bit constants, specified in [16]. All additions in the SHA-256 algorithm are computed modulo 232 , denoted by 0002. The logical AND operator is denoted by ∧, and x¯ denotes the logical NOT operator. After 64 iterations of the compression function, a 256-bit intermediate hash value H (i) , (i) (i) comprising H0 –H7 , is calculated: (i)
(i−1)
H0 = A 0002 H 0
(i)
(i−1)
, H1 = B 0002 H1
(i)
(i−1)
, . . . , H7 = H 0002 H7
(17)
The SHA-256 compression algorithm then repeats and begins processing another 512-bit block from the message padder. After all N data blocks have been processed, the output, H (N ) , is formed by concatenating the final hash values: (N )
H (N ) = H0 2.3
(N )
H1
(N )
H2
(N )
. . . H7
(18)
Differential Side-Channel Analysis
Some of the most common forms of side-channel analysis are Differential Power Analysis (DPA) [9] and related attacks such as Correlation Power Analysis (CPA) [2]. In this type of attack, a series of power consumption traces are acquired using an oscilloscope, where each trace has a known associated input (e.g. the message block being processed). A comprehensive guide to this class of attacks is provided in [12]. The fundamental principle behind all DPA attacks is that at some point in an algorithm’s execution, a function f exists that combines a fixed secret value with a variable which an attacker knows. An attacker can form hypotheses about the fixed secret value, and compute the corresponding output values of f by using an appropriate leakage model, such as the Hamming Distance model [2]. The attacker can then use the acquired power consumption traces to verify her hypotheses, by partitioning the acquisitions or using Pearson’s correlation coefficient. These side-channel analysis attacks are aided by knowledge of details of the implementation under attack. Moreover, these attacks can be used to validate hypotheses about implementation details. In subsequent sections, these side-channel analysis attacks are referred to as DPA attacks.
3
Attacking HMAC-SHA-256
In this section, we describe an attack on HMAC-SHA-256 using DPA. This attack does not allow recovery of the secret key itself, but rather a secret intermediate
Differential Power Analysis of HMAC
321
state of the SHA-256 hash function. Knowledge of this intermediate state would allow an attacker to forge MACs for arbitrary messages. We note that the attack is not limited to DPA, and other side-channels, such as the electromagnetic sidechannel, could also be used. 3.1
Goal of the Attack
We assume that the attacker has access to a device that performs the HMAC algorithm, and that she has knowledge of the messages being processed by the device. This target device contains a basic implementation of the SHA-256 algorithm, and does not include any side-channel analysis countermeasures. Furthermore, we assume that the attacker has access to some side-channel information (e.g. the power consumption) while the device is calculating the MAC, which leaks the Hamming Distance between the internal signals as they change from one state to the next. As is common, we assume that the secret key is stored in a secure environment, which does not leak side-channel information. The attack focuses on the first execution of SHA-256, given in equation (2). The block size of SHA-256 is 512 bits; therefore, using the notation from Section 2.2, k = ipad = 512. Without loss of generality, we can assume that the size of the message m is such that N = 2, i.e. the device will run through the 64 iterations of the compression function twice, in order to calculate equation (2). When i = 1, the hash function is operating on (k ⊕ ipad), which clearly does not change from one execution of the HMAC algorithm to the next. Hence, the intermediate hash H (1) is also fixed and unknown. Recall that in order to perform a differential side-channel attack, we require fixed unknown data to be combined with variable known data. This criterion is fulfilled during the calculation of H (2) , when the variable m is introduced and combined with the previous intermediate hash H (1) . Therefore, in theory, a differential side-channel attack could be launched on a device calculating equation (2), in order to recover H (1) . This knowledge would allow the attacker to create partial MACs of her choice. Reapplying the side-channel attack on the second invocation of SHA256 in the HMAC algorithm would allow the attacker to forge full MACs for messages of her choosing. Consequently, the goal of the attacker is to recover the secret intermediate hash value H (1) . 3.2
Attack Strategy
The secret intermediate hash H (1) manifests itself as the initial values of the eight 32-bit working variables A–H, when i = 2. We use the subscript t, 1 ≤ t ≤ 64 to denote the round number, e.g. A1 refers to the value of A at the start of round 1 of the compression function, etc. The side-channel attacker’s goal is to uncover the eight variables A1 –H1 . A strategy for such an attack is now described. 1. With reference to equations (3) and (7), it is clear that at some point in the first round, the variable T 11 must be calculated. T 1t is a large sum with 5 terms, and can be re-written as: T 1t = θt 0002 Wt
(19)
322
R. McEvoy et al.
where θ t = Ht 0002
0002 1
(Et ) 0002 Ch(Et , Ft , Gt ) 0002 Kt
(20)
In round 1, θ1 is fixed and unknown, and W1 is known by the attacker, since it is related to m. Therefore, a DPA attack can be launched by making hypotheses about θ1 , and computing the corresponding values of T 11. Since SHA-256 uses 32-bit words, 232 hypotheses for θ1 are required. Furthermore, since we assume that the target device leaks the Hamming Distance (HD), the 232 possibilities for the previous state, T 10, must also be taken into account. Therefore, the attacker correlates the power traces with her 264 hypotheses for HD(T 10 , T 11). This allows the attacker to recover T 10 and θ1 , and then calculate T 11 for any message m. Clearly, correlating with 264 hypotheses would be computationally infeasible, even for well-resourced attackers. In Section 3.3, we describe how the Partial CPA technique [22] can be used to significantly reduce the attack’s complexity. 2. The above attack stage gives the attacker control over the value of T 11 , so it is now a known variable. Using equation (7), the attacker can now make hypotheses on the (fixed) bits of D1 , using the bits of E2 as selection bits. Using the Hamming Distance model, hypotheses for the previous (secret) state E1 are also generated. In this way, the attacker can recover her first secrets, D1 and E1 , and accurately predict the value of E2 for any message m. 3. Focusing on equation (3), we observe that T 11 is variable and known, whereas T 21 is fixed and unknown. The attacker can launch a DPA attack on A2 by forming hypotheses about T 21 and the previous state A1 . Hence, the secret value of A1 is revealed. Furthermore, with knowledge of both T 11 and T 21 , the attacker can now accurately predict A2 for any message m. Therefore, by analysing the side-channel signals from the first round, the attacker can recover the fixed secret values of θ1 , D1 , E1 , T 21 and A1 , and also predict the values of variables T 11 , A2 and E2 . 4. The attacker now turns her attention to the second SHA-256 round. Here, the Ch function is calculated as: Ch(E2 , F2 , G2 ) = (E2 ∧ F2 ) ⊕ (E2 ∧ G2 )
(21)
where E2 is variable, and known by the attacker. From equations (8) and (9), we observe that F2 and G2 are fixed at E1 and F1 , respectively. Therefore, the attacker can generate hypotheses about the unknown values F1 , and attack the output of the Ch function. Of course, 232 hypotheses for the previous state Ch(E1 , F1 , G1 ) are also required. Recovering F1 means that the attacker can now accurately predict the variable Ch(E2 , F2 , G2 ). 5. The next point of attack is the calculation of T 12 (equation (11)). At this stage, the only fixed unknown value in the equation is H2 , as every other variable can be predicted. The attacker already knows the previous state T 11 from stage 1 above. Mounting a DPA attack uncovers H2 , and allows
Differential Power Analysis of HMAC
323
T 12 to be predicted. From equation (10), it can be seen that H2 is equivalent to G1 . 6. The knowledge of T 12 gained from the previous attack stage can be applied to equation (7). Using the bits of E3 as the selection function, the attacker can mount a DPA attack that uncovers D2 . From equation (6), we observe that D2 is equivalent to C1 . 7. The M aj function in the second round can be expressed as: M aj(A2 , B2 , C2 ) = (A2 ∧ B2 ) ⊕ (A2 ∧ C2 ) ⊕ (B2 ∧ C2 )
(22)
where A2 is variable, and known by the attacker. From equations (4) and (5), we observe that B2 and C2 are fixed at A1 and B1 , respectively. Using a similar approach to that taken in stage 4 above, the attacker can perform DPA on M aj and discover the secret value of B1 . 8. By following the above strategy, the attacker can recover the fixed secrets A1 –G1 . The last remaining secret variable, H1 , can be found by reverting the focus to round 1, and substituting into equation (11), where the only unknown value is that of H1 . The eight 32-bit secret values are thus recovered, using seven first-order DPA attacks. 3.3
Complexity of the Attack
As noted above, it is currently computationally infeasible for an attacker to compute 264 hypotheses for a DPA attack. However, as indicated in [2] and illustrated in [22], a partial correlation may be computed, rather than the full correlation. If a correlation coefficient of ρ is obtained by correctly predicting all 32 bits of an 0003 intermediate variable, then we would expect to obtain a partial correlation of ρ l/32 by predicting l bits correctly. Hypotheses can be made on smaller sets of bits at a time, e.g. l = 8, and this strategy can be employed to keep only those hypotheses that produce the highest partial correlations. In this way, the full 32-bit correlation can be built up in stages, thereby reducing the complexity of the attack. This is similar to the ‘extend-and-prune’ strategy employed by a template attack [3].
4 4.1
Attack on FPGA Implementation Implementation Details
In order to demonstrate the feasibility of a DPA attack on HMAC-SHA-256, we implemented the algorithm on a Xilinx FPGA Board. FPGAs are attractive for implementing cryptographic algorithms because of their low cost (relative to ASICs), and their flexibility when adopting security protocol upgrades. FPGAs also allow rapid prototyping of various designs. For our experiments, we implemented SHA-256 on Xilinx’s low-cost Spartan™-3E Development Kit, which contains the Spartan™XC3S500E FPGA [23]. FPGAs consist mostly of Configurable Logic Blocks (CLBs), arranged in a regular array across the chip. In our
324
R. McEvoy et al. 0.3
8 bits 12 bits 16 bits 20 bits 24 bits 28 bits 32 bits
0.25
0.2
Correlation
0.15
0.1
0.05
0
−0.05
−0.1
0
0.5
1
1.5
Time
2
2.5
3 4
x 10
Fig. 1. Correlation and partial correlations between the power consumption and E2 , given the correct prediction of D1 and the previous state E1
case, each CLB contains 4 logic “slices”, which is the basic unit used to quantify the area occupied by a design. Each slice contains two four-input Look-Up Tables (LUTs) and two registers. Logic also exists within a slice which allows fast implementation of carry look-ahead addition. Each slice has a dedicated multiplexer, which is hard-wired to provide a fast carry chain between consecutive slices and CLBs. Indeed, fast carry logic is a feature of many modern FPGAs. We will make use of this dedicated addition circuitry in Section 5. Several optimisations for the SHA-2 family exist, such as pipelining and loop unrolling [4]. However, for simplicity, it was decided to implement a basic design without such optimisations. The design was captured using VHDL, and synthesis, placing and routing were performed using Xilinx ISE™v9.1i. The processor and interface circuitry utilise 951 slices, corresponding to 20% of the FPGA resources. The critical path in the design (i.e. the longest combinational path) is 16.4 ns, during the calculation of A (equation (3)). Block RAMs (BRAMs) on the FPGA are used to store the various messages to be processed; this reduces the communication requirements with the FPGA board. 4.2
Experimental Results
In order to obtain DPA power traces from the design, the FPGA board was configured with the basic SHA-256 design, and a 10 Ω resistor was inserted in the FPGA power supply line. Using a LeCroy WaveRunner 104Xi oscilloscope and a differential probe, we could measure the voltage fluctuations across the resistor as the SHA-256 algorithm was being executed. Therefore, the traces recorded on the oscilloscope were proportional to the power consumption of the FPGA during the execution of the algorithm. Traces for the first three rounds were captured while 4000 random messages were being processed by the FPGA. In order to reduce acquisition noise, each captured trace corresponded to the average of 700 executions with the same message. Figure 1 shows the correlation trace achieved when D1 and the unknown previous state E1 are correctly predicted.
Differential Power Analysis of HMAC
325
The different levels correspond to the correlation coefficients achieved when a certain number of bits are correctly predicted.
5
Masking the SHA-256 Algorithm
The preceding sections have demonstrated the susceptibility of hardware implementations of the HMAC algorithm to first-order DPA attacks. We now examine how to use masking as a countermeasure to such attacks. The masking technique aims to use random values to conceal intermediate variables in the implementation of the algorithm, thereby making the side-channel leakage independent of the secret intermediate variables. Much of the literature has focused on masking techniques for software implementations of cryptographic algorithms (e.g. [5,8,15]). In [7], Goli´c detailed techniques for masking hardware implementations, which we build upon below in order to mask HMAC-SHA-256. 5.1
Requirements
Consider a function f and intermediate variables x, y and z, such that z = f (x, y). If x or y are key-dependent ors personalized experience, the evaluation of recommendation credibility is depending on the common set of peers that have interacted with requestor and the recommendatory peers. As the increase of peers’ quantity, the common set is always very small [14]. Eigentrust [3] considers the recommendation credibility as being equal to the service trust. This metric is not suitable in circumstances where a peer may maintain a good reputation by providing high quality services but send malicious feedbacks to its competitors. Research [21] presents a new recommendation credibility calculation model, but there exists unfairness to blameless peers. Research [6] proposes the weighted majority algorithm (WMA), the main idea is to assign and tune the weights so that the relative weight assigned to the successful advisors is increased and the relative weight assigned to the unsuccessful advisors is decreased. But, the approaches mentioned above don’t consider more complex malicious strategies, for example, peers could try to gain trust from others by telling the truth over a sustained period of time and only then start lying, colluding peers could inflate reputation using unfair ratings flooding. Moreover, a peer may be penalized for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will be falsely classified as lying. Truthtelling Mechanisms. In order to stimulate reputation information sharing and honest recommendation elicitation, Jurca and Faltings [8] propose an incentive compatible reputation mechanism to deal with inactivity and lies. A client buys a recommendation about a service provider from a special broker named R-nodes. After interacting with the provider, the client can sell its feedback to the same R-node, but gets paid only if its report coincides with the next client’s report about the same service provider. One issue is that if the recommendation from an R-node is negative such that a client decides to avoid the service provider, the client will not have any feedback to sell. Or in the existence of opportunistic service providers that, for example, behave and misbehave alternatively, an honest feedback does not ensure payback. This opens up the possibility of an honest entity to have negative revenue and thus is unable to buy any recommendation. Besides, the effectiveness of their work depends largely on the integrity of R-nodes, which is assumed to be trusted a priori. To encourage the exchange of reputation information, Pinocchio [16] rewards participants that advertise their experience to others and uses a probabilistic honesty metric to detect dishonest users and deprive them of the rewards. The reward can be used to query the reputation of others. Pinocchio does not intend to protect against conspiracies or bad-mouthing. Research [17] proposes a mechanism for providing the incentives for reporting truthful feedback in a P2P system for exchanging services. Under their approach, both transacting peers submit ratings on performance of their mutual transaction. If these are in disagreement, then both transacting peers are punished, since such an occasion is a sign that one of them is lying. The severity of
374
J. Chang et al.
each peer’s punishment is determined by his corresponding non-credibility metric; this is maintained by the mechanism and evolves according to the peer’s record. But their proposal still avoids the fundamental problem that peers have no incentive to provide reputation feedback. Even peer can provide feedback, but, obviously, malicious peers may collude to weaken the mechanism (two colluding peers can provide the consistent rating for each other to increase their reputation value.). In contrast to these works, we propose a fully distributed mechanism based on local knowledge that provides malicious and non-participating entities an incentive for participation and honest behavior.
3 Incentive-Compatible Reputation Mechanism We adopt the terminology witness to denote a peer solicited for providing its recommendation. Finding the right set of witnesses is a challenging problem since the reputation value depends on their recommendations. Our approach for collecting recommendations follows the solution proposed by Yu et al [6], in which recommendations are collected by constructing chains of referrals through which peers help one another to find witnesses. In order to stimulate peers to send sufficiently honest recommendations, we make some changes (see Sect. 3.3 for details). In this section, we first introduce our trust valuation algorithm used by a peer to reason about trustworthiness of other peers based on the available local information which includes past interactions and recommendations received from witnesses. Then, we present our recommendation credibility calculation model, which can effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced. Last, a simple yet effective trust information exchange protocol is proposed to stimulate sufficiently honest recommendations. 3.1 Trust Evaluation Algorithm In respect that peers may change their behaviors over time, and the earlier ratings may have little impact on the calculation result, it is desirable to consider more recent ratings, that is, to consider only the most recent ratings, and discard those previous ones. Such a restriction is motivated by game theoretic results and empirical studies on ebay that show that only recent ratings are meaningful [15]. Thus, in the following, only interactions that occur within a sliding window of width D are considered. Moreover, by doing so, the storage costs of our reputation system are reasonable and justified given its significant benefits. There are two kinds of trust relationships among peers, namely, direct trust and indirect trust [19]. The direct trust of peer i to peer j can be evaluated from the direct transaction feedback information between i and j. The indirect trust of i to j can be computed according to the recommendations of peers who have interacted with j. The overall trust of peer i to peer j is produced by combining these two trust value. Direct Trust. Any peer in our system will maintain a direct trust table for all the other peers it has interactions with directly. Suppose peer i has some interactions with peer j during the last D time units, the entry for peer j is denoted as
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
375
E xp ( i , j ) = n , E X P _ L IS T , where n is the number of ratings, and E X P _ L IS T is an index in which these ratings are kept. The rating is in the form of r = (i, j, t, v). Here i and j are the peers that participated in interaction, and v is the rating peer i gave peer j. The range of v is [0, 1], where 0 and 1 means absolutely negative, absolutely positive respectively. t is the time stamp of the interaction. A rating is deleted from the direct trust table after an expiry period D. From the direct trust table, the direct trust valuation of peer i to peer j at time t is represented as < D ijt , ρ ijt >, where D ijt is the direct trust value and ρ ijt is introduced
to express the level of confidence about this direct trust value. Although there are a lot of elements that can be taken into account to calculate the level of confidence, we will focus on two of them: the number of experiences used to calculate the direct trust value and the variability of its rating values. D ijt is calculated by the following formula: t ij
D
∑ = ∑
e∈ Exp ( i , j ). EXP _ LIST
e.v ∗ α ( t − e.t )
α ( t − e .t ) e∈Exp ( i , j ). EXP _ LIST
(1)
Where e.v is the value of the rating e, and α is the decay factor in the range of (0, 1). A malicious node may strategically alter its behavior in a way that benefits itself such as starting to behave maliciously after it attains a high reputation. In order to cope with strategic altering behavior, the effect of an interaction on trust calculation must fade as new interactions happen [10]. This makes a peer to behave consistently. So, a peer with large number of good interactions can not disguise failures in future interactions for a long period of time. Let CIN ijt be the level of confidence based on the number of ratings that have been taken into account in computing D ijt . As the number of these ratings ( E x p ( i , j ). n ) grows, the level of confidence increases until it reaches a defined threshold (denoted by m). ⎧ E xp ( i , j ).n ⎪ C IN ijt = ⎨ m ⎪⎩1
if E xp ( i , j ).n ≤ m
(2)
otherw ise
Hence, the level of confidence CIN ijt increases from 0 to 1 when the number of ratings E x p ( i , j ) . n increases from 0 to m, and stays at 1 when E x p ( i , j ) . n exceeds m. Let CID ijt be the level of confidence based on the variability of the rating values.
CIDijt is calculated as the deviation in the ratings’ values: C ID ijt = 1 −
1 ∗ 2
∑
e∈ Exp ( i , j ). EX P _ LIST
α ( t − e .t ) ∗ e .v − D ijt
∑ e∈ Exp ( i , j ). E XP _ L IST α (t − e .t )
(3)
376
J. Chang et al.
This value goes from 0 to 1. A deviation value near 0 indicates a high variability in the rating values (this is, a low confidence on the direct trust value) while a value close to 1 indicates a low variability (this is, a high confidence on the direct trust value). Finally, the level of confidence
ρijt
about the direct trust value
Dijt combines the
two reliability measures above:
ρijt = CINijt ∗ CIDijt
(4)
Indirect trust. After collecting all recommendations about peer j using the rating discovery algorithm proposed by Yu et al [6], peer i can compute the indirect trust about peer j. Let Ti = {p1,p2,…,pti} be the set of trustworthy peers which reply the request. If p k ∈ T i had at least one service interaction with pj, it replies recommendation < Re c kjt , CI kjt > based on the local rating records with peer j. For an honest peer p, we have R e c kt j = D kt j and CI kjt = ρ kjt . The inclusion of level of confidence in the recommendation sent to the recommendation requestor allows the recommendation requestor to gauge the how much confidence the peer itself places in the recommendation it has sent. To minimize the negative influence of unreliable information, the recipient of these recommendations weighs them using this attached level of confidence and the credibility of the sender. If the level of confidence has a small value, the recommendation is considered weak and has less effect on the reputation calculation. Credibility is evaluated according to the past behavior of peers and reflects the confidence a peer has in the received recommendation. Credibility computation is presented in the next subsection. The indirect trust value of peer j according peer i, denoted by
Rijt , is given by the
following formula: t ij
R Where
∑ =
pk ∈Ti
∑
Crikt ∗ CI kjt ∗ Re ckjt
(5)
Crikt ∗ CI kjt p ∈T k
i
Crikt is the credibility of peer k according to peer i, CI kjt denotes the level of
confidence about the recommendation value Re cijt . So peer i gives more weight to recommendations that are considered to be of a high confidence and that come from peers who are more credible. Overall Trust. Base on the direct trust value and indirect trust value calculated t
above, the overall trust value of peer i to peer j’s service (denoted by Oij ) is defined in formula (6).
Oijt = λ ∗ Dijt + (1 − λ ) ∗ Rijt
(6)
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
377
Where D ijt denotes the direct trust value of i to j, R ijt is the indirect trust of peer j according to peer i, the “self-confidence factor” is denoted by λ , which means that how a peer is confident to its evaluation of direct trust value. λ = Exp (i, j ).n / m ,
Exp (i, j ).n is the number of the direct interactions considered, and m is the maximum number to be considered for a peer, and the upper limit for
λ
is 1.
3.2 Recommendation Credibility Any peer in our system will also maintain a recommendation credibility table for all the other peers it has got recommendations from. Suppose peer i has got some recommendations with peer k during the last D time units, the entry for peer k is denoted as R E C ( i , k ) = n , R E C _ L I S T , where n is the number of credibility ratings, and REC _ LIST is an index in which these credibility ratings are kept. In more detail, after having an interaction with peer j, peer i gives its rating about j’s service performance as V ij . Now, if peer i received recommendation < V k j , C I kj > from peer k, then the credibility rating value V w for peer k about this recommendation is given in the following formula:
Vw = 1 − Vkj − Vij The credibility rating value
(7)
Vw is set to be inversely proportional to the difference
between a witness recommendation value and the actual performance (e.g. higher difference, lower credibility). The rating about peer k’s credibility — r = (i, k, t, V w , C I w ) — is then appended to peer i’s recommendation credibility table. t is the time of peer k providing peer i the recommendations about peer j, C I w = C I kj . The recommendation credibility of peer i to peer j at time t is denoted by C rijt , it is a [0, 1]-valued function which represents the confidence formed by peer i about the truthfulness of j's recommendations. This function is local and is evaluated on the recent past behavior of both peer i and j. It is locally used to prevent a false credibility from being propagated within the network. The credibility trust value
Crikt is
calculated as follows: ⎧∑e∈Rec(i,k).REC _ LIST e.Vw ∗αt−e.t ∗eCI . w ⎪ if Rec(i, k).REC _ LIST ≠∅ t t −e.ts Crik = ⎨ ∑e∈Rec(i,k ).REC_ LIST α ∗eCI . w ⎪ otherwise ⎩c0
(8)
Where α is the decay factor in the range of (0, 1) using equation (1), So, it can cope with more complex malicious strategies, for example, peers could try to gain trust from others by telling the truth over a sustained period of time and only then start lying. e .C I w is the level of confidence about the recommendation value, and
378
e .V
J. Chang et al.
w
is the value of the credibility rating e. If no such ratings has been recorded, we
will assign the default credibility trust value, denoted by
c0 , to peer j.
Our credibility model considers the level of confidence about the recommendation value. Giving incorrect recommendation can decrease the recommendation credibility of a peer. So, a peer can lower the level of confidence for opinions about which it is not very sure, therefore risking less loss of credibility in case its judgment is incorrect. If a weak recommendation is inaccurate, the recommendation credibility does not diminish quickly. A peer can not be penalized as much for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will not be falsely classified as lying. 3.3 Simple Trust Information Exchange Protocol The efficiency of the reputation mechanism fully depends on the number of received recommendations and the quality of each of them. In our reputation mechanism, incentive for participation and honest behavior is implemented through a fair differential service mechanism. The goal of service differentiation is not to provide hard guarantees but to create a distinction among the peers based on their contributions to the system. The basic idea is, the more the contribution, the better the relative service. In order to achieve this goal, first, we define two parameters that can be used to create service differentiation in trust information exchange, namely, level of participation, measuring if a peer is active in providing recommendations, and the recommendation credibility defined in section 3.2, assessing if a peer is providing honest recommendations. Second, based on the above two parameters, we propose a simple yet effective trust information exchange protocol using a “tit-for-tat” strategy, to elicit sufficient and honest participation. Based on the rating discovery algorithm proposed in [6], our protocol makes some changes to implement fair service differentiation. We introduce the level of participation notion as the propensity of a peer for replying to a rating request. It is described by function
lijt such that lijt represents the
percentage of times j provided its recommendation to i's queries regarding other peers over the last D time units, with
lij0 = 1. We use a simple approach to calculate lijt ,
which is calculated based on the number of recommendations provided by peer j to i during the last D time units. As the number of these recommendations (retrieved from the recommendation credibility table and denoted by
I ijt ) grows, the participation
level increases until it reaches a defined threshold (denoted by ⎧ I ijt ⎪ l ijt = ⎨ I m a x ⎪1 ⎩
if
I ijt ≤ I m a x
I m ax ). (9)
e lse
Finally, to elicit sufficient and honest participation, we apply the tit-for-tat strategy during the collect phase, i.e., upon receipt of a rating request from peer j, with
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems t
379
t
probability min( lij , Crij ) peer i provides its recommendation, otherwise it ignores the request. The more details are described in algorithm 1. Consequently, by not participating, requesting peers drive correct witnesses ignoring the request, which clearly makes their reputation mechanism useless. Hence there is a clear incentive for non participating peers to change their behavior. As for participating peers, when peer p receives a request from a requesting peer j then i satisfies j's request with probability
Crijt . By doing so, i satisfies j's request if it estimates that j is trustworthy, otherwise it notifies j of its recent faulty behavior by simply ignores the request. As previously, by cheating, a malicious peer penalizes itself by pushing correct witnesses to ignore its request, leading to its effective isolation. We claim that this social exclusion-based strategy motivates j to reliably cooperate.
Algorithm 1: Trust Information Exchange Protocol 1: upon (receipt of a rw(j,s,ttl,t) message at peer i) do 2: with (probability min( l itj , C rijt )) do 3:
if (i has interacted with s in the last D time units)
4:
recist ⇐〈 Dist , ρist 〉;
5:
send
recist to j;
6: else 7: ignore message; 8: end if 9: end do 10: if(ttl≠0) 11: A ⇐ getRandomNeighbor(b); // b is branching factor; 12: For each peer k in A do 13: send a rw(j,s,ttl−1,t) message to peer k; 14: send a witness(s,k,t) to j; 15: end do 16: end if 17: end do
4 Experimental Evaluation We will now evaluate the effectiveness of ICRep by means of experiments. Our intention with this section is to confirm that ICRep is robust against the collusion and badmouthing attacks, that it can effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced, and that it is incentive compatible.
380
J. Chang et al.
4.1 Simulation Setup In our simulation, we use the topology of the system and the deception models as Bin Yu’s reputation mechanism [6]. In order to empirically evaluate our new reputation mechanism against more complex strategies, we make some changes. In our simulation experiment, the quality for a peer to be a SP (service provider) is independent of the quality for a peer to be a rater which provides recommendation. We first define the types of qualities of both SPs and raters used in our evaluation. Three types of behavior patterns of SPs are studied: good peers, fixed malicious peers and dynamic malicious peers. Good peers and fixed malicious peers provide good services and bad services without changing their qualities once the simulation starts respectively. Dynamic malicious peers alter their behavior strategically. The behaviors of peers as raters can be one of the three types: honest peers, fixed dishonest peers and dynamic dishonest peers. Honest and fixed dishonest peers provide correct and incorrect feedback without changing their patterns respectively. Dynamic dishonest peers provide correct feedback strategically, for example, the dynamic dishonest peers which tell the truth over a sustained period of time and only then start lying. Our initial simulated community consists of N peers, N is set to be 128. The percentage of the bad SPs is denoted by pb , the percentage of the bad raters is denoted by pf . Table 1 summarizes the main parameters related to the community setting and trust computation. The default values for most experiments are listed. In the default setting, 50% malicious peers are fixed malicious service providers, 50% malicious peers are dynamic malicious service providers, with 50% probability giving service maliciously. The dishonest raters are fixed dishonest peers which give complementary rating and the level of confidence is set to 1. We divide the total simulation time into multiple simulation time units. In every time unit, each peer initiates a single request that can be satisfied by any of the potential service providers. Every peer issues one service request per simulation round. When a peer receives a Table 1. Simulation Parameters Parameter N pb
Description number of peers in the community percentage of malicious peers in the community
Default 128 80%
pf
percentage of dishonest raters in the community probability peer responds to a service request
80% 0.1
λ α
self-confidence factor
Dynamic
the decay factor
0.9
c0 TTL B
initial credibility value
0.5
bound of the referral chain’s length branching factor in rating discovery algorithm exaggeration factor sliding time window the threshold number of recommendations the threshold number of interactions in formula (2)
4 2 1 10 20 5
res
ρ
D Imax M
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
381
query, it answers it with res probability, or refers to other peers. res is set to 0.1 in the experiments. Two transaction settings are simulated, namely random setting and trusted setting. In random setting, peers randomly pick a peer from candidate peers who answer the service request to perform transactions. In trusted setting, peers select the reputable peer who has the maximal reputation value. The simulation program has been implemented in Java programming language. 4.2 Effectiveness of the Reputation Mechanism This simulation evaluates the immunity of ICRep reputation mechanism to the collusion and badmouthing attacks. This set of experiments demonstrates the benefit of reputation mechanism we proposed, peers compare the trustworthiness of peers and choose the peer with the highest trust value to interact with. A transaction is considered successful if both of the participating peers are good peers, otherwise is a failure transaction. We define successful transaction rate as the ratio of the number of successful transactions over the total number of transactions in the community up to a certain time. A community with a higher transactions success rate has a higher productivity and a stronger level of security. The experiment is performed in both non-collusive setting and collusive setting. We show the benefit of our reputation mechanism compared to a community without any trust scheme. We also compare the performance of our scheme against the trust management scheme proposed by Bin Yu in [6]. 1.4
1 0.8 0.6 0.4
1 0.8 0.6 0.4 0.2
0.2 0
Our reputation mechanism Bin Yu's scheme Without reputation mechanism
1.2 transaction success ratio
1.2 transaction success ratio
1.4
Our reputation mechanism Bin Yu's scheme Without reputation mechanism
0
2
4
6
8 10 time unit
12
14
(a) no-collusive setting
16
0
0
2
4
6
8 10 time unit
12
14
16
(b) collusive setting
Fig. 1. Effectiveness against the collusion and badmouthing attacks
Figure 1 shows the rate of success transactions with respect to the number of time units in collusive and non-collusive setting. We can see an obvious gain of the transaction success rate in communities equipped with a trust mechanism either in non-collusive setting or in collusive setting. Both ICRep and Bin Yu’s scheme can help peers avoid having interactions with malicious service providers in both settings, malicious peers are effectively identified even when they launch a collusion attack. This confirms that supporting trust is an important feature in a P2P community as peers are able to avoid untrustworthy peers. While in the collusive setting, dishonest peers’ collusive behaviors hardly disturb honest peers’ judgment. It needs more
382
J. Chang et al.
interactions to differentiate good peers from bad peers. Moreover, it is observed that Bin Yu’s scheme needs more interactions to differentiate good peers from bad peers in both setting, so ICRep outperforms Bin Yu’s reputation mechanism. 4.3 Predicting Honesty We now define a useful metric to evaluate the performance of our proposed recommendation credibility model. Definition 1. The average recommendation credibility of a witness C re j = 1
W j is
N
N
∑
i =1
(9)
C ri j
Where C rij is the credibility value of witness W j from peer Pi’s acquaintance model [6], and N is the number of peers in whose acquaintance model W j occurs. • Sensitiveness to Strategically Alter Behaviors of Peers The goal of this experiment is to show how credibility model we proposed works against strategic dynamic personality of peers. We simulated a community with all good peers but a dynamic malicious rater with dynamic personality. We simulated two changing patterns. First, peer could try to gain trust from others by telling the truth over a sustained period of time and only then start lying. Second, the peer is trying to improve its recommendation trust by telling the truth. Figure 2 illustrates the changes of recommendation trust of both the peer who is milking its recommendation trust and the peer who is building its recommendation trust in the whole process. The results indicate that our reputation mechanisms is very sensitive to peers’ strategically alter behaviors. 0.9
Milking Recommendation Trust Building Recommendation Trust
1 recommendation credibility
re co m m e n d a tio n cre d ib ility
1.2
0.8 0.6 0.4 0.2 0
exaggerated coefficient = 0.2 exaggerated coefficient = 0.3 exaggerated coefficient = 0.5
0.8
0.7
0.6
0.5
0.4
0
5
10
15 time unit
20
25
30
Fig. 2. Sensitiveness to strategically behaviors of peers
0
5
10
15 20 time unit
25
30
35
Fig. 3. Average recommendation credibility of alter witnesses for different exaggeration coefficients
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
383
• Effects of Exaggeration Coefficient The present experiment studies the average recommendation credibility for such witnesses with different exaggeration coefficients [6]. Figure 3 shows the average recommendation credibility for witnesses with exaggerated negative ratings when exaggeration coefficient ρ is set to 0.2, 0.3, and 0.5, respectively. The results indicate that our approach can effectively detect witnesses lying to different degrees. For the witnesses with exaggerated negative ratings, their average recommendation credibility reaches to about 0.8, 0.7, and 0.5, respectively, after 10 time unit. So, the marginal lying cases can be detected. • Impact of Level of Confidence In the above two experiments, we only considered peers providing service with fixed personality. This experiment considers dynamic attack. An attacker, with x% probability, behaves maliciously by giving malicious service. In the other times, it behaves as a good peer. In this experiment, 80% peers are dynamic attackers with 50% probability giving service maliciously, other peers are good peer, and all the peers provide honest recommendations. The recommendation trust metrics has been observed to understand if honest peers assign fair recommendation trust values to each other. 0.9 recommendation credibility
0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5
0
5
10
without level of confidence with level of confidence 15 20 25 30 35 time unit
40
Fig. 4. Impact of level of confidence
Figure 4 shows the recommendation trust of honest peers in this setting, we can conclude that: without level of confidence, a peer may be penalized for an incorrect opinion that was based on a large variation in experience. Our approach allows a peer to determine the level of confidence about its recommendation. Giving incorrect recommendation can decrease the credibility of a peer. So, a peer can lower the level of confidence for opinions about which it is not very sure, therefore risking less loss of credibility in case its judgment is incorrect. If a weak recommendation is inaccurate, the recommendation credibility does not diminish quickly. A peer can not be penalized as much for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will not be falsely classified as lying.
384
J. Chang et al.
4.4 Effectiveness of Incentive Mechanism Since we have showed that our credibility model can help peers effectively detect inaccurate recommendations and generate a fair evaluation of recommendation in a variety of scenarios, we focus on the effectiveness of our incentive mechanism in stimulating active and truthful recommendations. We investigate and compare the performance of the 4 different types of recommenders similar as [20]: active truth-teller, inactive truth-teller, active liar and inactive liar. Each type of entity has the same population, i.e., 32 each. Honest recommenders recommend with their direct trust regarding the trustee, while dishonest recommenders send back lies which are complementary to their direct trust and level of confidence are set to 1. Active recommenders offer recommendations with 90% probability, while inactive ones offer with 10% probability. An active truth-teller can elicit more honest recommendations, which help him make right trust decisions regarding whether to interact with a peer or not. Therefore, first, we show the number of honest recommendations obtained by the four types of recommenders respectively. Second, we also display the number of wrong trust decisions made by different recommenders. When a peer fails to acquire any helpful recommendation, it has to base its trust decision solely on its direct experiences, which are not significant enough for a sound decision. Namely, the peer is subject to wrong trust decisions, which refer to either false positives (when an honest service provider is identified as an untrustworthy one) or false negatives (when a dishonest service provider is not identified as being so). • Elicited Honest Recommendations Figure 5 shows the number of elicited honest recommendations for different type of recommenders. We can see that at the beginning, very few recommendations are propagated and the four types of recommenders do not have much difference in the number of obtained honest recommendations. With the accumulation of experiences, the honest peers have enough experiences to recommend. Recommendation credibility is gradually recognized and the order of benefit (AT > IT > IL > AL) starts to be established, from simulation round 5 in Figure 5. 60
800
AT recommendations IT recommendations 700 AL recommendations IL recommendations
AT wrong trust decisions IT wrong trust decisions AL wrong trust decisions IL wrong trust decisions
50
600
number
number
40 500
400
30 20
300
10
200
0
100 1
2
3
4
5 time unit
6
7
8
9
1
2
3
4 5 time unit
6
7
8
Fig. 5. Number of elicited honest recommendations Fig. 6. Number of wrong trust decisions
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
385
• Wrong Trust Decisions Figure 6 presents the number of wrong trust decisions made by the four types of recommenders. It can be seen that, with more accumulated experiences, every type of recommenders make less and less wrong trust decisions. Especially, with the help of honest recommendations, AT peers make the least number of wrong trust decisions and AL peers make the most (the order of AT > IT > IL > AL is enforced). Note that dishonest or inactive recommenders can also tell the honesty and activeness of a recommender using reputation system. However, they have access to less number of truthful recommendations for making the right trust decision. From these two experiments, we can get a conclusion that ICRep provides peers with the right incentives for truthful reporting of feedback information, as sincere peers receive always more benefit from the peer-to-peer system than liar peers, whose benefit is very low. Thus, the incentive mechanism is effective.
5 Conclusions and Future Work We present ICRep: an incentive compatible reputation system for P2P systems. Within this system a peer can reason about trustworthiness of other peers based on the available local information which includes past interactions and recommendations received from others. We focus on how to stimulate reputation information sharing and honest recommendation elicitation. Theoretic analysis and simulation show that ICRep can help peers effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced. the precision of inaccuracy detection is improved (e.g. more marginal lying cases can be detected, and honest witnesses will not be falsely classified as lying because of an increased fluctuation in a provider’s performance). Moreover, it can also stimulate peers to send sufficiently honest recommendations. The latter is realized by ensuring that active and honest recommenders, compared to inactive or dishonest ones, can receive always more benefit from the peer-to-peer system. Interactions (service providing and recommendation providing) that occur within a sliding window of width D are considered, therefore, storage requirements for storing trust information are tolerable. The main overhead of our reputation mechanism comes from the reputation queries. In a service session, one provider is selected but reputation values about other providers are deleted. Thus, reputation values about unselected providers can be cached. Since a peer obtains more acquaintances with time, number of cache entries and cache hit ratio increase with time. By this way, we can reduce the overhead of our reputation mechanism comes from the reputation queries. As a next step, we will be evaluating our reputation mechanism as applied to a peer-to-peer network.
References 1. Oram, A.: Peer to Peer: Harnessing the power of disruptive technologies (2001) ISBN 0596-00110-X 2. Aberer, K., Despotovic, Z.: Managing Trust in a Peer-2-Peer Information System. In: The Proceedings of Intl. Conf. on Information and Knowledge Management (2001)
386
J. Chang et al.
3. Kamwar, S.D., Schlosser, M.T., Garcia-Molina, H.: The Eigentrust Algorithm for Reputation Management in P2P Networks. In: The Proceedings of the twelfth international conference on World Wide Web, Budapest, Hungary (2003) 4. Damiani, E., De Capitani di Vimercati, S., Paraboschi, S., Samarati, P.: Managing and sharing servents’ reputations in p2p systems. IEEE Transactions on Data and Knowledge Engineering 15(4), 840–854 (2003) 5. Xiong, L., Liu, L.: PeerTrust: Supporting reputation-based trust in peer-to-peer communities. IEEE Transactions on Data and Knowledge Engineering, Special Issue on Peer-to-Peer Based Data Management 16(7), 843–857 (2004) 6. Yu, B., Singh, M.P., Sycara, K.: Developing trust in large-scale peer-to-peer systems. In: Proceedings of First IEEE Symposium on Multi-Agent Security and Survivability (2004) 7. Chang, J., Wang, H., Yin, G.: A Time-Frame Based Trust Model for P2P Systems. In: Proceedings of 9th International Conference on Information Security Cryptology, Seoul, Korea (2006) 8. Jurca, R., Faltings, B.: An Incentive Compatible Reputation Mechanism. In: Proc. of the IEEE Conference on Electronic Commerce, Newport Beach, CA, USA (June 2003) 9. Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: Proceedings of the 13th World Wide Web Conference (2004) 10. Srivatsa, M., Xiong, L., Liu, L.: TrustGuard: countering vulnerabilities in reputation management for decentralized overlay networks. In: WWW 2005, pp. 422–431 (2005) 11. Withby, A., Jøsang, A., Indulska, J.: Filtering Out Unfair Ratings in Bayesian Reputation Systems. In: Proceedings of the 7th Int. Workshop on Trust in Agent Societies (at AAMAS 2004), ACM, New York (2004) 12. Dellarocas, C.: Immunizing Online Reputation Reporting Systems Against Unfair Ratings and Discriminatory Behavior. In: ACM Conference on Electronic Commerce, pp. 150–157 (2000) 13. Chen, M., Singh, J.: Computing and Using Reputations for Internet Ratings. In: Proceedings of the Third ACM Conference on Electronic Commerce (EC 2001), ACM, New York (2001) 14. Feng, Z.J., Xian, T., Feng, G.J.: An Optimized Collaborative Filtering Recommendation Algorithm. Journal of Computer Research and Development 41(10) (2004) (in Chinese) 15. Dellarocas, C.: Reputation mechanisms. In: Hendershott, T. (ed.) Handbook on Information Systems and Economics, Elsevier, Amsterdam (2006) 16. Fernandes, A., Kotsovinos, E., Ostring, S., Dragovic, B.: Pinocchio: Incentives for honest participation in distributed trust management. In: Jensen, C., Poslad, S., Dimitrakos, T. (eds.) iTrust 2004. LNCS, vol. 2995, Springer, Heidelberg (2004) 17. Papaioannou, T.G., Stamoulis, G.D.: An Incentives’ Mechanism Promoting Truthful Feedback in Peer-to-Peer Systems. In: Proceedings of the 5th IEEE/ACM International Symposium in Cluster Computing and the Grid, Cardi, UK (2005) 18. Dellarocas, C.: Reputation mechanisms. In: Hendershott, T. (ed.) Handbook on Information Systems and Economics, Elsevier, Amsterdam (2006) 19. Beth, T., Borcherding, M., Klein, B.: Valuation of Trust in Open Networks. In: Gollmann, D. (ed.) Computer Security - ESORICS 1994. LNCS, vol. 875, Springer, Heidelberg (1994) 20. Liu, J., Issarny, V.: An incentive compatible reputation mechanism for ubiquitous computing environments. In: PST 2006. International Conference on Privacy, Security and Trust, Toronto, Canada (2006) 21. Huynh, T.D., Jennings, N.R., Shadbolt, N.: On handling inaccurate witness reports. In: Proc. 8th International Workshop on Trust in Agent Societies, Utrecht, The Netherlands, pp. 63–77 (2005)
Author Index
Aickelin, Uwe
157
Chan, Herwin 102 Chang, Junsheng 371 Choi, Myeonggil 359 Choo, Kim-Kwang Raymond Eckert, Claudia 142 Elliott, Stephen J. 48 Feyereisl, Jan
157
Goi, Bok-Min Gu, Yuan X.
245 61
Lo, Swee-Won Lu, Zhengding
30
Main, Alec 61 Marnane, William P. 317 Matsumoto, Tsutomu 128 McEvoy, Robert 317 Moon, SangJae 333 Mu, Yi 16 Murphy, Colin C. 317 Okamoto, Eiji
Ha, JaeCheol 333 Han, Lilong 266 Hashimoto, Toru 173 Hatano, Yasuo 215 Huang, Hao 291 Huang, Wei 291 Huang, Xinyi 16 Imai, Hideki 173 Itoh, Takashi 173 Iwai, Hisato 173 Jang, Jihyeon 48 Johnson, Harold 61 Kaneko, Toshinobu 215 Kawahara, Yuto 1 Kim, Hakil 48 Kim, Hyoung-Joong 76 Kim, Jeong-Nyeo 91 Kiyomoto, Shinsaku 203 Kobara, Kazukuni 173 Komoda, Norihisa 128 Lee, Byoungcheon 30 Lee, ChoelHoon 91 Lee, Yong Ki 102, 115 Li, Ruixuan 277 Lim, Jae-Deok 91 Liu, Qingtan 266
245 277
1
Park, JeaHoon 333 Phan, Raphael C.-W. 245 Prouff, Emmanuel 227 Pyshkin, Andrei 188 Rivain, Matthieu 227 R¨ oder, Patrick 142 Saisho, Hideaki 128 Sameshima, Yoshiki 128 Sasaoka, Hideichi 173 Shin, Sangmun 359 Shirase, Masaaki 1 Stumpf, Frederic 142 Sugio, Nobuyuki 215 Susilo, Willy 16 Takagi, Tsuyoshi 1, 203 Tanaka, Hidema 215 Tanaka, Toshiaki 203 Tang, Yangbin 371 Tang, Zhuo 277 Tews, Erik 188 Tunstall, Michael 317 Ueba, Masazumi Un, Sung-Kyong
173 91
Verbauwhede, Ingrid Walter, Colin D. Wang, Huaimin
303 371
102, 115
388
Author Index
Weinmann, Ralf-Philipp 188 Wen, Zhumu 277 Wilson, William O. 157 Wu, Wei 16 Xia, Lei 291 Xiang, Shijun 76 Xu, Shenkun 345
Yang, Jeongmo 30 Yang, Zongkai 266 Ye, Xiaojun 345 Yen, SungMing 333 Yin, Gang 371 Yoo, Seungjae 30 Yoshitomi, Motoi 203 Zhou, Yongxin
61